Type: Package
Title: Bayesian Multivariate Receptor Modeling
Version: 2.4.0
Maintainer: Man-Suk Oh <msoh@ewha.ac.kr>
Description: Bayesian analysis of multivariate receptor modeling. The package consists of implementations of the methods of Park and Oh (2015) <doi:10.1016/j.chemolab.2015.08.021>.The package uses 'JAGS'(Just Another Gibbs Sampler) to generate Markov chain Monte Carlo samples of parameters.
Date: 2022-12-14
Depends: R (≥ 4.0.0)
License: GPL-2 | GPL-3 [expanded from: GPL (≥ 2)]
Encoding: UTF-8
Imports: rjags,coda,ggplot2,grDevices,graphics,gridExtra,rgl,shiny,shinythemes
Suggests: stats
RoxygenNote: 7.2.1
NeedsCompilation: no
Packaged: 2022-12-14 02:05:48 UTC; EKLee
Author: Man-Suk Oh [aut, cre], Eun-Kyung Lee [aut], Eun Sug Park [aut]
Repository: CRAN
Date/Publication: 2022-12-14 07:30:05 UTC

PM2.5 speciation data from El Paso, Texas, USA.

Description

The data frame has the following components:

References

Park, E.S. and Oh, M-S. (2016), Bayesian Quantile Multivariate Receptor Modeling, Chemometrics and intelligent laboratory systems, 159, 174-180.

Examples

data(Elpaso)
Y=Elpaso$Y
muP=Elpaso$muP

Shiny App for exploring the results of Bayesian multivariate receptor modeling

Description

Call Shiny to show the results of Bayesian analysis of multivariate receptor modeling in a web-based application. This object contains

Usage

bayesMRMApp(x)

Arguments

x

an object of class bmrm, the output of the bmrm function

Value

shiny App


Bayesian Analysis of Multivariate Receptor Modeling

Description

Generate posterior samples of the source composition matrix P, the source contribution matrix A, and the error variance \Sigma using 'JAGS', and computes estimates of A,P,\Sigma.

Usage

bmrm(Y, q, muP,errdist="norm", df=4,
            varP.free=100, xi=NULL, Omega=NULL,
              a0=0.01, b0=0.01,
             nAdapt=1000, nBurnIn=5000, nIter=5000, nThin=1,
             P.init=NULL, A.init=NULL, Sigma.init=NULL,...)

Arguments

Y

data matrix

q

number of sources. It must be a positive integer.

muP

(q,ncol(Y))-dimensional prior mean matrix for the source composition matrix P, where q is the number of sources. Zeros need to be assigned to prespecified elements of muP to satisfy the identifiability condition C1. For the remaining free elements, any nonnegative numbers (between 0 and 1 preferably) can be assigned. If no or an insufficient number of zeros are preassigned in muP, estimation can still be performed but the resulting estimates may be subject to rotational ambiguities. (default=0.5 for nonzero elements ).

errdist

error distribution: either "norm" for normal distribution or "t" for t distribution (default="norm")

df

degrees of freedom of a t-distribution when errdist="t" (default=4)

varP.free

scalar value of the prior variance of the free (nonzero) elements of the source composition matrix P (default=100)

xi

prior mean vector of the q-dimensional source contribution vector at time t (default=vector of 1's)

Omega

diagonal matrix of the prior variance of the q-dimensional source contribution vector at time t (default=identity matrix)

a0

shape parameter of the Inverse Gamma prior of the error variance (default=0.01)

b0

scale parameter of the Inverse Gamma prior of the error variance (default=0.01)

nAdapt

number of iterations for adaptation in 'JAGS' (default=1000)

nBurnIn

number of iterations for the burn-in period in MCMC (default=5000)

nIter

number of iterations for monitoring samples from MCMC (default=5000). nIter samples are saved in each chain of MCMC.

nThin

thinning interval for monitoring samples from MCMC (default=1)

P.init

initial value of the source composition matrix P. If omitted, zeros are assigned to the elements corresponding to zero elements in muP and the nonzero elements of P.init will be randomly generated from a uniform distrbution.

A.init

initial value of the source contribution matrix A. If omitted, it will be calculated from Y and P.init.

Sigma.init

initial value of the error variance. If omitted, it will be calculated from Y, A.init and P.init.

...

arguments to be passed to methods

Details

Model

The basic model for Bayesian multivariate receptor model is as follows:

Y_t=A_t P+E_t, t=1,\cdots,T,

where

Priors

Notes

Value

in bmrm object

nsource

number of sources

nobs

number of observations in data Y

nvar

number of variables in data Y

Y

observed data matrix

muP

prior mean of the source composition matrix P

errdist

error distribution

df

degrees of freedom when errdist="t"

A.hat

posterior mean of the source contribution matrix A

P.hat

posterior mean of the source composition matrix P

Sigma.hat

posterior mean of the error variance Sigma

A.sd

posterior standard deviation of the source contribution matrix A

P.sd

posterior standard deviation of the source composition matrix P

Sigma.sd

posterior standard deviation of the error variance Sigma

A.quantiles

posterior quantlies of A for prob=(0.025, 0.05, 0.25, 0.5, 0.75, 0.95, 0.975)

P.quantiles

posterior quantiles of P for prob=(0.025, 0.05, 0.25, 0.5, 0.75, 0.95, 0.975)

Sigma.quantiles

posterior quantiles of Sigma for prob=(0.025, 0.05, 0.25, 0.5, 0.75, 0.95, 0.975)

Y.hat

predicted value of Y computed from A.hat*P.hat

residual

Y-Y.hat

codaSamples

MCMC posterior samples of A, P, and \Sigma in class "mcmc.list"

nIter

number of MCMC iterations per chain for monitoring samples from MCMC

nBurnIn

number of iterations for the burn-in period in MCMC

nThin

thinning interval for monitoring samples from MCMC

References

Park, E.S. and Oh, M-S. (2015), Robust Bayesian Multivariate Receptor Modeling, Chemometrics and intelligent laboratory systems, 149, 215-226.

Plummer, M. 2003. JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling. Proceedings of the 3rd international workshop on distributed statistical computing, pp. 125. Technische Universit at Wien, Wien, Austria.

Plummer, M. 2015. 'JAGS' Version 4.0.0 user manual.

Examples


data(Elpaso); Y=Elpaso$Y ; muP=Elpaso$muP ; q=nrow(muP)
out.Elpaso <- bmrm(Y,q,muP)
summary(out.Elpaso)
plot(out.Elpaso)


Convergence Diagnostics on MCMC samples in bmrm

Description

Compute convergence diagnostics of Geweke (1992), Heidelberger and Welch (1983), Raftery and Lewis(1992).

Usage

convdiag_bmrm(x , var="P", convdiag="geweke",print=TRUE,...)

Arguments

x

an object of class bmrm, the output of the bmrm function

var

name of a variable to which convergence disagnostics apply. It should be one of "A" (source contribution matrix), "P" (source composition or profile matrix), "Sigma" (error variance).

convdiag

vector of convergence diagnostic methods. It should be any subvector of ("geweke", "heidel","raftery" ) (default="geweke").

print

TRUE/FALSE, print convergence diagnostics results (default=TRUE)

...

arguments to be passed to methods

Details

Geweke's convergence diagnostic for Markov chains is based on a test for equality of the means of the first and last part of a Markov chain (by default the first 10% and the last 50%). If the samples are drawn from the stationary distribution of the chain, the two means should be equal and Geweke's statistic has an asymptotically standard normal distribution. We use the function geweke.diag in coda package (with default option) which provides the test statistics (standard Z-scores) and the upper bound of and p-values.

Heidelberger and Welch's convergence diagnostic tests the null hypothesis that the sampled values come from a stationary distribution. The test is successively applied, firstly to the whole chain, then after discarding the first 10%, 20%, ... of the chain until either the null hypothesis is accepted, or 50% of the chain has been discarded. We use the function heidel.diag (with default option) which provides the staionary test results and p-values.

Raftery and Lewis's diagnostic estimates the minimum number of iterations, burn-in, thinning interval for zero autocorrelation, satisfying specified conditions regarding quantile q of parameters of interest. The conditions are specified by a posterior quantile q of parameters, an acceptable tolerance (accuracy) r for q, a probability s of being within the interval q-r, q+r. We use the function raftery.diag (with default option).

Value

A list of convergence diagnostics results

convdiag

selected convergence diagnostic methods

geweke

Geweke's z-scores and p-values if convdiag includes "geweke", NULL if convdiag does not include "geweke"

heidel

Heidelberger and Welch's stationary test results and p-values if convdiag includes "heidel"; NULL if convdiag does not include "heidel"

raftery

Raftery and Lewis's estimates of burn-in, minimum number of iterations, and thinning if convdiag includes "raftery"; NULL if convdiag does not include "raftery"

References

Geweke, J.(1992) Evaluating the accuracy of sampling-based approaches to calculating posterior moments. In Bayesian Statistics 4 (ed JM Bernado, JO Berger, AP Dawid and AFM Smith). Clarendon Press.

Heidelberger P, and Welch PD. (1981) A spectral method for confidence interval generation and run length control in simulations. Comm. ACM. 24, 233-245.

Heidelberger P. and Welch PD.(1983) Simulation run length control in the presence of an initial transient. Opns Res., 31, 1109-44,Oxford, UK.

Plummer, M., Best, N., Cowles, K. and Vines K. (2006) CODA: Convergence Diagnosis and Output Analysis for MCMC, R News, Vol 6, pp. 7-11.

Raftery, A.E. and Lewis, S.M. (1992). One long run with diagnostics: Implementation strategies for Markov chain Monte Carlo. Statistical Science, 7, 493-497.

Raftery, A.E. and Lewis, S.M. (1995). The number of iterations, convergence diagnostics and generic Metropolis algorithms. In Practical Markov Chain Monte Carlo (W.R. Gilks, D.J. Spiegelhalter and S. Richardson, eds.). London, U.K.: Chapman and Hall.

Examples


data(Elpaso)
Y=Elpaso$Y ; muP=Elpaso$muP
q=nrow(muP)
out.Elpaso <- bmrm(Y,q,muP, nAdapt=1000,nBurnIn=5000,nIter=5000,nThin=1)
conv1<-convdiag_bmrm(out.Elpaso,var="P",convdiag="raftery" )
conv2<-convdiag_bmrm(out.Elpaso,var="A", convdiag="geweke")
conv3<-convdiag_bmrm(out.Elpaso,var="Sigma", convdiag=c("geweke","heidel"))
conv4<-convdiag_bmrm(out.Elpaso,var="Sigma", convdiag=c("geweke","heidel", "raftery"))


Check the identifiability conditions

Description

Check the identifiability conditions C1-C2 of Park and Oh (2015).

Usage

idCond_check(P)

Arguments

P

source composition matrix in multivariate receptor model

Value

idCond TRUE if all the conditions are satisfied, FALSE otherwise


Principal component plot

Description

Draw principal component plots of data (Y) and source profiles (rows) of the estimated source composition matrix P.hat (and P0 if there is another source composition matrix P0 to compare, e.g., P0 could be the true P in simulation or P0 could be another estimate of P)

Usage

pcplot(x, P0, G3D=FALSE,...)

Arguments

x

an object of class bmrm, the output from the function bmrm

P0

estimated value of P (in simulation it can be the true value of P)

G3D

TRUE/FALSE, dynamic 3D plot (default=FALSE)

...

arguments to be passed to methods

Value

plot

Examples


data(Elpaso)
Y=Elpaso$Y ; muP=Elpaso$muP
q=nrow(muP)
out.Elpaso <- bmrm(Y,q,muP, nAdapt=1000,nBurnIn=5000,nIter=5000,nThin=1)
pcplot(out.Elpaso)
pcplot(out.Elpaso,G3D=TRUE)


Produce plots of the parameter estimates

Description

Produce plots of the estimated posterior mean and 95% posterior intervals of A,P, Sigma based on the MCMC samples in bmrm.

Usage

## S3 method for class 'bmrm'
plot(x, type = "both", ...)

Arguments

x

an object of class bmrm, the output of the function bmrm

type

name of a variable (default="P"). It should be one of "P"(source composition or profile matrix P), "A"(source contribution matrix A), "both" (both P and A), "Sigma" (error variance).

...

arguments to be passed to methods

Details

The following types of plots are drawn depending on the selected parameters:

Value

plot


Summarize the output of the bmrm function

Description

An S3 method that summarizes the output of the bmrm function in an object of class bmrm. This object contains the posterior mean, the posterior standard deviation, and (0.025, 0.05,0.25, 0.5, 0.75,0.95, 0.975) posterior quantiles of A, P, \Sigma. It also contains other relevant information about the MCMC procedure such as the burn-in iterations, the number of MCMC chains, etc.

Usage

## S3 method for class 'bmrm'
summary(object, digits = 3, ...)

Arguments

object

an object of class bmrm, the output of the bmrm function

digits

integer indicating the humber of signifiant digits

...

arguments to be passed to methods

Value

summary of bmrm class (output of bmrm function)


Trace and/or ACF plots of elements of a variable in bmrm object

Description

Produce trace and Auto-Correlation Function plots (along with Effective sample size) of MCMC samples of elements of A, nonzero elements of P, elements of Sigma.

Usage

trace_ACF_plot(x,var="P", ACF=FALSE, nplot=0,irow=1, icol=1, saveFile=FALSE,...)

Arguments

x

an object of class bmrm, the output of the bmrm function

var

name of a variable to which the plots apply. It should be one of "A" (source contribution matrix), "P" (source composition matrix), "Sigma" (error variance).

ACF

TRUE/FALSE if TRUE ACF plots will be provided along with effective sample sizes(dafault: FALSE)

nplot

number of elements of 'var' for trace and/or ACF plots. If 'nplot' is smaller than the total number of elements of 'var' then plots of 'nplot' selected elements will be drawn. Otherwise, trace and/or ACF plots of all elements will be drawn. (default=0 implies that all elements will be selected if var="P" or "Sigma", and the first 12 elements will be selected if var="A")

irow

row index of A/P matrix or index of element of Sigma vector. Plots of 'nplot' elements starting from (irow, icol) element of A/P or elements starting from irow element of Sigma will be drawn (default=1).

icol

column number of A/P matrix. Plots of 'nplot' elements starting from (irow, icol) element of A/P will be drawn (default=1).

saveFile

TRUE/FALSE, save the plots in file 'var'-trace.pdf (default=FALSE)

...

arguments to be passed to methods

Value

plot

Examples


data(Elpaso); Y=Elpaso$Y ; muP=Elpaso$muP ; q=nrow(muP)
out.Elpaso <- bmrm(Y,q,muP, nAdapt=1000,nBurnIn=5000,nIter=5000,nThin=1)
trace_ACF_plot(out.Elpaso,"Sigma", ACF=TRUE)
trace_ACF_plot(out.Elpaso,"P", ACF=TRUE)
trace_ACF_plot(out.Elpaso,"A",ACF=TRUE, nplot=12, irow=2, icol=3)