MCMCordfactanal {MCMCpack}R Documentation

Markov chain Monte Carlo for Ordinal Data Factor Analysis Model

Description

This function generates a posterior density sample from an ordinal data factor analysis model. Normal priors are assumed on the factor loadings and factor scores while improper uniform priors are assumed on the cutpoints. The user supplies data and parameters for the prior distributions, and a sample from the posterior density is returned as an mcmc object, which can be subsequently analyzed with functions provided in the coda package.

Usage

MCMCordfactanal(x, factors, lambda.constraints=list(),
                data=parent.environment(), burnin = 1000, mcmc = 20000,
                thin=1, tune=NA, verbose = FALSE, seed = NA,
                lambda.start = NA, l0=0, L0=0,
                store.lambda=TRUE, store.scores=FALSE,
                drop.constantvars=TRUE, ... )
 

Arguments

x Either a formula or a numeric matrix containing the manifest variables.
factors The number of factors to be fitted.
lambda.constraints List of lists specifying possible equality or simple inequality constraints on the factor loadings. A typical entry in the list has one of three forms: varname=list(d,c) which will constrain the dth loading for the variable named varname to be equal to c, varname=list(d,"+") which will constrain the dth loading for the variable named varname to be positive, and varname=list(d, "-") which will constrain the dth loading for the variable named varname to be negative. If x is a matrix without column names defaults names of ``V1", ``V2", ... , etc will be used. Note that, unlike MCMCfactanal, the Lambda matrix used here has factors+1 columns. The first column of Lambda corresponds to negative item difficulty parameters and should generally not be constrained.
data A data frame.
burnin The number of burn-in iterations for the sampler.
mcmc The number of iterations for the sampler.
thin The thinning interval used in the simulation. The number of iterations must be divisible by this value.
tune The tuning parameter for the Metropolis-Hastings sampling. Can be either a scalar or a k-vector. Must be strictly positive.
verbose A switch which determines whether or not the progress of the sampler is printed to the screen. If TRUE, the iteration number and the Metropolis-Hastings acceptance rate are printed to the screen.
seed The seed for the random number generator. If NA, the Mersenne Twister generator is used with default seed 12345; if an integer is passed it is used to seed the Mersenne twister. The user can also pass a list of length two to use the L'Ecuyer random number generator, which is suitable for parallel computation. The first element of the list is the L'Ecuyer seed, which is a vector of length six or NA (if NA a default seed of rep(12345,6) is used). The second element of list is a positive substream number. See the MCMCpack specification for more details.
lambda.start Starting values for the factor loading matrix Lambda. If lambda.start is set to a scalar the starting value for all unconstrained loadings will be set to that scalar. If lambda.start is a matrix of the same dimensions as Lambda then the lambda.start matrix is used as the starting values (except for equality-constrained elements). If lambda.start is set to NA (the default) then starting values for unconstrained elements in the first column of Lambda are based on the observed response pattern, the remaining unconstrained elements of Lambda are set to , and starting values for inequality constrained elements are set to either 1.0 or -1.0 depending on the nature of the constraints.
l0 The means of the independent Normal prior on the factor loadings. Can be either a scalar or a matrix with the same dimensions as Lambda.
L0 The precisions (inverse variances) of the independent Normal prior on the factor loadings. Can be either a scalar or a matrix with the same dimensions as Lambda.
store.lambda A switch that determines whether or not to store the factor loadings for posterior analysis. By default, the factor loadings are all stored.
store.scores A switch that determines whether or not to store the factor scores for posterior analysis. NOTE: This takes an enormous amount of memory, so should only be used if the chain is thinned heavily, or for applications with a small number of observations. By default, the factor scores are not stored.
drop.constantvars A switch that determines whether or not manifest variables that have no variation should be deleted before fitting the model. Default = TRUE.
... further arguments to be passed

Details

The model takes the following form:

Let 1=1,...,n index observations and j=1,...,K index response variables within an observation. The typical observed variable x_ij is ordinal with a total of C_j categories. The distribution of X is governed by a N by K matrix of latent variables Xstar and a series of cutpoints gamma. Xstar is assumed to be generated according to:

xstar_i = Lambda phi_i + epsilon_i

epsilon_i ~ N(0, I)

where xstar_i is the k-vector of latent variables specific to observation i, Lambda is the k by d matrix of factor loadings, and phi_i is the d-vector of latent factor scores. It is assumed that the first element of phi_i is equal to 1 for all i.

The probability that the jth variable in observation i takes the value c is:

pi_ijc = pnorm(gamma_jc - Lambda'_j phi_i) - pnorm(gamma_j(c-1) - Lambda'_j phi_i)

The implementation used here assumes independent conjugate priors for each element of Lambda and each phi_i. More specifically we assume:

Lambda_ij ~ N(l0_ij, L0_ij^-1), i=1,...,k, j=1,...,d

phi_i(2:d) ~ N(0, I), i=1,...,n

The standard two-parameter item response theory model with probit link is a special case of the model sketched above.

MCMCordfactanal simulates from the posterior density using a Metropolis-Hastings within Gibbs sampling algorithm. The algorithm employed is based on work by Cowles (1996). Note that the first element of phi_i is a 1. As a result, the first column of Lambda can be interpretated as item difficulty parameters. Further, the first element gamma_1 is normalized to zero, and thus not returned in the mcmc object. The simulation proper is done in compiled C++ code to maximize efficiency. Please consult the coda documentation for a comprehensive list of functions that can be used to analyze the posterior density sample.

Value

An mcmc object that contains the posterior density sample. This object can be summarized by functions provided by the coda package.

References

Shawn Treier and Simon Jackman. 2003. ``Democracy as a Latent Variable." Paper presented at the Midwest Political Science Association Annual Meeting.

M. K. Cowles. 1996. ``Accelerating Monte Carlo Markov Chain Convergence for Cumulative-link Generalized Linear Models." Statistics and Computing. 6: 101-110.

Valen E. Johnson and James H. Albert. 1999. ``Ordinal Data Modeling." Springer: New York.

Andrew D. Martin, Kevin M. Quinn, and Daniel Pemstein. 2004. Scythe Statistical Library 1.0. http://scythe.wustl.edu.

Martyn Plummer, Nicky Best, Kate Cowles, and Karen Vines. 2002. Output Analysis and Diagnostics for MCMC (CODA). http://www-fis.iarc.fr/coda/.

See Also

plot.mcmc, summary.mcmc, factanal, MCMCfactanal, MCMCirt1d, MCMCirtKd

Examples

   ## Not run: 
   data(painters)
   new.painters <- painters[,1:4]
   cuts <- apply(new.painters, 2, quantile, c(.25, .50, .75))
   for (i in 1:4){
      new.painters[new.painters[,i]<cuts[1,i],i] <- 100
     new.painters[new.painters[,i]<cuts[2,i],i] <- 200
     new.painters[new.painters[,i]<cuts[3,i],i] <- 300
     new.painters[new.painters[,i]<100,i] <- 400
   }

   posterior <- MCMCordfactanal(~Composition+Drawing+Colour+Expression,
                        data=new.painters, factors=1,
                        lambda.constraints=list(Drawing=list(2,"+")),
                        burnin=5000, mcmc=500000, thin=200, verbose=TRUE,
                        L0=0.5, store.lambda=TRUE,
                        store.scores=TRUE, tune=1.2)
   plot(posterior)
   summary(posterior)
   ## End(Not run)

[Package MCMCpack version 0.5-2 Index]