ggm.estimate.pcor {GeneTS}R Documentation

Graphical Gaussian Models: Small Sample Estimation of Partial Correlation

Description

ggm.estimate.pcor implements various small-sample point estimators of partial correlation that can be employed also for small sample data sets. Their statistical properties are investigated in detail in Schaefer and Strimmer (2004).

Usage

ggm.estimate.pcor(x, method = c("observed.pcor", "partial.bagged.cor", "bagged.pcor"), R = 1000, ...)

Arguments

x data matrix (each rows corresponds to one multivariate observation)
method method used to estimate the partial correlation matrix. Available options are "observed.pcor" (default), "partial.bagged.cor", and "bagged.pcor".
R number of bootstrap replicates (bagged estimators only)
... options passed to partial.cor, bagged.cor, and bagged.pcor.

Details

The result can be summarized as follows (with N being the sample size, and G being the number of variables):

observed.pcor: Observed partial correlation (Pi-1). Should be used preferentially for N >> G. In this region the other two estimators perform equally well but are slower due to bagging.

partial.bagged.cor: Partial bagged correlation (Pi-2). Best used for small sample applications with N < G. Here the advantages of Pi-2 are its small variance, its high accuracy as a point estimate, and its overall best power and positive predictive value (PPV). In addition it is computationally less expensive than Pi-3.

bagged.pcor: Bagged partial correlation (Pi-3). May be used in the critical zone (N = G) and for sample sizes N slightly larger than the number of variables G.

As a result, this particularly promotes the partial bagged correlation Pi-3 as estimator of choice for the inference of GGM networks from small-sample (gene expression) data.

Value

An estimated partial correlation matrix.

Author(s)

Juliane Schaefer (http://www.stat.uni-muenchen.de/~schaefer/) and Korbinian Strimmer (http://www.stat.uni-muenchen.de/~strimmer/).

References

Schaefer, J., and Strimmer, K. (2004). An empirical Bayes approach to inferring large-scale gene association networks. Bioinformatics in press.

See Also

ggm.simulate.data,ggm.estimate.pcor.

Examples

# load GeneTS library
library(GeneTS)

# generate random network with 40 nodes 
# it contains 780=40*39/2 edges of which 5 percent (=39) are non-zero
true.pcor <- ggm.simulate.pcor(40)
  
# simulate data set with 40 observations
m.sim <- ggm.simulate.data(40, true.pcor)

# simple estimate of partial correlations
estimated.pcor <- partial.cor(m.sim)

# comparison of estimated and true model
sum((true.pcor-estimated.pcor)^2)

# a slightly better estimate ...
estimated.pcor.2 <- ggm.estimate.pcor(m.sim, method = c("bagged.pcor"))
sum((true.pcor-estimated.pcor.2)^2)

[Package GeneTS version 2.3 Index]