ggm.estimate.pcor {GeneTS}R Documentation

Graphical Gaussian Models: Small Sample Estimation of Partial Correlation


ggm.estimate.pcor implements various small-sample point estimators of partial correlation that can be employed also for small sample data sets. Their statistical properties are investigated in detail in Schaefer and Strimmer (2004).


ggm.estimate.pcor(x, method = c("observed.pcor", "partial.bagged.cor", "bagged.pcor"), R = 1000, ...)


x data matrix (each rows corresponds to one multivariate observation)
method method used to estimate the partial correlation matrix. Available options are "observed.pcor" (default), "partial.bagged.cor", and "bagged.pcor".
R number of bootstrap replicates (bagged estimators only)
... options passed to partial.cor, bagged.cor, and bagged.pcor.


The result can be summarized as follows (with N being the sample size, and G being the number of variables):

observed.pcor: Observed partial correlation (Pi-1). Should be used preferentially for N >> G. In this region the other two estimators perform equally well but are slower due to bagging.

partial.bagged.cor: Partial bagged correlation (Pi-2). Best used for small sample applications with N < G. Here the advantages of Pi-2 are its small variance, its high accuracy as a point estimate, and its overall best power and positive predictive value (PPV). In addition it is computationally less expensive than Pi-3.

bagged.pcor: Bagged partial correlation (Pi-3). May be used in the critical zone (N = G) and for sample sizes N slightly larger than the number of variables G.

As a result, this particularly promotes the partial bagged correlation Pi-3 as estimator of choice for the inference of GGM networks from small-sample (gene expression) data.


An estimated partial correlation matrix.


Juliane Schaefer ( and Korbinian Strimmer (


Schaefer, J., and Strimmer, K. (2004). An empirical Bayes approach to inferring large-scale gene association networks. Bioinformatics in press.

See Also,ggm.estimate.pcor.


# load GeneTS library

# generate random network with 40 nodes 
# it contains 780=40*39/2 edges of which 5 percent (=39) are non-zero
true.pcor <- ggm.simulate.pcor(40)
# simulate data set with 40 observations
m.sim <-, true.pcor)

# simple estimate of partial correlations
estimated.pcor <- partial.cor(m.sim)

# comparison of estimated and true model

# a slightly better estimate ...
estimated.pcor.2 <- ggm.estimate.pcor(m.sim, method = c("bagged.pcor"))

[Package GeneTS version 2.3 Index]