ebayes {limma}R Documentation

Empirical Bayes Statistics for Differential Expression

Description

Given a series of related parameter estimates and standard errors, compute moderated t-statistics and log-odds of differential expression by empirical Bayes shrinkage of the standard errors towards a common value.

Usage

ebayes(fit,proportion=0.01,stdev.coef.lim=c(0.1,4))
eBayes(fit,proportion=0.01,stdev.coef.lim=c(0.1,4))

Arguments

fit a list object produced by lm.series, gls.series, mrlm or lmFit containing components coefficients, stdev.unscaled, sigma and df.residual
proportion numeric value between 0 and 1, assumed proportion of genes which are differentially expressed
stdev.coef.lim numeric vector of length 2, assumed lower and upper limits for the standard deviation of log2 fold changes for differentially expressed genes

Details

These functions is used to rank genes in order of evidence for differential expression. It uses an empirical Bayes method to shrink the gene-wise sample variances towards a common values and, in so doing, augmenting the degrees of freedom for the individual variances. The function accepts as input output from the functions lmFit, lm.series, mrlm or gls.series. The estimates s2.prior and df.prior are computed by fitFDist. s2.post is the weighted average of s2.prior and sigma^2 with weights proportional to df.prior and df.residual respectively. The lods is sometimes known as the B-statistic.

eBayes doesn't compute ordinary (unmoderated) t-statistics by default, but these can be easily extracted from the linear model output, see the example below.

ebayes is the earlier and leaner function. eBayes is intended to have a more object orientated flavor as it produces objects containing all the necessary components for downstream analysis.

Value

ebayes produces an ordinary list with the following components. eBayes adds the following components to fit to produce an augmented object, usually of class MArrayLM.

t numeric vector or matrix of penalized t-statistics
p.value numeric vector of p-values corresponding to the t-statistics
s2.prior estimated prior value for sigma^2
df.prior degrees of freedom associated with s2.prior
s2.post vector giving the posterior values for sigma^2
lods numeric vector or matrix giving the log-odds of differential expression
var.prior estimated prior value for the variance of the log2-fold-change for differentially expressed gene
F numeric vector of F-statistics for testing all contrasts simultaneously equal to zero
F.p.value numeric vector giving p-values corresponding to F

Author(s)

Gordon Smyth

References

Lönnstedt, I. and Speed, T. P. (2002). Replicated microarray data. Statistica Sinica 12, 31-46.

Smyth, G. K. (2004). Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Statistical Applications in Genetics and Molecular Biology, 3, No. 1, Article 3. http://www.bepress.com/sagmb/vol3/iss1/art3

See Also

squeezeVar, fitFDist, tmixture.matrix.

An overview of linear model functions in limma is given by 06.LinearModels.

Examples

#  See also lmFit examples

#  Simulate gene expression data,
#  6 microarrays and 100 genes with one gene differentially expressed
set.seed(2004); invisible(runif(100))
M <- matrix(rnorm(100*6,sd=0.3),100,6)
M[1,] <- M[1,] + 1
fit <- lmFit(M)

#  Ordinary t-statistic
par(mfrow=c(1,2))
ordinary.t <- fit$coef / fit$stdev.unscaled / fit$sigma
qqt(ordinary.t,df=fit$df.residual,main="Ordinary t")
abline(0,1)

#  Moderated t-statistic
eb <- eBayes(fit)
qqt(eb$t,df=eb$df.prior+eb$df.residual,main="Moderated t")
abline(0,1)
#  Points off the line may be differentially expressed
par(mfrow=c(1,1))

[Package limma version 2.4.7 Index]