Pool Sample Variances with Unequal Variances


Compute the Satterthwaite (1946) approximation to the distribution of a weighted sum of sample variances.


poolVar(var, df=n-1, multiplier=1/n, n)


var numeric vector of independent sample variances
df numeric vector of degrees of freedom for the sample variances
multiplier numeric vector giving multipliers for the sample variances
n numeric vector of sample sizes


The sample variances var are assumed to follow scaled chi-square distributions. A scaled chi-square approximation is found for the distribution of sum(multiplier * var) by equating first and second moments. On output the sum to be approximated is equal to multiplier * var which follows approximately a scaled chisquare distribution on df degrees of freedom. The approximation was proposed by Satterthwaite (1946).

If there are only two groups and the degrees of freedom are one less than the sample sizes then this gives the denominator of Welch's t-test for unequal variances.


A list with components

var effective pooled sample variance
df effective pooled degrees of freedom
multiplier pooled multiplier


Gordon Smyth


#  Welch's t-test with unequal variances
x <- rnorm(10,mean=1,sd=2)
y <- rnorm(20,mean=2,sd=1)
s2 <- c(var(x),var(y))
n <- c(10,20)
out <- poolVar(var=s2,n=n)
tstat <- (mean(x)-mean(y)) / sqrt(out$var*out$multiplier)
pvalue <- 2*pt(-abs(tstat),df=out$df)
#  Equivalent to t.test(x,y)

