poolVar {limma} | R Documentation |

## Pool Sample Variances with Unequal Variances

### Description

Compute the Satterthwaite (1946) approximation to the distribution of a weighted sum of sample variances.

### Usage

poolVar(var, df=n-1, multiplier=1/n, n)

### Arguments

`var` |
numeric vector of independent sample variances |

`df` |
numeric vector of degrees of freedom for the sample variances |

`multiplier` |
numeric vector giving multipliers for the sample variances |

`n` |
numeric vector of sample sizes |

### Details

The sample variances `var`

are assumed to follow scaled chi-square distributions.
A scaled chi-square approximation is found for the distribution of `sum(multiplier * var)`

by equating first and second moments.
On output the sum to be approximated is equal to `multiplier * var`

which follows approximately a scaled chisquare distribution on `df`

degrees of freedom.
The approximation was proposed by Satterthwaite (1946).

If there are only two groups and the degrees of freedom are one less than the sample sizes then this gives the denominator of Welch's t-test for unequal variances.

### Value

A list with components

`var` |
effective pooled sample variance |

`df` |
effective pooled degrees of freedom |

`multiplier` |
pooled multiplier |

### Author(s)

Gordon Smyth

### References

Welch, B. L. (1938). The significance of the difference between two means when the population variances are unequal.
*Biometrika* **29**, 350-362.

Satterthwaite, F. E. (1946). An approximate distribution of estimates of variance components.
*Biometrics Bulletin* **2**, 110-114.

Welch, B. L. (1947). The generalization of 'Student's' problem when several different population variances are involved.
*Biometrika* **34**, 28-35.

Welch, B. L. (1949). Further note on Mrs. Aspin's tables and on certain approximations to the tabled function. *Biometrika* **36**, 293-296.

### See Also

10.Other

### Examples

# Welch's t-test with unequal variances
x <- rnorm(10,mean=1,sd=2)
y <- rnorm(20,mean=2,sd=1)
s2 <- c(var(x),var(y))
n <- c(10,20)
out <- poolVar(var=s2,n=n)
tstat <- (mean(x)-mean(y)) / sqrt(out$var*out$multiplier)
pvalue <- 2*pt(-abs(tstat),df=out$df)
# Equivalent to t.test(x,y)

[Package

*limma* version 2.4.7

Index]