covMcd {rrcov}R Documentation

Robust location and scatter estimation with high breakdown point

Description

Compute a multivariate location and scale estimate with a high breakdown point using the Fast MCD (Minimum Covariance Determinant) Estimator.

Usage

covMcd(x, cor=FALSE, alpha=1/2, nsamp=500, seed=0, print.it=FALSE)

Arguments

x a matrix or data frame.
cor should the returned result include a correlation matrix? Default is cor = FALSE
alpha The size of the subsets over which the determinant is minimized. Must be between the default = (n+p+1)/2 and n. Provide a fraction between .5 and 1, indicating the fraction of the data over which the determinant is minimized.
nsamp number of subsets used for initial estimates. Default is nsamp = 500
seed starting value for random generator. Default is seed = 0
print.it whether to print intermediate results. Default is print.it = FALSE

Details

The minimum covariance determinant estimator of location and scatter implemented in covMcd() is similar to the existing R function cov.mcd() in MASS. The MCD method looks for the h(> n/2) observations (out of n) whose classical covariance matrix has the lowest possible determinant. The raw MCD estimate of location is then the average of these h points, whereas the raw MCD estimate of scatter is their covariance matrix, multiplied with a consistency factor. Based on these raw MCD estimates, a reweighting step is performed which increases the finite-sample eficiency considerably - see Pison et.al. (2002). The implementation in rrcov uses the Fast MCD algorithm of Rousseeuw and Van Driessen (1999) to approximate the minimum covariance determinant estimator.

Value

A list with components

center the final estimate of location.
cov the final estimate of scatter.
cor the (final) estimate of the correlation matrix (only if cor = TRUE) .
crit the value of the criterion, i.e. the determinant.
best the best subset found and used for computing the raw estimates. The size of best is equal to quan.
mah mahalanobis distances of the observations using the final estimate of the location and scater.
mcd.wt weights of the observations using the final estimate of the location and scater.
raw.center the raw (not reweighted) estimate of location.
raw.cov the raw (not reweighted) estimate of scatter.
raw.mah mahalanobis distances of the observations based on the raw estimate of the location and scater.
raw.weights weights of the observations based on the raw estimate of the location and scater.
X the input data as a matrix.
n.obs total number of observations.
alpha the size of the subsets over which the determinant is minimized (the default is (n+p+1)/2).
quan the number of observations on which the MCD is based. If quan equals n.obs, the MCD is the classical covariance matrix.
method character string naming the method (Minimum Covariance Determinant).

References

P. J. Rousseeuw and A. M. Leroy (1987) Robust Regression and Outlier Detection. Wiley.

P. J. Rousseeuw and K. van Driessen (1999) A fast algorithm for the minimum covariance determinant estimator. Technometrics 41, 212–223.

Pison, G., Van Aelst, S., and Willems, G. (2002), Small Sample Corrections for LTS and MCD, Metrika, 55, 111-123.

Examples


data(hbk)
covMcd(hbk.x)


[Package rrcov version 0.2-5 Index]