covMcd {rrcov} R Documentation

## Robust location and scatter estimation with high breakdown point

### Description

Compute a multivariate location and scale estimate with a high breakdown point using the Fast MCD (Minimum Covariance Determinant) Estimator.

### Usage

```covMcd(x, cor=FALSE, alpha=1/2, nsamp=500, seed=0, print.it=FALSE)
```

### Arguments

 `x` a matrix or data frame. `cor` should the returned result include a correlation matrix? Default is `cor = FALSE` `alpha` The size of the subsets over which the determinant is minimized. Must be between the default = (n+p+1)/2 and n. Provide a fraction between .5 and 1, indicating the fraction of the data over which the determinant is minimized. `nsamp` number of subsets used for initial estimates. Default is `nsamp = 500` `seed` starting value for random generator. Default is `seed = 0` `print.it` whether to print intermediate results. Default is `print.it = FALSE`

### Details

The minimum covariance determinant estimator of location and scatter implemented in covMcd() is similar to the existing R function cov.mcd() in MASS. The MCD method looks for the h(> n/2) observations (out of n) whose classical covariance matrix has the lowest possible determinant. The raw MCD estimate of location is then the average of these h points, whereas the raw MCD estimate of scatter is their covariance matrix, multiplied with a consistency factor. Based on these raw MCD estimates, a reweighting step is performed which increases the finite-sample eficiency considerably - see Pison et.al. (2002). The implementation in rrcov uses the Fast MCD algorithm of Rousseeuw and Van Driessen (1999) to approximate the minimum covariance determinant estimator.

### Value

A list with components

 `center` the final estimate of location. `cov` the final estimate of scatter. `cor` the (final) estimate of the correlation matrix (only if `cor = TRUE`) . `crit` the value of the criterion, i.e. the determinant. `best` the best subset found and used for computing the raw estimates. The size of `best` is equal to `quan`. `mah` mahalanobis distances of the observations using the final estimate of the location and scater. `mcd.wt` weights of the observations using the final estimate of the location and scater. `raw.center` the raw (not reweighted) estimate of location. `raw.cov` the raw (not reweighted) estimate of scatter. `raw.mah` mahalanobis distances of the observations based on the raw estimate of the location and scater. `raw.weights` weights of the observations based on the raw estimate of the location and scater. `X` the input data as a matrix. `n.obs` total number of observations. `alpha` the size of the subsets over which the determinant is minimized (the default is (n+p+1)/2). `quan` the number of observations on which the MCD is based. If `quan` equals `n.obs`, the MCD is the classical covariance matrix. `method` character string naming the method (Minimum Covariance Determinant).

### References

P. J. Rousseeuw and A. M. Leroy (1987) Robust Regression and Outlier Detection. Wiley.

P. J. Rousseeuw and K. van Driessen (1999) A fast algorithm for the minimum covariance determinant estimator. Technometrics 41, 212–223.

Pison, G., Van Aelst, S., and Willems, G. (2002), Small Sample Corrections for LTS and MCD, Metrika, 55, 111-123.

### Examples

```
data(hbk)
covMcd(hbk.x)

```

[Package rrcov version 0.2-5 Index]