fanny {cluster} R Documentation

## Fuzzy Analysis Clustering

### Description

Computes a fuzzy clustering of the data into `k` clusters.

### Usage

```fanny(x, k, diss = inherits(x, "dist"),
memb.exp = 2, metric = "euclidean", stand = FALSE,
maxit = 500, tol = 1e-15)
```

### Arguments

 `x` data matrix or data frame, or dissimilarity matrix, depending on the value of the `diss` argument. In case of a matrix or data frame, each row corresponds to an observation, and each column corresponds to a variable. All variables must be numeric. Missing values (NAs) are allowed. In case of a dissimilarity matrix, `x` is typically the output of `daisy` or `dist`. Also a vector of length n*(n-1)/2 is allowed (where n is the number of observations), and will be interpreted in the same way as the output of the above-mentioned functions. Missing values (NAs) are not allowed. `k` integer giving the desired number of clusters. It is required that 0 < k < n/2 where n is the number of observations. `diss` logical flag: if TRUE (default for `dist` or `dissimilarity` objects), then `x` is assumed to be a dissimilarity matrix. If FALSE, then `x` is treated as a matrix of observations by variables. `memb.exp` number r strictly larger than 1 specifying the membership exponent used in the fit criterion; see the ‘Details’ below. Default: `2` which used to be hardwired inside FANNY. `metric` character string specifying the metric to be used for calculating dissimilarities between observations. The currently available options are "euclidean" and "manhattan". Euclidean distances are root sum-of-squares of differences, and manhattan distances are the sum of absolute differences. If `x` is already a dissimilarity matrix, then this argument will be ignored. `stand` logical; if true, the measurements in `x` are standardized before calculating the dissimilarities. Measurements are standardized for each variable (column), by subtracting the variable's mean value and dividing by the variable's mean absolute deviation. If `x` is already a dissimilarity matrix, then this argument will be ignored. `maxit, tol` maximal number of iterations and default tolerance for convergence (relative convergence of the fit criterion) for the FANNY algorithm. The defaults `maxit = 500` and ```tol = 1e-15``` used to be hardwired inside the algorithm.

### Details

In a fuzzy clustering, each observation is ``spread out'' over the various clusters. Denote by u(i,v) the membership of observation i to cluster v. The memberships are nonnegative, and for a fixed observation i they sum to 1. The particular method `fanny` stems from chapter 4 of Kaufman and Rousseeuw (1990) (see the references in `daisy`) and has been extended to allow user specified `memb.exp`.

Fanny aims to minimize the objective function

SUM_[v=1..k] (SUM_(i,j) u(i,v)^r u(j,v)^r d(i,j)) / (2 SUM_j u(j,v)^r)

where n is the number of observations, k is the number of clusters, r is the membership exponent `memb.exp` and d(i,j) is the dissimilarity between observations i and j.
Note that r -> 1 gives increasingly crisper clusterings whereas r -> Inf leads to complete fuzzyness. K&R(1990), p.191 note that values too close to 1 can lead to slow convergence.

Compared to other fuzzy clustering methods, `fanny` has the following features: (a) it also accepts a dissimilarity matrix; (b) it is more robust to the `spherical cluster` assumption; (c) it provides a novel graphical display, the silhouette plot (see `plot.partition`).

### Value

an object of class `"fanny"` representing the clustering. See `fanny.object` for details.

`agnes` for background and references; `fanny.object`, `partition.object`, `plot.partition`, `daisy`, `dist`.

### Examples

```## generate 10+15 objects in two clusters, plus 3 objects lying
## between those clusters.
x <- rbind(cbind(rnorm(10, 0, 0.5), rnorm(10, 0, 0.5)),
cbind(rnorm(15, 5, 0.5), rnorm(15, 5, 0.5)),
cbind(rnorm( 3,3.2,0.5), rnorm( 3,3.2,0.5)))
fannyx <- fanny(x, 2)
## Note that observations 26:28 are "fuzzy" (closer to # 2):
fannyx
summary(fannyx)
plot(fannyx)

(fan.x.15 <- fanny(x, 2, memb.exp = 1.5)) # 'crispier' for obs. 26:28
(fanny(x, 2, memb.exp = 3))               # more fuzzy in general

data(ruspini)
## Plot similar to Figure 6 in Stryuf et al (1996)
plot(fanny(ruspini, 5))
```

[Package cluster version 1.10.2 Index]