xcluster {ctc} | R Documentation |

Performs a hierarchical cluster analysis on a set of dissimilarities.

xcluster(data,distance="euclidean",clean=FALSE,tmp.in="tmp.txt",tmp.out="tmp.gtr")

`data` |
a matrix (or data frame) which provides the data to analyze |

`distance` |
The distance measure used with Xcluster. This must be one of
`"euclidean"` , `"pearson"` or `"notcenteredpearson"` .
Any unambiguous substring can be given. |

`clean` |
a logical value indicating whether you want the true
distances (`clean=FALSE` ), or you want a clean dendogramme |

`tmp.in, tmp.out` |
temporary files for Xcluster |

Available distance measures are (written for two vectors *x* and
*y*):

- Euclidean: Usual square distance between the two vectors (2 norm).
- Pearson:
*1 - cor(x,y)* - Pearson not centered:
*1 - [ sum x_i y_i ] / sqrt[ sum x_i^2 * sum y_i^2 ]*

Xcluster does not use usual agglomerative methods (single, average, complete), but compute the distance between each groups' barycenter for the distance between two groups.

This have a problem for this kind of data:

A | 0 | 0 |

B | 0 | 1 |

C | 0.9 | 0.5 |

Ie: a triangular in {bf R}$^2$, the distance between A and B is larger than the distance between the group A,B and C (with euclidean distance).

For that case it can be useful to use `clean=TRUE`

and that mean
that you must not consider A and B as a group without C.

An object of class **hclust** which describes the
tree produced by the clustering process.
The object is a list with components:

`merge` |
an n-1 by 2 matrix.
Row i of `merge` describes the merging of clusters
at step i of the clustering.
If an element j in the row is negative,
then observation -j was merged at this stage.
If j is positive then the merge
was with the cluster formed at the (earlier) stage j
of the algorithm.
Thus negative entries in `merge` indicate agglomerations
of singletons, and positive entries indicate agglomerations
of non-singletons. |

`height` |
a set of n-1 non-decreasing real values.
The clustering height: that is, the value of
the criterion associated with the clustering
`method` for the particular agglomeration. |

`order` |
a vector giving the permutation of the original
observations suitable for plotting, in the sense that a cluster
plot using this ordering and matrix `merge` will not have
crossings of the branches. |

`labels` |
labels for each of the objects being clustered. |

`call` |
the call which produced the result. |

`method` |
the cluster method that has been used. |

`dist.method` |
the distance that has been used to create `d`
(only returned if the distance object has a `"method"`
attribute). |

*Xcluster* is a C program made by *Gavin Sherlock* that performs
hierarchical clustering, K-means and SOM.

*Xcluster* is copyrighted.
To get or have information
about
*Xcluster*: http://genome-www.stanford.edu/~sherlock/cluster.html

Antoine Lucas, http://genopole.toulouse.inra.fr/~lucas/R

# Create data .Random.seed <- c(1, 416884367 ,1051235439) m <- matrix(rep(1,3*24),ncol=3) m[9:16,3] <- 3 ; m[17:24,] <- 3 #create 3 groups m <- m+rnorm(24*3,0,0.5) #add noise m <- floor(10*m)/10 #just one digits # And once you have Xcluster program: # #h <- xcluster(m) # #library(mva) #plot(h)

[Package *ctc* version 1.2.7 Index]