xcluster {ctc}R Documentation

Hierarchical clustering

Description

Performs a hierarchical cluster analysis on a set of dissimilarities.

Usage

xcluster(data,distance="euclidean",clean=FALSE,tmp.in="tmp.txt",tmp.out="tmp.gtr")

Arguments

data a matrix (or data frame) which provides the data to analyze
distance The distance measure used with Xcluster. This must be one of "euclidean", "pearson" or "notcenteredpearson". Any unambiguous substring can be given.
clean a logical value indicating whether you want the true distances (clean=FALSE), or you want a clean dendogramme
tmp.in, tmp.out temporary files for Xcluster

Details

Available distance measures are (written for two vectors x and y):

Xcluster does not use usual agglomerative methods (single, average, complete), but compute the distance between each groups' barycenter for the distance between two groups.

This have a problem for this kind of data:

A 0 0
B 0 1
C 0.9 0.5

Ie: a triangular in {bf R}$^2$, the distance between A and B is larger than the distance between the group A,B and C (with euclidean distance).

For that case it can be useful to use clean=TRUE and that mean that you must not consider A and B as a group without C.

Value

An object of class hclust which describes the tree produced by the clustering process. The object is a list with components:

merge an n-1 by 2 matrix. Row i of merge describes the merging of clusters at step i of the clustering. If an element j in the row is negative, then observation -j was merged at this stage. If j is positive then the merge was with the cluster formed at the (earlier) stage j of the algorithm. Thus negative entries in merge indicate agglomerations of singletons, and positive entries indicate agglomerations of non-singletons.
height a set of n-1 non-decreasing real values. The clustering height: that is, the value of the criterion associated with the clustering method for the particular agglomeration.
order a vector giving the permutation of the original observations suitable for plotting, in the sense that a cluster plot using this ordering and matrix merge will not have crossings of the branches.
labels labels for each of the objects being clustered.
call the call which produced the result.
method the cluster method that has been used.
dist.method the distance that has been used to create d (only returned if the distance object has a "method" attribute).

Note

Xcluster is a C program made by Gavin Sherlock that performs hierarchical clustering, K-means and SOM.

Xcluster is copyrighted. To get or have information about Xcluster: http://genome-www.stanford.edu/~sherlock/cluster.html

Author(s)

Antoine Lucas, http://genopole.toulouse.inra.fr/~lucas/R

See Also

r2xcluster, xcluster2r,hclust

Examples

#    Create data
.Random.seed <- c(1,  416884367 ,1051235439)
m <- matrix(rep(1,3*24),ncol=3)  
m[9:16,3] <- 3 ; m[17:24,] <- 3    #create 3 groups
m <- m+rnorm(24*3,0,0.5)           #add noise
m <- floor(10*m)/10                #just one digits

# And once you have Xcluster program:
#
#h <- xcluster(m)
#
#library(mva)
#plot(h) 

[Package ctc version 1.2.7 Index]