segment {DNAcopy} | R Documentation |

This program segments DNA copy number data into regions of estimated equal copy number using circular binary segmentation (CBS).

segment(x, alpha = 0.01, nperm = 10000, window.size = NULL, overlap = 0.25, trim = 0.025, undo.splits = c("none", "prune", "sdundo"), undo.prune = 0.05, undo.SD = 3, verbose = 1)

`x` |
an object of class CNA |

`alpha` |
significance levels for the test to accept change-points. |

`nperm` |
number of permutations used for p-value computation. |

`window.size` |
size of window used to speed up computations when segment size is too large. Default is NULL (whole segment used). |

`overlap` |
proportion of data that overlap for adjacent windows. |

`trim` |
proportion of data to be trimmed for variance calculation for smoothing outliers and undoing splits based on SD. |

`undo.splits` |
A character string specifying how change-points are to be undone, if at all. Default is "none". Other choices are "prune", which uses a sum of squares criterion, and "sdundo", which undoes splits that are not at least this many SDs apart. |

`undo.prune` |
the proportional increase in sum of squares allowed when eliminating splits if undo.splits="prune". |

`undo.SD` |
the number of SDs between means to keep a split if undo.splits="sdundo". |

`verbose` |
level of verbosity for monitoring the program's progress where 0 produces no printout, 1 prints the current sample, 2 the current chromosome and 3 the current segment. The default level is 1. |

This function implements the cicular binary segmentation (CBS) algorithm of Olshen and Venkatraman (2004). Given a set of genomic data, either continuous or binary, the algorithm recursively splits chromosomes into either two or three subsegments based on a maximum t-statistic. A reference distribution, used to decided whether or not to split, is estimated by permutation. Options are given to eliminate splits when the means of adjacent segments are not sufficiently far apart. Note that after the first split the $α$-levels of the tests for splitting are not unconditional.

We recommend using one of the undoing options to remove change-points detected due to local trends (see the manuscript below for examples of local trends).

Since the segmentation procedure uses a permutation reference distribution, R commands for setting and saving seeds should be used if the user wishes to reproduce the results.

a data frame with six columns. Each row of the data frame contains a segment for which there are six variables: the sample id, the chromosome number, the map position of the start of the segment, the map position of the end of the segment, the number of markers in the segment, and the average value in the segment.

E. S. Venkatraman and Adam Olshen olshena@mskcc.org

Olshen, A. B., Venkatraman, E. S., Lucito, R., Wigler, M. (2004).
Circular binary segmentation for the analysis of array-based DNA copy
number data. *Biostatistics* 5: 557-572.
http://www.mskcc.org/biostat/~olshena/research.

# test code on an easy data set set.seed(25) genomdat <- rnorm(500, sd=0.1) + rep(c(-0.2,0.1,1,-0.5,0.2,-0.5,0.1,-0.2),c(137,87,17,49,29,52,87,42)) plot(genomdat) chrom <- rep(1:2,c(290,210)) maploc <- c(1:290,1:210) test1 <- segment(CNA(genomdat, chrom, maploc)) # test code on a noisier and hence more difficult data set set.seed(51) genomdat <- rnorm(500, sd=0.2) + rep(c(-0.2,0.1,1,-0.5,0.2,-0.5,0.1,-0.2),c(137,87,17,49,29,52,87,42)) plot(genomdat) chrom <- rep(1:2,c(290,210)) maploc <- c(1:290,1:210) test2 <- segment(CNA(genomdat, chrom, maploc)) #A real analyis data(coriell) #Combine into one CNA object to prepare for analysis on Chromosomes 1-23 CNA.object <- CNA(cbind(coriell$Coriell.05296,coriell$Coriell.13330), coriell$Chromosome,coriell$Position, data.type="logratio",sampleid=c("c05296","c13330")) #We generally recommend smoothing single point outliers before analysis #Make sure to check that the smoothing is proper smoothed.CNA.object <- smooth.CNA(CNA.object) #Segmentation at default parameters segment.smoothed.CNA.object <- segment(smoothed.CNA.object, verbose=1) data(coriell) #Combine into one CNA object to prepare for analysis on Chromosomes 1-23 CNA.object <- CNA(cbind(coriell$Coriell.05296,coriell$Coriell.13330), coriell$Chromosome,coriell$Position, data.type="logratio",sampleid=c("c05296","c13330")) #We generally recommend smoothing single point outliers before analysis #Make sure to check that the smoothing is proper smoothed.CNA.object <- smooth.CNA(CNA.object) #Segmentation at default parameters segment.smoothed.CNA.object <- segment(smoothed.CNA.object, verbose=1)

[Package *DNAcopy* version 1.1.0 Index]