argmax.geno {qtl}R Documentation

Reconstruct underlying genotypes


Uses the Viterbi algorithm to identify the most likely sequence of underlying genotypes, given the observed multipoint marker data, with possible allowance for genotyping errors.


argmax.geno(cross, step=0, off.end=0, error.prob=0, 


cross An object of class cross. See read.cross for details.
step Maximum distance (in cM) between positions at which the genotypes are reconstructed, though for step = 0, genotypes are reconstructed only at the marker locations.
off.end Distance (in cM) past the terminal markers on each chromosome to which the genotype reconstructions will be carried.
error.prob Assumed genotyping error rate used in the calculation of the penetrance Pr(observed genotype | true genotype).
map.function Indicates whether to use the Haldane, Kosambi, Carter-Falconer or Morgan map function when converting genetic distances into recombination fractions.


We use the Viterbi algorithm to calculate arg max_v Pr(g = v | O) where g is the underlying sequence of genotypes and O is the observed marker genotypes.

This is done by calculating Q[k](v[k]) = max{v[1], ..., v[k-1]} Pr(g[1] = v[1], ..., g[k] = v[k], O[1], ..., O[k]) for k = 1, ..., n and then tracing back through the sequence.


The input cross object is returned with a component, argmax, added to each component of cross$geno. argmax is a matrix of size [n.ind x n.pos], where n.pos is the number of positions at which the reconstructed genotypes were obtained, containing the most likely sequences of underlying genotypes. Attributes "error.prob", "step", and "off.end" are set to the values of the corresponding arguments, for later reference.


The Viterbi algorithm can behave badly when step is small but positive. One may observe quite different results for different values of step.

The problem is that, in the presence of data like A----H, the sequences AAAAAA and HHHHHH may be more likely than any one of the sequences AAAAAH, AAAAHH, AAAHHH, AAHHHH, AHHHHH, AAAAAH. The Viterbi algorithm produces a single ``most likely'' sequence of underlying genotypes.


Karl W Broman,


Lange, K. (1999) Numerical analysis for statisticians. Springer-Verlag. Sec 23.3.

Rabiner, L. R. (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77, 257–286.

See Also

sim.geno, calc.genoprob



fake.f2 <- argmax.geno(fake.f2, step=2, off.end=5, err=0.01)


fake.bc <- argmax.geno(fake.bc, step=0, off.end=0)

[Package qtl version 0.98-57 Index]