fdr.int {OLIN} R Documentation

## Assessment of the significance of intensity-dependent bias

### Description

This function assesses the significance of intensity-dependent bias by an one-sided random permutation test. The observed average values of logged fold-changes within an intensity neighbourhood are compared to an empirical distribution generated by random permutation. The significance is given by the false discovery rate.

### Usage

`fdr.int(A,M,delta=50,N=100,av="median")`

### Arguments

 `A` vector of average logged spot intensity `M` vector of logged fold changes `delta` integer determining the size of the neighbourhood. The actual window size is (`2 * delta+1`). `N` number of random permutations performed for generation of empirical distribution `av` averaging of `M` within neighbourhood by mean or median (default)

### Details

The function `fdr.int` assesses significance of intensity-dependent bias using a one-sided random permutation test. The null hypothesis states the independence of A and M. To test if `M` depends on `A`, spots are ordered with respect to A. This defines a neighbourhood of spots with similar A for each spot. Next, a test statistic is defined by calculating the median or mean of `M` (bar{M}) within a symmetrical spot's intensity neighbourhood of chosen size (`2 *delta+1`). An empirical distribution of the test statistic is produced by calculating for `N` random intensity orders of spots. Comparing this empirical distribution of median/mean of `M` with the observed distribution of median/mean of `M`, the independence of `M` and `A` is assessed. If `M` is independent of `A`, the empirical distribution of median/mean of `M` can be expected to be distributed around its mean value. The false discovery rate (FDR) is used to assess the significance of observing positive deviations of median/mean of `M`. It indicates the expected proportion of false positives among rejected null hypotheses. It is defined as FDR=q*T/s, where q is the fraction of median/mean of `M` larger than chosen threshold c for the empirical distribution, `s` is the number of neighbourhoods with (median/mean of `M`)> c for the distribution derived from the original data and `T` is the total number of neighbourhoods in the original data. Varying threshold c determines the FDR for each spot neighbourhood. FDRs equal zero are set to FDR=1/T*N for computational reasons, as `log10(FDR)` is plotted by `sigint.plot`. Correspondingly, the significance of observing negative deviations of median/mean of `M` can be determined. If the neighbourhood window extends over the limits of the intensity scale, the significance is set to `NA`.

### Value

A list of vector containing the false discovery rates for positive (`FDRp`) and negative (`FDRn`) deviations of median/mean of `M` (of the spot's neighbourhood) is produced.

### Author(s)

Matthias E. Futschik (http://itb.biologie.hu-berlin.de/~futschik)

`p.int`, `fdr.spatial`, `sigint.plot`

### Examples

```
# To run these examples, "un-comment" them!
#
# data(sw)
# CALCULATION OF SIGNIFICANCE OF SPOT NEIGHBOURHOODS
# For this  illustration, N was chosen rather small. For "real" analysis, it should be larger.
# FDR <- fdr.int(maA(sw)[,1],maM(sw)[,1],delta=50,N=10,av="median")
# VISUALISATION OF RESULTS
# sigint.plot(maA(sw)[,1],maM(sw)[,1],FDR\$FDRp,FDR\$FDRn,c(-5,-5))