03.ReadingData {limma}R Documentation

Reading Microarray Data from Files


This help page gives an overview of LIMMA functions used to read data from files.

Reading Target Information

The function readTargets is designed to help with organizing information about which RNA sample is hybridized to each channel on each array and which files store information for each array.

Reading Intensity Data

The first step in a microarray data analysis is to read into R the intensity data for each array provided by an image analysis program. This is done using the function read.maimages.

read.maimages optionally constructs quality weights for each spot using quality functions listed in QualityWeights.

read.maimages produces an RGList object and stores only the information required from each image analysis output file. If you wish to read all the image analysis output files into R as individual data frames preserving all the original columns found in the files, you may use read.series. An RGList object can be extracted from the data frames at a later stage using the functions rg.spot, rg.genepix or rg.quantarray.

Another function, rg.series.spot is very similar to read.maimages with source="spot". This function will be removed in future versions of LIMMA.

read.maimages uses utility functions removeExt, read.matrix, read.imagene and readImaGeneHeader.

The function as.MAList can be used to convert a marrayNorm object to an MAList object if the data was read and normalized using the marray and marrayNorm packages.

Reading the Gene List

Many image analysis program provide gene IDs as columns in the image analysis output files, for example ArrayVision, Imagene and the Stanford Microarray Database. In other cases you may have the probe ID and annotation information in a separate file. The function readGAL reads information from a GenePix Array List (GAL) file. It produces a data frame with known column names. If the probe IDs or names consist of multiple strings separated by a delimiter, then splitName may be used to separate the name and annotation information into separate vectors.

The functions readSpotTypes and controlStatus assist with separating control spots from ordinary genes in the analysis and data exploration.

The function getLayout extracts from the GAL-file data frame the print layout information for a spotted array. The functions gridr, gridc, spotr and spotc use the extracted layout to compute grid positions and spot positions within each grid for each spot. The function printorder calculates the printorder, plate number and plate row and column position for each spot given information about the printing process. The utility function getSpacing converts character strings specifying spacings of duplicate spots to numeric values.

If each probe is printed more than once of the arrays, then uniquegenelist will remove duplicate names from the gal-file or gene list.

Manipulating Data Objects

cbind, rbind, merge allow different RGList or MAList objects to be combined. cbind combines data from different arrays assuming the layout of the arrays to be the same. merge can combine data even when the order of the probes on the arrays has changed. merge uses utility function makeUnique.


Gordon Smyth

[Package limma version 2.4.7 Index]