readCel {affxparser} | R Documentation |
This function reads all or a subset of the data in an Affymetrix CEL file.
readCel(filename, indices = NULL, readHeader = TRUE, readXY = FALSE, readIntensities = TRUE, readStdvs = FALSE, readPixels = FALSE, readOutliers = TRUE, readMasked = TRUE, readMap = NULL, verbose = 0, .checkArgs = TRUE)
filename |
the name of the CEL file. |
indices |
a vector of indices indicating which features to
read. If the argument is NULL all features will be returned. |
readXY |
a logical: will the (x,y) coordinates be returned. |
readIntensities |
a logical: will the intensities be returned. |
readStdvs |
a logical: will the standard deviations be returned. |
readPixels |
a logical: will the number of pixels be returned. |
readOutliers |
a logical: will the outliers be return. |
readMasked |
a logical: will the masked features be returned. |
readHeader |
a logical: will the header of the file be returned. |
readMap |
A vector remapping cell indices to
file indices. If NULL , no mapping is used. |
verbose |
how verbose do we want to be. 0 is no verbosity, higher numbers mean more verbose output. At the moment the values 0, 1 and 2 are supported. |
.checkArgs |
If TRUE , the arguments will be validated,
otherwise not. Warning: This should only be used if the
arguments have been validated elsewhere! |
A CEL files consists of a header, a set of cell values,
and information about outliers and masked
cells.
The cell values, which are values extract for each cell (aka feature
or probe), are the (x,y) coordinate, intensity and standard deviation
estimates, and the number of pixels in the cell.
If readIndices=NULL
, cell values for all cells are returned,
Only cell values specified by argument readIndices
are returned.
This value returns a named list with compontents described below:
|
The header of the CEL file. Equivalent to the
output from readCelHeader , see the documentation for that
function. |
x,y |
(cell values) Two integer vectors containing
the x and y coordinates associated with each feature. |
|
(cell value) A numeric vector
containing the intensity associated with each feature. |
stdvs |
(cell value) A numeric vector containing
the standard deviation associated with each feature. |
pixels |
(cell value) An integer vector containing
the number of pixels associated with each feature. |
outliers |
An integer vector of indices specifying which
of the queried cells that are flagged as outliers.
Note that there is a difference between outliers=NULL and
outliers=integer(0) ; the last case happens when
readOutliers=TRUE but there are no outliers. |
masked |
An integer vector of indices specifying which
of the queried cells that are flagged as masked.
Note that there is a difference between masked=NULL and
masked=integer(0) ; the last case happens when
readMasked=TRUE but there are no masked features. |
The elements of the cell values are ordered according to argument
indices
. The lengths of the cell-value elements equals the
number of cells read.
Which of the above elements that are returned are controlled by the
readNnn
arguments. If FALSE
, the corresponding element
above is NULL
, e.g. if readStdvs=FALSE
then
stdvs
is NULL
.
The Affymetrix image analysis software flags cells as outliers and masked. This method does not return these flags, but instead vectors of cell indices listing which cells of the queried cells are outliers and masked, respectively. The current community view seems to be that this should be done based on statistical modelling of the actual probe intensities and should be based on the choice of preprocessing algorithm. Most algorithms are only using the intensities from the CEL file.
The Fusion SDK allocates memory for the entire
CEL file, when the file is accessed (but does not actually read the
file into memory). Using the indices
argument will therefore
only affect the memory use of the final object (as well as speed), not
the memory allocated in the C function used to parse the file. This
should be a minor problem however.
It is considered a bug if the file contains information not accessible by this function, please report it.
James Bullard, bullard@stat.berkeley.edu and Kasper Daniel Hansen, khansen@stat.berkeley.edu
readCelHeader()
for a description of the header output.
Often a user only wants to read the intensities, look at
readCelIntensities()
for a function specialized for
that use.
for (zzz in 0) { # Only so that 'break' can be used # Scan current directory for CEL files celFiles <- list.files(pattern="[.](c|C)(e|E)(l|L)$") if (length(celFiles) == 0) break; celFile <- celFiles[1] # Read a subset of cells idxs <- c(1:5, 1250:1500, 450:440) cel <- readCel(celFile, indices=idxs, readOutliers=TRUE) str(cel) # Clean up rm(celFiles, celFile, cel) } # for (zzz in 0)