import {MANOR} | R Documentation |
Load raw data from a text file coming from image analysis and convert it to an arrayCGH
object, using additional information about the array design.
Supported file types are Genepix Results file (.gpr), outputs from SPOT, or any text file with appropriate fields "Row" and "Column" and specified array design
import(file, var.names=NULL, spot.names=NULL, clone.names=NULL, type=c("default", "gpr", "spot"), id.rep=1, design=NULL, add.lines=FALSE, ...)
file |
a connection or character string giving the name of the file to import. |
var.names |
a vector of variables names used to compute the array design. If default is not overwritten, it is set to c("Block", "Column", "Row", "X", "Y") for gpr files, c("Arr.colx", "Arr.rowy", "Spot.colx", "Spot.rowy") for SPOT files, and c("Col", "Row") for other text files |
spot.names |
a list with spot-level variable names to be added to
arrayCGH$arrayValues |
clone.names |
a list with clone-level variable names to be added
to arrayCGH$cloneValues (only used in case of within-slide replicates) |
type |
a character value specifying the type of input file: currently .gpr files ("gpr"), spot files ("spot") and other text files with fields 'Col' and 'Row' ("default") are supported |
id.rep |
index of the replicate identifier (e.g. the name of the clone) in the vector(clone.names) |
design |
a numeric vector of length 4 specifying array design as number of blocks per column, number of blocks per row, number of columns by block, number of rows per block. This field is mandatory for "default" text files, optional for "gpr" files, and not used for "SPOT" files |
add.lines |
boolean value to handle the case when array design does not match number of lines. If TRUE, empty lines are added; if FALSE, execution is stopped |
... |
additional import parameters (e.g. 'sep= ', or 'comment.char= ', to be passed to read.delim
function. Note that argument as.is=TRUE is always passed to
read.delim, in order to avoid unapropriate conversion of character
vectors to factors |
Mandatory elements of arrayCGH
objects are the array design and the x and y
absolute coordinates of each spot on the array. Output files
from SPOT contain x and y relative coordinates of each spot within a
block, as well as block coordinates on the array; one can therefore
easily construct te corresponding arrayCGH
object.
.gpr files currently only contain x and y relative coordinates of each
spot within a block, and block index with no specification of the
spatial block design: if block design is not specified by user, we
compute it using the real pixel locations of each spot (X
and Y
variables in
usual .gpr files)
If clone.names is provided, an additional data frame is created with
clone-level information (e.g. clone names, positions,
chromosomes, quality marks), aggregated from array-level information
using the identifier specified by id.rep. This identifier is also
added to the arrayCGH
object created, with name 'id.rep'.
Due to space limitations, only the first 100 lines of sample 'gpr' and
'spot' files are given in the standard distribution of
MANOR
. Complete files are available at http://bioinfo.curie.fr/projects/manor/index.html
an object of class arrayCGH
People interested in tools for array-CGH analysis can visit our web-page: http://bioinfo.curie.fr.
Pierre Neuvial, manor@curie.fr.
dir.in <- system.file("data", package="MANOR") ## import from 'spot' files spot.names <- c("LogRatio", "RefFore", "RefBack", "DapiFore", "DapiBack", "SpotFlag", "ScaledLogRatio") clone.names <- c("PosOrder", "Chromosome") edge <- import(paste(dir.in, "/edge.txt", sep=""), type="spot", spot.names=spot.names, clone.names=clone.names, add.lines=TRUE) ## import from 'gpr' files spot.names <- c("Clone", "FLAG", "TEST_B_MEAN", "REF_B_MEAN", "TEST_F_MEAN", "REF_F_MEAN", "ChromosomeArm") clone.names <- c("Clone", "Chromosome", "Position", "Validation") ac <- import(paste(dir.in, "/gradient.gpr", sep=""), type="gpr", spot.names=spot.names, clone.names=clone.names, sep="\t", comment.char="@", add.lines=TRUE)