read.SnpSetIllumina {beadarraySNP} | R Documentation |
A SnpSetIllumina object is created from the textfiles created by the Illumina GenCall or BeadStudio software.
read.SnpSetIllumina(samplesheet, manifestpath=NULL, reportpath=NULL, rawdatapath=NULL, reportfile=NULL, briefOPAinfo=TRUE, readTIF=FALSE, ...)
samplesheet |
a data.frame or filename, contains the sample sheet |
manifestpath |
a character string for the path containing the manifests / OPA definition files, defaults to path of samplesheet |
reportpath |
a character string for the path containing the report files, defaults to path of samplesheet |
rawdatapath |
a character string for the path containing the intensity data files, defaults to path of samplesheet |
reportfile |
a character string for the name of BeadStudio reportfile |
briefOPAinfo |
logical, if TRUE then only the SNP name, Illumi
code, chromosome and basepair position are put into the featureData
slot of the result, else all information from the OPA file is put into the
featureData slot |
readTIF |
logical, uses beadarray package and raw TIF files to
read data |
... |
arguments are forwarded to readIllumina
and can be used to perform bead-level normalization |
The text files from Illumina software are imported to a SnpSetIllumina object.
Both result files from GenCall and BeadStudio can be used.
In both cases the sample sheets from the experiments are used to select the
proper data from the report or data files. The following columns from the
sample sheet file are used for this purpose: ‘Sample_Name
’,
‘Sentrix_Position
’, and ‘Pool_ID
’. The values in
columns ‘Sample_Plate
’, ‘Pool_ID
’, and
‘Sentrix_ID
’ should be the same for all samples in the file, as
this is the case for processed experiments. The contents of the sample sheet
are put into the phenoData
slot.
Ideally the OPA manifest file containing SNP annotation should be available,
these files are provided by Illumina. Columns ‘IllCode
’,
‘CHR
’, and ‘MapInfo
’ are put into the
featureData
slot.
GenCall Data
In order to process experiments that were genotyped using the GenCall software,
the arrays should be scanned with the setting
<SaveTextFiles>true</SaveTextFiles>
in the Illumina configuration file
Settings.XML
. 3 Types of files need to be present in the same folder:
The sample sheet, .csv files containing signal intensity data, and the report
file that contains the genotype information. For each sample in the sample
sheet there should be a .csv file with the following file mask:
[sam_id]_R00[yy]_C00[xx].csv
, where sam_id
is the Illumina ID
for the SAM, and xx
and yy
are the column and row number
respectively. From the report files the file with mask
[Pool_ID]_LocusByDNA[_ExpName].csv
is used. ‘Pool_ID
’ is
the OPA panel used, and ‘_ExpName
’ is optional.
BeadStudio Data
To process experiments that were processed with BeadStudio, only two files are
needed. The sample sheet and the Final Report file. The sample sheet must
contain the same columns as for GenCall, the report file should contain the
following columns: ‘SNP Name
’, ‘Sample ID
’,
‘GC Score
’, ‘Allele1 - AB
’,
‘Allele2 - AB
’, ‘GT Score
’, ‘X Raw
’,
and ‘Y Raw
’. ‘SNP Name
’ and
‘Sample ID
’ are used to form rows and columns in the
experimental data, ‘GC Score
’ is put in the
callProbability
matrix, ‘Allele1 - AB
’ and
‘Allele2 - AB
’ are combined into the call
matrix,
‘GT Score
’ is added to the featureData
slot,
‘X Raw
’ is put in the R
matrix and ‘Y Raw
’
in the G
matrix. Other columns in the report file are added as matrices
in the assayData
slot, or columns in the featureData
slot if
values are identical for all samples in the reportfile.
Sample sheets
To help generate a sample sheet for BeadStudio data a Sample_Map.txt
file can be converted to a sample sheet with the
Sample_Map2Samplesheet
function.
Manifest/OPA/annotation files
For BeadStudio reportfiles it is not necessary to have a Manifest file if the
columns ‘Chr
’ and ‘Position
’ are available in the
report file. Currently this is the only way to import data from Infinium
arrays, because Illumina does not supply Manifest files for these arrays.
This function returns an SnpSetIllumina
object.
Jan Oosting
SnpSetIllumina-class
, Sample_Map2Samplesheet
,
readIllumina
# read a SnpSetIllumina object using example textfiles in data directory datadir <- system.file("testdata", package="beadarraySNP") SNPdata <- read.SnpSetIllumina(paste(datadir,"4samples_opa4.csv",sep="/"),datadir)