Agi4x44PreProcess-package {Agi4x44PreProcess} | R Documentation |
Agi4x44PreProcess Package Overview
The package allows the preprocessing of Agilent 4x44 array data produced by the Agilent Feature Extraction (AFE) image analysis software. The AFE extracts foreground and background signals, as well as some quality flags. All the extracted information is assembled into the componenents of a 'RGList' object (see 'limma' package)
The preprocessing includes: background correction, normalization and filtering probes according to different quality flags that are produced by the AFE.
A 'target' file and the corresponding data files produced by the AFE image analysis software are required as inputs.
The preprocessing steps are the following: - reading the targets file - reading the array data samples obtained with AFE - Background correction - Normalization between samples - Filtering probes by their Quality Flag - Summarizing replicated probes - Creating and ExpressionSet object with the processed data
The package also contains two specific functions that allow the users to explore the architecture of the chip in terms of probe replication and gene replication. In the first case, it identifies non-control replicated probes (Probe Sets) that are spread over the chip with the propouse of evaluating its reproducibility. In the second case, it picks those genes (according to the ACCNUM code obtained from the corresponding Bioconductor annotation package) that are interrogated by different probes in different locations. These groups of genes are termed 'Gene Sets' .
The package also contains standard graphical microarray utilities that allow the users to evaluate the quality of the data. These graphics also allow to make a decision about what sort of foreground and background signals, among those provided by the AFE, are going to be used in the analysis. A graphical inspection of the data also might help to dedice what background signal correction and normalization between samples could be more suitable to perform.
There are also utility functions that write files across different stages of the processing protocol. These files include the probes list, with information such as their quality flag, normalized intensity and the corresponding information obtained from its annotation package.
Pedro Lopez-Romero plopez@cnic.es
Agilent Feature Extraction Reference Guide url{http://www.Agilent.com}
Gordon K. Smyth, M. Ritchie, N. Thorne, J. Wettenhall (2007). limma: Linear Models for Microarray Data User's Guide.
Bolstad, B. M. (2001), Probe level quantile normalization of high density oligonucleotide array data. Unpublished Manuscript: http://bmbolstad.com/stuff/qnorm.pdf
Bolstad, B. M., Irizarry R. A., Astrand, M., and Speed, T. P. (2003), A comparison of normalization methods for high density oligonucleotide array data based on bias and variance. Bioinformatics 19, 185-193.
Smyth, G. K. (2005). Limma: linear models for microarray data. In: 'Bioinformatics and Computational Biology Solutions Using R and Bioconductor'. R. Gentleman, V. Carey, S. Dudoit, R. Irizarry, W. Huber (eds), Springer, New York, pages 397 - 420
## Not run: reading target file and Agilent Feature Extraction data files targets=read.targets(infile="targets.txt") dd=read.AgilentFE(targets,makePLOT=TRUE) ## End(Not run) ## Not run: data(dd) data(targets) ## End(Not run) ## Not run: Non-Control replicated Probes ## Not run: CV.rep.probes(dd,"hgug4112a.db", foreground="MeanSignal",raw.data=TRUE,writeR=TRUE,targets) ## End(Not run) ## Not run: genes replicated - ensembl ## Not run: genes.rpt.agi(dd,annotation.package="hgug4112a.db",raw.data=TRUE, WRITE.html=TRUE,REPORT=TRUE) ## End(Not run) ## Not run: NORMALIZATION (here the foreground and background are chosen) ## Not run: ddNORM=BGandNorm(dd,BGmethod='half',NORMmethod='quantile', foreground='MeanSignal',background='BGMedianSignal', offset=50,makePLOTpre=TRUE,makePLOTpost=TRUE) ## End(Not run) ## Not run: FILTERING PROBES ## Not run: ddFILT=filter.probes(ddNORM, control=TRUE, wellaboveBG=TRUE, isfound=TRUE, wellaboveNEG=TRUE, sat=TRUE, PopnOL=TRUE, NonUnifOL=TRUE, nas=TRUE, limWellAbove=75, limISF=75, limNEG=75, limSAT=75, limPopnOL=75, limNonUnifOL=75, limNAS=100, makePLOT=TRUE,annotation.package="hgug4112a.db",flag.counts=TRUE,targets) ## End(Not run) ## Not run: SUMMARIZING PROBES ## Not run: ddPROC=summarize.probe(ddFILT,makePLOT=TRUE,targets) ## End(Not run) ## Not run: CREATING EXPRESIONSET OBJECT ## Not run: esetPROC=build.eset(ddPROC,targets,makePLOT=TRUE, annotation.package="hgug4112a.db") dim(esetPROC) ## End(Not run) ## Not run: WRITING EXPRESIONSET OBJECT: ProcessedData.txt ## Not run: write.eset(esetPROC,ddPROC,"hgug4112a.db",targets) ## End(Not run) ## Not run: MAPPING VARIABLE ## Not run: mappings=build.mappings(esetPROC,annotation.package="hgug4112a.db") names(mappings) ## End(Not run) ## Not run: Gene Set Enrichment Analysis at: http://www.broad.mit.edu/gsea ## Not run: gsea.files(esetPROC,targets,annotation.package="hgug4112a.db") ## End(Not run)