run.plgem {plgem} | R Documentation |
This function automatically performs PLGEM fitting and evaluation, determination of observed and resampled PLGEM STN values, and selection of differentially expressed genes/proteins (DEG) using the PLGEM method.
run.plgem(esdata, signLev=0.001, rank=100, covariateNumb=1, baselineCondition=1, Iterations="automatic", fitting.eval=TRUE, plotFile=FALSE, writeFiles=FALSE, Verbose=FALSE)
esdata |
an object of class ExpressionSet ; see Details for
important information on how the phenoData slot of this object will
be interpreted by the function. |
signLev |
numeric vector; significance level(s) for the DEG selection. Value(s) must be in (0,1). |
rank |
integer (or coercible to integer ); the number of
genes or proteins to be selected according to their PLGEM-STN rank. Only
used if number of available replicates is too small to perform resampling
(see Details). |
covariateNumb |
integer (or coercible to integer ); the
covariate used to determine on which samples to fit plgem. |
baselineCondition |
integer (or coercible to integer ); the
condition to be treated as the baseline. |
Iterations |
number of iterations for the resampling step; if "automatic" it is automatically determined. |
fitting.eval |
logical ; if TRUE , the fitting is evaluated
generating a diagnostic plot. |
plotFile |
logical ; if TRUE , the generated plot is written on a
file. |
writeFiles |
logical ; if TRUE , the generated list of DEG is
written on disk file(s). |
Verbose |
logical ; if TRUE , comments are printed out while
running. |
The ‘covariateNumb’ covariate (the first one by default) of the
phenoData
of the ExpressionSet
‘data’ is expected to
contain the necessary information about the experimental design. The values of
this covariate must be sample labels, that have to be identical for samples to
be treated as replicates. In particular, the ExpressionSet
‘esdata’ must have at least two conditions in the ‘covariateNumb’
covariate; by default the first one is considered the baseline.
The model is fitted on the most replicated condition. When more conditions exist with the max number of replicates, the condition providing the best fit is chosen.
If less than 3 replicates are provided for the condition used for fitting, then the selection is based on ranking according to the observed PLGEM STN values. In this case the first ‘rank’ genes or proteins are selected for each comparison.
Otherwise DEG are selected comparing the observed and resampled PLGEM
STN values at the ‘signLev’ significance level(s), based on p-values
obtained via a call to function plgem.pValue
. See References for
details.
This function returns a list with a number of items that is equal to the
number of different significance levels (‘signLev’) used as input. Each
item is again a list, whose number of items correspond to the number of
performed comparisons, i.e. the number of conditions defined in the
phenoData
of ‘esdata’ minus the baseline. In each list-item the
values are the observed PLGEM STN values of the significantly changing
genes or proteins, named according to the rownames
of the exprs
of ‘esdata’.
Mattia Pelizzola mattia.pelizzola@gmail.com
Norman Pavelka nxp@stowers-institute.org
Pavelka N, Pelizzola M, Vizzardelli C, Capozzoli M, Splendiani A, Granucci F, Ricciardi-Castagnoli P. A power law global error model for the identification of differentially expressed genes in microarray data. BMC Bioinformatics. 2004 Dec 17;5:203.; http://www.biomedcentral.com/1471-2105/5/203
Pavelka N, Fournier ML, Swanson SK, Pelizzola M, Ricciardi-Castagnoli P, Florens L, Washburn MP. Statistical similarities between transcriptomics and quantitative shotgun proteomics data. Mol Cell Proteomics. 2007 Nov 19; http://www.mcponline.org/cgi/content/abstract/M700240-MCP200v1
plgem.fit
, plgem.obsStn
,
plgem.resampledStn
, plgem.pValue
,
plgem.write.summary
data(LPSeset) set.seed(123) LPSdegList <- run.plgem(esdata=LPSeset)