plgem.fit {plgem} | R Documentation |
Function for fitting and evaluating goodness of fit of PLGEM on a ‘data’ exprSet, using the condition ‘fit.condition’ containing replicates; partitioning the range of expression values in ‘p’ intervals, using the ‘q’-th quantile of expression value standard deviations.
plgem.fit(data, fit.condition, p = 10, q = 0.5, fittingEval = FALSE, plot.file = FALSE, verbose = FALSE)
data |
an object of class ‘exprSet’ with a ‘conditionName’ covariate, see details. |
fit.condition |
number; the condition used for plgem fitting, according to the order of unique values of conditionName covariate. |
p |
number of intervals used to partition the expression value range. |
q |
number in [0,1]; the quantile of standard deviation used for PLGEM fitting. |
fittingEval |
logical; if TRUE, the fitting is evaluated generating a diagnostic plot. |
plot.file |
logical; if TRUE, a png file is written on the current working directory. |
verbose |
logical; if TRUE, comments are printed out while running. |
‘plgem.fit’ fits PLGEM on an expression set and eventually evaluates goodness of fit. This Power Law Global Error Model aims to find the mathematical relationship between standard deviation and mean expression values in a set of replicated microarray samples, according to a power law:
ln(modeledSpread) = PLGEMslope * ln(mean) + PLGEMintercept
The exprSet ‘data’ must have a phenoData slot with a covariate called ‘conditionName’. The values of this covariate must be sample labels, that have to be identical for samples to be treated as replicates. This function returns ‘SLOPE’ and ‘INTERCEPT’ of this power law; moreover it returns the Pearson's coefficient of correlation ‘DATA.PEARSON’ of the linear model fitted on the original data, as well as the adjusted R squared ‘ADJ.R2.MP’ of the linear model fitted on the modelling points.
If argument ‘fittingEval’ is TRUE, a graphical control of the goodness of the plgem fitting is produced and a plot containing four panels is generated. The top-left panel shows the power law, characterized by ‘SLOPE’ and ‘INTERCEPT’. The top-right panel represents the distribution of model residuals. The bottom-left reports the contour plot of ranked residuals. The bottom-right panel finally shows the relationship between the distribution of observed residuals and the normal distribution. The goodness of the fit is principally judged by an horizontal symmetric rank-plot and a near normal distribution of residuals.
‘plgem.fit’ returns a list of five numbers (see details):
SLOPE |
the slope of the fitted PLGEM. |
INTERCEPT |
the intercept of the fitted PLGEM. |
DATA.PEARSON |
the Pearson correlation coefficient of the linear model fitted on the original data. |
ADJ.R2.MP |
the adjusted R squared of PLGEM fitted on the modelling points. |
FIT.CONDITION |
the condition used for fitting PLGEM. |
Mattia Pelizzola mattia.pelizzola@unimib.it and Norman Pavelka norman.pavelka@unimib.it
N. Pavelka et al., BMC Bioinformatics, 2004 Dec 17;5(1):203; http://www.genopolis.it
plgem.obsStn
,plgem.resampledStn
,plgem.deg
,run.plgem
data(LPSeset) LPSfit<-plgem.fit(data = LPSeset, fittingEval = TRUE)