iterateBMAglm.wrapper {iterativeBMA}R Documentation

Iterative Bayesian Model Averaging

Description

This function repeatedly calls bic.glm from the BMA package until all variables are exhausted. The data is assumed to consist of two classes. Logistic regression is used for classification.

Usage

iterateBMAglm.wrapper (sortedA, y, nbest=10, maxNvar=30, maxIter=20000, thresProbne0=1) 

Arguments

sortedA data matrix where columns are variables and rows are observations. The variables (columns) are assumed to be sorted using a univariate measure. In the case of gene expression data, the columns (variables) represent genes, while the rows (observations) represent samples or experiments.
y class vector for the observations (samples or experiments) in the training data. Class numbers are assumed to start from 0, and the length of this class vector should be equal to the number of rows in sortedA. Since we assume 2-class data, we expect the class vector consists of zero's and one's.
nbest a number specifying the number of models of each size returned to bic.glm in the BMA package. The default is 10.
maxNvar a number indicating the maximum number of variables used in each iteration of bic.glm from the BMA package. The default is 30.
maxIter a number indicating the maximum of iterations of bic.glm. The default is 20000.
thresProbne0 a number specifying the threshold for the posterior probability that each variable (gene) is non-zero (in percent). Variables (genes) with such posterior probability less than this threshold are dropped in the iterative application of bic.glm. The default is 1 percent.

Details

In this function, the variables are assumed to be sorted, and bic.glm is called repeatedly. In the first application of the bic.glm algorithm, the top maxNvar univariate ranked genes are used. After each application of the bic.glm algorithm, the genes with probne0 < thresProbne0 are dropped, and the next univariate ordered genes are added to the BMA window. The function iterateBMAglm.train calls BssWssFast before calling this function. Using this function, users can experiment with alternative univariate measures.

Value

If all variables are exhausted, an object of class bic.glm returned by the last iteration of bic.glm. Otherwise, -1 is returned. The object of class bic.glm is a list consisting of the following components:

namesx the names of the variables in the last iteration of bic.glm.
postprob the posterior probabilities of the models selected.
deviance the estimated model deviances.
label labels identifying the models selected.
bic values of BIC for the models.
size the number of independent variables in each of the models.
which a logical matrix with one row per model and one column per variable indicating whether that variable is in the model.
probne0 the posterior probability that each variable is non-zero (in percent).
postmean the posterior mean of each coefficient (from model averaging).
postsd the posterior standard deviation of each coefficient (from model averaging).
condpostmean the posterior mean of each coefficient conditional on the variable being included in the model.
condpostsd the posterior standard deviation of each coefficient conditional on the variable being included in the model.
mle matrix with one row per model and one column per variable giving the maximum likelihood estimate of each coefficient for each model.
se matrix with one row per model and one column per variable giving the standard error of each coefficient for each model.
reduced a logical indicating whether any variables were dropped before model averaging.
dropped a vector containing the names of those variables dropped before model averaging.
call the matched call that created the bma.lm object.

Note

The BMA and Biobase packages are required.

References

Raftery, A.E. (1995). Bayesian model selection in social research (with Discussion). Sociological Methodology 1995 (Peter V. Marsden, ed.), pp. 111-196, Cambridge, Mass.: Blackwells.

Yeung, K.Y., Bumgarner, R.E. and Raftery, A.E. (2005) Bayesian Model Averaging: Development of an improved multi-class, gene selection and classification tool for microarray data. Bioinformatics 21: 2394-2402.

See Also

iterateBMAglm.train, iterateBMAglm.train.predict, iterateBMAglm.train.predict.test, BssWssFast

Examples

library (Biobase)
library (BMA)
library (iterativeBMA)
data(trainData)
data(trainClass)

## Use the BSS/WSS ratio to rank all genes in the training data
sorted.vec <- BssWssFast (t(exprs(trainData)), trainClass, numClass = 2)
## get the top ranked 50 genes
sorted.train.dat <- t(exprs(trainData[sorted.vec$ix[1:50], ]))
 
## run iterative bic.glm
ret.bic.glm <- iterateBMAglm.wrapper (sorted.train.dat, y=trainClass)

## The above commands are equivalent to the following 
ret.bic.glm <- iterateBMAglm.train (train.expr.set=trainData, trainClass, p=50)


[Package iterativeBMA version 1.0.0 Index]