compBoostCMA {CMA} | R Documentation |
Roughly speaking, Boosting combines 'weak learners' in a weighted manner in a stronger ensemble.
'Weak learners' here consist of linear functions in one component (variable), as proposed by Buehlmann and Yu (2003).
It also generates sparsity and can as well be as
used for variable selection alone. (s. GeneSelection
).
For S4
method information, see compBoostCMA-methods.
compBoostCMA(X, y, f, learnind, loss = c("binomial", "exp", "quadratic"), mstop = 100, nu = 0.1, ...)
X |
Gene expression data. Can be one of the following:
|
y |
Class labels. Can be one of the following:
0 to K-1 , where K is the
total number of different classes in the learning set.
|
f |
A two-sided formula, if X is a data.frame . The
left part correspond to class labels, the right to variables. |
learnind |
An index vector specifying the observations that
belong to the learning set. May be missing ;
in that case, the learning set consists of all
observations and predictions are made on the
learning set. |
loss |
Character specifying the loss function - one of "binomial" (LogitBoost),
"exp" (AdaBoost), "quadratic"(L2Boost). |
mstop |
Number of boosting iterations, i.e. number of updates
to perform. The default (100) does not necessarily produce
good results, therefore usage of tune for
this argument is highly recommended. |
nu |
Shrinkage factor applied to the update steps, defaults to 0.1.
In most cases, it suffices to set nu to a very low value
and to concentrate on the optimization of mstop . |
... |
Currently unused arguments. |
The method is partly based on code from the package mboost
from T. Hothorn and P. Buehlmann.
The algorithm for the multiclass case is described in Lutz and Buehlmann (2006) as 'rowwise updating'.
An object of class clvarseloutput
.
Martin Slawski martin.slawski@campus.lmu.de
Anne-Laure Boulesteix http://www.slcmsr.net/boulesteix
Buelmann, P., Yu, B. (2003).
Boosting with the L2 loss: Regression and Classification.
Journal of the American Statistical Association, 98, 324-339
Buehlmann, P., Hothorn, T.
Boosting: A statistical perspective.
Statistical Science (to appear)
Lutz, R., Buehlmann, P. (2006).
Boosting for high-multivariate responses in high-dimensional linear regression.
Statistica Sinica 16, 471-494.
dldaCMA
, ElasticNetCMA
,
fdaCMA
, flexdaCMA
, gbmCMA
,
knnCMA
, ldaCMA
, LassoCMA
,
nnetCMA
, pknnCMA
, plrCMA
,
pls_ldaCMA
, pls_lrCMA
, pls_rfCMA
,
pnnCMA
, qdaCMA
, rfCMA
,
scdaCMA
, shrinkldaCMA
, svmCMA
### load Golub AML/ALL data data(golub) ### extract class labels golubY <- golub[,1] ### extract gene expression golubX <- as.matrix(golub[,-1]) ### select learningset ratio <- 2/3 set.seed(111) learnind <- sample(length(golubY), size=floor(ratio*length(golubY))) ### run componentwise (logit)-boosting (not tuned) result <- compBoostCMA(X=golubX, y=golubY, learnind=learnind, mstop = 500) ### show results show(result) ftable(result) plot(result) ### multiclass example: ### load Khan data data(khan) ### extract class labels khanY <- khan[,1] ### extract gene expression khanX <- as.matrix(khan[,-1]) ### select learningset set.seed(111) learnind <- sample(length(khanY), size=floor(ratio*length(khanY))) ### run componentwise multivariate (logit)-boosting (not tuned) result <- compBoostCMA(X=khanX, y=khanY, learnind=learnind, mstop = 1000) ### show results show(result) ftable(result) plot(result)