pls_lrCMA {CMA}R Documentation

Partial Least Squares followed by logistic regression

Description

This method constructs a classifier that extracts Partial Least Squares components that form the the covariates in a binary logistic regression model. The Partial Least Squares components are computed by the package plsgenomics.

For S4 method information, see pls_lrCMA-methods.

Usage

pls_lrCMA(X, y, f, learnind, comp = 2, lambda = 1e-4, plot = FALSE)

Arguments

X Gene expression data. Can be one of the following:
  • A matrix. Rows correspond to observations, columns to variables.
  • A data.frame, when f is not missing (s. below).
  • An object of class ExpressionSet.
y Class labels. Can be one of the following:
  • A numeric vector.
  • A factor.
  • A character if X is an ExpressionSet that specifies the phenotype variable.
  • missing, if X is a data.frame and a proper formula f is provided.
WARNING: The class labels will be re-coded to range from 0 to K-1, where K is the total number of different classes in the learning set.
f A two-sided formula, if X is a data.frame. The left part correspond to class labels, the right to variables.
learnind An index vector specifying the observations that belong to the learning set. May be missing; in that case, the learning set consists of all observations and predictions are made on the learning set.
comp Number of Partial Least Squares components to extract. Default is 2 which can be suboptimal, depending on the particular dataset. Can be optimized using tune.
lambda Parameter controlling the amount of L2 penalization for logistic regression, usually taken to be a small value in order to stabilize estimation in the case of separable data.
plot If comp <= 2, should the classification space of the Partial Least Squares components be plotted ? Default is FALSE.

Value

An object of class cloutput.

Note

Up to now, only the two-class case is supported.

Author(s)

Martin Slawski martin.slawski@campus.lmu.de

Anne-Laure Boulesteix http://www.slcmsr.net/boulesteix

References

Boulesteix, A.L., Strimmer, K. (2007).

Partial least squares: a versatile tool for the analysis of high-dimensional genomic data.

Briefings in Bioinformatics 7:32-44.

See Also

compBoostCMA, dldaCMA, ElasticNetCMA, fdaCMA, flexdaCMA, gbmCMA, knnCMA, ldaCMA, LassoCMA, nnetCMA, pknnCMA, plrCMA, pls_ldaCMA, pls_rfCMA, pnnCMA, qdaCMA, rfCMA, scdaCMA, shrinkldaCMA, svmCMA

Examples

### load Golub AML/ALL data
data(golub)
### extract class labels
golubY <- golub[,1]
### extract gene expression
golubX <- as.matrix(golub[,-1])
### select learningset
ratio <- 2/3
set.seed(111)
learnind <- sample(length(golubY), size=floor(ratio*length(golubY)))
### run PLS, combined with logistic regression
result <- pls_lrCMA(X=golubX, y=golubY, learnind=learnind)
### show results
show(result)
ftable(result)
plot(result)

[Package CMA version 1.0.0 Index]