xval-methods {MLInterfaces}R Documentation

support for cross-validatory machine learning with exprSets

Description

support for cross-validatory machine learning with exprSets

Usage

xval( data, classLab, proc, xvalMethod, group, indFun, niter, fsFun=NULL, fsNum=NULL, decreasing=TRUE, ... )
balKfold(K)

Arguments

data instance of class exprSet
classLab character string identifying phenoData variable to be regarded
proc an MLInterfaces method that returns an instance of classifOutput
xvalMethod character string identifying cross-validation procedure to use: default is "LOO" (leave one out), alternatives are "LOG" (leave group out) and "FUN" (user-supplied partition extraction function, see Details below)
group a vector (length equal to number of samples) enumerating groups for LOG xval method
indFun a function that returns a set of indices to be saved as a test set; this function must have parameters data, clab, iternum; see Details
niter number of iterations for user-specified partition function to be run
fsFun function computing ranks of features for feature selection
fsNum number of features to be kept for learning in each iteration
decreasing logical, should be TRUE if fsFun provides high scores for high-performing features (e.g., is absolute value of a test statistics) and false if it provides low scores for high-performing features (e.g., p-value of a test).
... arguments passed to the MLInterfaces generic proc
K number of partitions to be used if balKfold is used as indFun

Details

If xvalMethod is "FUN", then indFun must be a function with parameters data, clab, and iternum. This function returns indices that identify the training set for a given cross-validation iteration passed as the value of iternum. An example function is printed out when the example of this page is executed.

if fsFun is not NULL, then it must be a function with two arguments: the first can be transformed to a feature matrix (rows are objects, columns are features) and the second is a vector of class labels. The function returns a vector of scores, one for each object. The scores will be interpreted according to the value of decreasing, to select fsNum features. Thanks to Stephen Henderson of University College London for this functionality.

Examples

library(golubEsets)
data(golubMerge)
smallG <- golubMerge[200:250,]
lk1 <- xval(smallG, "ALL.AML", knnB, xvalMethod="LOO", group=as.integer(0))
table(lk1,smallG$ALL.AML)
lk2 <- xval(smallG, "ALL.AML", knnB, xvalMethod="LOG", group=as.integer(
 rep(1:8,each=9)))
table(lk2,smallG$ALL.AML)
balKfold
lk3 <- xval(smallG, "ALL.AML", knnB, xvalMethod="FUN", 0:0, indFun=balKfold(5), niter=5)
table(lk3, smallG$ALL.AML)
#
# illustrate the xval FUN method in comparison to LOO
#
LOO2 <- xval(smallG, "ALL.AML", knnB, "FUN", 0:0, function(x,y,i) {
  (1:ncol(exprs(x)))[-i] }, niter=72 )
table(lk1, LOO2)
#
# use Stephen Henderson's feature selection extensions
#
t.fun<-function(data, fac)
{
        require(genefilter)
        # deal with the integer storage of golubTrain@exprs!
        xd <- matrix(as.double(exprs(data)), nrow=nrow(exprs(data)))
        return(abs(rowttests(xd,data[[fac]], tstatOnly=FALSE)$statistic))
}
lk3f <- xval(smallG, "ALL.AML", knnB, xvalMethod="LOO", 0:0, fsFun=t.fun)
table(lk3f$out, smallG$ALL.AML)

[Package MLInterfaces version 1.2.1 Index]