xval-methods {MLInterfaces} | R Documentation |
support for cross-validatory machine learning with exprSets
xval( data, classLab, proc, xvalMethod, group, indFun, niter, fsFun=NULL, fsNum=NULL, decreasing=TRUE, ... ) balKfold(K)
data |
instance of class exprSet |
classLab |
character string identifying phenoData variable to be regarded |
proc |
an MLInterfaces method that returns an instance of classifOutput |
xvalMethod |
character string identifying cross-validation procedure to use: default is "LOO" (leave one out), alternatives are "LOG" (leave group out) and "FUN" (user-supplied partition extraction function, see Details below) |
group |
a vector (length equal to number of samples) enumerating groups for LOG xval method |
indFun |
a function that returns a set of indices to be saved as a test set;
this function must have parameters data , clab , iternum ; see Details |
niter |
number of iterations for user-specified partition function to be run |
fsFun |
function computing ranks of features for feature selection |
fsNum |
number of features to be kept for learning in each iteration |
decreasing |
logical, should be TRUE if fsFun provides high scores for high-performing features
(e.g., is absolute value of a test statistics) and false if it provides low scores
for high-performing features (e.g., p-value of a test). |
... |
arguments passed to the MLInterfaces generic proc |
K |
number of partitions to be used if balKfold is used as indFun |
If xvalMethod
is "FUN"
, then indFun
must be a function
with parameters data
, clab
, and iternum
.
This function returns
indices that identify the training set for a given
cross-validation iteration passed as the value of iternum
. An example
function is printed out when the example of this page is executed.
if fsFun
is not NULL
, then it must be a function with two
arguments: the first can be transformed to a feature matrix (rows are objects,
columns are features) and the second is a vector of class labels.
The function returns a vector of scores, one for each object. The
scores will be interpreted according to the value of decreasing
,
to select fsNum
features. Thanks to Stephen Henderson of University
College London for
this functionality.
library(golubEsets) data(golubMerge) smallG <- golubMerge[200:250,] lk1 <- xval(smallG, "ALL.AML", knnB, xvalMethod="LOO", group=as.integer(0)) table(lk1,smallG$ALL.AML) lk2 <- xval(smallG, "ALL.AML", knnB, xvalMethod="LOG", group=as.integer( rep(1:8,each=9))) table(lk2,smallG$ALL.AML) balKfold lk3 <- xval(smallG, "ALL.AML", knnB, xvalMethod="FUN", 0:0, indFun=balKfold(5), niter=5) table(lk3, smallG$ALL.AML) # # illustrate the xval FUN method in comparison to LOO # LOO2 <- xval(smallG, "ALL.AML", knnB, "FUN", 0:0, function(x,y,i) { (1:ncol(exprs(x)))[-i] }, niter=72 ) table(lk1, LOO2) # # use Stephen Henderson's feature selection extensions # t.fun<-function(data, fac) { require(genefilter) # deal with the integer storage of golubTrain@exprs! xd <- matrix(as.double(exprs(data)), nrow=nrow(exprs(data))) return(abs(rowttests(xd,data[[fac]], tstatOnly=FALSE)$statistic)) } lk3f <- xval(smallG, "ALL.AML", knnB, xvalMethod="LOO", 0:0, fsFun=t.fun) table(lk3f$out, smallG$ALL.AML)