varSel.highest.var.eSRG {MCRestimate} | R Documentation |
Different functions for a variable selection and clustering methods. These functions are mainly used for the function MCRestimate
identity(sample.gene.matrix,classfactor,...) varSel.highest.t.stat(sample.gene.matrix,classfactor,theParameter=NULL,var.numbers=500,...) varSel.highest.t.stat.eSRG(sample.gene.matrix,classfactor,theParameter=NULL,var.numbers=500,...) varSel.highest.var(sample.gene.matrix,classfactor,theParameter=NULL,var.numbers=2000,...) varSel.highest.var.eSRG(sample.gene.matrix,classfactor,theParameter=NULL,var.numbers=2000,...) varSel.green.int.max.eSRG(sample.gene.matrix,classfactor,theParameter=NULL,lambda=0.5,...) varSel.green.int.sec.eSRG(sample.gene.matrix,classfactor,theParameter=NULL, lambda=0.5,...) varSel.AUC(sample.gene.matrix, classfactor, theParameter=NULL,var.numbers=200,...) cluster.kmeans.mean(sample.gene.matrix,classfactor,theParameter=NULL,number.clusters=500,...) varSel.removeManyNA(sample.gene.matrix,classfactor, theParameter=NULL, NAthreshold=0.25,...) varSel.impute.NA(sample.gene.matrix ,classfactor,theParameter=NULL,...) varSel.svm.rfe(sample.gene.matrix, classfactor,theParameter=NULL, ...)
sample.gene.matrix |
a matrix in which the rows corresponds to genes and the colums corresponds to samples |
classfactor |
a factor containing the values that should be predicted |
theParameter |
Parameter that depends on the function. For
'cluster.kmeans.mean' eighter NULL or an output of the function
kmeans . If it is NULL then kmeans will be used to
form clusters of the genes. Otherwise the already existing clusters
will be used. In both ways there will be a calculation of the
metagene intensities afterwards. For the other functions eighter
NULL or a logical vector which indicates for every gene if it sould
be left out from further analysis or not |
number.clusters |
parameter which specifies the number of clusters |
var.numbers |
some methods needs an argument which specifies how many variables should be taken |
lambda |
additional parameter for some methods |
NAthreshold |
integer- if the percentage of the NA is higher than this threshold the variable will be deleted |
... |
Further parameters |
metagene.kmeans.mean
performes a kmeans clustering with
a number of clusters specified by 'number clusters' and takes the mean
of each cluster. varSel.highest.var
selects a number (specified
by 'var.numbers') of variables with the highest variance. varSel.AUC
chooses the
most discriminating variables due to the AUC criterium (the
library ROC
is required). Some variable selection functions
only work with an MCRestimate.exprSetRG
( name ends with .eSRG
).and others only work
with MCRestimate.default
(no .eSRG
).
varSel.svm.rfe
makes feature selection by SVM RFE using a
linear kernel.The number of selected features is optimised by internal
CV literature: Guyon et al. (2002) Machine Learning 46, 389-422.
Every function returns a list consisting of two arguments:
matrix |
the result matrix of the variable redution or the clustering |
parameter |
The parameter which are used to reproduce the algorithm, i.e. a vector which indicates for every gene if it will be left out from further analysis or not if a gene reduction is performed or the output of the function kmeans for the clustering algorithm. |
Markus Ruschhaupt mailto:m.ruschhaupt@dkfz.de, Patrick Warnat mailto:p.warnat@dkfz-heidelberg.de
library(MCRestimate) m <- matrix(c(rnorm(10,2,0.5),rnorm(10,4,0.5),rnorm(10,7,0.5),rnorm(10,2,0.5),rnorm(10,4,0.5),rnorm(10,2,0.5)),ncol=2) cluster.kmeans.mean(m ,number.clusters=3)