sam.snp {siggenes} | R Documentation |
Performs a SAM (Significance Analysis of Microarrays) analysis for categorical data such a SNP data
sam.snp(data, cl, B = 1000, med = FALSE, delta = NULL, n.delta = 10, p0 = NA, lambda = seq(0, 0.95, 0.05), ncs.value = "max", ncs.weights = NULL, gene.names = dimnames(data)[[1]], q.version = 1, na.replace = TRUE, check.levels = TRUE, rand = NA)
data |
a matrix or data frame. Each row must correspond to a SNP, and each column to a sample |
cl |
a numeric vector of length ncol(data) indicating to which class
a sample belongs. Recommended way of specifying cl is the use of the
integers between 1 and g, where g is the number of different groups,
or in the two-class case the use of 0's and 1's |
B |
the number of permutations used in the estimation of the null distribution |
med |
if FALSE (default), the mean number of falsely called SNPs
will be computed. Otherwise, the median number is calculated |
delta |
a numeric vector specifying a set of values for the threshold
Delta that should be used. If NULL , n.delta
Delta values will be computed automatically |
n.delta |
a numeric value specifying the number of Delta values
that will be computed over the range of possible values of Delta
if delta is not specified |
p0 |
a numeric value specifying the prior probability pi0
that a SNP is not differentially expressed. If NA , p0 will
be computed by the function pi0.est |
lambda |
a numeric vector or value specifying the lambda
values used in the estimation of the prior probability. For details, see
?pi0.est |
ncs.value |
a character string. Only used if lambda is a
vector. Either "max" or "paper" . For details, see ?pi0.est |
ncs.weights |
a numerical vector of the same length as lambda
containing the weights used in the estimation of pi0. By default
no weights are used. For details, see ?pi0.est |
gene.names |
a character vector of length nrow(data) containing the
names of the SNPs. By default the row names of data are used |
q.version |
a numeric value indicating which version of the q-value should
be computed. If q.version=2 , the original version of the q-value, i.e.
min{pFDR}, will be computed. If q.version=1 , min{FDR} will be used
in the calculation of the q-value. Otherwise, the q-value is not computed.
For details, see ?qvalue.cal |
na.replace |
if TRUE , the missing values of a SNP will be replaced
by random draws from the empirical distribution of that SNP |
check.levels |
if TRUE , it will be checked if all variables/SNPs have
the same number of levels/categories |
rand |
numeric value. If specified, i.e. not NA , the random number generator
will be set into a reproducible state |
For each SNP, Pearson's Chi-Square statistic is computed to test if the distribution of the SNP differs between several groups. Since it is very likely that the assumptions for the Chi-square-approximation are not fulfilled a permutation based method is used to estimate the null distribution. Since only one null distribution is estimated for all SNPs as proposed in the original SAM procedure of Tusher et al. (2001) all SNPs must have the same number of levels/categories.
an object of class SAM
This procedure will only work correctly if all SNPs/variables have the same number of levels/categories.
SAM was deveoped by Tusher et al. (2001).
!!! There is a patent pending for the SAM technology at Stanford University. !!!
Holger Schwender, holger.schw@gmx.de
Schwender, H. (2004). Modifying Microarray Analysis Methods for Categorical Data – SAM and PAM for SNPs. To appear in: Proceedings of the the 28th Annual Conference of the GfKl.
Tusher, V.G., Tibshirani, R., and Chu, G. (2001). Significance analysis of microarrays applied to the ionizing radiation response. PNAS, 98, 5116-5121.
SAM-class
,sam
,sam.dstat
,
sam.wilc