snp.rhs.tests {snpMatrix} | R Documentation |
This function fits a generalized linear model with phenotype as dependent variable and, optionally, one or more potential confounders of a phenotype-genotype association as independent variable. A series of SNPs (or small groups of SNPs) are then tested for additional association with phenotype. In order to protect against misspecification of the variance function, "robust" tests may be selected.
snp.rhs.tests(formula, family = "binomial", link, weights, subset, data = parent.frame(), snp.data, rules=NULL, tests=NULL, robust = FALSE, control=glm.test.control(maxit=20, epsilon=1.e-4, R2Max=0.98), allow.missing=0.01, score=FALSE)
formula |
The base model formula, with phenotype as dependent variable |
family |
A string defining the generalized linear model
family. This currently should (partially) match one of
"binomial" , "Poisson" , "Gaussian" or
"gamma" (case-insensitive) |
link |
A string defining the link function for the GLM. This
currently should (partially) match one of "logit" ,
"log" , "identity" or "inverse" . The
default action is to use the "canonical" link for the family selected |
data |
The dataframe in which the base model is to be fitted |
snp.data |
An object of class "snp.matrix" or
"X.snp.matrix" containing the SNP data |
rules |
An object of class
"snp.reg.imputation" . If
supplied, the rules coded in this object are used, together with
snp.data , to calculate tests for imputed SNPs |
tests |
Either a vector of SNP names (or numbers) for the SNPs
to be tested, or a list of short vectors defining groups of SNPs to be
tested (see Details ) |
weights |
"Prior" weights in the generalized linear model |
subset |
Array defining the subset of rows of data to use |
robust |
If TRUE , robust tests will be carried out |
control |
An object giving parameters for the IRLS algorithm fitting of the base model and for the acceptable aliasing amongst new terms to be tested. See code{glm.test.control} |
allow.missing |
The maximum proportion of SNP genotype that can be missing before it becomes necessary to refit the base model |
score |
Is extended score information to be returned? |
The tests used are asymptotic chi-squared tests based on the vector of first and second derivatives of the log-likelihood with respect to the parameters of the additional model. The "robust" form is a generalized score test in the sense discussed by Boos(1992). The "base" model is first fitted, and a score test is performed for addition of one or more SNP genotypes to the model. Homozygous SNP genotypes are coded 0 or 2 and heterozygous genotypes are coded 1. For SNPs on the X chromosome, males are coded as homozygous females. For X SNPs, it will often be appropriate to include sex of subject in the base model (this is not done automatically).
If a data
argument is supplied, the snp.data
and
data
objects are aligned by rowname. Otherwise all variables in
the model formulae are assumed to be stored in the same order as the
columns of the snp.data
object.
Usually SNPs to be used in tests will be referenced by name. However,
they can
also be referenced by number, a positive number indicating the
appropriate column in the input snp.data
, and a negative number
indicating (minus) a position in the rules
list. Tests
involving more than one SNP can use a mixture of observed and imputed
SNPs. If the tests
argument is missing, single SNP tests are
carried out; if a rules
is given, all imputed SNP tests
are calculated, otherwise all SNPs in the input snp.data
matrix
are tested. But note that, for single SNP tests, the function
single.snp.tests
will often achieve the same
result much faster.
An object of class snp.tests.glm
or snp.tests.glm.score
depending on whether score
is set to FALSE
or TRUE
in the call.
A factor (or
several factors) may be included as arguments to the function
strata(...)
in the formula
. This fits all
interactions of the factors so included, but leads to faster
computation than fitting these in the normal way. Additionally, a
cluster(...)
call may be included in the base model
formula. This identifies clusters of potentially correlated
observations (e.g. for members of the same family); in this case, an
appropriate robust estimate of the variance of the score test is used.
David Clayton david.clayton@cimr.cam.ac.uk
Boos, Dennis D. (1992) On generalized score tests. The American Statistician, 46:327-333.
snp.tests.glm-class
,
snp.tests.glm.score-class
,
single.snp.tests
, snp.lhs.tests
,
impute.snps
, snp.reg.imputation-class
,
snp.matrix-class
, X.snp.matrix-class
data(testdata) slt3 <- snp.rhs.tests(cc~strata(region), family="binomial", data=subject.data, snp.data= Autosomes, tests=1:10) print(slt3)