toptable {limma}R Documentation

Table of Top Genes from Linear Model Fit

Description

Extract a table of the top-ranked genes from a linear model fit.

Usage

topTable(fit,coef=NULL,number=10,genelist=fit$genes,adjust.method="BH",sort.by="B",resort.by=NULL,p.value=1,lfc=0)
toptable(fit,coef=1,number=10,genelist=NULL,A=NULL,eb=NULL,adjust.method="BH",sort.by="B",resort.by=NULL,p.value=1,lfc=0,...)
topTableF(fit,number=10,genelist=fit$genes,adjust.method="BH")

Arguments

fit list containing a linear model fit produced by lmFit, lm.series, gls.series or mrlm. For topTable, fit should be an object of class MArrayLM as produced by lmFit and eBayes.
coef column number or column name specifying which coefficient or contrast of the linear model is of interest. Can also be a vector of column subscripts, in which case the gene ranking is by F-statistic for that set of contrasts.
number maximum number of genes to list
genelist data frame or character vector containing gene information. For topTable only, this defaults to fit$genes.
A matrix of A-values or vector of average A-values. For topTable only, this defaults to fit$Amean.
eb output list from ebayes(fit). If NULL, this will be automatically generated.
adjust.method method used to adjust the p-values for multiple testing. Options, in increasing conservatism, include "none", "BH", "BY" and "holm". See p.adjust for the complete list of options. A NULL value will result in the default adjustment method, which is "BH".
sort.by character string specifying statistic to rank genes by. Possibilities are "logFC", "A", "T", "t", "P", "p" or "B". "M" is allowed as a synonym for "logFC" for backward compatibility.
resort.by character string specifying statistic to sort the selected genes by in the output data.frame. Possibilities are "logFC", "A", "T", "t", "P", "p" or "B". "M" is allowed as a synonym for "logFC" for backward compatibility.
p.value cutoff value for adjusted p-values. Only genes with lower p-values are listed.
lfc cutoff value for log2-fold-change. Only genes with larger fold changes are listed.
... any other arguments are passed to ebayes if eb is NULL

Details

Note that toptable is an earlier interface and is retained only for backward compatibility.

This function summarizes a linear model fit object produced by lmFit, lm.series, gls.series or mrlm by selecting the top-ranked genes for any given contrast. topTable() assumes that the linear model fit has already been processed by eBayes().

The p-values for the coefficient/contrast of interest are adjusted for multiple testing by a call to p.adjust. The "BH" method, which controls the expected false discovery rate (FDR) below the specified value, is the default adjustment method because it is the most likely to be appropriate for microarray studies. Note that the adjusted p-values from this method are bounds on the FDR rather than p-values in the usual sense. Because they relate to FDRs rather than rejection probabilities, they are sometimes called q-values. See help("p.adjust") for more information.

Note, if there is no good evidence for differential expression in the experiment, that it is quite possible for all the adjusted p-values to be large, even for all of them to be equal to one. It is quite possible for all the adjusted p-values to be equal to one if the smallest p-value is no smaller than 1/ngenes where ngenes is the number of genes with non-missing p-values.

The sort.by argument specifies the criterion used to select the top genes. The choices are: "logFC" to sort by the (absolute) coefficient representing the log-fold-change; "A" to sort by average expression level (over all arrays) in descending order; "T" or "t" for absolute t-statistic; "P" or "p" for p-values; or "B" for the lods or B-statistic.

Normally the genes appear in order of selection in the output table. If one wants the table to be in a different order, the resort.by argument may be used. For example, topTable(fit, sort.by="B", resort.by="logFC") selects the top genes according to log-odds of differential expression and then orders the resulting genes by log-ratio in decreasing order. Or topTable(fit, sort.by="logFC", resort.by="logFC") would select the genes by absolute log-ratio and then sort then by signed log-ratio from must positive to most negative.

topTableF ranks genes on the basis of the moderated F-statistic rather than t-statistics. If topTable is called with coef has length greater than 1, then the specified columns will be extracted from fit and topTableF called on the result. topTable with coef=NULL is the same as topTableF, unless the fitted model fit has only one column.

Value

A dataframe with a row for the number top genes and the following columns:

genelist if genelist was included as input
logFC estimate of the log2-fold-change corresponding to the effect or contrast
AveExpr average log2-expression for the probe over all arrays and channels, same as Amean in the MarrayLM object
t moderated t-statistic
P.Value raw p-value
adj.P.Value adjusted p-value or q-value
B log odds that the gene is differentially expressed

Note

This is not the right function to use to create summary statistics for all the probes on an array. Please consider using write.fit or write for this purpose, rather than using topTable with number=nrow(fit).

Author(s)

Gordon Smyth

See Also

An overview of linear model and testing functions is given in 06.LinearModels. See also p.adjust in the stats package.

Examples

#  See lmFit examples

[Package limma version 2.12.0 Index]