get.siggenes {maSigPro}R Documentation

Extract significant genes for sets of variables in time series gene expression experiments

Description

This function creates lists of significant genes for a set of variables whose significance value has been computed with the T.fit function.

Usage

get.siggenes(tstep, rsq = 0.7, add.IDs = FALSE, IDs = NULL, matchID.col = 1, 
             only.names = FALSE, vars = c("all", "each", "groups"), 
             groups.vector = NULL, trat.repl.spots = "none", 
             index = IDs[, (matchID.col + 1)], match = IDs[, matchID.col], r = 0.7)

Arguments

tstep a T.fit object
rsq cut-off level at the R-squared value for the stepwise regression fit. Only genes with R-squared more than rsq are selected
add.IDs logical indicating whether to include additional gene id's in the result
IDs matrix contaning additional gene id information (required when add.IDs is TRUE)
matchID.col number of matching column in matrix IDs for adding genes ids
only.names logical. If TRUE, expression values are ommited in the results
vars variables for which to extract significant genes (see details)
groups.vector required when vars is "groups".
trat.repl.spots treatment given to replicate spots. Possible values are "none" and "average"
index argument of the average.rows function to use when trat.repl.spots is "average"
match argument of the average.rows function to use when trat.repl.spots is "average"
r minimun pearson correlation coefficient for replicated spots profiles to be averaged

Details

There are 3 possible values for the vars argument: newline qquad "all": generates one single matrix or gene list with all significant genes. newline qquad "each": generates as many significant genes extractions as variables in the general regression model. Each extraction contains the significant genes for that variable. newline qquad "groups": generates a significant genes extraction for each experimental group. newline The difference between "each" and "groups" is that in the first case the variables of the same group (e.g. "TreatmentA" and "time*TreatmentA" ) will be extracted separately and in the second case jointly.

When add.IDs is TRUE, a matrix of gene ids must be provided as argument of IDs, the matchID.col column of which having same levels as in the row names of sig.profiles. The option only.names is TRUE will generate a vector of significant genes or a matrix when add.IDs is set also to TRUE.

When trat.repl.spots is "average", match and index vectors are required for the average.rows function. In gene expression data context, the index vector would contain geneIDs and indicate which spots are replicates. The match vector is used to match these genesIDs to rows in the significant genes matrix, and must have the same levels as the row names of sig.profiles.

add.IDs = TRUE and trat.repl.spots = "average" are not compatible argumet values. add.IDs = TRUE and only.names = TRUE are compatible argumet values.

Value

summary a vector or matrix listing significant genes for the variables given by the function parameters
sig.genes a list with detailed information on the significant genes found for the variables given by the function parameters. Each element of the list is also a list containing: newline quad sig.profiles: expression values of significant genes newline quad coefficients: regression coefficients for significant genes newline quad t.score: t.score of significant genes newline quad sig.pvalues: p-values of the regression coefficients for significant genes newline quad g: number of genes newline quad ...: arguments passed by previous functions

Author(s)

Ana Conesa, aconesa@ivia.es; María José Nueda, mj.nueda@ua.es

References

Conesa, A., Nueda M.J., Alberto Ferrer, A., Talón, T. 2005. maSigPro: a Method to Identify Significant Differential Expression Profiles in Time-Course Microarray Experiments.

Examples


#### GENERATE TIME COURSE DATA
## generate n random gene expression profiles of a data set with 
## one control plus 3 treatments, 3 time points and r replicates per time point.

tc.GENE <- function(n, r,
             var11 = 0.01, var12 = 0.01,var13 = 0.01,
             var21 = 0.01, var22 = 0.01, var23 =0.01,
             var31 = 0.01, var32 = 0.01, var33 = 0.01,
             var41 = 0.01, var42 = 0.01, var43 = 0.01,
             a1 = 0, a2 = 0, a3 = 0, a4 = 0,
             b1 = 0, b2 = 0, b3 = 0, b4 = 0,
             c1 = 0, c2 = 0, c3 = 0, c4 = 0)
{

  tc.dat <- NULL
  for (i in 1:n) {
    Ctl <- c(rnorm(r, a1, var11), rnorm(r, b1, var12), rnorm(r, c1, var13))  # Ctl group
    Tr1 <- c(rnorm(r, a2, var21), rnorm(r, b2, var22), rnorm(r, c2, var23))  # Tr1 group
    Tr2 <- c(rnorm(r, a3, var31), rnorm(r, b3, var32), rnorm(r, c3, var33))  # Tr2 group
    Tr3 <- c(rnorm(r, a4, var41), rnorm(r, b4, var42), rnorm(r, c4, var43))  # Tr3 group
    gene <- c(Ctl, Tr1, Tr2, Tr3)
    tc.dat <- rbind(tc.dat, gene)
  }
  tc.dat
}
## Create 270 flat profiles
flat <- tc.GENE(n = 270, r = 3)
## Create 10 genes with profile differences between Ctl and Tr1 groups
twodiff <- tc.GENE (n = 10, r = 3, b2 = 0.5, c2 = 1.3)
## Create 10 genes with profile differences between Ctl, Tr2, and Tr3 groups
threediff <- tc.GENE(n = 10, r = 3, b3 = 0.8, c3 = -1, a4 = -0.1, b4 = -0.8, c4 = -1.2)
## Create 10 genes with profile differences between Ctl and Tr2 and different variance
vardiff <- tc.GENE(n = 10, r = 3, a3 = 0.7, b3 = 1, c3 = 1.2, var32 = 0.03, var33 = 0.03)
## Create dataset
tc.DATA <- rbind(flat, twodiff, threediff, vardiff)
rownames(tc.DATA) <- paste("feature", c(1:300), sep = "")
colnames(tc.DATA) <- paste("Array", c(1:36), sep = "")
tc.DATA [sample(c(1:(300*36)), 300)] <- NA  # introduce missing values

#### CREATE EXPERIMENTAL DESIGN
Time <- rep(c(rep(c(1:3), each = 3)), 4)
Replicates <- rep(c(1:12), each = 3)
Control <- c(rep(1, 9), rep(0, 27))
Treat1 <- c(rep(0, 9), rep(1, 9), rep(0, 18))
Treat2 <- c(rep(0, 18), rep(1, 9), rep(0,9))
Treat3 <- c(rep(0, 27), rep(1, 9))
edesign <- cbind(Time, Replicates, Control, Treat1, Treat2, Treat3)
rownames(edesign) <- paste("Array", c(1:36), sep = "")

tc.p <- p.vector(tc.DATA, design = make.design.matrix(edesign), Q = 0.01) 
tc.tstep <- T.fit(data = tc.p , alfa = 0.05)

## This will obtain sigificant genes per experimental group 
## which have a regression model Rsquared > 0.9
tc.sigs <- get.siggenes (tc.tstep, rsq = 0.9, vars = "groups")

## This will obtain all sigificant genes regardless the Rsquared value. 
## Replicated genes are averaged.
IDs <- rbind(paste("feature", c(1:300), sep = ""), 
       rep(paste("gene", c(1:150), sep = ""), each = 2))
tc.sigs.ALL <- get.siggenes (tc.tstep, rsq = 0, vars = "all", IDs = IDs)


[Package maSigPro version 1.2.0 Index]