getRelevantEGenes {nem}R Documentation

Automatic selection of most relevant E-genes

Description

1. A-priori filtering of E-genes: Select E-genes, which show a pattern of differential expression across experiments that is expected to be non-random. 2. Automated E-gene subset selection: Select those E-genes, which have the highest likelihood under the given network hypothesis.

Usage

filterEGenes(Porig, D, Padj=NULL, ntop=100, fpr=0.05, adjmethod="bonferroni", cutoff=0.05)

getRelevantEGenes(Phi, D, para=NULL, hyperpara=NULL,Pe=NULL,Pm=NULL,lambda=0, delta=1, type="CONTmLLDens", nEgenes=min(10*nrow(Phi), nrow(D)))

Arguments

For method filterEGenes:

Porig matrix of raw p-values, typically from the complete array
D data matrix. Columns correspond to the nodes in the silencing scheme. Rows are effect reporters.
Padj matrix of false positive rates. If not, provided Benjamini-Hochbergs method for false positive rate computation is used.
ntop number of top genes to consider from each knock-down experiment
fpr significance cutoff for the FDR
adjmethod adjustment method for pattern p-values
cutoff significance cutoff for patterns
Phi adjacency matrix with unit main diagonal
type mLL or FULLmLL or CONTmLL or CONTmLLBayes or CONTmLLMAP. CONTmLLDens and CONTmLLRatio are identical to CONTmLLBayes and CONTmLLMAP and are still supported for compatibility reasons, see nem.
para Vector with parameters a and b (for "mLL" with count data)
hyperpara Vector with hyperparameters a0, b0, a1, b1 for "FULLmLL"
Pe prior position of effect reporters. Default: uniform over nodes in silencing scheme
Pm prior on model graph (n x n matrix) with entries 0 <= priorPhi[i,j] <= 1 describing the probability of an edge between gene i and gene j.
lambda regularization parameter to incorporate prior assumptions.
delta regularization parameter for automated E-gene subset selection (CONTmLLMAP only)
nEgenes no. of E-genes to select

Details

The method filterEGenes performs an a-priori filtering of the complete microarray. It determines how often E-genes are expected to be differentially expressed across experiments just randomly. According to this only E-genes are chosen, which show a pattern of differential expression more often than can be expected by chance.

The method getRelevantEGenes looks for the E-genes, which have the highest likelihood under the given network hypothesis. In case of the scoring type "CONTmLLBayes" these are all E-genes which have a positive contribution to the total log-likelihood. In case of type "CONTmLLMAP" all E-genes not assigned to the "null" S-gene are returned. This involves the prior probability delta/no. S-genes for leaving out an E-gene. For all other cases ("CONTmLL", "FULLmLL", "mLL") the nEgenes E-genes with the highest likelihood under the given network hypothesis are returned.

Value

I index of selected E-genes
dat subset of original data according to I
patterns significant patterns
nobserved no. of cases per observed pattern
selected selected E-genes
mLL marginal likelihood of a phenotypic hierarchy
pos posterior distribution of effect positions in the hierarchy
mappos Maximum a posteriori estimate of effect positions
LLperGene likelihood per selected E-gene

Author(s)

Holger Froehlich

See Also

nem, score, mLL, FULLmLL

Examples

   # Drosophila RNAi and Microarray Data from Boutros et al, 2002
   data("BoutrosRNAi2002")
   D <- BoutrosRNAiDiscrete[,9:16]

   # enumerate all possible models for 4 genes
   models <- enumerate.models(unique(colnames(D)))  
   
   getRelevantEGenes(models[[64]], D, para=c(.13,.05), type="mLL")


[Package nem version 2.6.0 Index]