evalClusterHyper {goCluster}R Documentation

Evaluates a clustering result with regard to an enrichment of annotation terms in specific clusters.

Description

The function evalClusterHyper runs through a tree of gene groups and calls the function evalAnnosetHyper for each of them. This second function employs the hypergeometric distributon to calculate a p-value for each of the annotation terms that is annotated to the genes in the group.

Usage

evalClusterHyper(X, uniqueid, Annoset)
evalAnnosetHyper(Selection, uniqueid, Annoset)

Arguments

X The tree (list of lists) of clusters.
Annoset This is a list and each element holds a different annotation dataset. Each of these datasets is composed of two columns with the second column holding the genes ids while the first column holds the corresponding annotation terms.
uniqueid The unique id of the elements in the dataset.
Selection A list of genes that comprises one cluster. The gene ids given have to match ids from the first column of the annotation datasets (Annoset).

Details

The function evalClusterHyper analyses a "tree" (list of lists) of gene clusters. It determines the probabilities for the frequency of annotation terms within each cluster by employing the hypergeometric distribution. The function evalAnnosetHyper performs the statistical evaluation for each gene cluster. The function will determine all annotation terms that are associated with the genes in the cluster. For each of these annotation terms the number of matching genes over the whole list of genes (not only the cluster) will be calculated. Finally for each annotation term the ratio of matching genes within the cluster and total number of genes in the cluster will be compared to the ratio of matching genes over the whole list and the total number of genes in the list. This allows to determine probabilities according to the hypergeometric distribution.

Value

pvalues p-values according to the hypergeometric distribution.
selectedPerAnnotation A vector that holds the number of times the annotation was found in the given selection.
elementsPerAnnotation A vector that holds the number of times the annotation was found over all elements.
selectedTotal Total number of annotation terms in the given selection.
elementsTotal Total number of annotation terms over all elements.

Author(s)

Gunnar Wrobel, work@gunnarwrobel.de, http://www.gunnarwrobel.de.

See Also

clusterStatisticHyper-class

Examples

## We will first creat a goCluster object to get the gene ontology
## annotation from it
data(benomylsetupsmall)
test <- new("goCluster")
setup(test) <- benomylsetupsmall
## Executing the data object will also execute the annotation
## object associated with it. The "execute" function needs
## to specify the "test" object a second time since we need
## to specify a parent object when executing a goCluster subobject.
annotation <- execute(test@data, test)
## Extract the annotation datasets and the unique ids
Annoset  <- annotation@anno@annoset
Uniqueid <- annotation@uniqueid

## Test clusters (the genes are specified by there position in
## the dataset)
testclusters <- list(
                     list(
                          c(68, 78),
                          c(32,  7, 72)
                          ),
                     list(c(31, 78)
                     ))

evalClusterHyper(testclusters, Uniqueid, Annoset)


[Package goCluster version 1.4.0 Index]