termSim {SemSim} | R Documentation |
Given two terms in the same GO subcategory, information content-based measures are used to estimate their semantic similarity or distance.
termSim(GOID1, GOID2, measure = "Resnik", db = "all")
GOID1 |
Identifier of a GO term. |
GOID2 |
Identifier of a GO term. |
measure |
One of "Resnik", "Lin", "Rel", and "Jiang" methods. |
db |
Databases from which the information content of each GO term has been derived. |
Resnik, and Lin, and Relevance methods estimate semantic similarity of two GO terms, while Jiang's method calculate semantic distance of GO terms. The simplest measure (Resnik) defines the similarity as the information content of the lowest common ancestor of two terms, while the other three measures also take into accout the information content of query terms. Detailed description of each measure can be found in Lord, et al 2003 and Schlicker, et al 2006. Information content of a term is based on its relative frequency of occurrence in an annotation database. Default calculation of information content is based on all available annoations submitted to GO database. Specific organism database may also be used to estimate the information content. Options "human", "mouse", "rat", "yeast", "plant", and "microbe" of argument db represent data from all human gene product annotations in UniProt and annotations in MGI, RGD, SGD, TAIR, and TIGR CMR data sources respectively.
Sim |
Value of semantic similarity or distance between two terms. |
Lord, P.W., Stevens, R.D., Brass, A., and Goble, C.A. (2003) Semantic similarity measures as tools for exploring the Gene Ontology. In Pacific Symposium on Biocomputing 8: 601-612. Schlicker, A., Domingues, F.S., Rahnenfuhrer, J., and Lengauer, T. (2006) A new measure for functional similarity of gene products based on Gene Ontology. BMC Bioinformatics, 7(1):302.
termSim("GO:0043044", "GO:0006348") termSim("GO:0015801", "GO:0015813", measure="Rel", db="human")