GeneSet-class {GSEABase} | R Documentation |
A GeneSet
contains a set of gene identifiers. Each gene set has a
geneIdType
, indicating how the gene identifiers should be interpreted
(e.g., as Entrez identifiers), and a collectionType
, indicating
the origin of the gene set (perhaps including additional information
about the set, as in the BroadCollection
type).
Conversion between identifiers, subsetting, and logical (set)
operations can be performed. Relationships between genes and phenotype
in a GeneSet
can be summarized using coloring
to create
a GeneColorSet
. A GeneSet
can be exported to XML with
toBroadXML
.
Construct a GeneSet
with a GeneSet
method (e.g.,
from a character vector of gene names, or an
ExpressionSet
), or from gene sets stored as XML
(locally or on the internet; see getBroadSets
)
setName
:"ScalarCharacter"
containing a short name (single word is best) to identify the set.setIdentifier
:"ScalarCharacter"
containing a (unique) identifier for the
set.geneIdType
:"GeneIdentifierType"
containing information about how the gene identifiers are encoded. See
GeneIdentifierType
and related classes.geneIds
:"character"
containing
the gene symbols.collectionType
:"CollectionType"
containing information about how the geneIds
were collected, including perhaps additional information unique to
the collection methodology. See CollectionType
and related classes.shortDescription
:"ScalarCharacter"
representing short description (1 line) of the gene set.longDescription
:"ScalarCharacter"
providing a longer description (e.g.,
like an abstract) of the gene set.organism
:"ScalarCharacter"
represents the organism the gene set is derived from.pubMedIds
:"character"
containing PubMed ids related to the gene set.urls
:"character"
containing
urls used to construct or manipulate the gene set.contributor
:"character"
identifying who created the gene set.version
:"Versions"
a version
number, manually curated (i.e., by the contributor
) to
provide a consistent way of tracking a gene set.creationDate
:"character"
containing the character string representation of the date on which
the gene set was created.Gene set construction:
GeneSet
methods and
getBroadSets
for convenient construction.
Slot access (e.g., setName
) and retrieve
(e.g., setName<-
) :
signature(object = "GeneSet", value = "CollectionType")
signature(object = "GeneSet")
signature(object = "GeneSet", value = "character")
signature(object = "GeneSet")
signature(object = "GeneSet", value = "character")
signature(object = "GeneSet")
signature(object = "GeneSet", value = "character")
signature(object = "GeneSet")
signature(object = "GeneSet", value = "character")
signature(object = "GeneSet")
signature(object = "GeneSet", value = "character")
signature(object = "GeneSet")
signature(object = "GeneSet", value = "character")
signature(object = "GeneSet")
signature(object = "GeneSet", value = "character")
signature(object = "GeneSet")
signature(x = "GeneSet", y = "GeneSet")
signature(object = "GeneSet", value = "character")
signature(object = "GeneSet")
signature(object = "GeneSet", value = "character")
signature(object = "GeneSet")
signature(object = "GeneSet", verbose=FALSE, value = "character")
,
signature(object = "GeneSet", verbose=FALSE, value = "GeneIdentifierType")
:
These method attempt to coerce geneIds from
the current type to the type named by value
. Successful
coercion requires an appropriate method for mapIdentifiers
.signature(object = "GeneSet")
signature(object = "GeneSet", value = "Versions")
signature(object = "GeneSet")
signature(object = "GeneSet", value = "character")
signature(object = "GeneSet")
Logical and subsetting operations:
signature(x = "GeneSet", y = "GeneSet")
: ... signature(e1 = "GeneSet", e2 = "GeneSet")
: calculate
the logical `or' (union) of two gene sets. The sets must contain elements of
the same geneIdType
.signature(e1 = "GeneSet", e2 = "character")
,
signature(e1 = "character", e2 = "GeneSet")
:
calculate the logical `or' (union) of a gene set and a character vector,
i.e., add the geneIds named in the character vector to the gene set.signature(x = "GeneSet", y = "GeneSet")
:signature(e1 = "GeneSet", e2 = "GeneSet")
: calculate
the logical `and' (intersection) of two gene sets.signature(e1 = "GeneSet", e2 = "character")
,
signature(e1 = "character", e2 = "GeneSet")
:
calculate the logical `and' (intersection) of a gene set and a
character vector, creating a new gene set containing only those
genes named in the character vector.signature(x = "GeneSet", y = "GeneSet")
,
signature(x = "GeneSet", y = "character")
,
signature(x = "character", y = "GeneSet")
:
calculate the logical set difference betwen two gene sets, or
betwen a gene set and a character vector.signature(x = "GeneSet", i="character")
signature(x = "GeneSet", i="numeric")
: subset the gene set by
index (i="numeric"
) or value (i="character"
). Genes
are re-ordered as requiredsignature(x = "ExpressionSet", i = "GeneSet")
: subset the
expression set, using genes in the gene set to select
features. Genes in the gene set are coerced to appropriate annotation type
if necessary (by consulting the annotation
slot of the
expression set, and using geneIdType<-
).signature(x = "GeneSet")
: select a single gene from
the gene set.signature(x = "GeneSet")
: select a single gene from
the gene set, allowing partial matching.Useful additional methods include:
signature(type = "GeneSet")
: create a
'color' gene set from a GeneSet
, containing information
about phenotype. This method has a required argument
phenotype
, a character string describing the phenotype for
which color is available. See GeneColorSet
.GeneIdentifierType
to another. See
mapIdentifiers
and specific methods in
GeneIdentifierType
for additional detail.incidence-methods
.toGmt
.signature(object = "GeneSet")
: display a short
summary of the gene set.signature(object = "GeneSet")
: display
additional information about the gene set. See details
.signature(.Object = "GeneSet")
: Used
internally during gene set construction.Martin Morgan <mtmorgan@fhcrc.org>
GeneColorSet
CollectionType
GeneIdentifierType
## Empty gene set GeneSet() ## Gene set from ExpressionSet data(sample.ExpressionSet) gs1 <- GeneSet(sample.ExpressionSet[100:109]) ## GeneSet from Broad XML; 'fl' could be a url fl <- system.file("extdata", "Broad.xml", package="GSEABase") gs2 <- getBroadSets(fl)[[1]] # actually, a list of two gene sets ## GeneSet from list of geneIds geneIds <- geneIds(gs2) # any character vector would do gs3 <- GeneSet(geneIds=geneIds) ## unspecified set type, so... is(geneIdType(gs3), "NullIdentifier") == TRUE ## update set type to match encoding of identifiers geneIdType(gs2) geneIdType(gs3) <- SymbolIdentifier() ## Convert between set types; this consults the 'annotation' ## information encoded in the 'AnnotationIdentifier' set type and the ## corresponding annotation package. ## Not run: gs4 <- gs1 geneIdType(gs4) <- EntrezIdentifier() ## End(Not run) ## logical (set) operations gs5 <- GeneSet(sample.ExpressionSet[100:109], setName="subset1") gs6 <- GeneSet(sample.ExpressionSet[105:114], setName="subset2") ## intersection: 5 'genes'; note the set name '(subset1 & subset2)' gs5 & gs6 ## union: 15 'genes'; note the set name gs5 | gs6 ## an identity gs7 <- gs5 | gs6 gs8 <- setdiff(gs5, gs6) | (gs5 & gs6) | setdiff(gs6, gs5) identical(geneIds(gs7), geneIds(gs8)) identical(gs7, gs8) == FALSE # gs7 and gs8 setNames differ ## output tmp <- tempfile() toBroadXML(gs2, tmp) noquote(readLines(tmp)) ## must be BroadCollection() collectionType try(toBroadXML(gs1)) gs9 <- gs1 collectionType(gs9) <- BroadCollection() toBroadXML(gs9, tmp) unlink(tmp) toBroadXML(gs9) # no connection --> character vector ## list of geneIds --> vector of Broad GENESET XML gs10 <- getBroadSets(fl) # two sets entries <- sapply(gs10, function(x) toBroadXML(x)[[2]]) ## list mapIdentifiers available for GeneSet showMethods("mapIdentifiers", classes="GeneSet", inherit=FALSE)