si {exonmap} | R Documentation |
Calculates the splicing index for the probesets in one or more genes, as defined in the Affymetrix white paper "Alternative Transcript Analysis Methods for Exon Arrays".
si(x, v, group, gps, median.gene=FALSE,median.probeset=FALSE,unlogged=TRUE)
x |
eSet containing expression data |
v |
Character vector of Ensembl gene names |
group |
If defined, the column name in the ExpressionSet's pData object in which to look
for gps |
gps |
The two sets of arrays to compare |
median.gene |
Use the median instead of the mean when calculating averages across genes |
median.probeset |
Use the median instead of the mean when calculating averages across probesets in each replicate group |
unlogged |
Unlog the expression data before calculating the splicing index (and then re-log afterwards) |
The splicing index gives a measure of the difference in expression level
for each probeset in a gene between two sets of arrays, relative to the
gene-level average in each set. This is calculated only for those
probesets that are defined as exon targeting and non-multitargetted (See
select.probewise
and exclude.probewise
for
more details of how this filtering is performed.
The two sets of arrays can be specified in two ways: First, by using
numeric indices defining the appropriate columns in the expression
data. This is done by supplying these as a list to gps
(e.g. gps=list(1:3,4:6)
will calculate the splicing index
between arrays 1,2,3 and 4,5,6. Alternatively, the annotation in the
pData
object from x
can be used
(e.g. group="treatment",gps=c("a","b")
, will compare between
the arrays labelled "a", and "b" in the "treatment" column of
pData(x)
).
The implementation also calculates a p.value
and
t.statistic
for each probeset; these are returned alongside the
splicing index.
By default, the splicing index is calculated using the mean across genes
and samples. Specifing median.gene=TRUE
or
median.probeset=TRUE
will use the median instead (for the gene or
probeset level averages, respectively). It is calculated using the
unlogged data, unless unlogged=FALSE
. This only affects the
internal calculations; values in x
are always assumed to be
logged, and the splicing index is always returned on the log2 scale.
A list
, one element for each gene. Each element contains a
data.frame
, with the results for a given gene. Each row
corresponds to a probeset, and there are four columns in the
data.frame
: "si","p.value","t.statistic"
and "gene.av"
.
Crispin J Miller with contributions from Carla Moller Levet and Michal J Okoniewski
http://bioinformatics.picr.man.ac.uk/
if(interactive()) { xmapConnect() data(exonmap) gg <- probeset.to.gene(c("2326780","2326822" )) spl.idx <- si(x, gg, "group", c("a","b")) spl.idx <- si(x, gg, gps=list(1:3,4:6)) }