srFilter {ShortRead} | R Documentation |
These functions create user-defined (srFitler
) or built-in
instances of SRFilter
objects. Filters can be
applied to objects from ShortRead
, returning a logical vector
to be used to subset the objects to include only those components
satisfying the filter.
srFilter(fun, name = NA_character_, ...) ## S4 method for signature 'missing': srFilter(fun, name=NA_character_) ## S4 method for signature 'function': srFilter(fun, name=NA_character_) compose(filt, ..., .name) chromosomeFilter(regex=character(0), .name="ChromosomeFilter") strandFilter(strandLevels=character(0), .name="StrandFilter") nFilter(threshold=0L, .name="CleanNFilter") polynFilter(threshold=0L, nuc=c("A", "C", "T", "G", "other"), .name="PolyNFilter") srdistanceFilter(subject=character(0), threshold=0L, .name="SRDistanceFilter") alignQualityFilter(threshold=0L, .name="AlignQualityFilter") alignDataFilter(expr=expression(), .name="AlignDataFilter")
fun |
An object of class function to be used as a
filter. fun must accept a single named argument x , and
is expected to return a logical vector such that x[fun(x)]
selects only those elements of x satisfying the conditions of
fun
|
name |
A character(1) object to be used as the name of the
filter. The name is useful for debugging and reference. |
filt |
A SRFilter object, to be used with
additional arugments to create a composite filter. |
.name |
An optional character(1) object used to over-ride
the name applied to default filters. |
regex |
Either character(0) or a character(1)
regular expression used as grep(regex, chromosome(x)) to
filter based on chromosome. The default (character(0) )
performs no filtering |
strandLevels |
Either character(0) or character(1)
containing strand levels to be selected. ShortRead objects
have standard strand levels NA, "+", "-", "*" , with NA
meaning strand information not available and "*" meaning
strand information not relevant. |
threshold |
A numeric(1) value representing a minimum
(srdistanceFilter , alignQualityFilter ) or maximum
(nFilter , polynFilter ) criterion for the filter. The
minima and maxima are closed-interval (i.e., x >= threshold ,
x <= threshold for some property x of the object being
filtered). |
nuc |
A character vector containing IUPAC symbols for
nucleotides or the value "other" corresponding to all
non-nucleotide symbols, e.g., N . |
subject |
A character() of any length, to be used as the
corresponding argument to srdistance . |
expr |
A expression to be evaluated with
pData(alignData(x)) . |
... |
Additional arguments for subsequent methods; these arguments are not currently used. |
srFilter
allows users to construct their own filters. The
fun
argument to srFilter
must be a function accepting a
single argument x
and returning a logical vector that can be
used to select elements of x
satisfying the filter with
x[fun(x)]
The signature(fun="missing")
method creates a default filter
that returns a vector of TRUE
values with length equal to
length(x)
.
compose
constructs a new filter from one or more existing
filter. The result is a filter that returns a logical vector with
indicies corresponding to components of x
that pass all
filters. If not provided, the name of the filter consists of the names
of all component filters, each separated by " o "
.
The remaining functions documented on this page are built-in filters
that accept an argument x
and return a logical vector of
length(x)
indicating which components of x
satisfy the
filter.
chromosomeFilter
selects elements satisfying
grep(regex, chromosome(x))
.
strandFilter
selects elemenst satisfying
match(strand(x), strand, nomatch=0) > 0
.
nFilter
selects elements with fewer than threshold
'N'
symbols in each element of sread(x)
.
polynFilter
selects elements with fewer than threshold
copies of any nucleotide indicated by nuc
.
srdistanceFilter
selects elements at an edit distance greater
than threshold
from all sequences in subject
.
alignQualityFilter
selects elements with alignQuality(x)
greater than threshold
.
alignDataFilter
selects elements with
pData(alignData(x))
satisfying expr
. expr
should
be formulated as though it were to be evaluated as
eval(expr, pData(alignData(x)))
.
srFilter
returns an object of SRFilter
.
Built-in filters return a logical vector of length(x)
, with
TRUE
indicating components that pass the filter.
Martin Morgan <mtmorgan@fhcrc.org>
sp <- SolexaPath(system.file("extdata", package="ShortRead")) aln <- readAligned(sp, "s_2_export.txt") # Solexa export file, as example # a 'chromosome 5' filter filt <- chromosomeFilter("chr5.fa") aln[filt(aln)] # filter during input readAligned(sp, "s_2_export.txt", filter=filt) # x- and y- coordinates stored in alignData, when source is SolexaExport xy <- alignDataFilter(expression(abs(x-500) > 200 & abs(y-500) > 200)) aln[xy(aln)] # both filters chr5xy <- compose(filt, xy) aln[chr5xy(aln)] # custom filter: minimum calibrated base call quality >20 goodq <- srFilter(function(x) { apply(as(quality(x), "matrix"), 1, min) > 20 }, name="GoodQualityBases") goodq aln[goodq(aln)]