align-utils {Biostrings} | R Documentation |
A variety of different functions used to deal with sequence alignments.
mismatchTable(x, shiftLeft=0L, shiftRight=0L, ...) mismatchSummary(x, ...) ## S4 method for signature 'AlignedXStringSet': coverage(x, start=NA, end=NA, weight=1L) ## S4 method for signature 'PairwiseAlignedFixedSubject': coverage(x, start=NA, end=NA, weight=1L) compareStrings(pattern, subject) ## S4 method for signature 'character': consensusMatrix(x, freq=FALSE) ## S4 method for signature 'XStringSet': consensusMatrix(x, baseOnly=FALSE, freq=FALSE) consensusString(x)
x |
A character vector or matrix, XStringSet , XStringViews ,
PairwiseAlignedFixedSubject , or list of FASTA records containing the equal-length
strings.
|
shiftLeft, shiftRight |
Non-positive and non-negative integers respectively that specify how many preceding and succeeding characters to and from the mismatch position to include in the mismatch substrings. |
... |
Further arguments to be passed to or from other methods. |
start, end |
See ?coverage .
|
weight |
An integer vector specifying how much each element in x counts.
|
pattern, subject |
The strings to compare. Can be of type character , XString ,
XStringSet , AlignedXStringSet , or, in the case of
pattern , PairwiseAlignedFixedSubject . If pattern is a
PairwiseAlignedFixedSubject object, then subject must be missing.
|
baseOnly |
TRUE or FALSE .
If TRUE , the returned vector only contains frequencies for the
letters in the "base" alphabet i.e. "A", "C", "G", "T" if x
is a "DNA input", and "A", "C", "G", "U" if x is "RNA input".
When x is a BString object (or an XStringViews
object with a BString subject, or a BStringSet object),
then the baseOnly argument is ignored.
|
freq |
If TRUE , then letter frequencies (per position) are reported, otherwise counts.
|
mismatchTable
: a data.frame containing the positions and substrings
of the mismatches for the AlignedXStringSet
or PairwiseAlignedFixedSubject
object.
mismatchSummary
: a list of data.frame objects containing counts and
frequencies of the mismatches for the AlignedXStringSet
or
PairwiseAlignedFixedSubject
object.
compareStrings
combines two equal-length strings that are assumed to be aligned
into a single character string containing that replaces mismatches with "?"
,
insertions with "+"
, and deletions with "-"
.
consensusMatrix
computes a consensus matrix for a set of equal-length strings that
are assumed to be aligned.
consensusString
creates the string based on a 50% + 1 vote from the consensus
matrix with unknowns labeled with "?"
.
pairwiseAlignment
,
XString-class, XStringSet-class, XStringViews-class,
AlignedXStringSet-class, PairwiseAlignedFixedSubject-class,
match-utils
## Compare two globally aligned strings string1 <- "ACTTCACCAGCTCCCTGGCGGTAAGTTGATC---AAAGG---AAACGCAAAGTTTTCAAG" string2 <- "GTTTCACTACTTCCTTTCGGGTAAGTAAATATATAAATATATAAAAATATAATTTTCATC" compareStrings(string1, string2) ## Create a consensus matrix nw1 <- pairwiseAlignment(AAStringSet(c("HLDNLKGTF", "HVDDMPNAL")), AAString("SMDDTEKMSMKL"), substitutionMatrix = "BLOSUM50", gapOpening = -3, gapExtension = -1) consensusMatrix(nw1) ## Examine the consensus between the bacteriophage phi X174 genomes data(phiX174Phage) phageConsmat <- consensusMatrix(phiX174Phage, baseOnly = TRUE) phageDiffs <- which(apply(phageConsmat, 2, max) < length(phiX174Phage)) phageDiffs phageConsmat[,phageDiffs] ## Read in ORF data file <- system.file("extdata", "someORF.fa", package="Biostrings") orf <- read.DNAStringSet(file, "fasta") ## To illustrate, the following example assumes the ORF data ## to be aligned for the first 10 positions (patently false): orf10 <- DNAStringSet(orf, end=10) consensusMatrix(orf10, baseOnly=TRUE, freq=TRUE) consensusString(sort(orf10)[1:5]) ## For the character matrix containing the "exploded" representation ## of the strings, do: as.matrix(orf10, use.names=FALSE)