align-utils {Biostrings}R Documentation

Utility functions related to sequence alignment

Description

A variety of different functions used to deal with sequence alignments.

Usage

  mismatchTable(x, shiftLeft=0L, shiftRight=0L, ...)
  mismatchSummary(x, ...)
  ## S4 method for signature 'AlignedXStringSet':
  coverage(x, start=NA, end=NA, weight=1L)
  ## S4 method for signature 'PairwiseAlignedFixedSubject':
  coverage(x, start=NA, end=NA, weight=1L)
  compareStrings(pattern, subject)
  ## S4 method for signature 'character':
  consensusMatrix(x, freq=FALSE)
  ## S4 method for signature 'XStringSet':
  consensusMatrix(x, baseOnly=FALSE, freq=FALSE)
  consensusString(x)

Arguments

x A character vector or matrix, XStringSet, XStringViews, PairwiseAlignedFixedSubject, or list of FASTA records containing the equal-length strings.
shiftLeft, shiftRight Non-positive and non-negative integers respectively that specify how many preceding and succeeding characters to and from the mismatch position to include in the mismatch substrings.
... Further arguments to be passed to or from other methods.
start, end See ?coverage.
weight An integer vector specifying how much each element in x counts.
pattern, subject The strings to compare. Can be of type character, XString, XStringSet, AlignedXStringSet, or, in the case of pattern, PairwiseAlignedFixedSubject. If pattern is a PairwiseAlignedFixedSubject object, then subject must be missing.
baseOnly TRUE or FALSE. If TRUE, the returned vector only contains frequencies for the letters in the "base" alphabet i.e. "A", "C", "G", "T" if x is a "DNA input", and "A", "C", "G", "U" if x is "RNA input". When x is a BString object (or an XStringViews object with a BString subject, or a BStringSet object), then the baseOnly argument is ignored.
freq If TRUE, then letter frequencies (per position) are reported, otherwise counts.

Details

mismatchTable: a data.frame containing the positions and substrings of the mismatches for the AlignedXStringSet or PairwiseAlignedFixedSubject object.

mismatchSummary: a list of data.frame objects containing counts and frequencies of the mismatches for the AlignedXStringSet or PairwiseAlignedFixedSubject object.

compareStrings combines two equal-length strings that are assumed to be aligned into a single character string containing that replaces mismatches with "?", insertions with "+", and deletions with "-".

consensusMatrix computes a consensus matrix for a set of equal-length strings that are assumed to be aligned.

consensusString creates the string based on a 50% + 1 vote from the consensus matrix with unknowns labeled with "?".

See Also

pairwiseAlignment, XString-class, XStringSet-class, XStringViews-class, AlignedXStringSet-class, PairwiseAlignedFixedSubject-class, match-utils

Examples

  ## Compare two globally aligned strings
  string1 <- "ACTTCACCAGCTCCCTGGCGGTAAGTTGATC---AAAGG---AAACGCAAAGTTTTCAAG"
  string2 <- "GTTTCACTACTTCCTTTCGGGTAAGTAAATATATAAATATATAAAAATATAATTTTCATC"
  compareStrings(string1, string2)

  ## Create a consensus matrix
  nw1 <-
    pairwiseAlignment(AAStringSet(c("HLDNLKGTF", "HVDDMPNAL")), AAString("SMDDTEKMSMKL"),
      substitutionMatrix = "BLOSUM50", gapOpening = -3, gapExtension = -1)
  consensusMatrix(nw1)

  ## Examine the consensus between the bacteriophage phi X174 genomes
  data(phiX174Phage)
  phageConsmat <- consensusMatrix(phiX174Phage, baseOnly = TRUE)
  phageDiffs <- which(apply(phageConsmat, 2, max) < length(phiX174Phage))
  phageDiffs
  phageConsmat[,phageDiffs]

  ## Read in ORF data
  file <- system.file("extdata", "someORF.fa", package="Biostrings")
  orf <- read.DNAStringSet(file, "fasta")

  ## To illustrate, the following example assumes the ORF data
  ## to be aligned for the first 10 positions (patently false):
  orf10 <- DNAStringSet(orf, end=10)
  consensusMatrix(orf10, baseOnly=TRUE, freq=TRUE)
  consensusString(sort(orf10)[1:5])

  ## For the character matrix containing the "exploded" representation
  ## of the strings, do:
  as.matrix(orf10, use.names=FALSE)

[Package Biostrings version 2.10.22 Index]