matchPWM {Biostrings}R Documentation

A simple PWM matching function and related utilities

Description

A function implementing a simple algorithm for matching a set of patterns represented by a Position Weight Matrix (PWM) to a DNA sequence. PWM for amino acid sequences are not supported.

Usage

  matchPWM(pwm, subject, min.score="80%")
  countPWM(pwm, subject, min.score="80%")

  ## Utility functions for basic manipulation of the Position Weight Matrix
  maxWeights(pwm)
  maxScore(pwm)
  #reverseComplement(x, ...) # S4 method for matrix objects

Arguments

pwm A Position Weight Matrix (integer matrix with row names A, C, G and T).
subject A DNAString object containing the subject sequence.
min.score The minimum score for counting a match. Can be given as a percentage (e.g. "85%") of the highest possible score or as an integer.

Value

An XStringViews object for matchPWM.
A single integer for countPWM.
An integer vector containing the max weight for each position in pwm for maxWeights.
The highest possible score for a given Position Weight Matrix for maxScore.
A PWM obtained by reverting the column order in PWM x and by reassigning each row to its complementary nucleotide for reverseComplement.

See Also

matchPattern, reverseComplement, DNAString-class, XStringViews-class

Examples

  pwm <- rbind(A=c( 1,  0, 19, 20, 18,  1, 20,  7),
               C=c( 1,  0,  1,  0,  1, 18,  0,  2),
               G=c(17,  0,  0,  0,  1,  0,  0,  3),
               T=c( 1, 20,  0,  0,  0,  1,  0,  8))
  maxWeights(pwm)
  maxScore(pwm)
  reverseComplement(pwm)

  subject <- DNAString("AGTAAACAA")
  PWMscore(pwm, subject, c(2:1, NA))

  library(BSgenome.Dmelanogaster.UCSC.dm3)
  chr3R <- unmasked(Dmelanogaster$chr3R)
  chr3R

  ## Match the plus strand
  matchPWM(pwm, chr3R)
  countPWM(pwm, chr3R)

  ## Match the minus strand
  matchPWM(reverseComplement(pwm), chr3R)

[Package Biostrings version 2.10.22 Index]