SAGELyzer {SAGElyzer} | R Documentation |
This function finds the k nearest neighbors for a given SAGE tag based on the expression of SAGE tags across selected SAGE labraries. The calculations are based on data stored in a table in a databse.
SAGELyzer(dbArgs, targetSAGE, libs = "*", normalize = "min", tagColName = "tag", k = 500, dist = "euclidean", trans = "sqrt") getSAGESQL(dbArgs, conn, targetSAGE, libs, tagColName, chunk = FALSE, cursor = "sageRows", ignorZeros = TRUE, what = c("map", "counts", "info")) getTotalRNum(dbArgs, conn, tagColName, what = "counts") getKNN(dbArgs, targetSAGE, libs, tagColName, normalize, k, dist, trans, max = 10000) noChunkKNN(dbArgs, conn, targetSAGE, libs, tagColName, normalize, k, dist, trans) chunkKNN(dbArgs, conn, targetSAGE, libs, tagColName, normalize, k, dist, trans, rowNum, max = 50000) findNeighborTags(targetRow, data, k, NF, dist, trans) getColNames(dbArgs, conn, what = "counts")
dbArgs |
dbArgs a list containing arguments needed to make
connection to a database and queries against a table. The elements
include a DSN under Windows and database name, user name, password,
and host under Unix plus the names for three tables that will be
used by SAGElyzer |
targetSAGE |
targetSAGE a character string for the SAGE
tag whose neighbors will be sought |
libs |
libs a vector of character strings for column names
of database table where SAGE library data are stored |
normalize |
normalize a character string for the means to
perfrom data normalization. Can be either "min", "max", or "none" |
tagColName |
tagColName a character string for the column
name of a database table where SAGE tags are stored |
k |
k an integer for the number of nearest neighbors to be
sought |
dist |
dist a character string corresponding to an
existing R object for calculating distances between two data sets |
trans |
trans a character string corresponding to an
existing R object that will be used to transform the data |
conn |
conn a connection to a database |
chunk |
chunk a boolean indicating whether data will be
processed in chunks to avoid running out space |
ignorZeros |
ignorZeros a boolean indicating whether data
rows with all 0s will be ignored |
what |
what a character string for the type of database
table to use for getting data. Have to be either "map", "counts", or
"info" |
max |
max an integer for the maximum number of data rows
in a chunk to be processed |
rowNum |
rowNum an integer for row number |
NF |
NF a vector of numerical data that will be used as
normalization factor for SAGE counts |
targetRow |
targetRow a vector of character strings
containing data for the target SAGE tag |
data |
data a matrix containing SAGE counts across
selected libraries |
cursor |
cursor a character string for the name of a
cursor to reterive data in chunks from a database table |
Two database tables (default names "sagecounts" and "sageinfo" have to exist (tables can be created using other function in this package). One table (sagecounts) contains counts for SAGE tags for libraries and the other (sageinfo) contains mappings between column names used in "sagecounts" to store data for a given SAGE library.
Functions in this package are normally called by interactive interfaces that are invoked when the package is loaded.
SAGELyzer
returns a named vector with SAGE tags being
the names and the corresponding calculated distances to a given tag
being the values.
getSAGESQL
returns a character string for a SQL
statement to use to query a database.
getTotalRNum
returns an integer for the total row number
of a database table.
Jianhua Zhang
# No example is given as the code requires data with existing tables