cmp.search {ChemmineR}R Documentation

Search a descriptor database for compounds similar to query compound

Description

Given descriptor of a query compound and a database of compound descriptors, search for compounds that are similar to the query compound. User can limit the output by supplying a cutoff similarity score or a cutoff that limits the number of returned compounds. The function can also return the scores together with the compounds.

Usage

    cmp.search(db, query, cutoff = 0.5, return.score = FALSE, quiet = FALSE,
                    mode = 1, visualize=FALSE, visualize.browse=TRUE, visualize.query=NULL)

Arguments

db The compound descriptor database returned by 'cmp.parse'.
query The query descriptor, which is usually returned by 'cmp.parse1'.
cutoff The cutoff similarity (when cutoff <= 1) or the number of maximum compounds to be returned (when cutoff > 1).
return.score Whether to return similarity scores. If set to TRUE, a data frame will be returned; otherwise, only the compounds' indices in the database will be returned in the order of decreasing scores.
quiet Whether to disable progress information.
mode Mode used when computing similarity scores. This value is passed to 'cmp.similarity'.
visualize Whether to visualize the search result in a webpage.
visualize.browse Whether to open the browser automatically if you choose to visualize the search result.
visualize.query Filename/URL or a character string containing the SDF of the query structure if you also want to visualize the query in the search result visualization webpage.

Details

'cmp.search' will go through all the compound descriptors in the database and calculate the similarity between the query compound and compounds in the database. When cutoff similarity score is set, compounds having a similarity score higher than the cutoff will be returned. When maximum number of compounds to return is set to N via 'cutoff', the compounds having the highest N similarity scores will be returned.

If 'visualize' is set to a TRUE value, sdf.visualize will be called to send the search results and the scores to ChemMine website. If 'visualize.browse' is set to a TRUE value, the browser will open to show the structures in the search result with their corresponding scores. Otherwise, a URL pointing to that webpage will be printed. By default, 'visualize.query' is not set, and the query structure will not be uploaded. If you want that to be included in the visualization webpage as well, you must set this argument to a character string containing the SDF of the query, or a filename pointing to a file containing the SDF of the query. If the character string or the file containing multiple SDFs, only the first will be considered as the SDF of the query.

Value

When 'return.score' is set to FALSE, a vector of matching compounds' indices in the database will be returned. Otherwise, a data frame will be returned:

ids The indices of matching compounds in the database.
scores The similarity scores between the matching compounds and the query compound

Author(s)

Y. Eddie Cao, Li-Chang Cheng

References

Chen X and Reynolds CH (2002). "Performance of similarity measures in 2D fragment-based similarity searching: comparison of structural descriptors and similarity coefficients", in J Chem Inf Comput Sci.

See Also

cmp.parse1, cmp.parse, cmp.search, cmp.cluster, cmp.similarity, sdf.visualize

Examples

# load sample database from web
db <- cmp.parse("http://bioweb.ucr.edu/ChemMineV2/static/example_db.sdf")
# (optinally) save the db for future use
save(db, file="db.rda", compress=TRUE)
# load SDF of query struture from web
url <- "http://bioweb.ucr.edu/ChemMineV2/compound/Aurora/b32:NNQS2MBRHAZTI===/sdf"
query <- cmp.parse1(url)
# search for similar compounds using similarity cutoff
cmp.search(db, query, cutoff=0.4)
# search for similar compounds using similarity cutoff; request to return scores
cmp.search(db, query, cutoff=0.4, return.score=TRUE)
# search for similar compounds using return-the-top-N style
cmp.search(db, query, cutoff=10, return.score=TRUE)

# you may visualize the search result in ChemMine
cmp.search(db, query, cutoff=10, visualize=TRUE, visualize.browse=FALSE, visualize.query=url)

## in the next session, you may use load a saved db and do the search:
load("db.rda")
cmp.search(db, query, cutoff=3)
## you may also use the loaded db to do clustering:
cmp.cluster(db, cutoff=0.35)

[Package ChemmineR version 1.2.0 Index]