cmp.parse {ChemmineR} | R Documentation |
'cmp.parse' will take a SDF file, parse all the compounds encoded, compute their atom-pair descriptors, and return the descriptors as a list. The list contains two names, 'descdb' and 'cids'. 'descdb' is a vector of descriptors, and 'cids' is a list of names of compounds found in the SDF file. The returned list is usually used to a database, against which similarity search can be performed using the 'search' function. These two functions will parse all compounds in the SDF file. To parse a single compound, use 'cmp.parse1' instead.
cmp.parse(filename, quiet=FALSE)
filename |
The file name of the SDF file |
quiet |
Whether to silent the output of progress information |
The 'filename' can be a local file or an URL. It is interactive, and will display the parsing progress. Since the parsing will also compute of atom-pair descriptors, it is time consuming. You will be reminded to save the parsing result for future use at the end of parsing.
Return a list that can be used as the database against which similarity search can be performed. The 'search' and 'cmp.cluster' functions both expect a database returned by 'cmp.parse'.
descdb |
A vector containing the descriptors for all the compounds. |
cids |
Compound ID information found in the SDF file. It is the first line of SDF of a compound. |
Y. Eddie Cao, Li-Chang Cheng
Chen X and Reynolds CH (2002). "Performance of similarity measures in 2D fragment-based similarity searching: comparison of structural descriptors and similarity coefficients", in J Chem Inf Comput Sci.
cmp.parse1
, cmp.search
,
cmp.cluster
,
cmp.similarity
# load sample database from web db <- cmp.parse("http://bioweb.ucr.edu/ChemMineV2/static/example_db.sdf") # (optinally) save the db for future use save(db, file="db.rda", compress=TRUE) # ... # later, in a separate session, you can load it back: load("db.rda")