Bibliography
- Milne GM. Pharmaceutical productivity – the imperative for new paradigms. Annu Rep Med Chem 2003;38:383-96
- Slater T, Bouton C, Huang ES. Beyond data integration. Drug Discov Today 2008;13:584-9
- Williams AJ. A perspective of publicly accessible/open-access chemistry databases. Drug Discov Today 2008;13:495-501
- Williams AJ. Public chemical compound databases. Curr Opin Drug Discov Devel 2008;11:393-404
- Bolton EE, Wang Y, Thiessen PA, Bryant SH. PubChem: integrated platform of small molecules and biological activities. Annu Rep Comput Chem 2008;4:217-41
- PubChem substance data source information. Available from: http://pubchem.ncbi.nlm.nih.gov/sources/sources.cgi. [Last accessed 16 June 2009]
- Irwin JJ, Shoichet BK. ZINC – a free database of commercially available compounds for virtual screening. J Chem Inf Model 2005;45:177-82
- ChemSpider. Available from: http://www.chemspider.com. [Last accessed 16 June 2009]
- eMolecules. Available from: http://www.emolecules.com. [Last accessed 16 June 2009]
- Apweiler R, Bairoch A, Wu CH. Protein sequence databases. Curr Opin Chem Biol 2004;8:76-80
- Uniprot Consortium. The universal protein resource. Nucleic Acids Res 2007;35:193-7
- Uniprot. Available from: http://www.uniprot.org. [Last accessed 22 June 2009]
- Bernstein FC, Koetzle TF, Williams GJB, et al. The protein data bank: a computer-based archival file for macromolecular structures. J Mol Biol 1977;112:535-42
- Research Collaboratory for Structural Bioinformatics. Available from: http://home.rcsb.org. [Last accessed 22 June 2009]
- The protein data bank. Available from: http://www.pdb.org. [Last accessed 22 June 2009]
- Schaefer CF. Pathway databases. Ann NY Acad Sci 2004;1020:77-91
- Kanehisa M, Araki M, Goto S, et al. KEGG for linking genomes to life and the environment. Nucleic Acids Res 2008;36:480-4
- Kanehisa M, Goto S, Hattori M, et al. From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res 2006;34:354-7
- Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 2000;28:27-30
- Karp PD, Ouzounis CA, Moore-Kochlacs C, et al. Expansion of the BioCyc collection of pathway/genome databases to 160 genomes. Nucleic Acids Res 2005;19:6083-9
- Cochrane G, Akhtar R, Bonfield J, et al. Petabyte-scale innovations at the European Nucleotide Archive. Nucleic Acids Res 2009;37(Database issue):D19-25
- Benson DA, Karsch-Mizrachi I, Lipman DJ, et al. GenBank. Nucleic Acids Res 2008;36:25-30
- NCBI nucleotide databases. Available from: http://www.ncbi.nlm.nih.gov/About/tools/restable_nuc.html. [Last accessed 29 June 2009]
- Dalkikic M, Costello J, Clark WT, Radivojac P. From protein-disease associations to disease informatics. Front Biosci 2008;13:3391-407
- Radivojac P, Peng K, Clark WT, et al. An integrated approach to inferring gene-disease associations in humans. Proteins Struct Funct Bioinform 2008;72:1030-7
- Disease gene database. Available from: http://www.proteinlounge.com/ disease_proteins.asp. [Last accessed 26 June 2009]
- Thorisson GA, Muilu J, Brookes AJ. Genotype–phenotype databases: challenges and solutions for the post-genomic era. Nat Rev Genet 2009;10:9-18
- Kaharaman A, Avramov A, Nashev LG, et al. PhenomicDB: a multi-species genotype/phenotype database for comparative phenomics. Bioinformatics 2005;21:418-20
- Wild DJ, Beckman R. The future of chemical information searching. In: Banville D, editor, Chemical information mining: facilitating literature-based discovery. CRC Press; 2008
- PubMed. Available from: http://www.ncbi.nlm.nih.gov/pubmed. [Last accessed 22 June 2009]
- PubMed central. Available from: http://pubmedcentral.nih.gov. [Last accessed 22 June 2009]
- NIH public access poicy. Available from: http://publicaccess.nih.gov. [Last accessed 22 June 2009]
- Han J, Kamber M. Data mining: concepts and techniques. 1st edition. Morgan Kaufmann; 2000
- Wang H, Klinginsmith J, Dong X, et al. Chemical data mining of the NCI human tumor cell line database. J Chem Inf Model 2007;47(6):2063-76
- Brown N. Chemoinformatics – an introduction for computer scientists. ACM Comput Surv 2009;41:2
- Altschul SF, Madden TL, Schaffer AA, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997;25:3389-402
- MacQueen JB. Some methods for classification and analysis of multivariate observations. Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability 1967;281-97
- Kaufman L, Rousseeuw PJ. Findings in groups of data: an introduction to cluster analysis. John Wiley; 1990
- Ng RT, Han J. Efficient and effective clustering methods for spatial data mining. 1994 International Conference Very Large Data Bases (VLDB'94), Santiago, Chile 1994:144-55
- Zhang T, Ramakrishnan R, Livny M. BIRCH: an efficient data clustering methods for very large databases; 1996 ACM-SIGMOD International Conference Management of Data (SIGMOD'96), Montreal, Canada. 1996:103-14
- Downs GM, Barnard JM. Clustering methods and their uses in computational chemistry. Rev Comput Chem 2002;18:1-40
- Guha S, Rastogi R, Shim K. Cure: an efficient clustering algorithm for large databases. 1996 ACM-SIGMOD International Conference Management of Data (SIGMOD'96) Seattle, WA. 1998;73-84
- Karypsis G, Han EH, Kumar V. CHAMELEON: a hierarchical clustering alorithm using dynamic modeling. Computer 1999;68-75
- Ester M, Kriegel HP, Sander J, Xu X. A density-based algorithm for discovering clusters in large spatial databases. 1996 International Conference of Knowledge Discovery and Data Mining (KDD'97), Portland OR. 1996:226-31
- Ankerst M, Breunig MM, Kriegel HP, Sander J. OPTICS: ordering points to identify the clustering structure. 1999 ACM-SIGMOD International Conference Management of Data (SIGMOD'99) Philadelphia, PA. 1999:49-60
- Hoschka P, Klogsen W. A support system for interpreting statistical data. In knowledge discovery in databases. AIAA/MIT Press; 1991:325-46
- Wang W, Yang J, Muntz RR. STING: a statistical information grid approach to spatial data mining; 1997 International Conference of Very Large Databases (VLDB'97), Athens, Greece. 1997:186-95
- Sheikholeslami G, Chatterjee S, Zhang A. WaveCluster: a multi-resolution clustering approach for very large spatial databases; 1998 International Conference of Very Large Data Bases (VLDB'98), New York. 1998:428-39
- Agrawa R, Gehrke J, Gunopulos D, Raghavan P. Automatic subspace clustering of high dimensional data for data mining applications. 1998 ACM-SIGMOD International Conference Management of Data (SIGMOD'98), Seattle, WA. 1998:94-105
- Wild DJ, Blankley CJ. Comparison of 2D fingerprint types and hierarchy level selection methods for structural grouping using Ward's clustering. J Chem Inf Comput Sci 2000;40:155-62
- Wild DJ, Blankley CJ. VisualiSAR: a web-based application for clustering, structure browsing and SAR study. J Mol Graph Model 1999;17:85-9
- Brown RD, Martin YC. Use of structure-activity data to compare structure-based clustering methods and descriptors for use in compound selection. J Chem Inf Comput Sci 1996;36:572-84
- Sturn A, Quackenbush J, Trajanoski Z. Genesis: cluster analysis of microarray data. Bioinformatics 2002;18:207-8
- Eisen MB, Spellman PT, Brown PO, Botstein D. Cluster analysis and display of genome-wide expression patterns. PNAS 1998;25:14863-8
- Witten IH, Eibe F. Data mining: practical machine learning tools and techniques. second edition. Morgan Kaufmann; 2006
- Carlin BP, Louis TA. Bayesian methods for data analysis. third edition. Chapman & Hall/CRC; 2008
- Liaw A, Wiener M. Classification and regression by randomforest. R News 2002;2(3):18-22
- Rusinko III A, Farmen MW, Lambert CG, et al. Analysis of a large structure/biological activity data set using recursive partitioning. J Chem Inf Comput Sci 1999;39(6):1017-26
- Ihaka R, Gentleman RR. A language for data analysis and graphics. J Comput Graphical Stat 1996;5:299-314
- SPSS. Avilable from: http://www.spss.com. [Last accessed 6 July 2009]
- Agrawal R, Imielinski R, Swami A. Mining association rules between sets of items in large databases, ACM-SIGMOD International Conference Management of Data (SIGMOD'93). 1993;207-16
- Kersey P, Apweiler R. Linking publication, gene and protein data. Nat Cell Biol 2006;8:11
- Durinck S, Moreau Y, Kasprzyk A, et al. BioMart and bioconductor: a powerful link between biological databases and microarray data analysis. Bioinformatics 2005;21:3439-40
- Xtractor: data mining simplified. Available from: http://www.xtractor.in. [Last accessed 22 June 2009]
- Belleau F, Nolin M, Tourigny N, et al. Bio2RDF: towards a mashup to build bioinformatics knowledge systems. J Biomed Inform 2008;03:706-16
- Royal society of chemistry project prospect. Available from: http://www.projectprospect.org. [Last accessed 22 June 2009]
- Wishart DS, Knox C, Guo AC, et al. DrugBank: a knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Res 2008;36 (Database issue):D901-6
- Schreyer A, Blundell T. CREDO: a protein-ligand interaction database for drug discovery. Chem Biol Drug Des 2009;73(2):157-67
- Günther S, Kuhn M, Dunkel M, et al. SuperTarget and matador: resources for exploring drug-target relationships. Nucleic Acids Res 2008;36 (Database issue):D919-22
- Berners-Lee T, Hendler J, Lassila O. The semantic web. Sci Am 2001;284(5):34-43
- Hendler J, Berners-lee T, Miller E. Integrating applications on the semantic web. J Inst Electrical Eng Japan 2002;122(10):676-80
- XML. Available from: http://www.w3.org/XML. [Last accessed 4 August 2009]
- Murray-Rust P, Rzepa HS. Chemical markup language and XML part I. Basic principles. J Chem Inf Comput Sci 1999;39(6):928-42
- Hucka M, Finney A, Sauro HM, et al. The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics 2003;19(4):524-31
- XML subsets for the life sciences. Available from: http://www.visualgenomics.ca/gordonp/xml/. [Last accessed 4 August 2009]
- OWL. Available from: http://www.w3.org/TR/OWL-guide. [Last accessed 4 August 2009]
- RDF. Available from: http://www.w3.org/RDF. [Last accessed 4 August 2009]
- Gardner SP. Ontologies and semantic data integration. Drug Discov Today 2005;10(14):1001-7
- RSS guide. Available from: http://www.xml.com/pub/a/2002;/12/ 18/dive-into-xml.html. [Last accessed 4 August 2009]
- Murray-Rust P, Rzepa HS. Towards the chemical semantic web. An introduction to RSS. Internet J Chem 2003;6:Article 4
- WSDL. Available from: http://www.w3.org/TR/WSDL. [Last accessed 4 August 2009]
- SOAP. Available from: http://www.w3.org/TR/SOAP. [Last accessed 4 August 2009]
- REST wiki. Available from: http://rest.blueoxen.net/cgi-bin/wiki.pl. [Last accessed 4 August 2009]
- UDDI. Available from: http://uddi.xml.org. [Last accessed 4 August 2009]
- Hendler J. Is there an intelligent agent in your future? Nat Web Matters 1999
- Wooldridge M, Jennings N. Intelligent agents: theory and practice. Knowledge Eng Rev 1995;10(2)
- An inference engine for RDF. Available from: http://www.agfa.com/w3c/2002/02/thesis/An_inference_engine_for_RDF.html. [Last accessed 4 August 2009]
- Dong X, Gilbert KE, Guha R, et al. Web service infrastructure for chemoinformatics, J Chem Inf. Model 2007;47:1303-7
- Hur J, Wild DJ. PubChemSR: a search and retrieval tool for PubChem. Chem Cent J 2008;2:11
- Willighagen E, O'Boyle NM, Gopalakrishnan H, et al. Userscripts for the life sciences. BMC Bioinformatics 2007;8:487
- Torrey Path. Available from: http://www.torreypath.com. [Last accessed 26 June 2009]
- Wild DJ. Strategies for using information effectively in early-stage drug discovery. In: Ekins S editor, Computer applications in pharmaceutical research and development. Wiley-Interscience, Hoboken; 2006
- Hassan M, Brown RD, Varma-O'Brien S, Rogers D. Cheminformatics analysis and learning in a data pipelining environment. Mol Divers 2006;10:283-99
- Berthold MR, Cebron N, Dill F, et al. KNIME: The Konstanz Information Miner. In: Preisach, et al, editor, Data analysis, machine learning and applications. Springer; 2008
- Hull D, Wolstencroft K, Stevens R, et al. Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics 2004;20:3045-54
- Inforsense. Available from: http://www.inforsense.com. [Last accessed 6 July 2009]
- Dong X, Wild DJ. An automatic drug discovery workflow generation tool using semantic web technologies. Proceedings of the 4th IEEE conference on eScience. 2008:652-7
- Wild DJ. Grand challenges for cheminformatics. J Cheminformatics 2009;1:1