Cancer in silico Drug Discovery

Large-scale cancer data sets such as The Cancer Genome Atlas (TCGA) allow researchers to profile tumors based on a wide range of clinical and molecular characteristics. Subsequently, TCGA-derived gene expression profiles can be analyzed with the Connectivity Map (CMap) to find candidate drugs to target tumors with specific clinical or molecular phenotypes. This represents a powerful computational approach for candidate drug identification, but due to the complexity of TCGA and technology differences between CMap and TCGA experiments, such analyses are challenging to conduct and reproduce. CiDD (Cancer in silico Drug Discovery) is a computational drug discovery platform that addresses these challenges. CiDD integrates data from TCGA, CMap and the Cancer Cell Line Encyclopedia (CCLE) to perform computational drug discovery experiments, generating hypotheses for the following three general problems: 1) determining whether specific clinical or molecular phenotypes induce gene expression signatures, 2) finding candidate drugs to repress these expression signatures, and 3) identifying cell lines that resemble the tumors being studied for subsequent in vitro experiments. The input to CiDD is a clinical or molecular characteristic. The output is a biologically annotated list of candidate drugs to target that characteristic and a list of cell lines for in vitro experimentation. We applied CiDD to identify candidate drugs to treat colorectal cancers harboring mutations in BRAF. CiDD identified EGFR and proteasome inhibitors, while proposing five specific cell lines for in vitro testing. CiDD facilitates phenotype-driven, systematic drug discovery based on clinical and molecular data from TCGA.

CiDD requires the use of data sets produced by several institutions and working groups. Registration is required for the download of some of these data sets as described in the user documentation. TCGA data use policy and publication guidelines warrant the responsible use of TCGA data. Please note that downloading CiDD constitutes an acknowledgement that you, and any collaborators who use CiDD, will conduct research and publish in accordance with TCGA guidelines on responsible use of data found at the NIH.

CiDD was developed by Anthony San Lucas with assistance from Jerry Fowler, Eduardo Vilar Sanchez, and Paul Scheet.

CiDD is freely available with a GNU GPL version 3 license (

Please register to obtain the CiDD software.

First Name*
Last Name*
Intended Use?