Translational Medicine in the Age of Data

Billions of clinical measurements are recorded every day and stored in electronic health systems around the world. Each one of these experiments is a window into the human system, creating the most comprehensive and diverse medical data set ever imagined. Unfortunately, traditional statistical techniques were not developed to handle such diversity, instead they excel at analyzing homogenous data sets with first order effects. Because of this, these techniques are simply unable to untangle the sophisticated web of biological pathways and genetic interactions governing the human system.

With enormous data come enormous opportunity

Data Science is a new field dedicated to developing the methods, algorithms, and tools to unravel the complexities of enormous data. In our lab we advance data science by designing rigorous computational and mathematical methods that address the fundamental challenges of health data science. Foremost, we integrate our medical observations with systems and chemical biology models to not only explain drug effects, but also further our understanding of basic biology and human disease.

One particular area of interest is the integration of high-throughput data capture technologies, such as next-generation genome and transcriptome sequencing, metabolomics, and proteomics, with the electronic medical record to study the complex interplay between genetics, environment, and disease.

For a more in-depth information on our research areas of interest see our reviews in WIREs System Biology and Medicine, Science Translational Medicine, and Clinical Pharmacology & Therapeutics.

News and Events

Disease Heritability Inferred from Familial Relationships Reported in Medical Records

"On the cover: Vast amounts of medical records are stored in hospitals and clinics around the world. Digitalization of these records has made them available for secondary use in research of human disease and treatment. In this issue of Cell, Polubriaginof et al. (1692–1704) use these records to study disease heritability. The cover image is an artistic representation of the digital transformation process of paper medical records that enables this study."

The future of drug safety is not in clinical trails

In this commentary Dr. Tatonetti argues for a different approach for monitoring drug safety in the marketplace that combines data from spontaneous reporting systems in humans with experiments in models systems. This approach has the speed and agility of observational research with the rigor and robustness of prospective trials. Read the whole commentary here.

Featured publications

Alexandre Yahi, Rami Vanguri, Noémie Elhadad, Nicholas P Tatonetti
Generative Adversarial Networks for Electronic Health Records: A Framework for Exploring and Evaluating Methods for Predicting Drug-Induced Laboratory Test Trajectories.
31st Conference on Neural Information Processing Systems (NIPS 2017). Long Beach, CA, USA. Source.

Nicholas P Tatonetti
Translational medicine in the Age of Big Data.
Briefings in bioinformatics, 2017 Source.

See more publications.


Our lab is in the Department of Biomedical Informatics at Columbia University as well as the Department of Systems Biology, and the Department of Medicine. We are a member of the Data Science Institute at Columbia.

Potential graduate students should apply to the Department of Biomedical Informatics Training Program or the Computational Biology Training Program at Columbia.