Chagas cardiomyopathy prediction using statistical models.

Work Group

Computational Biology

Project Lead(s)

Andrew Ghazi, PhD

Project Status

Complete

Project Deliverables

To analyze and leverage the data that Dr.Seidman’s group has on hand to better understand progression to Chagas cardiomyopathy

Collaborator(s)

Jonathan G. Seidman, PhD

HMS Department

Genetics

Project Description

CCB worked with Seidman Lab to help process, explore, and analyze the data that Dr. Seidman’s lab has collected. This involved data cleaning as appropriate and re-examining the epitope quantification pipeline and batch effect correction steps. We tested a suite of standard machine learning techniques (using random forests as a starting point) to develop a predictive classifier that uses the epitope data and other available covariates to distinguish between the Chagas cardiomyopathy and indeterminate patient categories

Outcomes

We reviewed and updated the Seidman group’s QC pipeline on a PhIP-Seq assay focused on cardiomyopathy patients. The newly corrected data was used as input to a variety of machine learning models aimed at predicting disease status.