Projects

Collaborate on a Project with Us

 

CCB is here to support your data science and computational needs. If interested in collaborating or need any related assistance, please email the details to ccbhelp@hms.harvard.edu.

Additional information can be found on our Collaboration Page.

 


 

Browse Projects

Filters

Work Group

Status

Project Title Work Group(s) Status Collaborator(s) Project Deliverables
OpenGWAS Phenotype Mapping
  • Knowledge Representation
Ongoing
  • MRC Integrative Epidemiology Unit, University of Bristol
Tool for ontology mapping; and mappings of OpenGWAS phenotypes
23andMe Phenotype Mapping
  • Knowledge Representation
Ongoing
  • 23andMe
Mappings of 23andMe GWAS phenotypes
Benchmark of Ontology Mapping Tools
  • Knowledge Representation
Ongoing
  • Kohane Lab, Department of Biomedical Informatics, Harvard Medical School
Survey and benchmark test of ontology mapping tools
Programmatic Interface to HuBMAP Ontology
  • Knowledge Representation
Ongoing
  • Cyberinfrastructure for Network Science Center, Indiana University
Tool for programmatic interaction with HuBMAP ontology
Single-cell characterization of acute inflammation in patients with COVID-19
  • Computational Biology
Ongoing
  • Jonathan Kagan, PhD
An assessment of the quality of the scRNA-seq data obtained for the first 10 samples; 2) a characterization of the cell type composition of the first 10 samples; and 3) a statistical analysis of differential cell type abundance and investigation of T-cell receptor sequencing data and cytokine measurements between samples for defined contrasts.
Transparency In Coverage
  • Data & Analytics Platforms
Ongoing
  • Mike Chernew, PhD
Data warehouse for insurance cost-sharing data
HSDM Research Data Repository
  • Data & Analytics Platforms
Ongoing
  • Jane Barrow
Data warehouse containing copy of production EHR data to support research activities
Research Design and Analysis - R Component
  • Education
Ongoing
  • Catherine Hayes, DMD, SM, DMSc
R programming workshop as a supplementary component to the Research Design and Analysis Course
Identification of cell-cell interactions from high-resolution spatial transcriptomics data
  • Computational Biology
Ongoing
  • Martin Hemberg, PhD
Develop an open-source Python package and publish it on the Python Package Index (PyPI)
Aging intervention multi-omics data integration and biomarker discovery
  • Computational Biology
Ongoing
  • Lee Rubin, PhD
To develop an R/Bioconductor experimental data package, integrate signals from multiple aging intervention omics datasets and identify biomarker sets through machine learning and gene set analysis, and develop a data portal that enables interactive exploration of the data.
Extensions and updates to the BioPlex R and Python packages
  • Computational Biology
Ongoing
  • Edward Huttlin, PhD
Update the BioPlex R and Python packages to provide access to newly generated PPI, TMT, and PTM datasets
C elegans Database
  • Data & Analytics Platforms
Complete
  • Max Heiman, PhD
Database
Proteome-scale protein-protein interaction networks from the BioPlex project
  • Computational Biology
Complete
  • Steven Gygi, PhD,
  • Wade Harper, PhD
R & Python packages
RNA sequencing atlas of vascular endothelial cells
  • Computational Biology
Complete
  • Ulrich von Andrian
O2-based RNA-seq pipeline & interactive data exploration platform
Harvey Mudd College CS Clinic 21/22
  • Knowledge Representation
Complete
  • Harvey Mudd College
Components of ontology mapping tool
Multiplexed Error Robust Fluorescence in Situ Hybridization(MERFISH)
  • Computational Biology
Complete
  • Jeffrey Moffitt, PhD
R/Bioconductor package, Repository of applications & downstream analyses and Interactive gallery of publicly available MERFISH datasets
AlphaFold & ColabFold
  • Computational Biology
Complete
  • HMS Research Computing,
  • Edward Huttlin, PHD
Modules on the O2 HPC cluster
Drugging the undruggable – machine-learning-based cancer immunotherapy design
  • Computational Biology,
  • Data & Analytics Platforms
Complete
  • Ming-Ru Wu, MD, PhD
Promoter visualization platform
Designmatch Container
  • Data & Analytics Platforms
Complete
  • José R. Zubizarreta, PhD
Develop a Docker image to collect and install all the necessary resources to run Designmatch in R.
Leveraging geographic information systems for spatial transcriptomics
  • Data & Analytics Platforms,
  • Computational Biology
Complete
  • Harvey Mudd College
Implement a GIS database-backend to represent and analyse spatial transcriptomics data
Whole-genome sequencing analysis of fluoroquinolone resistance acquisition in Mycobacterium tuberculosis
  • Computational Biology
Complete
  • Maha Farhat, MD MSc
Wrangle data to construct fully-processed & analysis-ready large WGS MTB sample constructed phylogenies that relate thousands of MTB isolates from different genetic backgrounds dated phylogenies and identification of key mutations based of phylogeny dating tools, identification of FLQ antibiotic resistance emergence in time and geographically
Thyroid hormone influence on the brain
  • Computational Biology
Complete
  • Bernardo Sabatini, PhD
Gene expression quantification and quality control for 16 bulk RNA-seq samples obtained from 8 mice; 2) differential expression analysis for defined contrasts; and 3) an exploration of alternative splicing for the ROBO3 gene
Computational pipeline for whole-genome sequencing analysis of yeast strains
  • Computational Biology
Complete
  • Fred Winston, PhD
Computational pipeline for whole-genome sequencing data analysis of yeast strains on HMS’ O2 cluster.
MERFISH Mouse Brain Data Viewer
  • Computational Biology
Complete
  • Jeffrey Moffitt, PhD
Posit Connect interface that enables interactive exploration of the data
Single-cell atlas of human variation in hematopoietic tissue
  • Computational Biology
Complete
  • Allon Klein, PhD
Processing, analysis, annotation, visualization, and exploration of a large single-cell hematopoiesis reference dataset