Project Description

Harvey Mudd’s Computer Science Clinic is a program that brings together students with academic or industry sponsors. Our work with the 2021/22 Harvey Mudd CS Clinic team involves research and development of methods and software components to facilitate the mapping of free-text biomedical concepts to ontology terms. In the 2021/22 project we developed three new components for a tool that semi-automatically maps concepts to ontologies in bulk: (1) a set of rich user interfaces to support user interaction, for example, to browse, verify, or edit the generated mappings; (2) integration of word or ontology embedding models (BioBERT and OWL2Vec); and (3) a test harness for continuous quality assurance of the mappings generated by the tool as the software evolves over time. The resulting tool components are ultimately designed to facilitate the use and efficacy of the tool when used by scientists in data curation workflows.