Opportunities and Challenges
The increasing amount of multi-omics data becoming available for human pathogens brings new data analytic challenges. The NIH has invested significant resources into data storage repositories (e.g., GEO, Genbank, etc.) creating a critical need for data science methodologies capable of fully utilizing this data. The novel data science methods that are being developed by UC San Diego CHARM have been demonstrated to successfully scale to the global number of new genome and transcriptome sequences available; indeed, these methods have exhibited increased utility with scale. Our new data analytic methods are enabling new discoveries in large data sets, leading to new hypotheses and ambitious experimental frameworks that would not otherwise be formulated or contemplated.