The enormous increase in availability of full human genomic sequences presents great opportunity for understanding the impact of our individual genetic profiles on our bodies and health. Based on current knowledge, however, it remains highly challenging to predict the consequences of a genetic sequence alteration (variant) we have never observed, or one that is too rare to rely on standard association testing. Each of us harbors tens of thousands of genetic variants that are too rare to accurately predict consequences with standard methods. However, integrative machine learning methods that can combine information personal genome sequencing along with diverse molecular measurements in the same individual offer a promising avenue. We have developed a flexible and extensible Bayesian machine learning framework that integrates multiple personal and population data to identify genetic variants most likely to have consequences for each individual. We demonstrate that integrative models perform better than predictions from DNA-sequencing or RNA-sequencing alone, and discuss avenues for improving and utilizing machine learning in personal genomics and clinical applications.
Alexis Battle, Johns Hopkins University