Smita Krishnaswamy: Deep Geometric and Topological Representations for Extracting Insights from Biomedical Data
Abstract: High-throughput, high-dimensional data has become ubiquitous in the biomedical sciences because of breakthroughs in measurement technologies. These large datasets, containing millions of observations of cells, molecules, brain voxels, and people, hold great potential for understanding the underlying state space of the data, as well as drivers of differentiation, disease, and progression. However, they pose new challenges in terms of noise, missing data, measurement artifacts, and the “curse of dimensionality.” In this talk, I will show how to leverage data geometry and topology, embedded within modern machine learning frameworks, to understand these types of complex scientific data. First, I will use data geometry to obtain representations that enable denoising, dimensionality reduction, and visualization. Next, I will show how to combine diffusion geometry with topology to extract multi-granular features from the data for predictive analysis. Then, I will move up from the local geometry of individual data points to the global geometry of data clouds and graphs, using graph signal processing to derive representations of these entities and optimal transport for distances between them. Finally, I will demonstrate how two neural networks use geometric inductive biases for generation and inference: GRASSY (geometric scattering synthesis network) for generating new molecules and molecular fold trajectories, and TrajectoryNet for performing dynamic optimal transport between time-course samples to understand the dynamics of cell populations. Throughout the talk, I will include examples of how these methods shed light on the inner workings of biomedical and cellular systems. I will finish by highlighting future directions of inquiry.
Zoom link: https://wse.zoom.us/j/95448608570