About
With support from a $1.5 million, three-year Transdisciplinary Research in Principles of Data Science (TRIPODS) grant from the National Science Foundation, a multi-disciplinary team of researchers at Johns Hopkins’ Mathematical Institute of Data Science (MINDS) has created the TRIPODS Institute for the Foundations of Graph and Deep Learning at Johns Hopkins University to boost data-driven discovery.
The new institute will bring together mathematicians, statisticians, theoretical computer scientists, electrical and biomedical engineers to develop the foundations for the next generation of data-analysis methods, which will integrate model-based and data-driven approaches. Data science fellows will be trained at the institute, where they will be jointly supervised by faculty members who have complementary expertise in model-based and data driven approaches.
The mission of this new TRIPODS institute comprises both research and education. The team will develop a multidisciplinary research agenda around the foundations of model-based and data-driven approaches to data science, with a focus on the foundations of deep neural and generative models, as well as integrated models that derive strength from both types of models. In addition, the institute will become a regional designation for collaborative work as a result of organizing and staging semester-long focused research themes and workshops, an annual symposium, and a research intensive summer school and workshop on the foundations of data science.
TRIPODS researchers will also develop a unified curriculum for a new minor and master’s program, which will be offered jointly by the departments of Computer Science and Applied Mathematics and Statistics.
Faculty
Raman Arora Assistant Professor Department of Computer Science | Amitabh Basu Assistant Professor Department of Applied Mathematics and Statistics | Vladimir Braverman Assistant Professor Department of Computer Science |
Donald Geman Professor Department of Applied Mathematics and Statistics | Mauro Maggioni Bloomberg Distinguished Professor Department of Mathematics Department of Applied Mathematics and Statistics | Enrique Mallada Assistant Professor Department of Electrical and Computer Engineering |
Carey Priebe Professor Department of Applied Mathematics and Statistics | Jeremias Sulam Assistant Professor Department of Biomedical Engineering | Soledad VillarAssistant Professor Department of Applied Mathematics and Statistics |
Rene Vidal Herschel Seder Professor Department of Biomedical Engineering |
TRIPODS Fellows
Spring 2020 | ||||
Joshua Agterberg | Vittorio Loprinzo | Poorya Mianji | Hancheng Min | Anirbit Mukherjee |
Ambar Pal | Ravi Shankar | Eli Sherman | ||
Spring 2021 | ||||
Joshua Agterberg | Aditya Chattopadhyay | Niharika Shimona D’Souza | Noam Finkelstein | Teresa Huang |
Hancheng Min | Cong Mu | Ambar Pal | Ravi Shankar | Salma Tarmoun |
Xuan Wu |
||||
Summer 2021 | ||||
Teresa Huang | Ramchandran Muthukumar | Thabo Samakhoana | Salma Tarmoun |
|
Spring 2023 | ||||
Tianqi Zheng | Jacopo Teneggi | Salma Tarmoun | Taha Entesari | Konstantinos Emmanouilidis |
Teresa Huang | Ning Liu | George A Kevrekidis | Anastasia Georgiou | Leo Du |
Research
TRIPODS will develop a multidisciplinary research agenda on the foundations of model-based and data-driven approaches to data science, with a focus on the foundations of deep neural models (e.g., feed-forward networks, recurrent networks, generative adversarial networks) and generative models (e.g., attributed graphs, dynamical systems) of complex, structured data (e.g., images, shapes, networks), as well as integrated models that benefit from the strengths of both types of models.
Theme I: Foundations of Deep Learning
Recently, deep neural networks (DNNs) have led to dramatic improvements in the performance of pattern recognition systems. For instance, DNNs have revolutionized computer vision, enabling the development of powerful new technologies for face and object recognition in images and videos. However, the mathematical understand ing of DNNs remains shallow. This TRIPODS research theme will focus on developing a mathematical framework based on principles from statistics, optimization, and learning theory for understanding generalization, optimization, and approximation properties of DNNs.
Theme II: Foundations of Graph Learning
In many modern applications, ranging from network analysis to social networks to information extraction from large data sets in high dimensions, large graphs and processes on them (random walks, epidemics, etc.) play a fundamental role. These graphs are often noisy, being derived from measurements or partial observations, and they often evolve in time, with the number of vertices and edges all changing, in a stochastic fashion. Ever-richer statistical models and machine learning algorithms are needed to model graphs and their dynamics. The study of graphs increased substantially in the last decade, across multiple disciplines, attracting interest in various communities, including statistical signal processing, statistics, computer science, and computational mathematics. There are strong interconnections among multiple related areas of research and problems. In the brief space at our disposal, we organize our discussion by distinguishing questions about the analysis on graphs from questions about the analysis of graphs.
Education and Training
- The institute will train Data Science Fellows on the foundations of data science, who will be jointly supervised by faculty with complementary expertise in model-based and data-driven approaches.
- The institute will organize a series of collaborative events, including a Seminar Series, a hackathon, and an Annual Symposium on the foundations of data science.
- The institute will also fund an Annual Summer Research School and Workshop on the foundations of data science, where a team of 3 to 4 faculty, 2 to 3 graduate students and 2to 3 undergraduates work on their dream research topic for a period of 8 weeks.
- The institute will also create a new Master in Data Science, which will be jointly offered by the departments of Computer Science and Applied Mathematics and Statistics.
Publications
- Ambar Pal, Connor Lane, René Vidal, Benjamin Haeffele. On the Regularization Properties of Structured Dropout. IEEE Conference on Computer Vision and Pattern Recognition, 2020. [pdf]
- Poorya Mianjy and Raman Arora, “On Convergence and Generalization of Dropout Training,” In Proceedings of Advances in Neural Information Processing Systems (NeurIPS), 2020. [pdf]
- Raman Arora, Peter Bartlett, Poorya Mianjy, Nathan Srebro. “Dropout: Explicit Forms and Capacity Control,” 2020. [pdf]
- Amitabh Basu, Tu Nguyen and Ao Sun, “Admissibility of solution estimators in stochastic optimization”, to appear in SIAM Journal on Mathematics of Data Science, 2020.
- Anirbit Mukherjee, Ramchandran Muthukumar, “A study of neural training with non-gradient and noise assisted gradient methods”, 2020. [pdf]
- Anirbit Mukherjee, Ramchandran Muthukumar, “Guarantees on adversarial robustness of training depth-2 neural networks with a stochastic algorithm,” 2020. [pdf]
- Jason Miller, Sui Tang, Ming Zhong Mauro Maggioni, “Learning Theory for Inferring Interaction Kernels in Second-Order Interacting Agent Systems,” https://arxiv.org/pdf/2010.03729.pdf.
- Zhongyang Li, Fei Lu, Mauro Maggioni, Sui Tang, Cheng Zhang, “On the identifiability of interaction functions in systems of interacting particles,” to appear in Stochastic Processes and their Applications, https://arxiv.org/pdf/1912.11965.pdf.
- Hancheng Min and Enrique Mallada. “Dynamics Concentration of Large-Scale Tightly-Connected Networks.” IEEE 58th Conference on Decision and Control (CDC), pp. 758-763, 2019 https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9029796
- Vittorio Loprinzo, Laurent Younes and Donald Geman, “A neural network generative model for random dot product graphs,” in preparation, 2020.
- Joshua Agterberg, Minh Tang, and Carey Priebe, “On Two Forms of Nonidentifiability in Latent Position Random Graphs,” submitted, 2020
- Joshua Agterberg, Minh Tang, and Carey Priebe, “Consistent Nonparametric Hypothesis Testing for Low Rank Random Graphs with Negative and Repeated Eigenvalues,” in preparation, 2020.
- Eli Sherman, David Arbour, and Ilya Shpitser. “General Identification of Dynamic Treatment Regimes Under Interference.” Conference on Artificial Intelligence and Statistics (AISTATS), in PMLR 108:3917-3927, 2020 [pdf]
- Eli Sherman, David Arbour, and Ilya Shpitser; “Policy Interventions Under Interference.” NeurIPS Workshop on Machine Learning and Causal Inference for Improved Decision Making, 2019
- David Arbour, Eli Sherman, Avi Feller, and Alex Franks. “Multitask Gaussian Processes for Causal Inference with Panel Data.” Under Review 2020
- Ravi Shankar, Hsi-Wei Hsieh, Nicolas Charon, Archana Venkataraman. “Multi-speaker Emotion Conversion via Latent Variable Regularization in Chained Encoder-Decoder-Predictor Network,” InterSpeech 2020
- Ravi Shankar, Jacob Sager, Archana Venkataraman. “Unsupervised Emotion Conversion via Cycle-GAN and Pair Discriminator,” InterSpeech 2020
- Jingfeng Wu, Difan Zou, Vladimir Braverman, Quanquan Gu. “Direction Matters: On the Implicit Regularization Effect of Stochastic Gradient Descent with Moderate Learning Rate,” submitted.