Scientific Machine Learning Symposium
March 17, 2023 @ Great Hall in San Diego


Recent progress in Artificial Intelligence (AI) and Machine Learning (ML) has provided groundbreaking methods for processing large data sets. These new techniques are particularly powerful when dealing with scientific data with complex structures, non-linear relationships, and unknown uncertainties that are challenging to model and analyze with traditional tools. This has triggered a flurry of activity in science and engineering, developing new methods to tackle problems which used to be impossible or extremely hard to deal with.

The goal of this symposium is to bring together researchers and practitioners at the intersection of AI and Science, to discuss opportunities to use AI to accelerate scientific discovery, and to explore the potential of scientific knowledge to guide AI development. The symposium will provide a platform to nurture the research community, to fertilize interdisciplinary ideas, and shape the vision of future developments in the rapidly growing field of AI + Science.

We plan to use the symposium as the launching event for the AI + Science event series, co-hosted by Computer Science and Engineering(CSE), Halıcıoğlu Data Science Institute (HDSI), and Scripps Institution of Cceanography(SIO) at UC San Diego. The symposium will include a combination of invited talks, posters, panel discussions, social and networking events. The first event will put a particular emphasis on AI + physical sciences. We will invite contribution and participation from physics, engineering, and oceanography, among others. Part of the program will highlight the research from climate science, as a result of our DOE funded scientific ML project for tackling climate extremes.


8:00 am - 8:00pm (Pacific time)
March 17, 2023

Main Program (Friday); tentative

8:00 - 8:45 am Registration & Breakfast
8:45 - 9:00am Welcome & Program Overview
9:00 - 9:50 am Invited Talk: Becca Willett
9:50 - 10:40 am Invited Talk: Chris Bretherton
10:40 - 11:00 am Coffee break/Social
11:00-12:00pm Organizing Team Project Highlights
12:00-1:00pm Lunch
1:00-1:50pm Invited Talk: Aditya Grover
1:50-2:40pm Invited Talk: Shirley Ho
2:40 - 3:00 pm Coffee Break/Social
3:00 - 3:50pm Invited Talk: Frederick Eberhardt
3:50 - 4:30pm Contributed Session/Poster Highlight
4:30 - 5:30 pm Panel Discussion (Opportunities and Challenges in using AI for Science)
5:30- 7:00 pm Poster Session and Reception

Keynote Speakers

Chris Bretherton

Senior Director
Allen Institute for Artificial Intelligence (AI2)

Professor Emeritus
University of Washington

Improving climate models using corrective machine learning vs. other emulation approaches

AI2, with GFDL, has developed a corrective machine learning (ML) methodology to improve weather forecast skill and reduce climate biases in a computationally efficient coarse-grid climate model. The corrective ML is trained by nudging the 3D temperature, humidity and wind fields forecast by the coarse-grid model to a time-dependent global reference and learning the ‘nudging tendencies’ as a function of the column state of the model. The reference can be a reanalysis (for present-climate simulation) or a finer-grid version of the same model that may be more trustworthy across a range of climates. The ML is interpreted as a correction to the combined physics parameterizations of the coarse-grid model. We trained the ML on global 25 km simulations in multiple climates, and separately on a year-long 3 km simulation, and applied it in 200 km coarse-grid simulations. The ML reduced annual-mean land temperature and precipitation pattern biases by up to 50% and enhanced weather forecast skill. We compare strengths and weaknesses of this method vs. other strategies for emulating reference climate models, including another hybrid approach using reservoir computing, and full model emulation.
Chris Bretherton is the Senior Director of Climate Modeling at AI2, where he leads a research group using machine learning to improve climate models, in collaboration with NOAA’s Geophysical Fluid Dynamics Laboratory in Princeton. From 1985-2021 he was a professor of atmospheric science and applied mathematics at the University of Washington, studying cloud formation, turbulence, and how to better represent them in global climate and weather forecast models. He was a lead author of the IPCC Fifth Assessment Report in 2013. In 2012, he received the Jule G. Charney Award, from the American Meteorological Society, and he was the 2019 AMS Haurwitz Lecturer. He is a Fellow of the AMS and AGU, and a member of the National Academy of Sciences and Washington State Academy of Sciences.

Rebecca Willett

University of Chicago

Machine learning and data assimilation in the natural sciences

The potential for machine learning to revolutionize scientific research is immense, but its transformative power cannot be fully harnessed through the use of off-the-shelf tools alone. To unlock this potential, novel methods are needed to integrate physical models and constraints into learning systems, accelerate simulations, and quantify model prediction uncertainty. In this presentation, we will explore the opportunities and emerging tools available to address these challenges in the context of inverse problems, data assimilation, and simulator calibration. By leveraging ideas from statistics, optimization, scientific computing, and signal processing, the AI and Science community can develop new and more effective machine learning methods that improve predictive accuracy and computational efficiency in the natural sciences.
Rebecca Willett is a Professor of Statistics and Computer Science at the University of Chicago. Her research is focused on machine learning, signal processing, and large-scale data science. Willett received the National Science Foundation CAREER Award in 2007, was a member of the DARPA Computer Science Study Group, received an Air Force Office of Scientific Research Young Investigator Program award in 2010, was named a Fellow of the Society of Industrial and Applied Mathematics in 2021, and was named a Fellow of the IEEE in 2022. She is a co-principal investigator and member of the Executive Committee for the Institute for the Foundations of Data Science, helps direct the Air Force Research Lab University Center of Excellence on Machine Learning, and currently leads the University of Chicago’s AI+Science Initiative.

Frederick Eberhardt

California Institute of Technology

Causal Models at Multiple Levels of Granularity

Methods of causal inference and discovery have focused on identifying causal relations among a set of _given_ causal variables. Such a starting point presupposes that the causal variables themselves are identified and clearly defined. Moreover, it fixes from the outset a granularity at which the causal interaction is presumed to occur. This presentation explores the constraints and principles of how one might learn and construct causal variables at a granularity different than the measurement process, and under what circumstances causal relations may be described at multiple levels of granularity. I will illustrate the approach with results from climate and neuroscience.
Frederick Eberhardt's research primarily focuses on methods for causation and how we might learn about causal relations from data. His research projects generally fall in an area of overlap between philosophy, machine learning, statistics, and cognitive science. He has also done some historical work on the philosopher Hans Reichenbach, especially on his frequentist interpretation of probability. Before coming to Caltech in 2013, Eberhardt was an assistant professor in the Philosophy-Neuroscience-Psychology program in the department of philosophy at Washington University in St. Louis. He spent a year as a McDonnell Postdoctoral Fellow at the Institute of Cognitive and Brain Sciences at the University of California, Berkeley. He holds a PhD in Logic, Computation and Methodology from the department of philosophy at Carnegie Mellon University (CMU), and a masters in Knowledge Discovery and Data Mining from what is now CMU's Machine Learning Department.

Shirley Ho

Group Leader
Flatiron Institute

Deep learning as a last resort

For the last 10 years, we have seen a rapid adoption of deep learning techniques across many disciplines, ranging from self-driving vehicles, credit card rating to biomedicine. Along with this wave, we have seen rapid adoption and rejection in the nascent field of Machine Learning and Sciences. While we see more and more people working in the area of Machine Learning and Sciences, there are also quite a number of skeptics (sometimes for very good reasons). Some of us are believers of using deep learning as a last resort, and I will showcase a few of these scientific challenges ranging from understanding our Universe, the Milky Way, the Solar System to our genome.
Shirley Ho is an American astrophysicist and machine learning expert, currently at the Center for Computational Astrophysics at Flatiron Institute in NYC and at the New York University and the Carnegie Mellon University. Ho also has visiting appointment at Princeton University. A cited expert in cosmology, machine learning applications in astrophysics and data science, her interests include developing and deploying deep learning techniques to better understand our Universe, and other astrophysical phenomena.

Aditya Grover

Assistant Professor

ClimaX: A foundation model for weather and climate

Most state-of-the-art approaches for weather and climate modeling are based on physics-informed numerical models of the atmosphere. These approaches aim to model the non-linear dynamics and complex interactions between multiple variables, which are challenging to approximate. Additionally, many such numerical models are computationally intensive, especially when modeling the atmospheric phenomenon at a fine-grained spatial and temporal resolution. Recent data-driven approaches based on machine learning instead aim to directly solve a downstream forecasting or projection task by learning a data-driven functional mapping using deep neural networks. However, these networks are trained using curated and homogeneous climate datasets for specific spatiotemporal tasks, and thus lack the generality of numerical models. In this talk, I will present ClimaX, a flexible and generalizable deep learning model for weather and climate science that can be trained using heterogeneous datasets spanning different variables, spatio-temporal coverage, and physical groundings. ClimaX extends the Transformer architecture with novel encoding and aggregation blocks that allow effective use of available compute while maintaining general utility. The pre-trained ClimaX can then be fine-tuned to address a breadth of climate and weather tasks, including those that involve atmospheric variables and spatio-temporal scales unseen during pretraining. Compared to existing data-driven baselines, we show that this generality in ClimaX results in superior performance on benchmarks for weather forecasting and climate projections, even when pretrained at lower resolutions and compute budgets. Towards the end of the talk, I will present ClimateLearn, our open-sourced library to standardize machine learning for climate science.
Aditya Grover is an assistant professor of computer science at UCLA. His goal is to develop efficient machine learning approaches that can interact and reason with limited supervision with a focus on deep generative models and their intersection with sequential decision making and causal inference. He is also an affiliate faculty at the UCLA Institute of the Environment and Sustainability, where he grounds his research in real-world applications in climate science. Aditya's 40+ research works have been published at top venues including Nature, deployed in production at major technology companies, and covered in popular press venues. His research has been recognized with two best paper awards, four research fellowships, four faculty awards, the ACM SIGKDD doctoral dissertation award, and the AI Researcher of the Year Award by Samsung. Aditya received his postdoctoral training at UC Berkeley, PhD from Stanford, and bachelors from IIT Delhi, all in computer science.



The event will be hosted at the Great Hall. You can locate the venue on UCSD Map


The symposium is free of charge, thanks to the generous support from our sponsors. You can RSVP via Eventbrite.

Poster Contribution

We highly encourage poster contributions from students and postdocs. You can register your poster via Poster Submission Form.


Organizing Committee

Elias Bareinboim (Columbia University)

Pierre Gentine (Columbia University)

Stephan Mandt (UC Irvine)

Mike Pritchard (UC Irvine)

Lawrence Saul (Flatiron Institute)

Yian Ma (UC San Diego)

Rose Yu (UC San Diego)

Local Organizing Committee

Yian Ma (UC San Diego)

Nick Lutsko (UC San Diego)

Yuanyuan Shi (UC San Diego)

Rose Yu (UC San Diego)