Ann B Lee

Ann B Lee

Professor, Co-Director of the STAMPS@CMU Research Center

Department of Statistics & Data Science / Machine Learning Department, Carnegie Mellon University

About me

I am a professor in the Department of Statistics & Data Science at Carnegie Mellon University, with a joint appointment in the Machine Learning Department. Prior to joining CMU, I was the J.W. Gibbs Assistant Professor in the department of mathematics at Yale University, and before that I served a year as a visiting research associate in the department of applied mathematics at Brown University.

My research interests are in developing statistical methodology for complex data and problems in the physical sciences. I am particularly interested in trust-worthy scientific inference and reliable uncertainty quantification, and in bridging classical statistics and machine learning for simulation-based inference and experimental design. My recent work includes likelihood-free inference, calibrated probabilistic forecasting, interpretable diagnostics of generative models, and applications in astronomy and hurricane intensity guidance involving satellite imagery and large surveys.

I co-founded the STAtistical Methods for Physical Sciences (STAMPS) Research Center at Carnegie Mellon University together with my colleague Mikael Kuusela.

See Google Scholar for my publications and preprints (I’ve been slow at updating my list of selected publications)

Interests

  • Scientific Machine Learning
  • Trust-Worthy Uncertainty Quantification
  • Likelihood-Free Inference
  • Statistical Methods for the Physical Sciences

Education

  • PhD in Physics

    Brown University

  • MSc/BSc in Engineering Physics

    Chalmers University of Technology, Sweden

News & Recent Events

Recent Papers

(2024). Classification under Nuisance Parameters and Generalized Label Shift in Likelihood-Free Inference. Proceedings of the Forty-First International Conference on Machine Learning (ICML 2024), PMLR 235, 2024.

Preprint

(2022). Simulator-Based Inference with WALDO: Confidence Regions by Leveraging Prediction Algorithms and Posterior Estimators for Inverse Problems. Proceedings of the 26th International Conference on Artificial Intelligence and Statistics (AISTATS 2023), PMLR 206:2960-2974, 2023. (Finalist at the ASA SPES and Q&P Student Paper Competition).

Preprint PDF Code

(2022). Detecting Distributional Differences in Labeled Sequence Data with Application to Tropical Cyclone Satellite Imagery. Annals of Applied Statistics 17(2):1260-1284, June 2023. (Selected for The Best of AOAS invited paper session at JSM 2023).

Preprint DOI

(2021). Diagnostics for Conditional Density Models and Bayesian Inference Algorithms. Proceedings of the 37th Conference on Uncertainty in Artificial Intelligence (UAI 2021). PMLR 161:1830-1840, 2021.

Preprint PDF

(2020). Wildfire Smoke and Air Quality: How Machine Learning Can Guide Forest Management. Tackling Climate Change with Machine Learning workshop at NeurIPS 2020 (Spotlight talk).

Preprint Slides Video

Talks

(some recorded)

  • Calibrated Uncertainty Quantification in Simulator-Based Inference” at Hammers & Nails: Frontiers in Machine Learning in Cosmology, Astro & Particle Physics, Ascona, November 2, 2023. Slides.
  • Detecting Distributional Differences in Labeled Sequences of Tropical Cyclone Satellite Imagery", “Best of AOAS” invited session, Joint Statistical Meeting, Toronto, August 9, 2023. Slides.
  • 2-Sample and GoF Testing via Regression” at PHYSTAT-2samples workshop, June 2, 2023. Slides. Video recording.
  • Likelihood-Free Frequentist Inference: Confidence Sets with Correct Conditional Coverage", ISSI-STAMPS joint seminar with discussant Minge Xie (Rutgers University), June 16, 2022. Poster. Slides. Video recording.

Workshops

Some recent workshops in Stats/ML for physics that I’ve co-organized:

Group

I direct the STAtistical Methods for the Physical Sciences (STAMPS) Research Center at CMU together with my colleague Dr Mikael Kuusela.

I am fortunate to advise the following amazing students:

Current PhD Students

James Carzon Elizabeth Cucuzella Joshua D. Ingram

PhD Graduates (only listing my thesis advisees)

  • Luca Masserano
    – PhD May 2025, Department of Statistics & Data Science and the Machine Learning Department, CMU
    – Thesis title: Trustworthy Scientific Inference with Machine Learning

  • David Zhao
    – PhD May 2023, Department of Statistics & Data Science and the Machine Learning Department, CMU
    – Thesis title: Calibrated Conditional Density Models and Predictive Inference via Local Diagnostics

  • Trey (Tria) McNeely
    – PhD June 2022, Department of Statistics & Data Science, CMU
    – Thesis title: Quantifying Spatio-temporal Convective Structure in Tropical Cyclones

  • Niccolò (Nic) Dalmasso
    – PhD May 2021, Department of Statistics & Data Science, CMU
    – Thesis title: Uncertainty Quantification in Simulation-based Inference
    – 2021 ASA Student of the Year, Pittsburgh Chapter

  • Taylor Pospisil
    – PhD May 2019, Department of Statistics & Data Science, CMU
    – Thesis title: Conditional Density Estimation for Regression and Likelihood-Free Inference

  • Rafael Izbicki
    – PhD April 2014, Department of Statistics, CMU
    – Thesis title: A Spectral Series Approach to High-Dimensional Nonparametric Inference
    – 2014 Best Thesis Award, Department of Statistics, CMU

  • Di Liu
    – PhD July 2012, Department of Statistics, CMU
    – Thesis title: Comparing Data Sources in High Dimensions

  • Andrew Crossett
    – co-advised with Kathryn Roeder
    – PhD May 2012, Department of Statistics, CMU
    – Thesis title: Using Dimension Reduction Techniques to Model Genetic Relationships for Association Studies

  • Susan Buchman
    – co-advised with Chad Schafer
    – PhD March 2011, Department of Statistics, CMU
    – Thesis title: High-Dimensional Adaptive Basis Density Estimation

  • Joseph W. Richards
    – co-advised with Chad Schafer
    – PhD July 2010, Department of Statistics, CMU
    – Thesis title: Fast and Accurate Estimation for Astrophysical Problems in Large Databases
    – 2010 ASA Student of the Year, Pittsburgh Chapter

  • Diana Luca
    – co-advised with Kathryn Roeder
    – PhD Sept 2008, Department of Statistics, CMU
    – Thesis title: Genetic Matching by Ancestry in Genome-Wide Association Studies

Teaching

  • Probability and Mathematical Statistics (STAT 36-700). Fall 2024-2025.
  • Regression Analysis (STAT 36-707). Fall 2021, 2023.
  • Modern Ideas in Statistics and AI for Climate and Environmental Sciences (STAT 36-722). Spring 2021.
  • Advanced Methods for Data Analysis (STAT 36-402/608). Spring 2017-2023.
  • Modern Regression (STAT 36-401/607). Fall 2018, 2022. Spring 2025.
  • Advanced Data Analysis II (STAT 36-758). Fall 2015-2017.
  • Mathematical Statistics Honors (STAT 36-326). Spring 2014-2016.
  • Probability and Statistics I (STAT 36-625). Fall 2005-2007, 2013-2014.
  • Statistical Practice (STAT 36-726). Spring 2012, 2016.
  • Engineering Statistics and Quality Control (STAT 36-220). Fall 2010-2011.
  • Machine Learning Journal Club (ML 10-915), Machine Learning Department, CMU. Fall 2009-2010.
  • Probability and Statistics II (STAT 36-626). Spring 2006-2008, 2010.
  • Probability and Statistics for Business Applications (STAT 36-207). Fall 2009.
  • Applied Mathematics and Engineering I (AMTH 251), Yale University. Fall 2003, 2004.
  • Introduction to Calculus in Several Variables (MATH 118), Yale University. Spring 2004.
  • Pattern Theory and its Applications (STAT 2), 12th Jyväskylä Ph.D. Summer School, Aug 2002, Finland.

Contact