About half the list below are journal papers, and the other half are full-length peer-reviewed papers with proceedings in top-tier venues in AI/ML, where conference publications are the norm.
-
Data fission: splitting a single data point (with J. Leiner, B. Duan, L. Wasserman), J of American Stat Assoc, 2023
arXiv |
TLDR
We devise an alternative to data splitting using external randomization called data fission that more efficiently splits information in many circumstances and then apply it to several examples in post-selection inference: interactive multiple testing, fixed-design linear regression, generalized linear models, and trend filtering.
-
Adaptive privacy composition for accuracy-first mechanisms (with R. Rogers, G. Samorodnitsky, S. Wu), Conf. on Neural Information Processing Systems (NeurIPS), 2023
arXiv |
TLDR
We derive basic and advanced composition results and privacy filters for noise-reduction mechanisms that allow an analyst to adaptively switch between differentially private and ex-post private mechanisms subject to an overall privacy guarantee.
-
Sequential predictive two-sample and independence testing (with A. Podkopaev), Conf. on Neural Information Processing Systems (NeurIPS), 2023
arXiv
-
Auditing fairness by betting (with B. Chugg, S. Cortes-Gomez, B. Wilder), Conf. on Neural Information Processing Systems (NeurIPS), 2023
arXiv | code
-
Counterfactually comparing abstaining classifiers (with Y. J. Choe, A. Gangrade), Conf. on Neural Information Processing Systems (NeurIPS), 2023
arXiv
-
An efficient doubly-robust test for the kernel treatment effect (with D. Martinez-Taboada, E. Kennedy), Conf. on Neural Information Processing Systems (NeurIPS), 2023
arXiv
-
On the sublinear regret of GP-UCB (with J. Whitehouse, S. Wu), Conf. on Neural Information Processing Systems (NeurIPS), 2023
arXiv |
TLDR
By appropriately regularizing simple confidence sequences in Hilbert spaces, we derive (for the first time) sublinear regret for GP-UCB for any kernel with polynomial decay (including Matern).
-
A composite generalization of Ville's martingale theorem
(with J. Ruf, M. Larsson, W. Koolen),
Elec. J. of Prob., 2023
arXiv
-
Online multiple hypothesis testing (with D. Robertson, J. Wason), Statistical Science, 2023
arXiv
-
Nonparametric two-sample testing by betting
(with S. Shekhar), IEEE Trans. on Info. Theory, 2023
arXiv | code | slides | TLDR
We develop a general framework for designing sequential two-sample tests, and obtain a general characterization of the power of these tests in terms of the regret of an associated online prediction game. This yields the ``right'' sequential generalizations of many offline nonparametric two-sample tests like Kolmogorov-Smirnov or kernel-MMD.
-
E-values as unnormalized weights in multiple testing (with N. Ignatiadis, R. Wang), Biometrika, 2023
arXiv | proc
-
Comparing sequential forecasters (with Y.J. Choe), Operations Research, 2023
arXiv | code | talk | poster | slides (Citadel, Research Showcase Runner-up)
-
Game-theoretic statistics and safe anytime-valid inference
(with P. Grunwald, V. Vovk, G. Shafer), Statistical Science, 2023
arXiv
-
Martingale methods for sequential estimation of convex functionals and divergences (with T. Manole),
IEEE Trans. on Information Theory, 2023 arXiv | article | talk (Student Research Award, Stat Soc Canada) | TLDR
We derive confidence sequences for convex functionals, with an emphasis on convex divergences such as the kernel Maximum Mean Discrepancy and Wasserstein distances; our main technical contribution is to show that empirical plugins of convex functionals/divergences (and more generally processes satisfying a leave-one-out property) are partially ordered reverse submartingales, coupled with maximal inequalities for such processes.
-
Estimating means of bounded random variables by betting (with I. Waudby-Smith),
J. of the Royal Statistical Society, Series B, 2023
arXiv (Discussion paper) | code
-
Sequential change detection via backward confidence sequences (with S. Shekhar).
Intl. Conf. on Machine Learning (ICML), 2023 arXiv | code | slides |
TLDR
We derive a general reduction from constructing confidence sequences (CSs) for some functional~(say $\theta$) to detecting changes in that functional: given a stream of observations, construct a single CS (for $\theta$) in the forward direction, and a new CS in the backward direction with each new observation, and stop and declare a changepoint as soon as they do not intersect. We obtain tight guarantees on the average run length~(ARL) and the detection delay for this general strategy, and instantiate them for several classical and modern change detection problems.
-
Fully adaptive composition in differential privacy (with J. Whitehouse, R. Rogers, Z. S. Wu),
Intl. Conf. on Machine Learning (ICML), 2023
arXiv
-
Online Platt scaling with calibeating (with C. Gupta),
Intl. Conf. on Machine Learning (ICML), 2023
arXiv
-
A nonparametric extension of randomized response for locally private confidence sets (with I. Waudby-Smith, Z. S. Wu),
Intl. Conf. on Machine Learning (ICML), 2023
arXiv | code (oral talk)
-
Sequential kernelized independence testing
(with A. Podkopaev, P. Bloebaum, S. Kasiviswanathan),
Intl. Conf. on Machine Learning (ICML), 2023
arXiv
-
Risk-limiting financial audits via weighted sampling without replacement (with S. Shekhar, Z. Xu, Z. Lipton, P. Liang),
Intl. Conf. Uncertainty in AI (UAI), 2023
arXiv |
TLDR
We introduce the notion of risk-limiting financial audits (RLFA), where the goal is to design statistical procedures to verify an assertion about a set of reported financial transactions. We propose a general RLFA strategy using confidence sequences constructed with weighted sampling without replacement, and also develop techniques that can incorporate any available side information (such as predictions from AI models).
-
Huber-robust confidence sequences
(with H. Wang),
Intl. Conf. on AI and Statistics (AISTATS), 2023,
arXiv (full oral talk) | TLDR
Under a slight generalization of Huber's epsilon-contamination model (where epsilon fraction of the points are arbitrarily corrupted), we derive confidence sequences for univariate means only assuming a finite p-th moment (for p between 1 and 2), which are minimax optimal and perform very well in practice.
-
Catoni-style confidence sequences for heavy-tailed mean estimation
(with H. Wang),
Stochastic Processes and Applications, 2023
arXiv | article | code |
TLDR
We derive confidence sequences, which are confidence intervals valid at arbitrary stopping times, for univariate means only assuming a finite p-th moment (for p between 1 and 2), which are minimax optimal and perform very well in practice.
-
Anytime-valid confidence sequences in an enterprise A/B testing platform (with A. Maharaj, R. Sinha, D. Arbour, I. Waudby-Smith, S. Liu, M. Sinha, R. Addanki, M. Garg, V. Swaminathan),
ACM Web Conference (WWW), 2023 arXiv
-
Dimension-agnostic inference using cross U-statistics (with I. Kim),
Bernoulli, 2023
arXiv |
TLDR
We introduce dimension-agnostic inference, which is a novel approach for high-dimensional inference that ensures asymptotic validity regardless of how the dimension and sample size scale, while preserving minimax optimal power across diverse scenarios; our main tool is a cross U-statistic, which drops half of the terms of a degenerate U-statistic to yield a limiting Gaussian distribution.
-
On the power of conditional independence testing under model-X (with E. Katsevich),
Electronic J. Stat, 2023
arXiv | article
-
Permutation tests using arbitrary permutation distributions (with R. Barber, E. Candes, R. Tibshirani),
Sankhya A, 2023
arXiv | article
-
Conformal prediction beyond exchangeability (with R. Barber, E. Candes, R. Tibshirani),
Annals of Stat., 2023
arXiv | article
-
Faster online calibration without randomization: interval forecasts and the power of two choices (with C. Gupta),
Conf. on Learning Theory (COLT), 2022
arXiv | article
-
Top-label calibration and multiclass-to-binary reductions (with C. Gupta),
Intl. Conf. on Learning Representations, 2022
arXiv | article
-
Gaussian universal likelihood ratio testing (with R. Dunn, S. Balakrishnan, L. Wasserman),
Biometrika, 2022
arXiv | article |
TLDR
Under a Gaussian setting, we present the first in-depth exploration of the size, power, and relationships between several universal inference variants. We find that in this setting, the power of universal inference has the same behavior in n, d, alpha and SNR as the classical Wilks' Theorem approach, only losing in a small constant of about 2.
-
A permutation-free kernel two sample test (with S. Shekhar, I. Kim),
Conf. on Neural Information Processing Systems (NeurIPS), 2022
arXiv | article | code | (oral talk) |
TLDR
We propose a new kernel-MMD statistic that drops half the terms of the original, and show that it has a standard normal limiting null distribution in low and high dimensional regimes. This results in a test that is easy to calibrate, that is up to two orders of magnitude faster than running the permutation test, at the price of a small ($\approx \sqrt{2}$ in effective sample size) reduction in power.
-
Testing exchangeability: fork-convexity, supermartingales, and e-processes (with J. Ruf, M. Larsson, W. Koolen).
Intl J. of Approximate Reasoning, 2022
arXiv | article
-
Tracking the risk of a deployed model and detecting harmful distribution shifts (with A. Podkopaev).
Intl. Conf. on Learning Representations (ICLR), 2022
arXiv | article
-
Brownian noise reduction: maximizing privacy subject to accuracy constraints (with J. Whitehouse, Z.S. Wu, R. Rogers),
Conf. on Neural Information Processing Systems (NeurIPS), 2022
arXiv | article
-
Sequential estimation of quantiles with applications to A/B-testing and best-arm identification (with S. Howard),
Bernoulli, 2022
arXiv | article | code
-
Brainprints: identifying individuals from magnetoencephalograms (with S. Wu, L. Wehbe),
Nature Communications Biology, 2022
bioRxiv | article
-
Interactive rank testing by betting (with B. Duan, L. Wasserman),
Conf. on Causal Learning and Reasoning (CLEAR), 2022
arXiv | article (oral talk)
-
Large-scale simultaneous inference under dependence (with J. Tian, X. Chen, E. Katsevich, J. Goeman),
Scandanavian J of Stat., 2022
arXiv | article
-
False discovery rate control with e-values (with R. Wang),
J. of the Royal Stat. Soc., Series B, 2022
arXiv | article
-
Nested conformal prediction and quantile out-of-bag ensemble methods (with C. Gupta, A. Kuchibhotla),
Pattern Recognition, 2022
arXiv | article | code
-
Distribution-free prediction sets for two-layer hierarchical models (with R. Dunn, L. Wasserman),
J of American Stat. Assoc., 2022
arXiv | article | code |
TLDR
Conformal methods typically rely on exchangeable data to provide valid prediction sets in finite samples, but we extend conformal methods to construct prediction sets in a nonexchangeable two-layer hierarchical setting, where N groups of data are exchangeable, and the observations within each group are also exchangeable.
-
Fast and powerful conditional randomization testing via distillation (with M. Liu, E. Katsevich, L. Janson),
Biometrika, 2021
arXiv | article | code
-
Uncertainty quantification using martingales for misspecified Gaussian processes (with W. Neiswanger),
Algorithmic Learning Theory (ALT), 2021 arXiv | article | code | talk
-
RiLACS: Risk-limiting audits via confidence sequences (with I. Waudby-Smith, P. Stark),
Intl. Conf. for Electronic Voting (EVoteID), 2021
arXiv | article | code (Best Paper award)
-
Predictive inference with the jackknife+ (with R. Barber, E. Candes, R. Tibshirani),
Annals of Stat., 2021
arXiv | article | code
-
Path length bounds for gradient descent and flow (with C. Gupta, S. Balakrishnan),
J. of Machine Learning Research, 2021
arXiv | article | blog
-
Nonparametric iterated-logarithm extensions of the sequential generalized likelihood ratio test (with J. Shin, A. Rinaldo),
IEEE J. on Selected Areas in Info. Theory, 2021
arXiv | article
-
Time-uniform, nonparametric, nonasymptotic confidence sequences (with S. Howard, J. Sekhon, J. McAuliffe),
The Annals of Stat., 2021
arXiv | article | code | tutorial
-
Off-policy confidence sequences (with N. Karampatziakis, P. Mineiro),
Intl. Conf. on Machine Learning (ICML), 2021
arXiv | article
-
Best arm identification under additive transfer bandits (with O. Neopane, A. Singh),
Asilomar Conf. on Signals, Systems and Computers, 2021
arXiv | article (Best Student Paper award)
-
On the bias, risk and consistency of sample means in multi-armed bandits (with J. Shin, A. Rinaldo),
SIAM J. on the Math. of Data Science, 2021
arXiv | article | talk
-
Dynamic algorithms for online multiple testing (with Z. Xu),
Conf. on Math. and Scientific Machine Learning, 2021
arXiv | article | talk | slides | code |
TLDR
We develop the first practically powerful algorithms that provably controls the supremum of the false discovery proportion with high probability in online multiple testing.
-
Online control of the familywise error rate (with J. Tian),
Statistical Methods in Medical Research, 2021
arXiv | article
-
Asynchronous online testing of multiple hypotheses (with T. Zrnic, M. Jordan),
J. of Machine Learning Research, 2021
arXiv | article | code | blog
-
Classification accuracy as a proxy for two sample testing (with I. Kim, A. Singh, L. Wasserman),
Annals of Stat., 2021
arXiv | article | (JSM Stat Learning Student Paper Award) | TLDR
We explore the use of classification accuracy for two-sample testing for general classifiers, proving in particular that the accuracy test based on Fisher's LDA achieves minimax rate-optimal power and establishing conditions for consistency based on general classifiers.
-
Distribution-free calibration guarantees for histogram binning without sample splitting (with C. Gupta),
Intl. Conf. on Machine Learning, 2021
arXiv | article
-
Distribution-free uncertainty quantification for classification under label shift (with A. Podkopaev),
Conf. on Uncertainty in AI, 2021
arXiv | article
-
Distribution-free binary classification: prediction sets, confidence intervals and calibration (with C. Gupta, A. Podkopaev),
Conf. on Neural Information Processing Systems (NeurIPS), 2020
arXiv | article (spotlight talk)
-
The limits of distribution-free conditional predictive inference (with R. Barber, E. Candes, R. Tibshirani),
Information and Inference, 2020
arXiv | article
-
Analyzing student strategies in blended courses using clickstream data (with N. Akpinar, U. Acar),
Educational Data Mining, 2020
arXiv | article | talk (oral talk)
-
The power of batching in multiple hypothesis testing (with T. Zrnic, D. Jiang, M. Jordan),
Intl. Conf. on AI and Statistics, 2020
arXiv | article | talk
-
Online control of the false coverage rate and false sign rate (with A. Weinstein),
Intl. Conf. on Machine Learning (ICML), 2020
arXiv | article
-
Confidence sequences for sampling without replacement (with I. Waudby-Smith),
Conf. on Neural Information Processing Systems (NeurIPS), 2020
arXiv | article | code (spotlight talk)
-
Universal inference (with L. Wasserman, S. Balakrishnan),
Proc. of the National Academy of Sciences, 2020
arXiv | article | talk
-
A unified framework for bandit multiple testing (with Z. Xu, R. Wang),
Conf. on Neural Information Processing Systems (NeurIPS), 2020
arXiv | article |
talk | slides | code |
TLDR
Using e-values (or e-processes) and the e-BH procedure, we formulate a framework which provides false discovery rate (FDR) control at any stopping time for multiple testing in the bandit setting, that is robust to the dependencies induced by the user’s sampling and stopping policies.
-
Simultaneous high-probability bounds on the FDP in structured, regression and online settings (with E. Katsevich),
Annals of Stat., 2020
arXiv | article | code
-
Time-uniform Chernoff bounds via nonnegative supermartingales (with S. Howard, J. Sekhon, J. McAuliffe),
Prob. Surveys, 2020
arXiv | article | talk
-
STAR: A general interactive framework for FDR control under structural constraints (with L. Lei, W. Fithian),
Biometrika, 2020
arXiv | article | poster | code
-
Familywise error rate control by interactive unmasking (with B. Duan, L. Wasserman),
Intl. Conf. on Machine Learning (ICML), 2020
arXiv | article | code
-
Interactive martingale tests for the global null (with B. Duan, S. Balakrishnan, L. Wasserman),
Electronic J. of Stat., 2020
arXiv | article | code
-
On conditional versus marginal bias in multi-armed bandits (with J. Shin, A. Rinaldo),
Intl. Conf. on Machine Learning (ICML), 2020
arXiv | article
-
Are sample means in multi-armed bandits positively or negatively biased? (with J. Shin, A. Rinaldo),
Conf. on Neural Information Processing Systems (NeurIPS), 2019
arXiv | article | poster
-
A higher order Kolmogorov-Smirnov test (with V. Sadhanala, Y. Wang, R. Tibshirani),
Intl. Conf. on AI and Statistics, 2019
arXiv | article
-
ADDIS: an adaptive discarding algorithm for online FDR control with conservative nulls (with J. Tian),
Conf. on Neural Information Processing Systems (NeurIPS), 2019
arXiv | code | article
-
A unified treatment of multiple testing with prior knowledge using the p-filter (with R. F. Barber, M. Wainwright, M. Jordan),
Annals of Stat., 2019
arXiv | article | code
-
DAGGER: A sequential algorithm for FDR control on DAGs (with J. Chen, M. Wainwright, M. Jordan),
Biometrika, 2019
arXiv | article | code
-
Conformal prediction under covariate shift (with R. Tibshirani, R. Barber, E. Candes),
Conf. on Neural Information Processing Systems (NeurIPS), 2019
arXiv | article | poster
-
Optimal rates and tradeoffs in multiple testing (with M. Rabinovich, M. Wainwright, M. Jordan),
Statistica Sinica, 2019
arXiv | article | poster
-
Function-specific mixing times and concentration away from equilibrium (with M. Rabinovich, M. Wainwright, M. Jordan),
Bayesian Analysis, 2019
arXiv | article | poster
-
Decoding from pooled data (II): sharp information-theoretic bounds (with A. El-Alaoui, F. Krzakala, L. Zdeborova, M. Jordan),
SIAM J. on Math. of Data Science, 2019
arXiv | article
-
Decoding from pooled data (I): phase transitions of message passing (with A. El-Alaoui, A. Ramdas, F. Krzakala, L. Zdeborova, M. Jordan),
IEEE Trans. on Info. Theory, 2018
arXiv | article
-
On the power of online thinning in reducing discrepancy (with R. Dwivedi, O. N. Feldheim, Ori Gurel-Gurevich),
Prob. Theory and Related Fields, 2018
arXiv | article | poster
-
On kernel methods for covariates that are rankings (with H. Mania, M. Wainwright, M. Jordan, B. Recht),
Electronic J. of Stat., 2018
arXiv | article
-
SAFFRON: an adaptive algorithm for online FDR control (with T. Zrnic, M. Wainwright, M. Jordan),
Intl. Conf. on Machine Learning (ICML), 2018
arXiv | article | code (full oral talk)
-
Online control of the false discovery rate with decaying memory (with F. Yang, M. Wainwright, M. Jordan),
Conf. on Neural Information Processing Systems (NeurIPS), 2017
arXiv | article | poster | talk (from 44:00) (full oral talk)
-
MAB-FDR: Multi (A)rmed\/(B)andit testing with online FDR control (with F. Yang, K. Jamieson, M. Wainwright),
Conf. on Neural Information Processing Systems (NeurIPS), 2017
arXiv | article | code (spotlight talk)
-
QuTE: decentralized FDR control on sensor networks (with J. Chen, M. Wainwright, M. Jordan),
IEEE Conf. on Decision and Control, 2017
arXiv | article | code | poster
-
Iterative methods for solving factorized linear systems (with A. Ma, D. Needell),
SIAM J. on Matrix Analysis and Applications, 2017
arXiv | article
-
Rows vs. columns : randomized Kaczmarz or Gauss-Seidel for ridge regression (with A. Hefny, D. Needell),
SIAM J. on Scientific Computing, 2017
arXiv | article
-
On Wasserstein two sample testing and related families of nonparametric tests (with N. Garcia, M. Cuturi),
Entropy, 2017
arXiv | article
-
Generative models and model criticism via optimized maximum mean discrepancy (with D. Sutherland, H. Tung, H. Strathmann, S. De, A. Smola, A. Gretton),
Intl. Conf. on Learning Representations, 2017
arXiv | article | poster | code
-
Minimax lower bounds for linear independence testing (with D. Isenberg, A. Singh, L. Wasserman),
IEEE Intl. Symp. on Information Theory, 2016
arXiv | article
-
p-filter: multi-layer FDR control for grouped hypotheses (with COAUTHORS),
J. of the Royal Stat. Society, Series B, 2016
arXiv | article |code | poster
-
Sequential nonparametric testing with the law of the iterated logarithm (with A. Balsubramani),
Conf. on Uncertainty in AI, 2016
arXiv | article | errata
-
Asymptotic behavior of Lq-based Laplacian regularization in semi-supervised learning (with A. El-Alaoui, X. Cheng, M. Wainwright, M. Jordan),
Conf. on Learning Theory, 2016
arXiv | article
-
Regularized brain reading with shrinkage and smoothing (with L. Wehbe, R. Steorts, C. Shalizi),
Annals of Applied Stat., 2015
arXiv | article
-
On the high-dimensional power of a linear-time two sample test under mean-shift alternatives (with S. Reddi, A. Singh, B. Poczos, L. Wasserman),
Intl. Conf. on AI and Statistics, 2015
arXiv | article | errata
-
On the decreasing power of kernel and distance based nonparametric hypothesis tests in high dimensions (with S. Reddi\*, B. Poczos, A. Singh, L. Wasserman),
AAAI Conf. on Artificial Intelligence, 2015
arXiv | article | supp
-
Fast two-sample testing with analytic representations of probability measures (with K. Chwialkowski, D. Sejdinovic, A. Gretton),
Conf. on Neural Information Processing Systems (NeurIPS), 2015
arXiv | article | code
-
Nonparametric independence testing for small sample sizes (with L. Wehbe),
Intl. Joint Conf. on AI, 2015
arXiv | article (oral talk)
-
Convergence properties of the randomized extended Gauss-Seidel and Kaczmarz methods (with A. Ma, D. Needell),
SIAM J. on Matrix Analysis and Applications, 2015
arXiv | article | code
-
Fast & flexible ADMM algorithms for trend filtering (with R. Tibshirani),
J. of Computational and Graphical Statistics, 2015
arXiv | article | talk | code
-
Towards a deeper geometric, analytic and algorithmic understanding of margins (with J. Pena),
Opt. Methods and Software, 2015
arXiv | article
-
Margins, kernels and non-linear smoothed perceptrons (with J. Pena),
Intl. Conf. on Machine Learning (ICML), 2014
arXiv | article | poster | talk oral talk
-
Simultaneously uncovering the patterns of brain regions involved in different story reading subprocesses (with L. Wehbe, B. Murphy, P. Talukdar, A. Fyshe, T. Mitchell),
PLoS ONE, 2014
website | article
-
An analysis of active learning with uniform feature noise (with A. Singh, L. Wasserman, B. Poczos),
Intl. Conf. on AI and Statistics, 2014
arXiv | article | poster | talk (oral talk)
-
Algorithmic connections between active learning and stochastic convex optimization (with A. Singh),
Conf. on Algorithmic Learning Theory (ALT), 2013
arXiv | article | poster
-
Optimal rates for stochastic convex optimization under Tsybakov's noise condition (with A. Singh),
Intl. Conf. on Machine Learning (ICML), 2013
arXiv | article | poster | talk (oral talk)