Aaditya Ramdas – Sequential uncertainty quantification

A large fraction of published research in top journals in applied sciences such as medicine and psychology has been claimed as irreproducable. In light of this 'replicability crisis’, traditional methods for hypothesis testing, most notably those based on p-values, have come under intense scrutiny. One central problem is the following: if our test result is promising but nonconclusive (say, p = 0.07) we cannot simply decide to gather a few more data points. While this practice is ubiquitous in science, it invalidates p-values and error guarantees and makes the results of standard meta-analyses very hard to interpret. This issue is not unique for p-values: other approaches, such as replacing testing by estimation with confidence intervals, suffer from similar optional continuation problems. Over the last few years several distinct but closely related solutions have been proposed, such as the anytime confidence sequences and p-values, and safe tests.

Remarkably, all these approaches can be understood in terms of (sequential) gambling. One formulates a gambling strategy under which one would not expect to gain any money if the null hypothesis were true. If for the given data one would have won a large amount of money in this game, this provides evidence against the null hypothesis. The test statistic in traditional statistics gets replaced by the gambling strategy; the p-value gets replaced by the (virtual) amount of money gained. In more mathematical terms, evidence against the null and confidence sets are derived in terms of nonnegative supermartingales. While this idea in essence goes back to Wald’s sequential testing of the 1950s and its extensions by Robbins and co in the early 1960s and Lai in the 1970s, it never really caught on because it used to be applicable only to very simple statistical models and testing scenarios. However, recent work shows that this idea is essentially universally applicable – one can design supermartingales for large classes of nonparametric tests and many estimation problems, and one can analyze them using novel tools such as nonasymptotic versions of the law of the iterated logarithm. Also, these directions are able to somewhat unite Bayesian, frequentist ways of thinking; with the explicit ability to use prior knowledge, with correct frequentist inference often using Bayesian techniques.

Anytime-valid, safe confidence intervals and p-values (package) (tutorial)

  • Admissible anytime-valid sequential inference must rely on nonnegative martingales
    A. Ramdas, J. Ruf, M. Larsson, W. Koolen       arxiv

  • Testing exchangeability: fork-convexity, supermartingales, and e-processes
    A. Ramdas, J. Ruf, M. Larsson, W. Koolen       Intl J of Approximatte Reasoning, 2021   arxiv   proc

  • Estimating means of bounded random variables by betting
    I. Waudby-Smith, A. Ramdas       arxiv  

  • Comparing sequential forecasters
    Y.J. Choe, A. Ramdas       arxiv  

  • RiLACS: Risk-limiting audits via confidence sequences
    I. Waudby-Smith, P. Stark, A. Ramdas       EVoteID, 2021   arxiv   (Best Paper award)

  • Sequential estimation of convex functionals and divergences
    T. Manole, A. Ramdas       arxiv   video   (Student Research Award, Statistical Society of Canada)

  • Off-policy confidence sequences
    N. Karampatziakis, P. Mineiro, A. Ramdas       ICML, 2021   arxiv  

  • Doubly robust confidence sequences for sequential causal inference
    I. Waudby-Smith, D. Arbour, R. Sinha, E. Kennedy, A. Ramdas       arxiv   package  

  • Time-uniform, nonparametric, nonasymptotic confidence sequences
    S. Howard, A. Ramdas, J. Sekhon, J. McAuliffe       The Annals of Stat., 2021   arxiv   proc   code   tutorial

  • Time-uniform Chernoff bounds via nonnegative supermartingales
    S. Howard, A. Ramdas, J. Sekhon, J. McAuliffe       Prob. Surveys, 2020   arxiv   proc   talk

  • Universal inference
    L. Wasserman, A. Ramdas, S. Balakrishnan       PNAS, 2020   arxiv   proc   talk

  • Sequential nonparametric testing with the law of the iterated logarithm
    A. Balsubramani*, A. Ramdas*       Uncertainty in AI, 2016   arxiv   proc   errata

  • Nonparametric iterated-logarithm extensions of the sequential generalized likelihood ratio test
    J. Shin, A. Ramdas, A. Rinaldo       IEEE J. on Selected Areas in Info. Theory, 2021   arxiv   proc

  • Sequential estimation of quantiles with applications to A/B-testing and best-arm identification
    S. Howard, A. Ramdas       Bernoulli, 2022   arxiv   code

Multi-armed bandits

All of the aforementioned techniques come in handy when designing new algorithms for multi-armed bandit problems, as well as to understand what existing algorithms are doing in quite some generality.

  • A unified framework for bandit multiple testing
    Z. Xu, R. Wang, A. Ramdas       NeurIPS, 2021   arxiv  

  • Best arm identification under additive transfer bandits
    O. Neopane, A. Singh, A. Ramdas       Asilomar, 2021   (Best Student Paper award)

  • On conditional versus marginal bias in multi-armed bandits
    J. Shin, A. Ramdas, A. Rinaldo       ICML, 2020   arxiv

  • Are sample means in multi-armed bandits positively or negatively biased?
    J. Shin, A. Ramdas, A. Rinaldo       NeurIPS, 2019   arxiv   poster   proc

  • On the bias, risk and consistency of sample means in multi-armed bandits
    J. Shin, A. Ramdas, A. Rinaldo       SIAM J Math of Data Science (SIMODS), 2022   arxiv   talk

  • MAB-FDR: Multi (A)rmed/(B)andit testing with online FDR control
    F. Yang, A. Ramdas, K. Jamieson, M. Wainwright       NeurIPS, 2017   arxiv   code   30-min talk   proc   (spotlight talk)