Aaditya Ramdas – Sequential uncertainty quantification

A large fraction of published research in top journals in applied sciences such as medicine and psychology has been claimed as irreproducable. In light of this 'replicability crisis’, traditional methods for hypothesis testing, most notably those based on p-values, have come under intense scrutiny. One central problem is the following: if our test result is promising but nonconclusive (say, p = 0.07) we cannot simply decide to gather a few more data points. While this practice is ubiquitous in science, it invalidates p-values and error guarantees and makes the results of standard meta-analyses very hard to interpret. This issue is not unique for p-values: other approaches, such as replacing testing by estimation with confidence intervals, suffer from similar optional continuation problems. Over the last few years several distinct but closely related solutions have been proposed, such as the anytime confidence sequences and p-values, and safe tests.

Remarkably, all these approaches can be understood in terms of (sequential) gambling. One formulates a gambling strategy under which one would not expect to gain any money if the null hypothesis were true. If for the given data one would have won a large amount of money in this game, this provides evidence against the null hypothesis. The test statistic in traditional statistics gets replaced by the gambling strategy; the p-value gets replaced by the (virtual) amount of money gained. In more mathematical terms, evidence against the null and confidence sets are derived in terms of nonnegative supermartingales. While this idea in essence goes back to Wald’s sequential testing of the 1950s and its extensions by Robbins and co in the early 1960s and Lai in the 1970s, it never really caught on because it used to be applicable only to very simple statistical models and testing scenarios. However, recent work shows that this idea is essentially universally applicable – one can design supermartingales for large classes of nonparametric tests and many estimation problems, and one can analyze them using novel tools such as nonasymptotic versions of the law of the iterated logarithm. Also, these directions are able to somewhat unite Bayesian, frequentist ways of thinking; with the explicit ability to use prior knowledge, with correct frequentist inference often using Bayesian techniques.

Anytime-valid, safe confidence intervals and p-values (package)

  • Sequential estimation of quantiles with applications to A/B-testing and best-arm identification
    S. Howard, A. Ramdas   arxiv   code

  • Uniform, nonparametric, nonasymptotic confidence sequences
    S. Howard, A. Ramdas, J. Sekhon, J. McAuliffe   arxiv   talk

  • Exponential line-crossing inequalities
    S. Howard, A. Ramdas, J. Sekhon, J. McAuliffe  
    (PS) Probability Surveys, 2020   arxiv

  • Universal inference using the split likelihood ratio test
    L. Wasserman, A. Ramdas, S. Balakrishnan   arxiv

  • Sequential nonparametric testing with the law of the iterated logarithm
    A. Balsubramani*, A. Ramdas*
    (UAI) Uncertainty in Artificial Intelligence, 2016   arxiv   UAI   errata

Multi-armed bandits

All of the aforementioned techniques come in handy when designing new algorithms for multi-armed bandit problems, as well as to understand what existing algorithms are doing in quite some generality.

  • On conditional versus marginal bias in multi-armed bandits
    J. Shin, A. Ramdas, A. Rinaldo   arxiv

  • Are sample means in multi-armed bandits positively or negatively biased?
    J. Shin, A. Ramdas, A. Rinaldo  
    (NeurIPS) Neural Information Processing Systems, 2019   arxiv

  • On the bias, risk and consistency of sample means in multi-armed bandits
    J. Shin, A. Ramdas, A. Rinaldo   arxiv   talk

  • MAB-FDR: Multi (A)rmed/(B)andit testing with online FDR control
    F. Yang, A. Ramdas, K. Jamieson, M. Wainwright
    (NeurIPS) Neural Information Processing Systems, 2017   arxiv   code   30-min talk   NeurIPS   (spotlight talk)