Safe, Anytime Valid Inference (SAVI) and Game-theoretic Statistics, 2022

Safe, Anytime-Valid Inference (SAVI) and Game-theoretic Statistics.
~~May 25-29, 2020 (Covid), Jun 28 to Jul 2, 2021 (Covid)~~ May 30-Jun 3, 2022 in Eindhoven, Netherlands

Official workshop page at EURANDOM

A large fraction of published research in top journals in applied sciences such as medicine and psychology has been claimed as irreproducable. In light of this 'replicability crisis', traditional methods for hypothesis testing, most notably those based on p-values, have come under intense scrutiny. One central problem is the following: if our test result is promising but nonconclusive (say, p = 0.07) we cannot simply decide to gather a few more data points. While this practice is ubiquitous in science, it invalidates p-values and error guarantees and makes the results of standard meta-analyses very hard to interpret. This issue is not unique for p-values: other approaches, such as replacing testing by estimation with confidence intervals, suffer from similar optional stopping/continuation problems. Over the last few years several distinct but closely related solutions have been proposed, such as confidence sequences, anytime-valid p-values, and e-processes/e-values. These form the basis of safe, anytime-valid inference.

Remarkably, all these approaches can be understood in terms of (sequential) gambling. One formulates a gambling strategy under which one would not expect to gain any money if the null hypothesis were true. If for the given data one would have won a large amount of money in this game, this provides evidence against the null hypothesis. The test statistic in traditional statistics gets replaced by the gambling strategy; the p-value gets replaced by the (virtual) amount of money gained. In more mathematical terms, evidence against the null and confidence sets are derived in terms of nonnegative supermartingales, or more generally, e-processes. While this idea in essence goes back to Wald’s sequential testing of the 1940s and its extensions by Robbins, Darling, Siegmund and Lai in the 1960s and 1970s, it never really caught on because it used to be applicable only to very simple statistical models and testing scenarios. Another line of ideas, beginning with Ville (1939), has been resurrected by Shafer and Vovk in the name of game-theoretic probability. The applications and generalizations of all these ideas form an active area that we call game-theoretic statistics.

Recent work shows that these ideas are essentially universally applicable –-- one can design nonnegative martingales, supermartingales or e-processes for large classes of tests and for many estimation problems, both parametric and nonparametric, also thereby yielding sequential tests, anytime-valid p-values using nonasymptotic versions of the law of the iterated logarithm, and confidence sequences by inversion. All these techniques have both a gambling and a Bayes factor interpretation. Thus, these directions are able to somewhat unite Bayesian and frequentist ways of thinking; with the explicit ability to use prior knowledge, with frequentist error control and confidence bounds, but often using Bayesian techniques.

This workshop aims to get together two groups of researchers --- those who have been developing the mathematical, probabilistic and statistical foundations of this area, and practitioners who have studied and written about the reproducibility crisis in the sciences.

There has been much progress on these martingale techniques in the last 5 years or so, and researchers within the first group often do not know or understand each other’s work very well; they also don’t know what the most pressing issues in ‘reproducibility practice’ are. And researchers within the second group often do not know about the new (or resurrection of old) approaches, and what they can achieve. The aim of this workshop is to bring these two groups together, hopefully resulting in new theory informed by what is really important in practice, and practical applications of what are highly promising theoretical developments. We will start with two tutorials that should be understandable for everyone with basic knowledge of classical statistics –-- thus, all invitees. In the remainder of the workshop we will intersperse more general talks (given by researchers in the second group and a subset of researchers in the first) with more technical talks by researchers in the first group. While some of these may be quite technical, we will strive to ensure that a large fraction of every day's talks are accessible to the full intended audience.

This conference supports the Welcoming Environment Statement of the Association for Women in Mathematics (AWM).

This workshop is being co-sponsored by EURANDOM, 6PAC and the IMS.

Organizing Committee

Aaditya Ramdas
Carnegie Mellon University

Peter Grunwald
CWI, Amsterdam

Administrative support:
Patty Koorn
EURANDOM

Registration Details and Confirmed Attendees

A partial list of confirmed attendees is available at the EURANDOM website linked above, and is updated from time to time. Please email the organizers if you wish to attend, approval will be granted on the basis of space (due to COVID restrictions).

Dates, Times, and Location

The workshop is a full 5 days. The talks will begin around 8:30am on Monday May 29, 2022 and will end by 5pm on Friday Jun 3, 2022. The workshop will take place at EURANDOM in Eindhoven, Netherlands.

Tentative Schedule

Contributed Reading List

Click here to contribute to the above reading list for SAVI

Poster Session

There will be one or two poster sessions. Anyone wishing to present a poster should indicate this on the online registration form. The topic of the poster should be related to the content of the workshop. Please check with the organizers if you are unsure about the suitability of your poster topic.

Potential Topics include (but are certainly not limited to):
Confidence sequences (anytime valid confidence intervals) and techniques to construct them
Always-valid p-values (and their relationship to confidence sequences)
E-values/e-processes, and how they're similar and different from p-values, and how to construct them
Sequential testing using the sequential likelihood ratio test and its nonparametric generalizations
The mixture method, self-normalization, nuisance parameters, and the law of the iterated logarithm
Gambling, betting, Bayes factors, supermartingales and friends
Reproducibility, or lack thereof, in science, and the relationship of the crisis to anytime/safe inference
Existing public software packages for these new methods, and how to use them
Other applications of the aforementioned methods, outside of the sciences
Historical perspectives: what did Wald, Robbins, Ville, Doob and others do over 50 years ago?