Safe, Anytime-Valid Inference (SAVI).

A large fraction of published research in top journals in applied sciences such as medicine and psychology has been claimed as irreproducable. In light of this 'replicability crisis', traditional methods for hypothesis testing, most notably those based on p-values, have come under intense scrutiny. One central problem is the following: if our test result is promising but nonconclusive (say, p = 0.07) we cannot simply decide to gather a few more data points. While this practice is ubiquitous in science, it invalidates p-values and error guarantees and makes the results of standard meta-analyses very hard to interpret. This issue is not unique for p-values: other approaches, such as replacing testing by estimation with confidence intervals, suffer from similar optional continuation problems. Over the last few years several distinct but closely related solutions have been proposed, such as anytime-valid confidence sequences, anytime-valid p-values, and safe tests.

Remarkably, all these approaches can be understood in terms of (sequential) gambling. One formulates a gambling strategy under which one would not expect to gain any money if the null hypothesis were true. If for the given data one would have won a large amount of money in this game, this provides evidence against the null hypothesis. The test statistic in traditional statistics gets replaced by the gambling strategy; the p-value gets replaced by the (virtual) amount of money gained. In more mathematical terms, evidence against the null and confidence sets are derived in terms of nonnegative supermartingales. While this idea in essence goes back to Wald’s sequential testing of the 1950s and its extensions by Robbins, Darling, Siegmund and Lai in the 1960s and 1970s, it never really caught on because it used to be applicable only to very simple statistical models and testing scenarios.

However, recent work shows that this idea is essentially universally applicable – one can design supermartingales for large classes of tests, both parametric and nonparametric, and many estimation problems, yielding anytime P-values using nonasymptotic versions of the law of the iterated logarithm. A variation of the idea, the S-value, can even be applied to completely arbitrary tests - at the price of only allowing for a weaker form of optional stopping. All these techniques have both a gambling and a Bayes factor interpretation. Thus, these directions are able to somewhat unite Bayesian and frequentist ways of thinking; with the explicit ability to use prior knowledge, with frequentist error control and confidence bounds, but often using Bayesian techniques.

This workshop aims to get together two groups of researchers --- those who have been developing the mathematical, probabilistic and statistical foundations of this area, and practitioners who have studied and written about the reproducibility crisis in the sciences.

There has been much progress on these martingale techniques in the last 5 years or so, and researchers within the first group often do not know or understand each other’s work very well; they also don’t know what the most pressing issues in ‘reproducibility practice’ are. And researchers within the second group often do not know about the new (or resurrection of old) approaches, and what they can achieve. The aim of this workshop is to bring these two groups together, hopefully resulting in new theory informed by what is really important in practice, and practical applications of what are highly promising theoretical developments. We will start with two tutorials that should be understandable for everyone with basic knowledge of classical statistics – thus, all invitees. In the remainder of the workshop we will intersperse more general talks (given by researchers in the second group and a subset of researchers in the first) with more technical talks by researchers in the first group – while some of these may be quite technical, we will ensure that everyday a large fraction of the talks is accessible to the full intended audience.

This conference supports the Welcoming Environment Statement of the Association for Women in Mathematics (AWM).

This workshop is being co-sponsored by EURANDOM, 6PAC and the IMS.

Organizing Committee

- Aaditya Ramdas

Carnegie Mellon University

- Peter Grunwald

CWI, Amsterdam

- Supporting organizer:

Emilie Kaufmann

INRIA, Lille

- Administrative support:

Patty Koorn

EURANDOM

Confirmed attendees

Glenn Shafer

Professor, Rutgers University (Statistics)

Christian Robert

Professor, Université Paris-Dauphine + Warwick University (Statistics)

Vladimir Vovk

Professor, Royal Holloway London (CS)

Larry Wasserman

Professor, Carnegie Mellon University (Stat+ML)

Chris Jennison

Professor, University of Bath (Stat)

Leonhard Held

Professor, University of Zurich (Biostatistics, Center for Reproducible Science)

Eric-Jan Wagenmakers

Professor, University of Amsterdam (Psychology)

Deborah Mayo

Professor, Virginia Tech (Philosophy)

Luigi Pace

Professor, University of Udine (Economics, Statistics)

Alessandra Salvan

Professor, University of Padova (Statistics)

Daniel Lakens

Associate Professor, Eindhoven University of Technology (Human-Technology Interaction)

Theis Lange

Associate Professor, University of Copenhagen (Biostatistics)

Ruodu Wang

Associate Professor, University of Waterloo (Statistics, Actuarial Science)

Mark Simmonds

Sr. Research Fellow, University of York (Center for Reviews and Dissemination)

Wouter Koolen

Sr. Researcher, CWI, Amsterdam (CS)

Alexander Ly

Postdoc, University of Amsterdam and CWI (Psychology, CS)

Akshay Balsubramani

Postdoc, Stanford University (Genetics, CS)

Leonid Pekelis

Statistician, Opendoor

Alan Malek

Statistican, Optimizely

Anne Lyngholm Sørensen

PhD student, University of Copenhagen (Biostatistics)

Boyan Duan

PhD student, Carnegie Mellon University (Statistics)

Rianne de Heide

PhD student, CWI (CS)

Tudor Manole

PhD student, Carnegie Mellon University (Statistics)

Judith ter Schure

Phd student, CWI (CS)

Ian Waudby-Smith

PhD student, Carnegie Mellon University (Statistics)

Rosanne Turner

Ph.D. student, CWI/University Medical Centre Utrecht (Statistics)

Aaditya Ramdas

Assistant Professor, Carnegie Mellon University (Statistics, Machine Learning)

Peter Grunwald

Professor, CWI (CS)

Registration Details

TBD

Dates, Times, and Location

The workshop is a full 5 days. The talks will begin around 8:30am on Monday May 25, 2020 and will end by 5pm on Friday May 29, 2020. The workshop will take place at EURANDOM in Eindhoven, Netherlands.

Availability of Funding

TBD

Poster Session

There will be one or two poster sessions. Anyone wishing to present a poster should indicate this on the online registration form. The topic of the poster should be related to the content of the workshop. Please check with the organizers if you are unsure about the suitability of your poster topic.

Lodging Information

TBD

Potential Topics include (but are certainly not limited to):

Confidence sequences (anytime valid confidence intervals) and techniques to construct them

Always-valid p-values (and their relationship to confidence sequences)

Safe testing, S-values, and how they're similar and different from p-values, and how to construct them

Sequential testing using the sequential likelihood ratio test and its nonparametric generalizations

The mixture method, self-normalization, nuisance parameters, and the law of the iterated logarithm

Gambling, betting, Bayes factors, supermartingales and friends

Reproducibility, or lack thereof, in science, and the relationship of the crisis to anytime/safe inference

Existing public software packages for these new methods, and how to use them

Other applications of the aforementions methods, outside of the sciences

Historical perspectives: what did Wald, Robbins, Ville, Doob and others do over 50 years ago?

(Acknowledgments to Todd Kuffner from whom this workshop webpage template was borrowed.)