R package to scrape soccer commentary and statistics from ESPN.


An R package to compute WAR for offensive players using nflscrapR.


Developing R package with Max Horowitz and Sam Ventura that allows R users to utilize and analyze data from the National Football League (NFL) API. The functions in this package allow users to perform analysis at the play and game levels on single games and entire seasons. With open-source data, the development of reproducible advanced NFL metrics can occur at a more rapid pace and lead to growing the football analytics community.


Enter marquis de Laplace In my first post on Bayesian data analysis, I did a brief overview of how Bayesian updating works using grid approximation to arrive at posterior distributions for our parameters of interest, such as a wide receiver’s catch rate. While the grid-based approach is simple and easy to follow, it’s just not practical. Before we turn to MCMC, in this post we’ll cover the popular approach known as Laplace approximation1, aka quadratic approximation.


First steps I was originally thinking of writing a blog post about multilevel models (aka hierachical, mixed, random effects) because of how useful they are for measuring player performance in sports1 (shameless self promotion for nflWAR here!). But the more I thought about it, the more I realized how ill-minded of an idea that was. Instead, I want to build up the intuition for how and why one would want to use a full Bayesian multilevel model.


First post! Since I have successfully survived my first year of grad school, I’ve decided to start blogging while working on my various projects. I’m doing this to: (1) share what I’m working on (instead of just tweeting), (2) learn by attempting to write instructive blog posts, and (3) because multiple people keep telling me this is a good idea… we’ll find out. Hopefully you’ll find my posts useful and educational, as I’ll be providing R code thanks to the incredible blogdown package by Yihui Xie (more info here).


Baseball Analytics Workshop


Hosted by the CMU Statistics & Data Science Department and Carnegie Mellon Sports Analytics Club, the Carnegie Mellon University Baseball Analytics Workshop is an interactive workshop focusing on data exploration skills with baseball data!

Intro to Exploring Baseball Data with R

The starter code script for the first of the workshop, using dplyr and ggplot2 to explore historical baseball data from the Lahman package is available here.

PITCHf/x and Statcast Resources

The starter code script for the second half of the workshop, dedicated to using PITCHf/x and Statcast data for creating a game plan for the Pirates against the Reds, is available here.

Additional resources that will be useful for working with this data include:

The repository for all of the workshop’s material is located here.

All PITCHf/x and Statcast data, made available by MLBAM, was accessed using the baseballr package.