Tartan Data Science Cup

Episode II: Analytics Strikes Back!

Problem Overview

Understanding if a customer is likely to buy a particular product is an important factor that drives decisions at Kroger.

Using the customer purchase history data provided here, your team is asked to predict which customers will purchase eggs in the following week.

Specifically, your team should answer the following questions in your report:

Additional information on the problem can be found here.


Please create a two-column .csv file that contains the household_key and probability of purchasing eggs in the next week for each of the 967 households in the training dataset. A template for your submission can be found here.

IMPORTANT: When submitting your predictions, your file must match the exact format of the template file:


You may not use any other data sources aside from the dataset provided above. Exactly how you justify your answer is up to you. That said, we suggest the following:

Submissions: Each team should submit a single .zip file that contains:

Submission constitutes permission to post (anonymized) winning team entries online.

Finalists: Eight teams will make the finals, based on the following criteria: