Tartan Data Science Cup: Episode II - Analytics Strike Back

The data for Episode II can be found here.

The problem and rules for Episode II can be found here.

We were especially excited to partner with 84.51° for this episode. Participants actively engaged in a real data analytics scenario provided by 84.51°. Their team of analysts and recruiters were on campus to engage with participants, answer any questions, and judge the finalists.

We were especially excited to partner with 84.51° for this episode. Participants actively engaged in a real data analytics scenario provided by 84.51°. Their team of analysts and recruiters were on campus to engage with participants, answer any questions, and judge the finalists.

1st place: Polar Bear (Michael Rosenberg)
2nd place: #1 Hammer Bosses (David Hashe, Justin Kim, Kevin Ouyang)
3rd place: Why (Boxiang Lyu, Sheng Xu, Xue Yang)

Organizer's Choice: The Chi-squared Factor (Mohin Banker, James Eby, Suvrath Penmetcha)
Master's Division: Heteroscadasticity Terminator (Connie Chen, Mengran He, Alex Lam)

Episode II - Analytics Strike Back Timeline

Information Session: Wednesday, October 5th, 2016, 5:30-6:30pm

The information session will provide additional information on the format and logistics of the competition. An analyst from 84.51° will also be available for questions. This is also an opportunity for students to form teams and register for the competition. Students do not have to attend the information session to participate.

Pizza and drinks will be provided!
Specific information about the data or research topics, however, will not.

Registration Deadline: Thursday, October 6th, 2016, 11:59pm

All currently enrolled Carnegie Mellon University undergraduate students on the Pittsburgh campus are eligible to participate. Teams can be from 1-3 students; students can only participate on one team. All student names and Andrew IDs must be included when registering. Registration must also include a (non-identifying) team name.
To register, click here.

Release the Data! Friday, October 7th, 2016

The data set and variable descriptions will be available on Friday evening but without details about the specific competition questions. Participants should try to do some exploratory data analysis prior to the competition in order to focus their efforts on Sunday.

Meet-and-Greet with 84.51: Saturday, October 8th, 2016, 5-7pm, CUC Peter/McKenna/Wright

Please join us at a reception sponsored by 84.51° to network and talk about statistics, data analytics, or the TDSC!

84.51° analysts and recruiters will also be at the reception to answer questions about the TDSC data and analytics employment opportunities. Stop by!

Competition Begins! Sunday, October 9th, 2016, 9am, Baker Hall 136A (Adamson Wing):

The research problem and competition question(s) will be released on this website at 9am. Students are welcome to work anywhere, but Baker Hall 136A will be open all day as the TDSC Homebase. TDSC organizers will also be available during the day to answer questions.

Lunch will be provided for participants in the TDSC Homebase at 12pm.

Time's Up! Sunday, October 9th, 2016, 5pm:

At 5pm, submissions are due. Each team should submit a single .zip file to this website.
The zip file should contain:

  • all (well-documented!) code used to analyze the data, obtain results, create graphics, etc
    (any programming language/software is acceptable)
  • a set of predictions (more info on TDSC Sunday)
  • a 1-2 page report describing the key results and methods used to analyze the data (made as or converted to a pdf file)
  • up to 3 slides for a 5-minute research presentation (made as or converted to a pdf file)

Submission constitutes permission to post winning team entries online (under non-identifying team name).

Presentations and Final Results: Sunday, October 9th, 2016, 7pm, Porter 100

There will be a panel of judges from 84.51° and the Department of Statistics. The judges will review the code, reports, and slides from 5-7pm and then watch the slide presentations at 7pm. Students are encouraged to practice their presentations over the 5-7pm dinner break.

The top 8 teams will be given five minutes to present their methods and results to the judges, the other teams, and anyone else who wishes to attend. Teams can have up to three slides, but be careful -- you will be cut off after exactly five minutes! Teams outside of the top 8 are still eligible to win other prizes and encouraged to stay and watch the final presentations.

The judging criteria include:

  • Code: Is the submitted code reproducible, well-documented, correct, and easy to understand?
  • Report: Does the submitted report describe the problem, methods, and results in a clear and concise manner?
  • Presentation: Did the team communicate their problem, methods, and results to an uninformed audience effectively?
  • Results: Do the team's results sufficiently and effectively answer the research problems presented?

Prizes (sponsored by 84.51°)

1st place: $500
2nd place: $300
3rd place: $200

Additionally, the 1st place team will receive the Tartan Data Science Cup. After each competition, the Cup is presented to the winning team, who are allowed to keep the cup and gloat for a short period of time. Members of the winning team will have their names engraved onto the Cup.

TDSC Organizer Contact Information:

Sam Ventura (, Rebecca Nugent (