Carnegie Mellon's Department of Statistics

present

Tartan Data Science Cup



The Tartan Data Science Cup is a series of Kaggle-like data analysis competitions exclusively for CMU undergraduates. Each competition will have a different theme, research scenario, goals, and solutions. The problem description, research question, and data set will not be released until a specified date/time. Students will submit their answers by a specified deadline; selected finalists will present in front of a panel of judges.

Winning teams will receive cash prizes, (temporary ownership of) the Tartan Data Science Cup, and glory. So much glory.


Typical Episode Timeline


Information Session: Wednesday evening

The information session will provide additional information on the format and logistics of the competition. This is also an opportunity for students to form teams and register for the competition. Students do not have to attend the information session to participate.

Pizza and drinks will be provided!
Specific information about the data or research topics, however, will not.


Registration Deadline: Thursday evening

All currently enrolled Carnegie Mellon University undergraduate students on the Pittsburgh campus are eligible to participate. Teams can be from 1-3 students; students can only participate on one team. All student names and Andrew IDs must be included when registering. Registration must also include a (non-identifying) team name.

Release the Data! Friday evening

The data set and variable descriptions will be available on Friday evening but without details about the specific competition questions. Participants should try to do some exploratory data analysis prior to the competition in order to focus their efforts on Sunday.


Meet-and-Greet reception Saturday evening

Take advantage of an opportunity to network with company representatives and ask questions about the data set.


Competition Begins! Sunday morning:

The research problem and competition question(s) will be released on this website. Students are welcome to work anywhere, but the TDSC Homebase will be open all day. TDSC organizers will also be available during the day to answer questions.

Lunch will be provided for participants in the TDSC Homebase.


Time's Up! Sunday late afternoon:

Submissions are due. Each team should submit a single .zip file to this website.
The zip file should contain:

  • all (well-documented!) code used to analyze the data, obtain results, create graphics, etc
    (any programming language/software is acceptable)
  • a set of predictions (more info on TDSC Sunday)
  • a 1-2 page report describing the key results and methods used to analyze the data (made as or converted to a pdf file)
  • up to 3 slides for a 5-minute research presentation (made as or converted to a pdf file)

Submission constitutes permission to post winning team entries online (under non-identifying team name).


Presentations and Final Results: Sunday early evening

A panel of judges will review the code, reports, and slides and then watch the slide presentations. Students are encouraged to practice their presentations over the dinner break.

The top 8 teams will be given five minutes to present their methods and results to the judges, the other teams, and anyone else who wishes to attend. Teams can have up to three slides, but be careful -- you will be cut off after exactly five minutes! Teams outside of the top 8 are still eligible to win other prizes and encouraged to stay and watch the final presentations.

The judging criteria include:

  • Code: Is the submitted code reproducible, well-documented, correct, and easy to understand?
  • Report: Does the submitted report describe the problem, methods, and results in a clear and concise manner?
  • Presentation: Did the team communicate their problem, methods, and results to an uninformed audience effectively?
  • Results: Do the team's results sufficiently and effectively answer the research problems presented?


(Typical) Prizes

alternative text

1st place: $500
2nd place: $300
3rd place: $200

Additionally, the 1st place team will receive the Tartan Data Science Cup. After each competition, the Cup is presented to the winning team, who are allowed to keep the cup and gloat for a short period of time. Members of the winning team will have their names engraved onto the Cup.

TDSC Organizer Contact Information:

Sam Ventura (sventura@stat.cmu.edu), Rebecca Nugent (rnugent@stat.cmu.edu).