Rebecca Nugent

Department of Statistics & Data Science
Dietrich College of Humanities & Social Sciences
Carnegie Mellon University

  • Home
  • Research
  • Education
  • Initiatives
  • Upcoming/News
  • ISLE/Data Science
  • Clustering
  • Record Linkage
  • Educational Data Mining/Psychometrics
  • Public Health
  • Teaching Statistics
  • Application Work
  • Students

ISLE and the Science of Data Science
http://www.stat.cmu.edu/isle

Integrated Statistics (Subject) Learning Environment: ISLE. Fully integrated educational software platform that supports lectures, videos, interactive statistics widgets, group collaboration on data analysis; remote learning; industry upskilling/retraining in data science

Tracking and characterizing how people do data science; crowd-sourcing distributions of data analysis workflows; science of data science; how do people interact with data; researching optimization of group collaboration

Grants

  • Carnegie Mellon ProSEED/Simon Initiative, Summer 2020
    Integrating a Statistical Learning Environment into the Writing Classroom; David West Brown (PI), Rebecca Nugent (co-PI), Philipp Burckhardt (co-PI); $15,000
  • Carnegie Mellon ProSEED/Simon Initiative, Summer 2018
    Data Analysis Think-Alouds: Student Engagement and Workflow on an Online Interactive Statistical Analysis Tool, Rebecca Nugent (PI) with Philipp Buckhardt (Co-Investigator); $15,000

Publications

  • Burckhardt P, Nugent R, Genovese C."Teaching Statistical Concepts and Modern Data Analysis with a Computing-Integrated Learning Environment",Journal of Statistics Education, Accepted with revisions, June 2020.

Presentations

For related presentations, see CV.

Clustering Methodology

Nonparametric multivariate analysis methods; clustering; mode analysis; cluster tress; spanning trees; generalized single linkage clustering; assessing significance of clusters; pruning and merging procedures

Graphics/Visualization; high-dimensional density estimation and visualization; characterizing subspaces of density estimates; variable/subspace selection

Grants

  • CMU Berkman Faculty Development Fund , Summer 2014
    Determining the Cluster Structure in High-Dimensional Data: A Visualization Tool for Merging Clusters; graduate student summer support ~$3000
  • Association for Women in Mathematics, Summer 2011
    Self-Tuning Diffusion Maps: Finding Local Cluster Structure while Reducing Dimensionality, research/travel grant, $2000 (declined due to last minute unavailability to travel)
  • The Royal Society of Edinburgh, Summer 2008
    Merging Clustering Methodologies for Visualization and Estimation of Group Structure
    PI with Nema Dean (PI, Dept of Statistics, Univ of Glasgow), research/travel grant, $5500

Publications

  • Flynt, A., Dean, N., Nugent, R. "A soft agreement measure for class partitions incorporating assignment probabilities", Advances in Data Analysis and Classification, March 2019, Vol 13, Number 1, p.303-323.
  • Dean, N. and Nugent, R. "Clustering Student Skill Set Profiles in a Unit Hypercube using Mixtures of Multivariate Betas" Advances in Data Analysis and Classification, Vol 7, No. 3, p.339-357, 2013.
  • Rinaldo, A, Singh, A, Nugent, R, Wasserman, L. "Stability of Density-Based Clustering". Journal of Machine Learning Research 13 (Apr):905-948, 2012 JMLR Link
  • Friedenberg, D and Nugent, R "Exploration of the Use of a Self-Tuning Diffusion Map Framework". International Statistical Institute Proceedings (2011) (invited). ISI 2011 IPS040.02, Jan 2012.
  • Dean, N and Nugent, R "Comparing Different Clustering Methods on the Unit Hypercube". International Statistical Institute Proceedings (2011) (invited). ISI 2011 IPS040.03, Jan 2012.
  • Nugent, R., Rinaldo, A, Singh, A, Wasserman, L. "Discussion of Stability Selection by Meinshausen and Buhlmann" Journal of the Royal Statistical Society B (2010). Vol 72, Part 4, p. 465 (authorship in alphabetical order).
  • Nugent, R., Meila, M. "An Overview of Clustering Applied to Molecular Biology" Book chapter. Statistical Methods in Molecular Biology: Humana Press, Springer, 2010.
  • Nugent, R., Stuetzle, W. "Clustering with Confidence: A Low-Dimensional Binning Approach."
    "Classification as a Tool for Research". Proceedings of the 11th International Federation Classification Societies Conference (refereed), Herman Locarek-Junge, Claus Weihs (editors), University of Dresden, Germany, March 13-18, 2009. Springer-Verlag, Heidelberg-Berlin, 2010, p. 117-126.
  • Stuetzle W and Nugent R. "A generalized single linkage method for estimating the cluster tree of a density". The Journal of Computational and Graphical Statistics. (2010), Vol. 19, 2, p. 397-418.

Invited Discussions

  • Nugent R, Lorenzi E, Frisoli K. "Discussion of A Bayesian Information Criterion for Singular Models by Drton and Plummer", Journal of the Royal Statistical Society, Series B (2017), Vol 79, Issue 2, p.371.
  • Ventura S and Nugent R. "Discussion of Of quantiles and expectiles: consistent scoring functions, Choquet representations and forecast rankings by Ehm, Gneiting, Jordan, and Kruger", Journal of the Royal Statistical Society, Series B (2016), Vol 78, Issue 3, p.555.
  • Flynt, A. and Nugent R. "Discussion of Statistical Modelling of Citation Exchange Among Statistics Journals by Varin, Cattelan, and Firth". Journal of the Royal Statistical Society: Series A, Vol 179, Issue 1, p.47-49 (2016).
  • Nugent, R and Lorenzi E. "Discussion of Analysis of forensic DNA mixtures with artefacts by Cowell, Graversen, Lauritzen, and Mortera" Journal of the Royal Statistical Society: Series C, Vol 64, Issue 1, p.43 (2015).
  • Nugent, R. and Flynt, A. "Discussion of How to find an appropriate clustering for mixed type variables with application to socio-economic stratification by Hennig and Liao". Journal of the Royal Statistical Society: Series C, Vol 62, Part 3, pp. 47-48 (2013).

In Revision/Submitted

  • Yurko, R and Nugent, R. "MMA: Maximum Model Agreement for Model-Based Clustering with Variable Selection", Submitted October 2019.

Presentations

For related presentations, see CV.

Large-Scale Record Linkage Methodology

*see here for more information about our prior NSF-Census Research Network node*

Text mining, record linkage, disambiguation; large-scale clustering/classification algorithms for disambiguating and linking text records or files; projects below (collaborators)

  • characterizing early 20th century Irish immigration and religious change through family structure changes in Ireland Census data (UC Dublin Mathematics & Statistics)
  • building acquaintances/colleague networks in 15th/16th century England (CMU English/Digital Humanities, Six Degrees of Francis Bacon )
  • building longitudinal and intergenerational databases with vital records data from the late 19th and 20th centuries; collaboration with Michigan LIFE-M Project
  • quantifying human rights violations in civil war conflicts (Human Rights Data Analysis Group, CMU Center for Human Rights Sciences)
  • developing linking methods for large-scale administrative data (U.S. Census Bureau)
  • determining and characterizing networks of colleagues and innovation (CMU Dept of Engineering & Public Policy)

Grants

  • NSF: Division of Mathematical Sciences October 2017-September 2019
    Improving Probabilistic Record Linkage; sub-grant from Jared Murray, University of Texas; $92,348
  • NIH: National Institute on Aging January 2018-May 2018
    How Does Automated Record Linkage Affect Inferences about Population Health?; sub-grant from Martha Bailey (LIFE-M), University of Michigan; $27,049
  • NSF: The NSF-Census Research Network Supplement, September 2016-August 2017
    Census Research Node: Data Integration, Online Data Collection, and Privacy Protection for Census 2020; co-PI with S Fienberg (PI, Dept of Statistics, CMU), W Eddy (PI, Dept of Statistics, CMU), A Acquisti (co-PI, Heinz School, CMU); $650,000
  • NSF: The NSF-Census Research Network, Sept 2011-Sept 2016
    Census Research Node: Data Integration, Online Data Collection, and Privacy Protection for Census 2020;co-PI with S Fienberg (PI, Dept of Statistics, CMU), W Eddy (PI, Dept of Statistics, CMU), A Acquisti (co-PI, Heinz School, CMU); $3,000,000

Publications

  • Frisoli, K, LeRoy B, Nugent, R. "A novel record linkage interface that incorporates group structure to rapidly collect richer labels". 6th IEEE International Conference on Data Science and Advanced Analytics, September 2019, pp. 580-589. DOI: 10.1109/DSAA.2019.00073
  • Frisoli, K and Nugent, R. "Exploring the effect of household structure in historical record linkage of early 1900s Ireland census records". IEEE International Conference on Data Mining Workshops (ICDMW), November 2018, pp. 502-509. DOI: 10.1109/ICDMW.2018
  • Ventura S, Nugent R, Fuchs E. "Seeing the non-stars: (Some) sources of bias in past disambiguation approaches and a new public tool leveraging labeled records". Research Policy (Special Issue on Big Data), Vol 44, Issue 9, Nov 2015, p.1672-1701. DOI

  • Ventura S, Nugent R and Fuchs, E. "Hierarchical Clustering with Distributions of Distances for Large-Scale Record Linkage". Privacy in Statistical Databases (Lecture Notes in Computer Science 8744), ed. J. Domingo-Ferrer, Springer, p.283-298 (2014). paper (Erratum with correct authorship listing here)

Invited Discussions

  • Frisoli, K and Nugent, R."Discussion of Statistical Challenges of administrative and transaction data by Hand", Journal of the Royal Statistical Society, Series A (2018), Vol 181, Issue 3, p.590.

Presentations

For related presentations, see CV.

Psychometrics/Educational Data Mining

*see here for more information about our CMART program in collaboration with RAND*

Exploring and modeling hierarchical latent class structures; developing alternative methodology to cognitive diagnosis models to identify disparate skill set profiles; integrating instructional assisting and ability assessment

Collaborators include Department of Modern Languages (CMU), Human Computer Interaction Institute (CMU), CMART program (joint at CMU Statistics and RAND), Dept of Psychology at University of Missouri, Dept of Statistics at University of Glasgow

Grants

    IES: PIER Program in Interdisciplinary Educational Research Submitted September 2019
    predoctoral training program at Carnegie Mellon University; member of proposed steering committee; five years

Publications

  • Youngs, B., Prakash, A., Nugent, R. "Statistically-driven Visualizations of Student Interactions in a French Online Course Video", Journal of Computer-Assisted Language Learning, Special Edition on Learning Analytics. Published online, September 2017. link
  • Ayers, E, Rabe-Hesketh, S, Nugent, R. "Incorporating Student Covariates in Cognitive Diagnosis Models". Journal of Classification, 30: 195-224 (2013)
  • Rupp, A, Nugent R, Nelson B. "Evidence-centered Design for Diagnostic Assessment within Digital Learning Environments: Integrating Modern Psychometrics and Educational Data Mining". Journal of Educational Data Mining, Volume 4, Issue 1, October 2012. Pages 1-10. link
  • Nelson, B, Nugent R, Rupp, A. "On Instructional Utility, Statistical Methodology, and the Added Value of ECD: Lessons Learned from the Special Issue". Journal of Educational Data Mining, Volume 4, Issue 1, October 2012. Pages 224-230. link
  • Nugent, R, Dean, N, Ayers, E. "Skill Set Profile Clustering: The Empty K-Means Algorithm with Automatic Specification of Starting Cluster Centers". Educational Data Mining 2010: 3rd International Conference on Educational Data Mining, Proceedings (refereed). Baker, R.S.J.d, Merceron, A., Pavlik, P.I. Jr.(Eds), p.151-160. link
  • Nugent, R, Ayers, E, Dean, N. "Conditional Subspace Clustering with Skill Mastery Information: Identifying Skills that Separate Students".   Educational Data Mining 2009: 2nd International Conference on Educational Data Mining, Proceedings (refereed). Barnes, T., Desmarais, M., Romero, C., and Ventura, S. (Eds), Cordoba, Spain, July 1-3, 2009, p.101-110. link
  • Ayers, E., Nugent, R., Dean, N. "A Comparison of Student Skill Knowledge Estimates".   Educational Data Mining 2009: 2nd International Conference on Educational Data Mining, Proceedings (refereed). Barnes, T., Desmarais, M., Romero, C., and Ventura, S. (Eds), Cordoba, Spain, July 1-3, 2009, p.1-10. link
  • Ayers, E., Nugent, R. Dean, N. "Skill Set Profile Clustering Based on Student Capability Vectors Computed from Online Tutoring Data". Educational Data Mining 2008: 1st International Conference on Educational Data Mining, Proceedings (refereed). R.S.J.d. Baker, T. Barnes, and J.E. Beck (Eds), Montreal, Quebec, Canada. June 20-21, 2008. p.210-217. link

Presentations

For related presentations, see CV.

Public Health

Current research concentrates primarily on two areas:

  • characterizing sleep duration, quality, consistency and its association with health, particularly obesity; sleep stability
  • medical education and information science; see the Education section for related work
Collaborators include Dept of Internal Medicine, Texas Tech University

Grants

  • CMU Undergraduate Resarch Office Summer Fellowship Program, Summer 2008
    Using Statistical Techniques to Improve Disease Classification
    student research support (Ryan Sieberg, BS Mathematical Sciences, minors in Statistics & Neuroscience 2009, Master's in Statistical Practice 2010 (CMU)
    declined due to a conflicting support opportunity

Publications

  • Nugent E, Nugent A, Nugent R, Nugent K, Nugent C. "The management of women's health care by internists with a focus on the utility of ultrasound". The American Journal of the Medical Sciences, In press, online May 2020. Link
  • Nugent K, Raj R, and Nugent R. "Sleep Patterns and Health Behaviors in Health Care Students" Southern Medical Journal, VOl 113, No.3, March 2020, p.104-110.
  • Nugent E, Nugent A, Nugent R, Nugent K. "Zika virus: epidemiology, pathogenesis, and human disease". The American Journal of the Medical Sciences, Vol. 353, No. 5, May 2017, p. 466-473.
  • Narayanan R, Nugent R, Nugent K. "An Investigation of the Variety and Complexity of Statistical Methods Used in Current Internal Medicine Literature". Southern Medical Journal , Vol 108, No. 10, Oct 2015.
    **Selected for Invited Commentary**
  • Nugent R, Althouse A, Yaqub Y, Nugent K, Raj R. "Modeling the relationship between obesity and sleep parameters in children referred for dietary weight reduction intervention". Southern Medical Journal, Special Series: Obesity, Vol 107, Issue 8, p. 473-480 (2014).
  • Ngamruengphong, S, Nugent, A, Nugent, K, Nugent, R. (authorship in alphabetical order) "Case 49: Prostate-Ca-Survival." Case Files Geriatrics (LANGE Case Files), Andrew Dentino, MD (editor). McGrawHill Medical (2013).
  • Nourbaksh, E., Nugent, R., Wang, H., Cevik, C., and Nugent, K. "Medical Literature Searches: PubMed Central or Google Scholar?" Health Information and Libraries Journal. 2012; 29:214-22.
    **Additionally selected to be part of a special issue on The Role of the Health Information Professional marking the CILIP Health Libraries Group Conference, Oxford, 2014**
  • Wang, H., Nugent, R. Nugent, C. Nugent, K. Phy, M. "A Commentary on the Use of the Internal Medicine In-Training Examination". The American Journal of Medicine . Vol 122, No 9, September 2009, p.879-883.
  • Buscemi D, Kumar A, Nugent R, Nugent K. "Short Sleep Times Predict an Increased Body Mass Index in Internal Medicine Clinic Patients". Journal of Clinical Sleep Medicine. Vol 3, Number 7, Dec 15, 2007. 681-688.
  • Glaser SL, Clarke CA, Keegan THM, Gomez SL, Nugent RA, Topol B, Stearns CB, Stewart SL. "Attenuation of social class and reproductive risk factors for Hodgkin lymphoma due to selection bias in controls". Cancer Causes Control: 2004; 15:731-9.
  • Glaser SL, Clarke CA, Nugent RA, Stearns CB, Dorfman RF. "Reproductive Factors in Hodgkin's disease in women". American Journal of Epidemiology: 2003; 158(6):553-563.
  • Glaser SL, Clarke CA, Nugent RA, Dorfman RF, Stearns CB. "Social class and risk of Hodgkin's disease in young adult women in 1988-94". International Journal of Cancer: 2002; 98(1):110-17.

Accepted/In Revision/Submitted

  • Nugent E, Nugent A, Nugent R, Nugent K, Nugent C. "The management of women's health care by internists with a focus on the utility of ultrsound". Submitted, January 2020.

Presentations

For related presentations, see CV.

Teaching Statistics

The Teaching Statistics group engages in a combination of research, pedagogy, and classroom training. Members are interested in updating and modernizing curriculum and assessment, pedagogical philosophy and development, best classroom practices, student engagement, and outreach to a diverse community.

Visit the group website to learn more.

Accepted/In Revision/Submitted

  • Reinhart A, Evans C, Luby A, Orellana J, Meyer M, Wieczorek J, Elliott P, Burckhardt P, Nugent R. "Think-aloud interviews: A tool for exploring student statistical reasoning". Revise and resubmit, Summer 2020.

Presentations

  • Burckhardt P, Elliott P, Hyun S, Lin K, Luby A, Makris CP, Orellana J, Reinhart A, Wieczorek J, Weinberg G, Nugent R. Assessment of Student Learning and Misconception Identification in Intro Statistics, CMU Eberly Teaching and Learning Summit, October 2017.
  • Burckhardt P, Chouldechova A, Nugent R. The ISLE Experience: Enhancing Classroom Instruction with Interactive E-Learning Tools, CMU Eberly Teaching and Learning Summit, October 2017.
For other related presentations, see CV.







Domestic Violence Recividism

Research focuses on building risk assessment tools for use in domestic violence recividism; specific emphasis on characteristing factors associated with potential violations of issued protective orders; public policy component aims to improve circumstances for all alleged parties involved through education and available social services

In collaboration with Mark Patterson (CMU Quantitative Social Science Scholars Program), Maggie McGannon, Farhod Yuldashev, and various students

Semantic Organization

*see here for more information about the Cognitive Development Lab,
Department of Psychology, CMU (Director: Anna Fisher)*
Research focuses on understanding cognitive development with respect to semantic organization; how do children group items? Thematically? Taxonomically? How do these relationships change over time? In collaboration with Dept of Psychology, CMU

Publications

  • Unger L, Fisher A, Nugent R, Ventura S, and MacLellan C. "Developmental Changes in the Semantic Organization of Living Kinds". Journal of Experimental Child Psychology, Vol. 146, June 2016, p.202-222.

Technology, Innovation and Entrepreneurship

*see here for more information about Erica Fuchs, the director of the EPP group*
Characterizing and better understanding how manufacturing decisions (e.g. offshoring facilities) affect research and innovation; analysis on inventors' careers, mobility, trajectories. In collaboration with Erica Fuchs, Department of Engineering & Public Policy (CMU)

Publications

  • Yang C, Nugent R, Fuchs E. "Gains from Others' Losses: Technology Trajectories and the Global Division of Firms". Vol 45, Issue 3, April 2016, p.724-745.

  • Ventura S, Nugent R, Fuchs E. "Seeing the non-stars: (Some) sources of bias in past disambiguation approaches and a new public tool leveraging labeled records". Research Policy (Special Issue on Big Data), Vol 44, Issue 9, Nov 2015, p.1672-1701. DOI (also listed under Record Linkage)

Presentations

For related presentations, see CV.

Current Postdocs:

  • Philipp Burckhardt, PhD, Dept of Statistics & Data Science, CMU home page
    ISLE: Integrated Statistics Learning Environment; science of data science; educational software development; studying data analysis workflows, group collaboration, and their optimization

Current Students:

  • Kayla Frisoli, PhD student, Dept of Statistics & Data Science, CMU home page
    thesis work on record linkage/clustering methodology specifically focusing on extending models to include label uncertainty and labeler disagreement; includes development and analysis of crowd-sourcing linkage interfaces; applications include early 20th century Irish Census records

  • Xiaoyi Yang, PhD student, Dept of Statistics & Data Science, CMU home page
    thesis work on extending network models for text data to include covariate-driven penalty structures and incorporate record linkage techniques for de-duplication; applications include acquaintance networks in early modern Britain;
    co-advised with Nynke Niezink, Statistics & Data Science, CMU

    previous research project on Historical Record Linkage
    co-advised with Jared Murray, Dept of Information, Risk, & Operations Mgmt, UT Austin
    in collaboration with University of Michigan LIFE-M Project (PI: Bailey, UM Economics)

  • Ron Yurko, PhD student, Dept of Statistics & Data Science, CMU home page
    research project on variable selection for consistent clustering; clustering/classification work on ensemble prediction; Carnegie Mellon Sports Analytics; thesis work advised byKathryn Roeder

  • Frank Kovacs, Master's/undergrad, Dept of Statistics & Data Science, CMU
    ISLE (Integrated Statistics Learning Environment) Development
    statistical computing/software development

  • Carlo Duffy, undergraduate, Dept of Statistics & Data Science, QSSS program, CMU
    honors thesis co-advised by Mark Patterson, Social & Decision Sciences/QSSS

Former Students:

PhD students
  • Brendan McVeigh, PhD student, Dept of Statistics & Data Science, CMU(2020) home page
    thesis work on Bayesian record linkage methodology
    primary advisor: Jared Murray, Dept of Information, Risk, & Operations Mgmt, UT Austin
    Currently at Waymo

  • Sam Ventura, PhD, Dept of Statistics, CMU (2015) home page
    Large Scale Classification and Clustering Methods with Applications in Record Linkage
    2016 Classification Society Distinguished Dissertation Honorable Mention (2nd)
    2015 ASA Pittsburgh Chapter Student of the Year (CMU)

    Visiting Assistant Professor at CMU Statistics (2015-2017)
    Currently the Director of Hockey Research for the NHL Pittsburgh Penguins

  • Alan Mishler, PhD student, Dept of Statistics & Data Science, CMU home page
    research project on Online Learning for Introductory French
    co-advised by Bonnie Youngs, Dept of Modern Languages, CMU;
    thesis currently advised by Edward Kennedy

  • David Friedenberg, PhD, Dept of Statistics, CMU (2010) (advisor: Chris Genovese)
    Exploring the Use of a Self-Tuning Diffusion Map Framework (dissertation project)

    Currently at Battelle Memorial Institute

  • Elizabeth Ayers, PhD, Dept of Statistics, CMU (2010) (advisor: Brian Junker)
    Developing Skill Set Profile Clustering Methods (dissertation project); CMU home page

    2008 ASA Pittsburgh Chapter Student of the Year (CMU)
    Postdoc in the Graduate School of Education at Berkeley
    Currently at the American Institutes of Research in D.C.
Master's Students
  • Joseph Pane, Master's in Stat Practice (2016), BS Statistics (2015), CMU
    Hitting the Wall: Mixture Models of Long Distance Running Trajectories
    developing clustering methodology for mixed variable types in the presence of missing or sparse categories

    2016 ASA Pittsburgh Chapter Student Poster Session Co-Winner
    Dietrich HSS Honors Thesis, 2nd place in Oral Presentation of a Honors Thesis, Statistics Competition, Meeting of the Minds Research Symposium

    The Use of Electronic Cigarettes as Cessation Aids and in Conjunction with Conventional Cigarettes: A Meta-Analysis of Survey Evidence
    this project co-advised with Jared Murray, Dept of Statistics, CMU

    Starting at RAND in August 2016

  • Elizabeth Lorenzi, Master's in Stat Practice (2014), BS Econ-Stat (2013), CMU
    Hierarchical Modal Structures of Density Estimates; Local Merging and Visualization of Mixture Model Components

    Work partially funded by the Dietrich Faculty Development Fund (Summer 2014)
    Recipient of Mihaela Serban Travel Award (2014)
    Started her PhD at Duke Statistics in Fall 2014

  • Daniel Park, Master's in Stat Practice (2011), BS in Economics-Statistics (2010), CMU
    Characterizing Performance of Medical Residency Programs

    Working at the Department of Public Finance and Social Policy, Korea Development Institute (Summer 2011)

  • Xia-Yi (Sandy) Shen, MS, Elec & Comp Engineering (2009); BS, Statistics, Elec & Comp Engineering (2008), CMU
    Cluster-Based Modeling: Exploring the Linear Regression Model Space

    Currently at Citigroup Hong Kong
Undergraduate Students
  • Corey Emery, undergrad, Dept of Statistics & Data Science, CMU (graduating 2020)
    Historical Record Linkage of early 20th Century Ireland

  • Yeuk Yu (Tiffany) Lee, BS in Statistics & Machine Learning, CMU (2019)
    User-Driven Visualization Tools
    co-advised with Robin Mejia, Center for Human Rights Science, CMU

  • Josh Ragen, BS in Economics-Statistics, CMU (2019)
    Developing Risk Assessment Tools for Domestic Violence Recividism
    co-advised with Mark Patterson, Quantitative Social Science Scholars Program, CMU Dietrich College

  • Theo Peterson, BS in Economics-Statistics (2017), CMU
    Characterizing Domestic Violence Re-Offenders
    (co-advised with Mark Patterson, QSSS program, CMU)

    Working for PNC

  • Lina Sheremet, BS in Statistics (2017), CMU
    Active Learning for Mixture Models
    (lead advisor: Sam Ventura, Dept of Statistics, CMU)

    Currently working for Thermo Fisher Scientific in Boston

  • Akhil Prakash, BS in Mathematical Sciences, Statistics (2016), CMU
    Modeling Student Navigation and Engagement in Hybrid Online French Course
    Co-advised with Bonnie Youngs, Dept of Modern Languages, CMU

    Started Master's in Statistics at Stanford in Fall 2016

  • Ronald Yurko, BS in Statistics (2015), CMU
    Prediction with Ensembles using Distribution Summaries (PREDS)
    Co-advised with Sam Ventura, Dept of Statistics, CMU

    Andrew Carnegie Society Scholar (2015); starting PhD in Statistics at CMU in Fall 2017

  • Adrian Botta, BS in Economics-Statistics (2015), CMU
    Characterizing Clustering Methods by their Maximum Clustering Similarity

  • Emily Wright, BS Economics-Statistics (CMU, 2014)
    The Next Big Thing: Predicting Radio Airplay using Social Media Metrics

    Dietrich H&SS Senior Honors Thesis; Best Oral Presentation of a Honors Thesis, Statistics Competition, Meeting of the Minds Research Symposium
    Currently working in analytics

  • Yi Xiang Chong, BS, Mathematical Sciences, Economics-Statistics (CMU), 2012
    Incorporating Flexibility into the Normalized-Cut Image Segmentation Algorithm

    Dietrich H&SS Senior Honors Thesis (2011-2012); Best Oral Presentation of a Honors Thesis, Statistics Competition, Meeting of the Minds Research Symposium
    Currently working for Central Bank of Malaysia

  • Neha Nandakumar, BS, Depts of ChemE/EPP (2013), CMU
    Disambiguating Optoelectronics Inventors and their Patents
    Project co-supervised with Erica Fuchs, Dept of Engineering & Public Policy (CMU)

  • Stephanie Kao, BS, Mathematical Sciences (2012), CMU.
    State-Space Modeling of Inventor Mobility
    Project co-supervised with Erica Fuchs, Dept of Engineering & Public Policy (CMU)

    Currently working at Agilex in D.C.

  • Christopher Peter Makris, Master's in Stat Practice (2012); BS, Statistics (2011), CMU.
    Exploration of Imputation Methods for Missingness in Image Segmentation,

    Dietrich H&SS Senior Honors Thesis (2010-2011); Best Oral Presentation of a Honors Thesis, Statistics Competition, Meeting of the Minds Research Symposium<

  • Andrew Klein, Master's in Stat Practice (2015); BS, Information Systems, Statistics (CMU), 2011
    Semi-Automated Collection of Pitch Location and Intent in Baseball,

    Dietrich H&SS Senior Honors Thesis (2010-2011); (co-advised with Andrew Thomas)
    Worked at Accenture in D.C.

  • Hannah Pileggi, MS in Human Centered Computing, School of Interactive Computing (2013), Georgia Teach; BS, Statistics (CMU), 2011
    Characterizing the Quantity and Perceived Quality of Sleep in a Health Care Student Population; joint with Texas Tech Univ Dept of Internal Medicine

    The Impact of Offshoring on Firm vs. Individual Technology Trajectories in Optoelectronics
    Statistical Graphics project co-supervised with Erica Fuchs, Dept of EPP (CMU)

    Currently working at Facebook.

  • Ryan Sieberg BS Mathematical Sciences, minors in Statistics & Neuroscience 2009,
    Master's in Statistical Practice 2010 (CMU)
    Using Statistical Methods to Improve Disease Classification

    Research fellowship with the Center for the Neural Basis of Cognition (2008-2009)
    Currently in Northwestern University Medical School (Fall 2011)

  • Alex(andra) Tronetti, BS, Statistics & Economics 2009 (CMU)
    Finding the "Important" Structure in a Corpus of Sleep Habit Questionnaires


    Graduated with a Master's in Biostatistics from UNC-Chapel Hill (2011).

  • Andrew Althouse, PhD Epidemiology (2014), M.S. Applied Statistics (2010), University of Pittsburgh; B.S. Statistics 2008 (CMU)
    Modeling the Relationship Between Sleep Characteristics and Pediatric Obesity

    Currently at the UPMC Heart and Vascular Institute

  • Xiaoyi Fei, B.S.Computer Science 2007 (CMU);
    Clustering with a Density-based Similarity Measure

    Currently in industry; returning to graduate school.
© 2013-2020 Department of Statistics & Data Science, Carnegie Mellon University. Website design adapted from a design by Mikhail Popov.