Nonparametric multivariate analysis methods; clustering; mode analysis; cluster tress; spanning trees; generalized single linkage clustering; assessing significance of clusters; pruning and merging procedures
Graphics/Visualization; high-dimensional density estimation and visualization; characterizing subspaces of density estimates; variable/subspace selection
- CMU Berkman Faculty Development Fund , Summer 2014
Determining the Cluster Structure in High-Dimensional Data: A Visualization Tool for Merging Clusters; graduate student summer support ~$3000
- Association for Women in Mathematics, Summer 2011
Self-Tuning Diffusion Maps: Finding Local Cluster Structure while Reducing Dimensionality, research/travel grant, $2000 (declined due to last minute unavailability to travel)
- The Royal Society of Edinburgh, Summer 2008
Merging Clustering Methodologies for Visualization and Estimation of Group Structure
PI with Nema Dean (PI, Dept of Statistics, Univ of Glasgow), research/travel grant, $5500
- Dean, N. and Nugent, R. "Clustering Student Skill Set Profiles in a Unit Hypercube using Mixtures of Multivariate Betas" Advances in Data Analysis and Classification, Vol 7, No. 3, p.339-357, 2013.
- Rinaldo, A, Singh, A, Nugent, R, Wasserman, L. "Stability of
Journal of Machine Learning Research 13 (Apr):905-948, 2012 JMLR Link
- Friedenberg, D and Nugent, R "Exploration of the Use of a Self-Tuning Diffusion Map Framework". International Statistical Institute Proceedings (2011) (invited). ISI 2011 IPS040.02, Jan 2012.
- Dean, N and Nugent, R "Comparing Different Clustering Methods on the Unit Hypercube". International Statistical Institute Proceedings (2011) (invited). ISI 2011 IPS040.03, Jan 2012.
- Nugent, R., Rinaldo, A, Singh, A, Wasserman, L. "Discussion of
Stability Selection by Meinshausen and Buhlmann"
Journal of the Royal Statistical Society B (2010). Vol 72, Part 4,
p. 465 (authorship in alphabetical order).
- Nugent, R., Meila, M. "An Overview of Clustering Applied to
Molecular Biology" Book chapter. Statistical Methods in
Molecular Biology: Humana Press, Springer, 2010.
- Nugent, R., Stuetzle, W. "Clustering with Confidence: A
Low-Dimensional Binning Approach."
"Classification as a Tool
for Research". Proceedings of the 11th International Federation
Classification Societies Conference (refereed), Herman
Locarek-Junge, Claus Weihs (editors), University of Dresden, Germany,
March 13-18, 2009. Springer-Verlag, Heidelberg-Berlin,
2010, p. 117-126.
- Stuetzle W and Nugent R. "A generalized single linkage method for
estimating the cluster tree of a density". The Journal of
Computational and Graphical Statistics. (2010), Vol. 19, 2, p. 397-418.
- Nugent R, Lorenzi E, Frisoli K. "Discussion of A Bayesian Information Criterion for Singular Models by Drton and Plummer", Journal of the Royal Statistical Society, Series B (2017), Vol 79, Issue 2, p.371.
- Ventura S and Nugent R. "Discussion of Of quantiles and expectiles: consistent scoring functions, Choquet representations and forecast rankings by Ehm, Gneiting, Jordan, and Kruger", Journal of the Royal Statistical Society, Series B (2016), Vol 78, Issue 3, p.555.
- Flynt, A. and Nugent R. "Discussion of Statistical Modelling of Citation Exchange Among Statistics Journals by Varin, Cattelan, and Firth". Journal of the Royal Statistical Society: Series A, Vol 179, Issue 1, p.47-49 (2016).
- Nugent, R and Lorenzi E. "Discussion of Analysis of forensic DNA mixtures with artefacts by Cowell, Graversen, Lauritzen, and Mortera" Journal of the Royal Statistical Society: Series C, Vol 64, Issue 1, p.43 (2015).
- Nugent, R. and Flynt, A. "Discussion of How to find an appropriate clustering for mixed type variables with application to socio-economic stratification by Hennig and Liao". Journal of the Royal Statistical Society: Series C, Vol 62, Part 3, pp. 47-48 (2013).
- Flynt, A., Dean, N.,Nugent, R. "sARI: A soft agreement measure for class partitions incorporating assignment probabilities" Submitted, November 2017.
For related presentations, see CV.
Large-Scale Record Linkage Methodology
*see here for more information about our prior NSF-Census Research Network node*
Text mining, record linkage, disambiguation; large-scale clustering/classification algorithms for disambiguating and linking text records or files; projects below (collaborators)
- characterizing early 20th century Irish immigration and religious change through family structure changes in Ireland Census data (UC Dublin Mathematics & Statistics)
- building acquaintances/colleague networks in 15th/16th century England (CMU English/Digital Humanities, Six Degrees of Francis Bacon )
- building longitudinal and intergenerational databases with vital records data from the late 19th and 20th centuries; collaboration with Michigan LIFE-M Project
- quantifying human rights violations in civil war conflicts (Human Rights Data Analysis Group, CMU Center for Human Rights Sciences)
- developing linking methods for large-scale administrative data (U.S. Census Bureau)
- determining and characterizing networks of colleagues and innovation (CMU Dept of Engineering & Public Policy)
- NSF: The NSF-Census Research Network Supplement, September 2016-August 2017
Census Research Node: Data Integration, Online Data Collection, and Privacy Protection for Census 2020; co-PI with S Fienberg (PI, Dept of Statistics, CMU), W Eddy (PI, Dept of Statistics, CMU), A Acquisti (co-PI, Heinz School, CMU); $650,000
- NSF: The NSF-Census Research Network, Sept 2011-Sept 2016
Census Research Node: Data Integration, Online Data Collection, and Privacy Protection for Census 2020;co-PI with S Fienberg (PI, Dept of Statistics, CMU), W Eddy (PI, Dept of Statistics, CMU), A Acquisti (co-PI, Heinz School, CMU); $3,000,000
- Ventura S, Nugent R, Fuchs E. "Seeing the non-stars: (Some) sources of bias in past disambiguation approaches and a new public tool leveraging labeled records". Research Policy (Special Issue on Big Data), Vol 44, Issue 9, Nov 2015, p.1672-1701. DOI
- Ventura S, Nugent R and Fuchs, E. "Hierarchical Clustering with Distributions of Distances for Large-Scale Record Linkage". Privacy in Statistical Databases (Lecture Notes in Computer Science 8744), ed. J. Domingo-Ferrer, Springer, p.283-298 (2014). paper (Erratum with correct authorship listing here)
For related presentations, see CV.
Psychometrics/Educational Data Mining
*see here for more information about our CMART program in collaboration with RAND*
Exploring and modeling hierarchical latent class structures; developing alternative methodology to cognitive diagnosis models to identify disparate skill set profiles; integrating instructional assisting and ability assessment
Collaborators include Department of Modern Languages (CMU), Human Computer Interaction Institute (CMU), CMART program (joint at CMU Statistics and RAND), Dept of Psychology at University of Missouri, Dept of Statistics at University of Glasgow
- Youngs, B., Prakash, A., Nugent, R. "Statistically-driven Visualizations of Student Interactions in a French Online Course Video", Journal of Computer-Assisted Language Learning, Special Edition on Learning Analytics. Published online, September 2017. link
- Ayers, E, Rabe-Hesketh, S, Nugent, R. "Incorporating Student
Covariates in Cognitive Diagnosis Models". Journal of Classification, 30: 195-224 (2013)
- Rupp, A, Nugent R, Nelson B. "Evidence-centered Design for Diagnostic Assessment within Digital Learning Environments: Integrating Modern Psychometrics and Educational Data Mining". Journal of Educational Data Mining,
Volume 4, Issue 1, October 2012. Pages 1-10. link
- Nelson, B, Nugent R, Rupp, A. "On Instructional Utility, Statistical Methodology, and the Added Value of ECD: Lessons Learned from the Special Issue". Journal of Educational Data Mining,
Volume 4, Issue 1, October 2012. Pages 224-230. link
- Nugent, R, Dean, N, Ayers, E. "Skill Set Profile Clustering:
The Empty K-Means Algorithm with Automatic Specification of Starting
Cluster Centers". Educational Data Mining 2010: 3rd
International Conference on Educational Data Mining, Proceedings
(refereed). Baker, R.S.J.d, Merceron, A., Pavlik, P.I. Jr.(Eds), p.151-160. link
- Nugent, R, Ayers, E, Dean, N. "Conditional Subspace Clustering with Skill Mastery Information: Identifying Skills that Separate Students".
Educational Data Mining 2009: 2nd International Conference on Educational Data Mining, Proceedings (refereed). Barnes, T., Desmarais, M., Romero, C., and Ventura, S. (Eds), Cordoba, Spain, July 1-3, 2009, p.101-110. link
- Ayers, E., Nugent, R., Dean, N. "A Comparison of Student Skill Knowledge Estimates".   Educational Data Mining 2009: 2nd International Conference on Educational Data Mining, Proceedings (refereed). Barnes, T., Desmarais, M., Romero, C., and Ventura, S. (Eds), Cordoba, Spain, July 1-3, 2009, p.1-10. link
- Ayers, E., Nugent, R. Dean, N. "Skill Set Profile Clustering
Based on Student Capability Vectors Computed from Online Tutoring
Data". Educational Data Mining 2008: 1st International
Conference on Educational Data Mining, Proceedings
(refereed). R.S.J.d. Baker, T. Barnes, and J.E. Beck (Eds), Montreal,
Quebec, Canada. June 20-21, 2008. p.210-217.
- Prakash, A., Nugent, R., Youngs, B "Clustering Transition Matrices to Identify Student Patterns of Engagement in Online Second Language Acquisition", Submitted, October 2017.
For related presentations, see CV.
Current research concentrates primarily on two areas:
- characterizing sleep duration, quality, consistency and its association with health, particularly obesity; sleep stability
- medical education and information science; see the Education section for related work
Collaborators include Dept of Internal Medicine, Texas Tech University
- CMU Undergraduate Resarch Office Summer Fellowship Program, Summer 2008
Using Statistical Techniques to Improve Disease Classification
student research support (Ryan Sieberg, BS Mathematical Sciences, minors in Statistics & Neuroscience 2009, Master's in Statistical Practice 2010 (CMU)
declined due to a conflicting support opportunity
- Nugent E, Nugent A, Nugent R, Nugent K. "Zika virus: epidemiology, pathogenesis, and humand disease". The American Journal of the Medical Sciences, Vol. 353, No. 5, May 2017, p. 466-473.
- Narayanan R, Nugent R, Nugent K. "An Investigation of the Variety and Complexity of Statistical Methods Used in Current Internal Medicine Literature". Southern Medical Journal , Vol 108, No. 10, Oct 2015.
**Selected for Invited Commentary**
- Nugent R, Althouse A, Yaqub Y, Nugent K, Raj R. "Modeling the relationship between obesity and sleep parameters in children referred for dietary weight reduction intervention". Southern Medical Journal, Special Series: Obesity, Vol 107, Issue 8, p. 473-480 (2014).
- Ngamruengphong, S, Nugent, A, Nugent, K, Nugent, R. (authorship in alphabetical order) "Case 49: Prostate-Ca-Survival." Case Files Geriatrics (LANGE Case Files), Andrew Dentino, MD (editor). McGrawHill Medical (2013).
- Nourbaksh, E., Nugent, R., Wang, H., Cevik, C., and
Nugent, K. "Medical Literature Searches: PubMed Central or Google
Scholar?" Health Information and Libraries Journal. 2012; 29:214-22.
**Additionally selected to be part of a special issue on The Role of the Health Information Professional marking the CILIP Health Libraries Group Conference, Oxford, 2014**
- Wang, H., Nugent, R. Nugent, C. Nugent, K. Phy, M. "A Commentary
on the Use of the Internal Medicine In-Training Examination".
The American Journal of Medicine . Vol 122, No 9, September 2009, p.879-883.
- Buscemi D, Kumar A, Nugent R, Nugent K. "Short Sleep Times Predict an
Increased Body Mass Index in Internal Medicine Clinic
Patients". Journal of Clinical Sleep Medicine. Vol 3, Number
7, Dec 15, 2007. 681-688.
- Glaser SL, Clarke CA, Keegan THM, Gomez SL, Nugent RA, Topol B,
Stearns CB, Stewart SL.
"Attenuation of social class and reproductive risk factors for
Hodgkin lymphoma due to
selection bias in controls". Cancer Causes Control: 2004; 15:731-9.
- Glaser SL, Clarke CA, Nugent RA, Stearns CB, Dorfman RF.
"Reproductive Factors in Hodgkin's disease in women". American
Journal of Epidemiology:
- Glaser SL, Clarke CA, Nugent RA, Dorfman RF, Stearns CB. "Social
class and risk of Hodgkin's disease in young adult women in
1988-94". International Journal
of Cancer: 2002; 98(1):110-17.
For related presentations, see CV.
The Teaching Statistics group engages in a combination of research, pedagogy, and classroom training. Members are interested in updating and modernizing curriculum and assessment, pedagogical philosophy and development, best classroom practices, student engagement, and outreach to a diverse community.
Visit the group website to learn more.
- NSF Division of Undergraduate Education: IUSE
"Supercharging the Data Science Classroom: Giving students agency and reach with new, interactive technologies", $2,000,000, five years, Submitted December 2017.
PI: Nugent with C Genovese (PI), P Burckhardt (Investigator)
- SIGKDD Impact Program, $50K, 2018, Submitted December 2017.
"The Carnegie Mellon Data Science Experience: Why take a course in Data Science when you can Experience it?". PI: Nugent; Key Personnel: Weinberg, Burckhardt; Collaborators: Alba, Genovese
- Burckhardt P, Chouldechova A, Nugent R. The ISLE Experience: Enhancing Classroom Instruction with Interactive E-Learning Tools, CMU Eberly Teaching and Learning Summit, October 2017.
- Burckhardt P, Elliott P, Hyun S, Lin K, Luby A, Makris CP, Orellana J, Reinhart A, Wieczorek J, Weinberg G, Nugent R. Assessment of Student Learning and Misconception Identification in Intro Statistics, CMU Eberly Teaching and Learning Summit, October 2017.
Domestic Violence Recividism
Research focuses on building risk assessment tools for use in domestic violence recividism; specific emphasis on characteristing factors associated with potential violations of issued protective orders; public policy component aims to improve circumstances for all alleged parties involved through education and available social services
In collaboration with Mark Patterson (CMU Quantitative Social Science Scholars Program), Maggie McGannon, Farhod Yuldashev, and various students
*see here for more information about the Cognitive Development Lab,
Department of Psychology, CMU (Director: Anna Fisher)*
Research focuses on understanding cognitive development with respect to semantic organization; how do children group items? Thematically? Taxonomically? How do these relationships change over time? In collaboration with Dept of Psychology, CMU
- Unger L, Fisher A, Nugent R, Ventura S, and MacLellan C. "Developmental Changes in the Semantic Organization of Living Kinds". Journal of Experimental Child Psychology, Vol. 146, June 2016, p.202-222.
Technology, Innovation and Entrepreneurship
*see here for more information about Erica Fuchs, the director of the EPP group*
Characterizing and better understanding how manufacturing decisions (e.g. offshoring facilities) affect research and innovation; analysis on inventors' careers, mobility, trajectories. In collaboration with Erica Fuchs, Department of Engineering & Public Policy (CMU)
- Yang C, Nugent R, Fuchs E. "Gains from Others' Losses: Technology Trajectories and the Global Division of Firms". Vol 45, Issue 3, April 2016, p.724-745.
- Ventura S, Nugent R, Fuchs E. "Seeing the non-stars: (Some) sources of bias in past disambiguation approaches and a new public tool leveraging labeled records". Research Policy (Special Issue on Big Data), Vol 44, Issue 9, Nov 2015, p.1672-1701. DOI (also listed under Record Linkage)
For related presentations, see CV.
- Kayla Frisoli, PhD student, Dept of Statistics & Data Science, CMU home page
thesis work on record linkage/clustering methodology
- Philipp Burckhardt, PhD student, Heinz College/Dept of Statistics & Data Science, CMU home page
Science of Data Science; developing ISLE: Integrated Statistics Learning Environment; educational software development for intro level statistics & data science
- Brendan McVeigh, PhD student, Dept of Statistics & Data Science, CMU
thesis work on Bayesian record linkage methodology
primary advisor: Jared Murray, Dept of Information, Risk, & Operations Mgmt, UT Austin
- Alan Mishler, PhD student, Dept of Statistics & Data Science, CMU
research project on Online Learning for Introductory French
co-advised by Bonnie Youngs, Dept of Modern Languages, CMU
- Xiaoyi Yang, PhD student, Dept of Statistics & Data Science, CMU
research project on Historical Record Linkage
co-advised with Jared Murray, Dept of Information, Risk, & Operations Mgmt, UT Austin
in collaboration with University of Michigan LIFE-M Project (PI: Bailey, UM Economics)
- Ron Yurko, PhD student, Dept of Statistics & Data Science, CMU home page
variable selection for consistent clustering; clustering/classification work on ensemble prediction
- Yeuk Yu (Tiffany) Lee, undergrad, Dept of Statistics & Data Science, CMU
User-Driven Visualization Tools
co-advised with Robin Mejia, Center for Human Rights Science, CMU
- Josh Ragen, undergrad, Dept of Statistics & Data Science, CMU
Developing Risk Assessment Tools for Domestic Violence Recividism
co-advised with Mark Patterson, Quantitative Social Science Scholars Program, CMU Dietrich College
- Frank Kovacs, undergrad, Dept of Statistics & Data Science, CMU
ISLE (Integrated Statistics Learning Environment) Development
statistical computing/software development
- Corey Emery, undergrad, Dept of Statistics & Data Science, CMU
Historical Record Linkage of early 20th Century Ireland
ongoing research project
- Sam Ventura, PhD, Dept of Statistics, CMU (2015) home page
Large Scale Classification and Clustering Methods with Applications in Record Linkage
2016 Classification Society Distinguished Dissertation Honorable Mention (2nd)
2015 ASA Pittsburgh Chapter Student of the Year (CMU)
Visiting Assistant Professor at CMU Statistics (2015-2017)
Currently the Director of Hockey Research for the NHL Pittsburgh Penguins
- David Friedenberg, PhD, Dept of Statistics, CMU (2010) (advisor: Chris Genovese)
Exploring the Use of a Self-Tuning Diffusion Map Framework (dissertation project)
Currently at Battelle Memorial Institute
- Elizabeth Ayers, PhD, Dept of Statistics, CMU (2010) (advisor: Brian Junker)
Developing Skill Set Profile Clustering Methods (dissertation project); CMU home page
2008 ASA Pittsburgh Chapter Student of the Year (CMU)
Postdoc in the Graduate School of Education at Berkeley
Currently at the American Institutes of Research in D.C.
- Joseph Pane, Master's in Stat Practice (2016), BS Statistics (2015), CMU
Hitting the Wall: Mixture Models of Long Distance Running Trajectories
developing clustering methodology for mixed variable types in the presence of missing or sparse categories
2016 ASA Pittsburgh Chapter Student Poster Session Co-Winner
Dietrich HSS Honors Thesis, 2nd place in Oral Presentation of a Honors Thesis, Statistics Competition, Meeting of the Minds Research Symposium
The Use of Electronic Cigarettes as Cessation Aids and in Conjunction with Conventional Cigarettes: A Meta-Analysis of Survey Evidence
this project co-advised with Jared Murray, Dept of Statistics, CMU
Starting at RAND in August 2016
- Elizabeth Lorenzi, Master's in Stat Practice (2014), BS Econ-Stat (2013), CMU
Hierarchical Modal Structures of Density Estimates; Local Merging and Visualization of Mixture Model Components
Work partially funded by the Dietrich Faculty Development Fund (Summer 2014)
Recipient of Mihaela Serban Travel Award (2014)
Started her PhD at Duke Statistics in Fall 2014
- Daniel Park, Master's in Stat Practice (2011), BS in Economics-Statistics (2010), CMU
Characterizing Performance of Medical Residency Programs
Working at the Department of Public Finance and Social Policy, Korea Development Institute (Summer 2011)
- Xia-Yi (Sandy) Shen, MS, Elec & Comp Engineering (2009); BS, Statistics, Elec & Comp
Engineering (2008), CMU
Modeling: Exploring the Linear Regression Model Space
Currently at Citigroup Hong Kong
- Theo Peterson, BS in Economics-Statistics (2017), CMU
Characterizing Domestic Violence Re-Offenders
(co-advised with Mark Patterson, QSSS program, CMU)
Working for PNC
- Lina Sheremet, BS in Statistics (2017), CMU
Active Learning for Mixture Models
(lead advisor: Sam Ventura, Dept of Statistics, CMU)
Currently working for Thermo Fisher Scientific in Boston
- Akhil Prakash, BS in Mathematical Sciences, Statistics (2016), CMU
Modeling Student Navigation and Engagement in Hybrid Online French Course
Co-advised with Bonnie Youngs, Dept of Modern Languages, CMU
Started Master's in Statistics at Stanford in Fall 2016
- Ronald Yurko, BS in Statistics (2015), CMU
Prediction with Ensembles using Distribution Summaries (PREDS)
Co-advised with Sam Ventura, Dept of Statistics, CMU
Andrew Carnegie Society Scholar (2015); starting PhD in Statistics at CMU in Fall 2017
- Adrian Botta, BS in Economics-Statistics (2015), CMU
Characterizing Clustering Methods by their Maximum Clustering Similarity
- Emily Wright, BS Economics-Statistics (CMU, 2014)
The Next Big Thing: Predicting Radio Airplay using Social Media Metrics
Dietrich H&SS Senior Honors Thesis; Best Oral Presentation of a Honors Thesis, Statistics Competition, Meeting of the Minds Research Symposium
Currently working in analytics
- Yi Xiang Chong, BS, Mathematical Sciences, Economics-Statistics (CMU), 2012
Incorporating Flexibility into the Normalized-Cut Image Segmentation Algorithm
Dietrich H&SS Senior Honors Thesis (2011-2012); Best Oral Presentation of a Honors Thesis, Statistics Competition, Meeting of the Minds Research Symposium
Currently working for Central Bank of Malaysia
- Neha Nandakumar, BS, Depts of ChemE/EPP (2013), CMU
Disambiguating Optoelectronics Inventors and their Patents
Project co-supervised with Erica Fuchs, Dept of Engineering &
Public Policy (CMU)
- Stephanie Kao, BS, Mathematical Sciences (2012), CMU.
State-Space Modeling of Inventor Mobility
Project co-supervised with Erica Fuchs, Dept of Engineering & Public Policy (CMU)
Currently working at Agilex in D.C.
- Christopher Peter Makris, Master's in Stat Practice (2012); BS, Statistics (2011), CMU.
Exploration of Imputation Methods for Missingness in Image
Dietrich H&SS Senior Honors Thesis (2010-2011); Best Oral Presentation of a Honors Thesis, Statistics Competition, Meeting of the Minds Research Symposium<
- Andrew Klein, Master's in Stat Practice (2015); BS, Information Systems, Statistics (CMU), 2011
Semi-Automated Collection of Pitch Location and Intent in Baseball,
Dietrich H&SS Senior Honors Thesis (2010-2011); (co-advised with Andrew Thomas)
Worked at Accenture in D.C.
- Hannah Pileggi, MS in Human Centered Computing, School of Interactive Computing (2013), Georgia Teach; BS, Statistics (CMU), 2011
Characterizing the Quantity and Perceived Quality of Sleep in a
Health Care Student Population; joint with Texas Tech Univ Dept of Internal Medicine
The Impact of Offshoring on Firm vs. Individual Technology
Trajectories in Optoelectronics
Statistical Graphics project
co-supervised with Erica Fuchs, Dept of EPP (CMU)
Currently working at Facebook.
- Ryan Sieberg BS Mathematical Sciences, minors in
Statistics & Neuroscience 2009,
Master's in Statistical Practice 2010 (CMU)
Using Statistical Methods to Improve Disease Classification
Research fellowship with the Center for the Neural Basis of Cognition (2008-2009)
Currently in Northwestern University Medical School (Fall 2011)
- Alex(andra) Tronetti, BS, Statistics & Economics 2009 (CMU)
Finding the "Important" Structure in a Corpus of Sleep Habit Questionnaires
Graduated with a Master's in Biostatistics from UNC-Chapel Hill (2011).
- Andrew Althouse, PhD Epidemiology (2014), M.S. Applied Statistics (2010), University of Pittsburgh; B.S. Statistics 2008 (CMU)
the Relationship Between Sleep
Characteristics and Pediatric Obesity
Currently at the UPMC Heart and Vascular Institute
- Xiaoyi Fei, B.S.Computer Science 2007 (CMU);
Clustering with a Density-based
Currently in industry; returning to graduate school.