Nicolás Kim :: (Ph.D. Student, Statistician)


About Me

I am a Ph.D. student in the Department of Statistics at Carnegie Mellon University. Previously, I was at Boston University where I graduated with a B.A. in Mathematics with Honors.

Alessandro Rinaldo is my advisor. My current research interests are elements of:
Statistics ⋂ (Networks ⋃ Algebra).

My email is [AndrewID]
My AndrewID is nicolask.



Nicolás Kim, Dane Wilburne, Sonja Petrović, Alessandro Rinaldo. (2016). On the Geometry and Extremal Properties of the Edge-Degeneracy Model. Third SDM Workshop on Mining Networks and Graphs.

Nicolás Kim. (2016). The Effect of Data Swapping on Analyses of American Community Survey Data. Journal of Privacy and Confidentiality: Vol. 7: Iss. 1, Article 3. Code, arXiv link.


I was a statistics research intern at Bell Labs over this past summer.

From May 5–7, 2016, I will be in Miami, Florida to give a talk at the Third SDM Workshop on Mining Networks and Graphs: slides.

I will be presenting a poster at the ASA Pittsburgh Spring Banquet on March 17th. The poster will be about my work on a particular algebraic statistical network model, the edge-degeneracy ERGM. Link to the poster. This poster will also be presented at the Innovation with Impact meeting at CMU on April 7th.

I gave a talk in Washington, D.C., at the NSF-Census Research Network Fall 2015 meeting; here are the slides.

From July 1 to July 10 2015, I was in Osaka, Japan for a summer school and conference on Gröbner Bases. Here is the website link.

I gave a talk about functional programming and Haskell for the Department of Statistics computing seminar, Stat Bytes. Here are the slides.

Network models

Global community detection for giant social networks is intractable. In my most recent work, I proposed a highly efficient algorithm that solves a local version of the community detection problem, and also established statistical guarantees for this procedure.

Algebraic statistics

I wrote some slides to introduce the ideas of Diaconis and Sturmfels (1998) along with some basic computational algebraic geometry. These are intended for a 30-minute talk.

For those interested in learning more about this new field (which is at the intersection of statistics and algebraic geometry) I recommend the book "Lectures on Algebraic Statistics" by Drton, Sturmfels, and Sullivant. A text which may be of interest to readers considering study in algebraic statistics is that of Ulf Grenander, "Probabilities on Algebraic Structures".

Probability Theory

Extremal asymptotics of Exponential Random Graph Models (ERGMs).

Minimum spanning trees and randomized algorithms.

Census Bureau confidentiality methods

I am interested in improving the quality of publicly-released Census data for the Decennial Census as well as non-census surveys.

This research was partially funded by Stephen E. Fienberg and William F. Eddy through NSF grant SES 1130706.


The CMU wordmark as a scatterplot of data, clustered with DBSCAN

Code repositories

My directory listing where you can download code, images, etc.

My GitHub page.

An index of my Haskell code.

Learn You a Data Analysis!

Check out my new project, Learn You a Data Analysis!

CMU Statistics T-shirt

I wrote the code to generate the Department of Statistics' T-shirt: data-ify your logo or wordmark!

Functional programming and Haskell

I am currently learning Haskell. A very useful package for algebraic statistics is HLearn.

The book I'm following is Learn You a Haskell for Great Good (LYAH). It's pretty much the canonical text for beginners to Haskell, and I can also recommend it as a great introductory text to the ideas of functional programming.

An exciting direction for programming is probabilistic programming. I am considering spending some time with Hakaru, an embedded language for Haskell. There is a paper by Martin Erwig and Steve Kollmansberger that seems worth reading.

In addition, I'm also following Bartosz Milewski's Category Theory for Programmers, which is superb, although it is still in-progress.

If you're just interested in learning a bit about what functors, monads, and applicatives are in Haskell, then this blog post should be helpful.


I have TA'ed for the following courses:

  • Modern Regression (Undergraduate course)
  • Introduction to Probability Theory (Undergraduate)
  • Sampling, Survey, and Society (Undergraduate)
  • Statistical Graphics and Visualization (Graduate)
  • Applied Multivariate Methods (Undergraduate/Graduate)

My CV/résumé, LinkedIn, blog, and ORCID.

Find out what the NSF Census Research Network CMU node is doing.

If you are involved in the Census project at CMU and would like to learn how to access the project GitHub, follow these instructions.

David Mumford's thoughts on the role of statistical inference and probability theory in pure and applied mathematics' futures: The Dawning of the Age of Stochasticity.

Check out PHD Comics, Dinosaur Comics, Abstruse Goose, SMBC (Saturday Morning Breakfast Cereal), and XKCD (X-treme Kansas College of Dentistry).

Take the Springer GTM Test (I'm Doob's Measure Theory).

A great page to do holiday shopping for your colleagues: Nausicaa Distribution (Etsy).

Updated October 2016