Nicolás Kim


About me

I am a Ph.D. candidate in the Department of Statistics at Carnegie Mellon University. Previously, I was at Boston University where I graduated with a B.A. in Mathematics with Honors.

Alessandro Rinaldo is my advisor. My current research interests are elements of:
Statistics ⋂ (Networks ⋃ Algebra).

My email is [AndrewID]
My AndrewID is nicolask.



Nicolas Kim, Alessandro Rinaldo (2017). Edge-Induced Sampling from Graphons. Preprint. [pdf].

Nicolas Kim, Alessandro Rinaldo (2017). Community Detection on Ego Networks via Mutual Friend Counts. Submitted. [pdf].

Jin Cao, Sining Chen, Sean Kennedy, Nicolas Kim, Lisa Zhang (2017). Extracting Mobile User Behavioral Similarity via Cell-Level Location Trace. 20th IEEE Global Internet Symposium (GI 2017).

Nicolas Kim*, Dane Wilburne*, Sonja Petrović, Alessandro Rinaldo (2016). On the Geometry and Extremal Properties of the Edge-Degeneracy Model. Third SDM Workshop on Mining Networks and Graphs. [arXiv].

Nicolas Kim (2016). The Effect of Data Swapping on Analyses of American Community Survey Data. Journal of Privacy and Confidentiality: Vol. 7: Iss. 1, Article 3. [Published] [arXiv] [Code].

Recent activities

This upcoming summer, I will be a statistics intern at Lilly.

I passed my thesis proposal on February 27, 2017.

I was a statistics research intern at Bell Labs during the summer of 2016.

From May 5–7, 2016, I was in Miami, Florida to give a talk at the Third SDM Workshop on Mining Networks and Graphs: slides.

I presented a poster at the American Statistical Association's Pittsburgh Spring Banquet on March 17, 2016. The poster is about some work on a particular algebraic statistical network model, the edge-degeneracy ERGM. Link to the poster. This poster was also presented at the Innovation with Impact meeting at CMU on April 7, 2016.

From July 1 to July 10, 2015, I was in Osaka, Japan for a summer school and conference on Gröbner Bases. Here is the website link.

Social networks

Global community detection for giant social networks is intractable. In my most recent work, I proposed a highly efficient algorithm that solves a local version of the community detection problem, and also established statistical guarantees for this procedure.

Algebraic statistics

I wrote some slides to introduce the ideas of Diaconis and Sturmfels (1998) along with some basic computational algebraic geometry. These are intended for a 30-minute talk.

For those interested in learning more about this new field (which is at the intersection of statistics and algebraic geometry) I recommend the book "Lectures on Algebraic Statistics" by Drton, Sturmfels, and Sullivant. A text which may be of interest to readers considering study in algebraic statistics is that of Ulf Grenander, "Probabilities on Algebraic Structures".

Probability theory

Extremal asymptotics of Exponential Random Graph Models (ERGMs).

Minimum spanning trees and randomized algorithms.

Data privacy

I am interested in improving the quality of publicly-released Census data for the Decennial Census as well as non-census surveys.

I gave a talk in Washington, D.C., at the NSF-Census Research Network Fall 2015 meeting; here are the slides.

This research was partially funded by Stephen E. Fienberg and William F. Eddy through NSF grant SES 1130706.


The CMU wordmark as a scatterplot of data, clustered with DBSCAN

Code repositories

My directory listing where you can download code, images, etc.

My GitHub page.

An index of my Haskell code.

Learn You a Data Analysis!

Check out my new project, Learn You a Data Analysis!

CMU Statistics T-shirt

I wrote the code to generate the Department of Statistics' T-shirt: data-ify your logo or wordmark!

Functional programming and Haskell

I am currently learning Haskell. A very useful package for algebraic statistics is HLearn.

The book I'm following is Learn You a Haskell for Great Good (LYAH). It's pretty much the canonical text for beginners to Haskell, and I can also recommend it as a great introductory text to the ideas of functional programming.

An exciting direction for programming is probabilistic programming. I am considering spending some time with Hakaru, an embedded language for Haskell. There is a paper by Martin Erwig and Steve Kollmansberger that seems worth reading.

In addition, I'm also following Bartosz Milewski's Category Theory for Programmers, which is superb, although it is still in-progress.

If you're just interested in learning a bit about what functors, monads, and applicatives are in Haskell, then this blog post should be helpful.

My CV/résumé, LinkedIn, blog, and ORCID.

David Mumford's thoughts on the role of statistical inference and probability theory in pure and applied mathematics' futures: The Dawning of the Age of Stochasticity.

Take the Springer GTM Test (I'm Doob's Measure Theory).

A great page to do holiday shopping for your colleagues: Nausicaa Distribution (Etsy).

Updated February 2017