R package TDA: Statistical Tools for Topological Data Analysis

The R package TDA provides some tools for Topological Data Analysis. In particular, it includes implementations of functions that, given some data, provide topological information about the underlying space, such as the distance function, the distance to a measure, the kNN density estimator, the kernel density estimator, and the kernel distance. The salient topological features of the sublevel sets (or superlevel sets) of these functions can be quantified with persistent homology. We provide an R interface for the efficient algorithms of the C++ libraries GUDHI, Dionysus and PHAT, including a function for the persistent homology of the Rips filtration, and one for the persistent homology of sublevel sets (or superlevel sets) of arbitrary functions evaluated over a grid of points. The significance of the features in the resulting persistence diagrams can be analyzed with functions that implement the methods discussed in (Fasy et al. 2014), (Chazal, Fasy, Lecci, Rinaldo, et al. 2014) and (Chazal, Fasy, Lecci, Michel, et al. 2014). The R package TDA also includes the implementation of an algorithm for density clustering, which allows us to identify the spatial organization of the probability mass associated to a density function and visualize it by means of a dendrogram, the cluster tree.

X <- circleUnif(100)
lim <- rep(c(-1.6, 1.6), 2)
by <- 0.05

Xseq <- seq(lim[1], lim[2], by = by)
Yseq <- seq(lim[3], lim[4], by = by)
Grid <- expand.grid(Xseq, Yseq)

m0 <- 0.1

Diag <- gridDiag(X = X, FUN = dtm, lim = lim,
                 by = by, sublevel = TRUE, library = "Dionysus",
                 printProgress = FALSE, m0 = m0)

B <- 20
alpha <- 0.1
band <- bootstrapBand(X = X, FUN = dtm, Grid = Grid, B = B,
                      parallel = FALSE, alpha = alpha, m0 = m0)

par(mfrow = c(1, 2))
plot(X, pch = 16, cex = 0.5, xlab = "", ylab = "")
plot(Diag[["diagram"]], band = 2 * band[["width"]])

Chazal, Frédéric, Brittany Terese Fasy, Fabrizio Lecci, Bertrand Michel, Alessandro Rinaldo, and Larry Wasserman. 2014. “Robust Topological Inference: Distance-to-a-Measure and Kernel Distance.” Technical Report.

Chazal, Frédéric, Brittany Terese Fasy, Fabrizio Lecci, Alessandro Rinaldo, and Larry Wasserman. 2014. “Stochastic Convergence of Persistence Landscapes and Silhouettes.” In Annual Symposium on Computational Geometry, 474–83. ACM.

Fasy, Brittany Terese, Fabrizio Lecci, Alessandro Rinaldo, Larry Wasserman, Sivaraman Balakrishnan, and Aarti Singh. 2014. “Confidence Sets for Persistence Diagrams.” Annals of Statistics.