Jing Lei

Home

Research

Teaching

Publications

Links


Research interests

My research is largely motivated by the need to understand new challenges and methods in modern statistics to deal with complicated data, including dynamic systems, nonparametric methods, high dimensional inference, and data privacy.

Numerical weather forecasting and particle filters. My PhD thesis work is on the theory and application of Sequential Monte Carlo (SMC) methods in data assimilation problems. Data assimilation is a statistical inverse problem in numerical weather forecasting, which is closely related to the sequential filtering problem in time series. Due to their natural capacity of incorporating physical dynamics into probabilistic models, state space models have found wide applications in engineering, biology, finance and geophysics. However, the inference becomes challenging if the dynamics is nonlinear. SMC methods are developed as a computationally feasible inference tool for nonlinear state space models. My work in this area focuses on the development of new algorithms for high-dimensional complicated data in state space models as well as theoretical analysis of some common SMC algorithms.

Data privacy. In general, statistical database privacy studies the tradeoff between the utility that databases (for example, those produced by the US Census) can offer and the privacy they afford their constituents. Interesting questions in this area include: How to do statistical analysis under privacy constraints? How can statistical methods help achieve privacy? Some of my work tries to utilize statistical ideas such as robustness and regularization in database privacy and I believe there is much more to explore in this direction. Currently, I am interested in understanding the limit of differential privacy in large contingency tables.

Nonparametric inference. Nonparametric inference has a long story in statistics. My interest in this area is two-fold: first, develop new nonparametric methods for prediction regions and confidence bands, by taking advantage of new ideas emerging in related areas such as machine learning; second, understand the behavior of smoothing parameters in new inference tasks, for example the bandwidth choice in prediction region construction. Moreover, nonparametric methods have been found useful in high dimensional problems when used in conjunction with classical parametric methods, a good example is false discovery rate control in multiple testing.

High dimensional inference. High dimensional data is currently by far the most popular topic in statistics, with a focus on the so-called "large p, small n" problems. While many practical approaches are proposed, theoretical analysis would help better understand the algorithms as well as the nature of the problems. I started working in this area with Vincent Vu, on the minimax estimation for sparse principal component analysis.