Jing Lei
|
Links |
|
Research interests |
||
|
My research is largely motivated by the need to understand new challenges and methods in modern statistics to deal with complicated data, including dynamic systems, nonparametric methods, high dimensional inference, and data privacy. Numerical
weather forecasting and particle filters. My PhD thesis work is on the theory and application of Sequential Monte Carlo
(SMC) methods in data assimilation problems. Data assimilation is a
statistical inverse problem in numerical weather forecasting, which is
closely related to the sequential filtering problem in time series. Due to
their natural capacity of incorporating physical dynamics into probabilistic
models, state space models have found wide applications in engineering,
biology, finance and geophysics. However, the inference becomes challenging
if the dynamics is nonlinear. SMC methods are developed as a computationally
feasible inference tool for nonlinear state space models. My work in this
area focuses on the development of new algorithms for high-dimensional
complicated data in state space models as well as theoretical analysis of
some common SMC algorithms. Data privacy.
In general, statistical
database privacy studies the tradeoff between the utility that databases
(for example, those produced by the US Census) can offer and the privacy they
afford their constituents. Interesting questions in this area include: How to
do statistical analysis under privacy constraints? How can statistical
methods help achieve privacy? Some of my work tries to utilize statistical
ideas such as robustness and regularization in database privacy and I believe
there is much more to explore in this direction. Currently, I am interested
in understanding the limit of differential privacy in large contingency
tables. Nonparametric inference. Nonparametric inference has a long story in
statistics. My interest in this area is two-fold: first, develop new
nonparametric methods for prediction regions and confidence bands, by taking
advantage of new ideas emerging in related areas such as machine learning;
second, understand the behavior of smoothing parameters in new inference
tasks, for example the bandwidth choice in prediction region construction.
Moreover, nonparametric methods have been found useful in high dimensional
problems when used in conjunction with classical parametric methods, a good
example is false discovery rate control in multiple testing. High dimensional inference. High dimensional data is currently by far the most
popular topic in statistics, with a focus on the so-called "large p, small n" problems. While many practical approaches are
proposed, theoretical analysis would help better understand the algorithms as
well as the nature of the problems. I started working in this area with Vincent Vu, on the minimax
estimation for sparse principal component analysis. |
||