**NOTE:** I am not taking on any new students until the 2018--2019 academic year at the earliest.

- Bayesian convergence under mis-specification: specific applications, predictive properties (see below under learning theory)
- Model checking to identify, measure, and if possible fix mis-specification, as in my paper with Gelman
- Efficient implementation of double bootstrap, cross-validation, or other bias corrections to posterior predictive tests
- Tests based on de-Baysed prediction intervals (a la Larry, following the "conformal predictors" crowd)
- Tests that tell us how to change the model, not just that something is wrong (as with CSSR, or causal discovery algorithms)

- Ensemble methods (see below under growing ensemble)
- Bootstrap mis-specification tests for parametric regression based on non-parametric smoothers: computational implementation, power (see relevant chapter in ADAfaEPoV)

- Cross-validation: theoretical properties of the "latin CV" of Dabbs and Junker
- Bootstrapping networks
- adapting the "pigeonhole" bootstrap of Owen and Eckles
- The "empirical graphon" (AG)
- The snowball bootstrap of Eldardiry and Neville: This seems to over-sample high-degree nodes, but are there functionals it is good for? Are there corrections?

- Smoothing adjacency matrices and graph-sequence limits (BK, LW)
- Relation of network sufficient statistics to projectibility and prediction (AR)
- Are the
*only*projectible ERGMs dyadic-independence models? - Asymptotic distribution of MLE in projectible exponential families --- can the Gartner-Ellis result be extended to get a Gaussian distribution?
- Algebraic characterization of projectibility in ERGMs
- Ditto without exponential-family assumption (perhaps assuming completeness of sufficient statistics?)

- Are the
- Model selection (CM)
- For block models specifically
- More generally for network models, especially when graphs are sparse

- Statistical approaches to discovering communities (more generally, blocks)
- Detecting network change (CG, AT, DA, LW)
- Significance of fluctuations in network summary statistics
- Testing for differences in higher-level network structure

- Effects of aggregating nodes on network inference (SM)

- Causal inference of contagion/influence on networks
- Use of community discovery (HW, MK, EM)
- What is identified by random-neighbor assignment?
- Bounds/partial identification (AT)
- What is the "largest" parameter identified in the usual case?
- Experimental design: when is it better to experiment on nodes vs. edges? (AR, VK)
- Analogues to "genomic control" to measure typical size of pure-homophily effects?
- Adaptation of "cryptic relatedness" measures from genetics?

- Use of social networks as sensor networks (DA)
- Distinguishing cultural from biological transmission (DA)
- Distributed learning and problem-solving (HF)
- Effects of network structure on institutional change (HF)

- Mathematical construction of maximal identified parameter
- Application to social influence
- Application to macroeconomic models
- Connection to "partial identification" in Manski's sense?

- Consistency and convergence of Bayesian nonparametrics for stochastic processes
- More comprehensible ("primitive") sufficient conditions for convergence
- Consistency/convergence of PDFAs with Ptiman-Yor priors
- Consistency/convergence for infinite dynamic Bayesian networks
- Proof of convergence in risk under Kullback-Leibler loss
- Extension to more complicated index sets
- Large deviations for location of posterior in space of distributions
- Gaussian process approximations to posterior distribution (from expanding existing LDP?)

- Measuring dependence and effective sample size (DM, MS)
- Estimating measures of weak dependence other than beta-mixing, along the lines of our estimation of beta-mixing coefficients
- Purely finite-dimensional bounds on generalization error (as opposed to the current bounds, which invoke functionals of the infinite-dimensional distribution)

- Model complexity for time series (DM, MS)
- Rademacher complexity of time-series models
- Estimation of empirical Rademacher complexity
- Other possible noise-correlation notions of complexity

- Implicit constraints from stationarity
- Complexity of general state-space models
- Complexity of specific restricted forms of state-space model

- Rademacher complexity of time-series models
- Bootstrap-type bounds on forecasting error (DM, RL)
- Validity of cross-validation for mixing processes, e.g., based on Kontorovich-Ramanan concentration of measure results
- Learning theory on mixtures of processes
- Construction of Rademacher bounds for predictive risk
- Other bounds on predictive risk

- Learning theory for infinite-memory prediction
- Characterization of uniform asymptotic-equipartion-style convergence (perhaps VC dimension of the functions
*X*^{*}->*P*(*X*) ?) - Explicit risk bounds for same

- Characterization of uniform asymptotic-equipartion-style convergence (perhaps VC dimension of the functions

- Improved algorithms for time series (KLK, SS)
- Classification of time series (KLK)
- Bootstrap theory for uncertainty estimates (GDM)
- BIC for tuning control settings
- Ensemblification by randomizing over hypothesis tests
- Automatic identification of order parameters: Given complexity field, what function of the immediate state best matches it?

- How far can they be justified as asymptotic approximations from large deviations principles?
- Projectibility and the algebraic form of sufficient statistics (AR; see under network structure)

- Lebesgue smoothing (GMG, LW)
- Use of fused lasso to decide how much partial pooling to do in hierarchical models
- Using non-parametric smoothers to test parametric specifications (see above)
- Distribution of typical regression coefficients for random low-dimensional projections of sparse high-dimensional systems (i.e., what is the right null model for linear regression?)

- Asymptotic probabilistic analysis
- What aspects are identifiable in pre-asymptotic regime? Discrimination from factor models
- Adaptation to discrete-choice models, e.g. ideal point models, NOMINATE (JG)

- LDP for stochastic automata
- LDP for adaptive-population processes
- Exponential-family connections (see under "exponential families" above)

- Growing ensembles
- Regret and risk under stationary sources (MS)
- Regret bounds in terms of variation of losses (MS)
- Tuning of epoch length, fixed share, weight of new model (MS)
- Practical applications (AZJ, AC)

- When does low regret imply a generalization error bound?
- Working with infinite spaces of models
- Working with limited feedback

- Remapping in fMRI (CG, EM)
- E-mail networks

- Consistency conditions for indirect inference (LZ, MS)
- Indirect inference with non-parametric auxiliary models (SH)
- Indirect inference for network models (MF)
- Indirect inference for agent-based models
- Tractability of indirect inference with chaotic dynamics
- "Approximate Bayesian computation" with non-parametric summaries; what advantages, if any, does ABC offer over indirect inference?

- Consistency (and rate?) of Clauset's estimator of the tail cut-off
- Semi-parametric estimation, with non-parametric density estimation below threshold and power-law tail; properties
- Practical non-parametric density estimation with heavy tails (extending Markovitch and Krieger)
- Test of Yule-Simon model for citations (AC)
- Exact distributions for fluctuating feedback (NW)

- Flesh out calculations about life-span distribution of findings
- Compare to data from least-favorite field
- Modifications to basic model, e.g., initial finding inhibits replication but excites testing of related hypotheses