A Radical Proposal for the Provision of Micro-Data Samples and the Preservation of Confidentiality

Stephen E. Fienberg


The problem of the preservation of confidentiality for the release of micro-data samples is reconsidered in the context of estimating an empirical cumulative density function. A proposal is made for the use of a bootstrap-like approach to the generation of synthetic microdata files. Many details in the proposal require careful attention, and the implications of the proposed method for data disclosure still need to be explored empirically. The method suggested here bears a remarkable similarity to a proposal by Rubin (1993) for the use of multiple imputation for data disclosure limitation.

Keywords: Bootstrap; Data-disclosure limitation; Multiple imputation; Simulated micro-data sets; Smoothed cumulative distribution function.

Here is the full postscript text for this technical report. It is 77 kbytes.