Administrative tax data contain a wealth of information that is potentially valuable for research and analysis. However, the legal and ethical imperative to protect taxpayer privacy has restricted access to a small number of government analysts and select researchers. We propose to develop in consultation with the experts at the Statistics of Income division of the IRS a fully synthetic tax database – that is, a file that preserves many of the statistical characteristics of the restricted data without containing any identifiable tax return information. We will test our procedures using the existing public use file and adapt the procedures to run on the confidential tax data. Working with the IRS, we also hope to develop a procedure for researchers to submit their statistical programs, which have been tested on the synthetic data, to run on IRS computers subject to a review to guarantee that output satisfies disclosure avoidance protocols. A fee structure would be set to defray costs.
To reuse content from Urban Institute, visit copyright.com, search for the publications, choose from a list of licenses, and complete the transaction.