PROJECTQuantitative Data Analysis

Project Navigation
  • Project Home
  • Inference
  • Impact Analysis
  • Bias
  • Experiments
  • Paired Testing
  • Quasi-experimental Methods
  • Difference-in-Difference and Panel Methods
  • Instrumental Variables
  • Propensity Score Matching
  • Regression Discontinuity
  • Regression Techniques
  • Generalized Linear Model
  • Linear Regression
  • Logit and Probit Regression
  • Segregation Measures
  • Inequality Measures
  • Decomposition Methods
  • Descriptive Data Analysis
  • Microsimulation
  • The Dynamic Simulation of Income Model DYNASIM
  • The Health Insurance Policy Simulation Model HIPSM
  • The Model of Income in the Near Term (MINT)
  • The Tax Policy Center Microsimulation Model
  • The Transfer Income Model TRIM
  • Performance Measurement and Management

  • The Urban-Brookings Tax Policy Center’s large-scale microsimulation model produces revenue and distribution estimates of the US federal tax system. The model is similar to those used by the Congressional Budget Office (CBO), the Joint Committee on Taxation (JCT), and the Treasury's Office of Tax Analysis.

    The model is based on data from the 2004 public-use file (PUF) produced by the Statistics of Income Division (SOI) of the Internal Revenue Service (IRS). The PUF contains 150,047 records with detailed information from federal individual income tax returns filed in the 2004 calendar year. We attach additional information on demographics and sources of income that are not reported on tax returns through a constrained statistical match of the public-use file with the March 2005 Current Population Survey (CPS) of the US Census Bureau. That match also generates a sample of individuals who do not file income tax returns (“nonfilers”). The dataset combining filers from the PUF (augmented by demographic and other information from the CPS) and nonfilers from the CPS allows us to carry out distribution analysis on the entire population rather than just the segment that files individual income tax returns, and to model tax proposals that would potentially affect current nonfilers.

    The tax model consists of two components: a statistical routine that “ages” or extrapolates the 2004 data to create a representative sample of both filing and nonfiling tax units for future years,1 and detailed tax calculators that compute individual income tax liability for all filers in the sample under current law and under alternative policy proposals, compute the employee and employer shares of payroll taxes for Social Security and Medicare, assign the burden of the corporate income tax to tax units, and determine the expected value of estate tax liability for each tax unit in the sample using an estate tax calculator combined with age-specific mortality rates.

    Aging and extrapolation

    For 2005 to 2024, we age the data based on CBO forecasts and projections for growth in various types of income, CBO and JCT baseline revenue projections, IRS estimates of future growth in the number of tax returns, JCT estimates of the distribution of tax units by income, and Census Bureau data on the size and age composition of the population. We use actual 2005–10 data that are available.2

    A two-step process produces a representative sample of the filing and nonfiling population in years beyond 2004. We first inflate the dollar amounts of income, adjustments, deductions, and credits on each record by their appropriate forecasted per capita growth rates. We use CBO’s forecast for per capita growth in major income sources such as wages, capital gains, and nonwage income (interest, dividends, Social Security benefits, and others). We assume that most other items grow at CBO’s projected growth rate for per capita personal income. Second, we use a linear programming algorithm to adjust the weights on each record so the major income items, adjustments, and deductions match aggregate targets. We also attempt to adjust the overall distribution of income to match published information from the SOI division of the IRS for 2004 through 2010 and published estimates of the 2011 and 2012 distributions from JCT. We extrapolate recent trends to obtain projected distributions for years beyond 2012 and modify those distributions to hit CBO's published forecasts for baseline individual income tax revenue.

    Individual income tax calculator

    Based on the extrapolated dataset, we can simulate policy options using a detailed tax calculator that captures most features of the federal individual income tax system, including the alternative minimum tax. The model's current-law baseline reflects major income tax legislation enacted through the beginning of 2013, including the American Taxpayer Relief Act of 2012, which was signed into law in January 2013. The American Taxpayer Relief Act of 2012 made permanent most of the provisions enacted in the Economic Growth and Tax Relief Reconciliation Act of 2001 and the Jobs and Growth Tax Relief Reconciliation Act of 2003, permanently patched the alternative minimum tax, extended for five years the enhancements to individual income tax credits originally enacted in the 2009 stimulus legislation, and temporarily extended certain other tax provisions.

    In our distribution tables, we assume that the burden of the individual income tax falls on the payer. CBO, JCT, and Treasury use the same assumption.

    Payroll tax calculator

    Using the extrapolated dataset, we also calculate federal payroll taxes for Social Security and Medicare. One complication is that for married couples, our tax return data provide information on combined earnings, whereas payroll taxes are based on individual earnings. This is important because the amount of earnings subject to the Social Security portion of payroll taxes is capped at $113,700 for 2013 and is indexed annually. For married couples, we therefore rely on the split in wages observed on the CPS record to which the PUF record is matched in order to assign earnings to each individual.

    In our distribution tables, we assume that the worker bears the burden of both the employer and employee portions of payroll taxes. This premise is widely accepted among economists. CBO, JCT, and Treasury make the same assumption for their distribution tables.

    Assigning corporate tax burden to individuals

    Although firms pay the corporate income tax, the economic incidence of the tax falls on individuals. The model therefore distributes the burden of the tax to individuals. The incidence of the corporate tax, however, is an unsettled theoretical issue. The tax could be borne by the owners of corporate stock or passed on to labor in the form of lower real wages, to consumers in the form of higher prices, or to the owners of some or all capital in the form of lower real rates of return.

    In September 2012, we updated the assumptions we use to distribute the corporate income tax: we estimate that 60 percent is borne by shareholders, 20 percent by all capital owners, and 20 percent by labor. Based on our review of research on the issue, we do not assign any of the burden to consumers. Previously, we assumed that the entire burden fell on all owners of capital. Our current assumptions are similar to those now made by CBO and Treasury. The JCT does not distribute the corporate tax.

    We rely on CBO for our projections of baseline corporate tax liability and, when available, on JCT estimates of changes in corporate tax liability due to tax proposals.

    Estate tax

    Because the income tax data in our model contain no direct information about wealth holdings, we rely on information from the Survey of Consumer Finances (SCF) to develop imputations of assets and liabilities. Specifically, we impute asset items and liabilities to each record in the income-tax file based on regressions of those wealth components against explanatory variables that exist in both the SCF and SOI datasets. To mitigate the problem of the SCF’s small sample size—it contains fewer than 5,000 observations—we pool data from the 2001 and 2004 surveys. In addition to roughly doubling the sample size, combining data from two years smooths out some temporal variation in asset values. We then calibrate the imputed number of individuals owning each type of asset (and liability) and their aggregate values to match SCF totals, augmented by the net worth of the Forbes 400.3 We further adjust the imputed distribution of each asset and liability by income class to more closely resemble those reported in the SCF.

    We assign values for most estate tax deductions and credits based on averages calculated from SOI estate tax data. Our estate tax calculator then determines estate tax liability for each record in the database based on values for gross estate, deductions, and credits, and the relevant estate tax rates and brackets. Finally we calculate each record’s expected value of gross estate and net estate tax liability by multiplying by age-specific mortality rates. We employ a linear programming algorithm to reweight the records to ensure that our baseline estimates of the distribution of and aggregate values for the gross estate and its components match the most recent published estate tax data from SOI.4

    In our distribution tables, we assume the estate tax is borne by decendents (the same assumption that Treasury used when it distributed the burden of estate taxes). Neither CBO nor JCT includes the estate tax in its incidence analysis.

    Additional features

    In recent years, the Tax Policy Center (TPC) has updated the tax model's estate tax module to incorporate the latest data on estate tax filers from SOI and the retirement savings module to be consistent with 2004 data. We also expanded the retirement module to allow us to model the revenue and distributional implications of implementing automatic enrollment in Individual Retirement Accounts and 401(k) retirement plans. In the latest version of the tax model, we improved our methodology for measuring the present value of the tax savings from tax-deferred retirement accounts. The latest version of the model also includes improved imputations of mortgage interest on second homes and on deductible interest on home equity loans. The model also contains imputations for all itemizable deductions, including charitable contributions, medical expenses, and home mortgage interest, for “nonitemizers”—people who claim only the standard deduction on their tax return. These imputations allow us to model the distribution and revenue implications of proposals to replace certain deductions with credits that would be available to all taxpayers regardless of itemization status.

    The latest version of our microsimulation model also includes a completely overhauled and expanded education module. The education module incorporates the most recent data available from the Department of Education—including detailed demographic and financial characteristics of postsecondary students from the 2008 National Postsecondary Student Aid Study, and actual and projected numbers of Pell grant recipients and awards—and education tax credit and tuition and fees deduction data publicly released by the IRS. We use the data to impute student status, characteristics, and education expenditures onto the tax model database. The module allows us to analyze both current tax incentives for education—the American Opportunity Tax Credit, the HOPE and Lifetime Learning tax credits, and the tuition and fees deduction—and the Pell grant program. We also use the module to examine revenue and distributional implications from modifying these tax incentives or Pell grant rules.

    We have also made several improvements that allow us to model various indirect taxes, including certain excise taxes, broad-based consumption taxes (e.g., a value-added tax), and environmental taxes. Using data from the Consumer Expenditure Survey, the Medical Expenditure Panel Survey, and the American Housing Survey, we produce detailed estimates of the consumption expenditures of individuals in the tax model database. We also use the Urban Institute’s DYNASIM model to estimate the amount of future consumption financed out of current wealth, which allows us to analyze transitional issues for options that move the tax system from an income base to a consumption base. This work gives TPC the ability to estimate the distributional impact of hybrid income-consumption tax systems and other comprehensive reform options, such as the plans endorsed by the President’s Advisory Panel for Federal Tax Reform in 2005 and, more recently, by the Bipartisan Policy Center’s Debt Reduction Task Force.

    In 2013, TPC developed an income concept called expanded cash income (ECI) for the purpose of distributional analysis. We construct ECI as a broad measure of pretax income, and we use it both to rank tax units in our distribution tables and to calculate effective tax rates. We define ECI as adjusted gross income plus above-the-line adjustments (e.g., Individual Retirement Account deductions, student loan interest, self-employed health insurance deductions, etc.), employer-paid health insurance and other nontaxable fringe benefits, employee and employer contributions to tax-deferred retirement savings plans, tax-exempt interest, nontaxable Social Security benefits, nontaxable pension and retirement income, accruals within defined benefit pension plans, inside buildup within defined contribution retirement accounts, cash and cash-like (e.g., Supplemental Nutrition Assistance Program) transfer income, employer’s share of payroll taxes, and imputed corporate income tax liability.

    Finally, using data from the Medical Expenditure Panel Survey, TPC collaborated with Urban’s Health Policy Center to impute details of health insurance eligibility, coverage, and medical expenses in the tax model database. With a modification to the tax calculator, the imputed information allows us to analyze policies that change the tax treatment of health insurance, such as repealing or limiting the currently unlimited exclusion of employer-provided health insurance. We have revised the health module to account for provisions in the Patient Protection and Affordable Care Act and modifications in subsequent legislation, such as the Medicare and Medicaid Extenders Act of 2010.



    1. A tax unit is an individual, or a married couple who file a tax return jointly, along with all dependents of that individual or married couple. A tax unit therefore differs from a family or a household in certain situations. For example, two people cohabiting would be considered one household, but if they were not legally married they would file separate tax returns and thus be considered two tax units.

    2. In our latest update, we also use the available preliminary data for the 2011 tax year.

    3. The SCF specifically omits data on the Forbes 400. We need to add them to the file to account for the substantial share of assets that they own. For 2004, we add approximately $1 trillion in net worth to the $67 trillion implied by the SCF.

    4. For a detailed description of TPC's estate tax methodology, see Back from the Grave: Revenue and Distributional Effects of Reforming the Federal Estate Tax.

    Research Methods Microsimulation modeling Data analysis Quantitative data analysis Research methods and data analytics