PROJECTQuantitative Data Analysis


Project Navigation
  • Project Home
  • Inference
  • Impact Analysis
  • Bias
  • Experiments
  • Paired Testing
  • Quasi-experimental Methods
  • Difference-in-Difference and Panel Methods
  • Instrumental Variables
  • Propensity Score Matching
  • Regression Discontinuity
  • Regression Techniques
  • Generalized Linear Model
  • Linear Regression
  • Logit and Probit Regression
  • Segregation Measures
  • Inequality Measures
  • Decomposition Methods
  • Descriptive Data Analysis
  • Microsimulation
  • The Dynamic Simulation of Income Model DYNASIM
  • The Health Insurance Policy Simulation Model HIPSM
  • The Model of Income in the Near Term (MINT)
  • The Tax Policy Center Microsimulation Model
  • The Transfer Income Model TRIM
  • Performance Measurement and Management

  • One serious threat to interpreting quantitative analysis is the danger of bias, or asymptotic bias (meaning, inconsistent estimates in the statistical sense), so estimates are not right on average or do not even approach the right answer asymptotically (as datasets become very large). Bias can arise from biased sampling, measurement error, or selection into treatment. The last, selection bias, can arise when participation in programs is not randomly assigned.

    For example, people with poor job prospects who sign up for a job search program might have poor labor market outcomes relative to nonparticipants because of their already weaker prospects, not because of the program. This is an example of negative selection. Or, motivated individuals who sign up for a job search program might have good labor market outcomes relative to nonparticipants because of their motivation alone, not because of the program. This is an example of positive selection.

    This concern is not merely hypothetical. Past writers convinced themselves that vitamin C might prevent cancer, but experimental trials showed that people with lower cancer risk were more likely to eat more vitamin C, and that the vitamin had no detectable causal effect on cancer rates. Even worse than finding an effect where there is none is finding an effect of the wrong sign: more police where there is more crime, or recipients of food assistance who are going hungrier than others without food assistance.

    If people who apply for food assistance and have the same income, education, and other characteristics as nonapplicants are worse off in ways we do not or cannot see, we have negative selection in our estimates of the impact of food assistance on hunger. Past writers have struggled mightily with this exact type of negative selection. They have found that conditioning extensively on observables using various quasi-experimental methods, such as regression adjustment and propensity score methods, cannot eliminate negative selection, but that panel regression and instrumental variables can.

    Research Methods Data analysis Quantitative data analysis Research methods and data analytics