ProjectQuantitative Data Analysis

Project Navigation
  • Project Home
  • Microsimulation
  • The Analysis of Transfers, Taxes, and Income Security (ATTIS) microsimulation model
  • The Medicare Policy Microsimulation Model (MCARE-SIM)
  • The Model of Income in the Near Term (MINT)
  • The Tax Policy Center Microsimulation Model
  • The Dynamic Simulation of Income Model (DYNASIM)
  • The Health Insurance Policy Simulation Model (HIPSM)
  • The Transfer Income Model (TRIM)
  • Descriptive Data Analysis
  • Inference
  • Impact Analysis
  • Bias
  • Experiments
  • Paired Testing
  • Quasi-experimental Methods
  • Difference-in-Difference and Panel Methods
  • Instrumental Variables
  • Propensity Score Matching
  • Regression Discontinuity
  • Regression Techniques
  • Generalized Linear Model
  • Linear Regression
  • Logit and Probit Regression
  • Segregation Measures
  • Inequality Measures
  • Decomposition Methods
  • Performance Measurement and Management

  • Linear Regression

    A linear regression predicts an outcome, y, as a function of observable predictors, X, but the function need not be linear in the explanatory variables, because elements of X can include squares or logs of variables, or any other transformation. The key factor is that the regression model is linear in b, the set of parameters to be estimated.

    If we write y = Xb+e with e a random error that has mean zero, then the conditional mean of y is Xb. If we assume that e is independently and identically distributed and uncorrelated with X (e is exogenous), then ordinary least squares is the best (least-variance) linear unbiased estimator of b, by virtue of the Gauss-Markov theorem.

    When e is not independently and identically distributed, we need a weighted-least squares estimator or a robust inference method. When e is correlated with X (e is endogenous), then ordinary least squares suffers from bias and we need to use a quasi-experimental method to estimate b.