PROJECTQuantitative Data Analysis

Project Navigation
  • Project Home
  • Inference
  • Impact Analysis
  • Bias
  • Experiments
  • Paired Testing
  • Quasi-experimental Methods
  • Difference-in-Difference and Panel Methods
  • Instrumental Variables
  • Propensity Score Matching
  • Regression Discontinuity
  • Regression Techniques
  • Generalized Linear Model
  • Linear Regression
  • Logit and Probit Regression
  • Segregation Measures
  • Inequality Measures
  • Decomposition Methods
  • Descriptive Data Analysis
  • Microsimulation
  • The Dynamic Simulation of Income Model DYNASIM
  • The Health Insurance Policy Simulation Model HIPSM
  • The Model of Income in the Near Term (MINT)
  • The Tax Policy Center Microsimulation Model
  • The Transfer Income Model TRIM
  • Performance Measurement and Management

  • A linear regression predicts an outcome, y, as a function of observable predictors, X, but the function need not be linear in the explanatory variables, because elements of X can include squares or logs of variables, or any other transformation. The key factor is that the regression model is linear in b, the set of parameters to be estimated.

    If we write y = Xb+e with e a random error that has mean zero, then the conditional mean of y is Xb. If we assume that e is independently and identically distributed and uncorrelated with X (e is exogenous), then ordinary least squares is the best (least-variance) linear unbiased estimator of b, by virtue of the Gauss-Markov theorem.

    When e is not independently and identically distributed, we need a weighted-least squares estimator or a robust inference method. When e is correlated with X (e is endogenous), then ordinary least squares suffers from bias and we need to use a quasi-experimental method to estimate b.

    Research Methods Data analysis Quantitative data analysis Research methods and data analytics