Descriptive Data Analysis

Descriptive techniques often include constructing tables of means and quantiles, measures of dispersion such as variance or standard deviation, and cross-tabulations or "crosstabs" that can be used to examine many disparate hypotheses. Those hypotheses are often about observed differences across subgroups. Specialized descriptive techniques are used to measure segregation, discrimination, and inequality. Discrimination is often measured using audit studies or decomposition methods. More segregation by type or inequality of outcomes need not be wholly good or bad in itself, but it is often considered a marker of unfair social processes; accurate measurement of the levels across time and space is a prerequisite to understanding those processes.

A table of means by subgroup can show important differences across subgroups, and this kind of descriptive analysis often invites causal inference. When we see a gap in earnings, for example, we naturally want to extrapolate reasons those patterns exist. But this enters the province of measuring impacts, and different techniques are needed. Often, means differ merely because of random variation, and statistical inference is needed to determine whether observed differences could stem merely from chance.

A crosstab or two-way tabulation shows the proportions of units with distinct values for each of two variables, or cell proportions. For example, we might ask what proportion of the population has a high school degree and receives food or cash assistance, which requires a crosstab of education versus receipt of assistance. Then we might also examine row proportions, or the fractions in each education group who receive assistance, perhaps seeing assistance levels sharply lower at higher education levels.

We could also look at column proportions, for the fraction of recipients with different levels of education, but this is the opposite direction from any causal effects. We might see a surprisingly high number or proportion of recipients with a college education, but this might be a result of larger numbers of college graduates than people with less than a high school degree (the column proportions of the total population without regard to receipt of assistance).