Combining Racial Groups in Data Analysis Can Mask Important Differences in Communities
Surveys, datasets, and published research often lump together racial and ethnic groups, which can erase the experiences of certain communities. Combining groups with different experiences can mask how specific groups and communities are faring and, in turn, affect how government funds are distributed, how services are provided, and how groups are perceived.
Large surveys that collect data on race and ethnicity are used to disburse government funds and services in a number of ways. The US Department of Housing Urban Development, for instance, distributes millions of dollars annually to Native American tribes through the Indian Housing Block Grant. And statistics on race and ethnicity are used as evidence in employment discrimination lawsuits and to help determine whether banks are discriminating against people and communities of color.
Despite the potentially large effects these data can have, researchers don’t always disaggregate their analysis to more racial groups. Many point to small sample sizes as a limitation for including more race and ethnicity categories in their analysis, but efforts to gather more specific data and disaggregate available survey results are critical to creating better policy for everyone.
To illustrate how aggregating racial groups can mask important variation, we looked at the 2019 poverty rate across 139 detailed race categories in the Census Bureau’s annual American Community Survey (ACS). The ACS provides information that helps determine how more than $675 billion in government funds is distributed each year.
The official poverty rate in the United States stood at 10.5 percent in 2019, with significant variation across racial and ethnic groups. The primary question in the ACS concerning race includes 15 separate checkboxes, with space to print additional names or races for some options (a separate question refers to Hispanic or Latino origin).
Although the survey offers ample latitude for interviewees to respond with their race, researchers have a tendency to aggregate racial categories. People who identify as Asian or Pacific Islander (API), for example, are often combined in economic analyses.
This aggregation can mask variation within racial or ethnic categories. As an example, one analysis that used the ACS showed 11 percent of children in the API group are in poverty, relative to 18 percent of the overall population. But that estimate could understate the poverty rate among children who identify as Pacific lslanders and could overstate the poverty rate among children who identify as Asian, which itself is a broad grouping that encompasses many different communities with various experiences. Similar aggregating can be found across economic literature, including on education, immigration (PDF), and wealth.
We find similar variation for our estimates of the 2019 poverty rate within several aggregated racial groups. In the graph below, each dot represents an estimated poverty rate for the 139 separate detailed racial groups, lumped together in some of the more common aggregated categories. (Hispanic/Latino is not shown in our chart because it is asked in a separate question about ethnicity.)
Within the American Indian or Alaska Native category, poverty rates vary from 5.9 percent for the Aleut (an Indigenous community primarily living in Alaska) to 36.9 percent for those who identify as part of the Sioux Native American tribe (living primarily in Nebraska, North Dakota, and South Dakota). Within the Asian or Pacific Islander category, we find a range of poverty rates from 4.5 percent for those who identify as both Chinese and Japanese to 27.8 percent for those who identify as Mongolian. Even within the Pacific Islander category, we find a range of poverty rates from 6.3 percent for Fijians to 24.6 percent for people who identify as more than one Micronesian race.
Recognizing that aggregate categories can mask significant variation across groups could help improve service and benefit provision across the country. Researchers collecting and using data can take a few steps to better represent the variation in their data.
- Carefully inspect detailed race and ethnicity data to better understand the variation across all groups.
- Seek out organizations led by people and communities of color to learn from their expertise (PDF) and weave that expertise into the analysis.
- Make an effort to collect more data from smaller and underrepresented communities.
If researchers took a different perspective to collecting and reporting data on race and ethnicity, we could better understand the needs and experiences of different communities and ensure funding and services reach everyone who needs them.
(Pekic / Getty Images)