Brief Five Ethical Risks to Consider before Filling Missing Race and Ethnicity Data
Workshop Findings on the Ethics of Data Imputation and Related Methods
Megan Randall, Alena Stern, Yipeng Su
Display Date
Download Report
(248.16 KB)

A growing set of methods from data science and statistics could fill critical gaps in race and ethnicity data by matching, imputing, or otherwise adding demographic and locational characteristics to existing datasets. As the potential for appending race and ethnicity variables grows, however, so does the risk of ethical violations and potential harm to Black, Indigenous, and other people of color. In November 2020, the Urban Institute’s Racial Equity Analytics Lab and Office of Technology and Data Science convened experts from the data science, government, racial justice, and data privacy fields to discuss the ethics of using advanced statistical methods to fill gaps in race and ethnicity data. This brief summarizes five ethical risk areas that surfaced during the workshop, including: 1) excluding people and communities of color from ownership of their data and from decisions on research process and methods; 2) violating individual informed consent; 3) compromising individual privacy or confidentiality; 4) producing inaccurate estimates and misleading conclusions; and 5) generating data for purposes that harm people or communities of color.

Research Areas Race and equity
Tags Racial and ethnic disparities Structural racism in research, data, and technology Data and technology capacity of nonprofits Racial Equity Analytics Lab
Policy Centers Center on Nonprofits and Philanthropy Office of Race and Equity Research
Research Methods Research methods and data analytics