PROJECTApplying Equity Awareness in Data Privacy Methods

On his first day in office, President Biden signed an executive order directing federal agencies and White House offices to examine barriers to racial equity and initiated several efforts to address inequitable policies that affect people and communities of color. One such barrier, which has been a growing concern among researchers and public policymakers, is statistical data privacy (or statistical disclosure control) methods that provide researchers or policymakers access to data while preserving participants’ privacy but often do not explicitly consider racial equity. 

Although these methods—such as suppressing data, adding random noise under differential privacy, or generating synthetic data—try to balance the need for accurate information against privacy concerns, all have equity implications for different racial groups stemming from the utility-risk trade-off. If equity is not considered, researchers can create unintended harm by inequitably distributing either the privacy risks or the utility of the information obtained from the data.

For instance, the number of Hispanics working in meat-processing plants, which are mostly in rural communities, has been increasing for decades. But the share of the rural population identifying as Hispanic remains small. Most data privacy and confidentiality methods would remove the Hispanic people’s data to protect their privacy. But we then erase the linguistic and cultural needs of rural areas and workplaces, such as insufficient warnings and health procedures on preventing and combating the spread of the COVID-19 virus or increased discrimination.

Research Methods Safely expanding data access