The text below is an excerpt from the complete document. Read the full report in PDF format.
Abstract
As part of the Reentry Housing Forum, "Reducing the Revolving Door of Incarceration and Homelessness in the District of Columbia," this paper presents the number of people who used jail only; shelter only; jail and shelter; jail, shelter, and Fire and Emergency Medical Services (FEMS); multiple spells in each, and a mental illness disability, for people using the D.C. Jail between October 1, 2004 and March 31, 2008, public emergency shelters between October 1, 2005 and September 30, 2007, and FEMS between January 1 and August 31, 2008.
Introduction
Specific Data Used, Matching Techniques, and Limitations
The basic technique used to produce the results we report is matching—comparing information on people in one system to information on people in another system to see if they are the same people. Cross-system data matching is at least as much an art form as a simple technique, due to missing data, misspellings, different forms of the same name, and other issues. Usually about 75-80 percent of matches are clear and straightforward, but it is not unusual for 20-25 percent to require some degree of judgment to decide whether someone in one system is the same person as appears in another system. The reader may want to know the decision rules we followed for the matching we report; they are given below.
Due to missing observations for many of the descriptive demographic variables in the data sets, most notably HMIS and FEMS, we were forced to merge on first and last names only—an inefficient and imperfect matching method. This posed two limitations. First, when a first or last name is missing or misspelled we will likely not pick up the overlap across data sets. For FEMS, we were forced to disregard all those with "Jane" or "John Doe" name inputs, many of whom could be homeless individuals unable or unwilling to give identification information. Second, merging on names also links different people with the same first and last name if both appear in the data. This type of matching error is difficult or impossible to fix if the two data sets do not contain other identifiers (e.g., age, race) on which to match, and records for many people in the HMIS system are missing these identifiers. Manually sifting through all 200,000+ data points for questionable spelling and individuals with the same name was prohibitive in the time we had with the amount of missing or inaccurate variables across FEMS and HMIS data sets (DOC data was generally complete and accurate). For now, our analysis can only be used as a close but useful estimation of the overlap.
(End of excerpt. The entire report is available in pdf format.)
The nonpartisan Urban Institute publishes studies, reports, and books on timely topics worthy of public consideration. The views expressed are those of the authors and should not be attributed to the Urban Institute, its trustees, or its funders.
Usage, posting and reprint of materials on the UI web site:
Most publications may be downloaded free of charge from the web site in PDF format. This information may be used and copies made for research, academic, policy or other non-commercial purposes. Proper attribution is required.
Copyright of the written materials contained within the Urban Institute website is owned or controlled by the Urban Institute. Posting UI research papers on other websites is permitted subject to prior approval from the Urban Institute—contact paffairs@urban.org.
If you are unable to access or print the PDF document please contact us or call the Publications Office at (202) 261-5687.