The Open Health Services and Policy Journal

2008, 1 : 12-18
Published online 2008 August 22. DOI: 10.2174/1874924000801010012
Publisher ID: TOHSPJ-1-12

Variation in Hispanic Self-Identification, Spanish Surname, and Geocoding: Implications for Ethnicity Data Collection

Debra P. Ritzwoller , Nikki Carroll , Bridget Gaglio , Anna Sukhanova , Fabio A. Almeida , Melanie A. Stopponi and Diego Osuna
Institute for Health Research Kaiser Permanente Colorado, Denver, CO 80237-8066, USA.

ABSTRACT

This study examines the variation in surname analysis and geocoding, and their association with self-identified Hispanics in an HMO. We collected ethnicity data from three studies, and employed Spanish surname software and census tract level geocoding to create proxies for Hispanic ethnicity. We computed sensitivity, specificity, and estimated multivariate logistic regression models to examine the variation in the likelihood of a match between self-identified Hispanics and surname. Sensitivity and specificity with respect to surname varied across the three studies, ranging from 57%-91% and 89%-96%, respectively. Relative to self-report, the sensitivity of the census tract measure of density of Hispanics, varied from 5%-15%. Multivariate models suggest that the likelihood of a match between self-identified Hispanics and surname was not associated with age or gender. Self-identified Hispanics living in neighborhoods with the highest density of Hispanics were less likely than those in more mixed neighborhoods to have a Spanish surname. Employing the Spanish surname software on only densely populated Hispanic census tracts may not always improve the likelihood of correctly identifying Hispanic subjects.