Buolamwini & Gebru's (2018) Gender Shades study became one of the most influential papers in AI ethics. They audited three commercial facial recognition systems (Microsoft, IBM, Face++) and found:
- Darker-skinned women: error rates up to 34.7%
- Lighter-skinned men: error rates as low as 0.8%
- The disparity was intersectional — gender alone or skin color alone didn't fully explain the gap; it was the combination that mattered most
Impact
This paper catalyzed the AI fairness movement. It led to:
- IBM, Microsoft, and Amazon all improving their systems or pausing sales to law enforcement
- Multiple U.S. cities banning facial recognition technology
- The EU AI Act classifying biometric identification as "high-risk"
Source
Buolamwini, J., & Gebru, T. (2018). Gender Shades. PMLR, 81, 77-91.