A number of findings in the field of machine learning have given rise to questions about what it means for automated scoring- or decisionmaking systems to be fair. One center of gravity in this discussion is whether such systems ought to satisfy classification parity (which requires parity in accuracy across groups, defined by protected attributes) or calibration (which requires similar predictions to have similar meanings across groups, defined by protected attributes). Central to this discussi…
Read moreA number of findings in the field of machine learning have given rise to questions about what it means for automated scoring- or decisionmaking systems to be fair. One center of gravity in this discussion is whether such systems ought to satisfy classification parity (which requires parity in accuracy across groups, defined by protected attributes) or calibration (which requires similar predictions to have similar meanings across groups, defined by protected attributes). Central to this discussion are impossibility results, owed to Kleinberg et al. (2016), Chouldechova (2017), and Corbett-Davies et al. (2017), which show that classification parity and calibration are often incompatible. This paper aims to argue that classification parity, calibration, and a newer, interesting measure called counterfactual fairness are unsatisfactory measures of fairness, offer a general diagnosis of the failure of these measures, and sketch an alternative approach to understanding fairness in machine learning.