Algorithmic Fairness Explained

Updated May 2026
Algorithmic fairness is the study of how to design AI systems that treat different groups of people equitably. It encompasses multiple mathematical definitions of fairness, including demographic parity, equalized odds, and predictive parity, each capturing a different intuition about what "fair" means. These definitions are provably incompatible in most real-world scenarios, which means that implementing fairness in AI requires explicit value judgments about which form of equity to prioritize for each application.

Why Fairness Needs a Mathematical Definition

When humans make decisions, we have an intuitive sense of fairness that incorporates context, intention, history, and competing considerations. A hiring manager can consider whether a candidate pool is representative, whether evaluation criteria might disadvantage certain groups, and whether past hiring patterns suggest systemic issues. AI systems have no such intuition. A machine learning model optimizes for whatever objective function it is given, and if that objective is pure prediction accuracy, it will happily exploit any pattern in the data, including patterns that reflect historical discrimination. Making an AI system "fair" requires translating the human concept of fairness into a mathematical constraint that can be incorporated into the model's optimization objective.

This translation is harder than it sounds because fairness is not a single concept. It encompasses at least three distinct intuitions. The first is equal treatment: similar individuals should receive similar outcomes regardless of their group membership. The second is equal outcomes: different groups should receive positive outcomes at similar rates. The third is equal accuracy: the system should make errors at similar rates across groups. These intuitions seem compatible in the abstract, but when applied to real data where groups have different base rates (different underlying rates of the outcome being predicted), they become mathematically incompatible.

The field of algorithmic fairness formalizes these intuitions, studies their relationships, proves impossibility results about when they can and cannot be simultaneously satisfied, and develops practical techniques for implementing each definition in working systems. It draws on computer science, statistics, philosophy, law, and social science, because fairness in AI is simultaneously a technical, ethical, legal, and social challenge.

The Major Fairness Definitions

Demographic parity (also called statistical parity or group fairness) requires that the model's positive prediction rate be equal across protected groups. If a hiring algorithm approves 30% of male applicants, it must also approve 30% of female applicants. Formally, for a binary classifier with output Y-hat and protected attribute A: P(Y-hat = 1 | A = a) = P(Y-hat = 1 | A = b) for all groups a and b. Demographic parity captures the intuition that outcomes should be distributed equally across groups, regardless of differences in input features. Its strength is simplicity and measurability. Its weakness is that it ignores whether the groups actually have different qualification rates, and it can be satisfied by random selection, which is fair in one sense but useless as a predictive system.

Equalized odds requires that the model's true positive rate and false positive rate be equal across groups. Among people who actually qualify (true positives), the model should identify them at equal rates across groups. Among people who do not qualify (true negatives), the model should incorrectly approve them at equal rates across groups. Formally: P(Y-hat = 1 | Y = y, A = a) = P(Y-hat = 1 | Y = y, A = b) for all y, a, b. Equalized odds captures the intuition that the model's errors should not fall disproportionately on any group. A relaxed version, equal opportunity, requires only that the true positive rate be equal across groups, ignoring false positive rates.

Predictive parity requires that the model's positive predictive value (precision) be equal across groups. Among people the model predicts will reoffend, the same fraction should actually reoffend regardless of their group. Formally: P(Y = 1 | Y-hat = 1, A = a) = P(Y = 1 | Y-hat = 1, A = b). Predictive parity captures the intuition that a positive prediction should mean the same thing for all groups. If the model says a person is high risk, that prediction should be equally reliable whether the person is Black or white, male or female.

Individual fairness, proposed by Dwork et al. in 2012, takes a fundamentally different approach from group-based definitions. It requires that similar individuals receive similar predictions: if two people are similar in all relevant respects, the model should treat them similarly, regardless of their group membership. Formally, for a distance metric d on individuals and a distance metric D on outcomes: D(M(x), M(y)) <= d(x, y) for all individuals x and y. The challenge is defining the distance metric d: what makes two individuals "similar" in all relevant respects is context-dependent and often contested.

The Impossibility Results

In 2017, Chouldechova proved that when base rates differ between groups, it is mathematically impossible to simultaneously satisfy equalized odds and predictive parity except in trivial cases (where the classifier is either perfect or random). Independently, Kleinberg, Mullainathan, and Raghavan proved a closely related result: calibration (a form of predictive parity) and balance (related to equalized odds) cannot be simultaneously achieved when base rates differ. These proofs are not about limitations of current algorithms. They are mathematical theorems that apply to any possible classifier.

The intuition behind the impossibility is straightforward. Suppose two groups have different base rates for reoffending: Group A reoffends at 30% and Group B reoffends at 50%. A perfectly calibrated model would assign higher risk scores to Group B on average, reflecting the true base rate difference. This satisfies predictive parity (a "high risk" prediction means the same thing for both groups) but violates demographic parity (Group B receives more "high risk" labels). Adjusting the model to equalize prediction rates across groups would destroy calibration, because a "high risk" label for Group A would now correspond to a higher actual risk than a "high risk" label for Group B.

This impossibility means that deploying a fair AI system requires choosing which definition of fairness to prioritize, and that choice is a normative judgment with real consequences for real people. In criminal justice, prioritizing equalized odds (equal false positive rates) protects innocent defendants from all groups equally, but may result in different overall prediction rates across groups. Prioritizing demographic parity ensures equal prediction rates but may mean that a "high risk" label carries different meanings for different groups. Prioritizing predictive parity ensures that labels mean the same thing for all groups but may result in disproportionate error burdens. There is no technical solution that avoids this choice.

Fairness in Practice

Organizations implementing algorithmic fairness must navigate a sequence of decisions. First, identify the protected groups relevant to the application. Anti-discrimination law provides a starting point: race, gender, age, disability status, and national origin are protected in most jurisdictions. But relevant groups may extend beyond legally protected categories. A healthcare algorithm might need to consider socioeconomic status, geography, or insurance type. A content moderation system might need to consider language, political viewpoint, or cultural context.

Second, choose a fairness metric appropriate to the context. This decision should involve stakeholders from affected communities, legal experts familiar with applicable regulations, and domain experts who understand the consequences of different types of errors. A credit scoring system where the primary harm is denying qualified applicants might prioritize equal opportunity (equal true positive rates). A criminal justice system where the primary harm is wrongly detaining innocent people might prioritize equal false positive rates. A medical screening system where the primary harm is missing sick patients might prioritize equal false negative rates.

Third, measure the chosen metric on the current system and determine whether disparities exceed acceptable thresholds. This requires demographic data, which creates a tension with privacy: measuring fairness requires knowing group membership, but collecting and storing group membership data creates privacy risks. Some organizations use Bayesian Improved Surname Geocoding (BISG) to probabilistically infer race and ethnicity from surnames and addresses, avoiding the need to directly collect self-reported demographic data while still enabling fairness measurement.

Fourth, apply mitigation techniques to reduce identified disparities. Pre-processing methods modify the training data, for example by reweighting samples to balance representation or learning fair representations that remove demographic information while preserving predictive features. In-processing methods add fairness constraints to the model's loss function, penalizing the model for violations of the chosen fairness criterion during training. Post-processing methods adjust the model's decision thresholds for different groups to equalize the chosen metric. The Fairlearn library, AI Fairness 360, and the Aequitas toolkit provide implementations of these techniques.

Fairness Across Different Domains

The appropriate fairness framework varies dramatically by application domain. In hiring, U.S. employment law uses the "four-fifths rule" as a practical threshold: if the selection rate for a protected group is less than 80% of the selection rate for the most-selected group, the employer must demonstrate that the selection criteria are job-related and consistent with business necessity. This is essentially a relaxed version of demographic parity. Many organizations implement hiring algorithms with this threshold as an explicit constraint, accepting some reduction in overall predictive accuracy to maintain legal compliance.

In lending, the Equal Credit Opportunity Act prohibits discrimination on the basis of race, color, religion, national origin, sex, marital status, or age. The Fair Housing Act extends similar protections to mortgage lending. Lenders must ensure that their algorithms do not produce disparate impact, which is measured by comparing approval rates across protected groups. The Consumer Financial Protection Bureau has issued guidance clarifying that algorithmic discrimination violates existing fair lending laws, even when the algorithm does not use protected characteristics as explicit inputs.

In healthcare, fairness considerations must account for biological differences between groups that are legitimate predictors alongside social differences that should not influence treatment. A model predicting cardiovascular risk may legitimately consider sex because biological differences in cardiovascular physiology are well-established. But it should not allow race to serve as a proxy for socioeconomic factors that affect health outcomes through discrimination rather than biology. The challenge is disentangling these effects in observational data where biological and social factors are confounded.

In criminal justice, the stakes of both false positives and false negatives are exceptionally high. A false positive (labeling an innocent person as high risk) can result in detention, higher bail, longer sentences, and denied parole. A false negative (labeling a dangerous person as low risk) can result in crimes committed against future victims. The COMPAS controversy illustrated how different fairness definitions lead to different conclusions: ProPublica found that COMPAS violated equalized odds (different false positive rates across racial groups), while Northpointe, the company that built COMPAS, argued that it satisfied predictive parity (the risk score meant the same thing across groups). Both claims were mathematically correct and simultaneously true, because the impossibility results guarantee that no system can satisfy both criteria when base rates differ.

Key Takeaway

Algorithmic fairness involves choosing between mathematically incompatible definitions of equity, each capturing a valid intuition about what "fair" means. No algorithm can simultaneously satisfy all reasonable fairness criteria when groups have different base rates, making fairness implementation an inherently normative decision that requires input from affected communities and domain experts.