Practical Guide for Causal Pathways and Sub-group Disparity Analysis

Farnaz Kohankhaki¹, Shaina Raza¹, Oluwanifemi Bamgbose¹,
Deval Pandya¹, Elham Dolatabadi^1,2,*,

Abstract

In this study, we introduce the application of causal disparity analysis to unveil intricate relationships and causal pathways between sensitive attributes and the targeted outcomes within real-world observational data. Our methodology involves employing causal decomposition analysis to quantify and examine the causal interplay between sensitive attributes and outcomes. We also emphasize the significance of integrating heterogeneity assessment in causal disparity analysis to gain deeper insights into the impact of sensitive attributes within specific sub-groups on outcomes. Our two-step investigation focuses on datasets where race serves as the sensitive attribute. The results on two datasets indicate the benefit of leveraging causal analysis and heterogeneity assessment not only for quantifying biases in the data but also for disentangling their influences on outcomes. We demonstrate that the sub-groups identified by our approach to be affected the most by disparities are the ones with the largest ML classification errors. We also show that grouping the data only based on a sensitive attribute is not enough, and through these analyses, we can find sub-groups that are directly affected by disparities. We hope that our findings will encourage the adoption of such methodologies in future ethical AI practices and bias audits, fostering a more equitable and fair technological landscape.

Introduction

Fairness in data science and machine learning (ML) is indispensable for the responsible development and deployment of ethical artificial intelligence (AI) technologies (Díaz-Rodríguez et al. 2023). Key tools in data science, including Aequitas (Saleiro et al. 2018), AI Fairness 360 (Bellamy et al. 2018), and Fairlearn (Bird et al. 2020) play a pivotal role in addressing fairness challenges in ML models, focusing on concepts such as demographic parity and equalizing statistics across sensitive attribute groups (Saha et al. 2020; Feldman et al. 2015; Zhang and Bareinboim 2018a; Barocas, Hardt, and Narayanan 2023a; Raza et al. 2024). However, these approaches can lead to fairness gerrymandering, where broad fairness across high-level groups masks unfair treatment within sub-groups (Kearns et al. 2018). Sub-group fairness approaches (Yang, Cisse, and Koyejo 2020; Shui et al. 2022) have emerged to address this, aiming to reconcile group and individual fairness notions (Pfohl et al. 2023).

Furthermore, understanding and quantifying the extent to which the observed disparity in outcomes, such as those seen with demographic parity, is attributed to the causal influence of sensitive attributes is crucial in fields, including health and social sciences (Mehrabi et al. 2021; Barocas, Hardt, and Narayanan 2023b; Braveman et al. 2011; Glymour and Hamad 2018). Causality-based fairness frameworks view disparity as the causal effect of sensitive attributes $S$ on outcomes $Y$ , raising fundamental questions about how changes in these attributes affect average outcomes (Kearns et al. 2018). These methodologies revolve around a central question: if the sensitive attribute $S$ changed (e.g., changing from marginalized group $s_{1}$ to non-marginalized group $s_{2}$ ), how would the outcome $Y$ change on average?

Two prominent causal frameworks, the structural causal model (SCMs) (Wu et al. 2019) and the potential outcome framework (Khademi et al. 2019), have been utilized for causal fairness analysis and more particularly to quantify the disparity (MacKinnon, Fairchild, and Fritz 2007; Pearl 2014). SCMs assume that we have full knowledge of the causal graph, enabling us to decompose the causal effect of any variable into different paths, such as direct and indirect effects. On the other hand, the potential outcome framework (Rubin 2005) does not assume the availability of the causal graph and instead focuses on estimating the causal effects of treatment variables. However, a common challenge across all causal models is identifiability, referring to whether they can be uniquely measured from observational data (Morgan and Winship 2015). This poses a critical barrier to applying these notions to real-world scenarios.

Randomized experiments, considered the gold standard for inferring causal relationships in statistics, are often not feasible or cost-effective in the context of disparity analysis (Hariton and Locascio 2018). Therefore, in most cases, the causal relationship must be inferred from observational data rather than controlled experiments. This limitation has spurred a stream of research aiming to address these challenges and develop more practical and effective methodologies for causal fairness analysis. Early literature in the SCM primarily utilized linear and parametric methods, limiting its capacity to offer a universal approach for analyzing natural and social phenomena characterized by non-linearities and interactions (MacKinnon, Fairchild, and Fritz 2007). Later, Pearl introduced the causal mediation formula designed for arbitrary non-parametric models, serving as a valuable tool for decomposing total effects (Pearl 2014). Subsequently, a substantial body of literature emerged, focusing on causal effect decomposition under the rubric of mediation analysis and proposing various optimization problems to adapt the causal framework for fairness analysis (Zhang and Bareinboim 2018b; Wu et al. 2019; Zhang, Wu, and Wu 2016). One notable framework (Plecko and Bareinboim 2024) addresses spurious effects in the decomposition of causal effects and explores the relationships between causal and spurious effects with demographic parity, offering practical insights for data science and fairness considerations.

In the realm of fairness through causal analysis research, a significant focus lies on sub-group analysis and heterogeneity, approached from two perspectives: one being heterogeneous treatment effects (Wager and Athey 2018), which directly aligns with our study, and the other involving ’counterfactually fair’ algorithms for individuals, a topic not directly relevant to our current research (Kearns et al. 2019; Kavouras et al. 2024). The former involves systematically quantifying variations in the causal impact of sensitive attributes on the outcome of interest across sub-groups (Pearl 2022). Approaches for estimating heterogeneous causal effects encompass classical non-parametric methods such as nearest-neighbour matching, kernel methods, and series estimation, demonstrating efficacy in scenarios with a limited number of covariates (Crump et al. 2008; Lee 2009; Willke et al. 2012). More recently, data-driven ML algorithms including causal forest which can be adept at handling numerous moderating variables have shown promising results in heterogeneity analysis (Wager and Athey 2018).

Building on the urgency of adopting causal reasoning techniques in fairness analysis, the main aim of this study is to leverage causal analysis for sub-group disparity assessment. First, we demonstrate the application of causal disparity analysis to uncover the intricate relationships and causal pathways between sensitive attributes and the outcome of interest in real-world observational data. Then, we close the loop by employing causal disparity analysis for sub-group fairness within the context of ML, showcasing how a causal-aware approach can enhance sub-group fairness evaluation. Our overarching goal is to pave the way for conducting disparity audits that lay the foundation for ethical and equitable ML. The novelty of this study lies not in the specific methodologies used but in recognizing causal reasoning as a novel technique for conceptualizing and quantifying disparity, making it suitable for promoting fairness in data science.

•

We demonstrate the application of causal disparity analysis to quantify and decompose causal pathways between sensitive attributes and the targeted outcomes within two real-world observational data. We successfully indicate the capability of our approach to uncover hidden disparities, even in cases where observed disparities are nearly zero.
•

We pioneer a novel sub-group discovery method rooted in the concept of Heterogeneity of Treatment Effect, enabling the identification of variations in the magnitude and direction of decomposed causal effects among individuals.
•

We evaluate the efficacy and utility of our proposed causal disparity analysis in a fairness ML experiment. Our method demonstrates its ability to identify biased performance within each sub-group of individuals, particularly those identified quantitatively as most affected by disparities.

Materials and Methods

In this section, we will introduce causal disparity analysis through the lens of counterfactual inference and non-parametric SCM proposed by Pearl (Pearl 2014) and expanded by Zhang et al. (Zhang and Bareinboim 2018b). Following this approach, various causal effects can be defined as the difference between two counterfactual outcomes (Holland 1986; Rubin 1974) along the causal pathway from sensitive attributes (causes) to outcomes. We will elucidate how these effects can be quantitatively measured and estimated from data through the experiments.

Preliminaries

Our study is based on a basic causal structure which consists of four random variables $(Y,S,X,M)$ sampled from unknown distribution; $S$ represents the random variable for the sensitive attribute (whose effect we seek to measure). $Y$ represents the random variable for the outcome of interest. $X$ represents the random variable for all in-sensitive attributes, including observed confounders, denoted by $C$ , and mediators, denoted by $M$ . The lowercase $(y,s,x,m)$ represents the values that variables may take. As a running example, $S$ stands for race, $M$ stands for the job title, and $Y$ stands for income amount. Here we consider two potential outcomes $Y_{s1}$ and $Y_{s2}$ for sensitive attributes, $S={s_{1},s_{2}}$ . $E[Y_{s},m]$ stands for $E[Y|do(S=s,M=m)]$ which is interpreted as the expectation of potential outcome $Y$ when the sensitive attribute $S$ is set to $s$ and the mediator variable $M$ is set to $m$ .

Sensitive attributes, $S$ , that serve as the basis for disparity encompass a range of personal characteristics that have historically been unfairly targeted to differentiate individuals (Mehrabi et al. 2021). These attributes are pivotal in discussions surrounding equity, inclusion, and human rights and are commonly discussed in anti-discrimination laws (Nations 2023), regulations, and human rights frameworks around the globe. Among these attributes, a notable array includes race, nationality, ethnic origin, colour, religion, age, sex, sexual orientation, gender identity or expression, marital status, family status, genetic characteristics, or disability (Verma and Rubin 2018).

In this study, the term sensitive category denotes individuals grouped based on their sensitive attributes. The term sub-group refers to individuals grouped according to the quantity of their estimated causal effects.

Causal Disparity Analysis

Within the context of counterfactual fairness, the causal effect is characterized as the difference between two potential (also called counterfactual) outcomes: one outcome, $Y_{s_{1}}$ , if the sensitive attribute is $s1$ (for instance, if the individual is female), and the other outcome, $Y_{s_{2}}$ for $s2$ (in this case, if the individual is not female). Due to the presence of the mediator, the potential outcomes are not only dependent on sensitive attributes but also on mediator values [4]. This way the causal effect can be decomposed into effects such as counterfactual causal effect, counterfactual indirect effect and spurious effect. The counterfactual measures of direct and indirect effects, are conditional versions of the natural direct and indirect effect introduced by Pearl (Pearl 2014) and are widely popular throughout the empirical sciences. Here we define the causal and non-causal fairness criteria we have used in this study;

Total Variation (TV) also known as demographic parity represents the statistical distinction in the conditional distribution of the outcome between two groups when simply observing that $S=s_{1}$ , compared to $S=s_{2}$ :

TV(Y)=P(Y|S=s_{2})-P(Y|S=s_{1})

(1)

The counterfactual direct effect (ctf-DE) is the average difference between two potential outcomes when the sensitive attribute transitions from $S=s_{1}$ (female) to $S=s_{2}$ (not female), while the mediator is set to whatever value it would have naturally attained prior to the change in $S=s_{1}$ , for a specific sub-group of the population, $s$ .

ctf-DE_{s_{1},s_{2}}(Y|s)=E[Y_{s_{2}},M_{s_{1}}-Y_{s_{1}},M_{s_{1}}|s]

(2)

The counterfactual indirect Effect (ctf-IE) is the average difference between two potential outcomes when the sensitive attribute remains constant at $s_{1}$ (female), while the mediator changes from its values under $s_{1}$ to whatever value it would have attained for each individual under $s_{2}$ (not female), for a specific sub-group of the population, $s$ .

ctf-IE_{s_{1},s_{2}}(Y|s)=E[Y_{s_{1}},M_{s_{2}}-Y_{s_{1}},M_{s_{1}}|s]

(3)

According to Zhang (Zhang and Bareinboim 2018b), direct and indirect causal effects can be linearly combined and contribute to total variation by introducing an additional term that uncovers the spurious relations between $S$ and $Y$ through confounding variables, $X$ .

TV_{s_{1},s_{2}}(Y)=DE_{s_{1},s_{2}}(Y|s)-IE_{s_{2},s_{1}}(Y|s)-SE_{s_{2},s_{1% }}(Y)

(4)

Counterfactual Spurious Effect (ctf-SE) measures the average difference in outcome $Y$ had $S$ been $s_{1}$ by intervention compared to settings that would naturally choose $S$ to be $s_{1}$ . SE, in fact, measures all paths between $S$ and $Y$ except the causal ones (direct and indirect),

ctf-SE_{s_{1},s_{2}}(Y|s)=E[Y_{s_{1}}|s_{2}-Y_{s}|s_{1}]

(5)

In order to estimate these counterfactual quantities from data, we assume the presence of unconfoundedness between the sensitive attribute and outcome, along with the assumption of conditional ignorability. Moreover, leveraging the following two assumptions (1) none of the confounders are descendants of $S$ and (2) confounders block all backdoor paths from mediators to $Y$ , we can express counterfactual quantities in terms of conditional distributions.

Sub-group discovery for heterogeneity assessment

We conduct sub-group discovery to identify and quantify causal effect heterogeneity among individuals from distinct sensitive categories (sensitive attributes are changed while keeping all other relevant variables constant). Generalized Random Forest (GRF) (Athey and Wager 2019) is employed in this study to measure conditional distribution which is an extension of traditional Random Forest by maximizing heterogeneity when splitting nodes in a decision tree. It incorporates a statistical criterion known as the Causal tree-splitting criterion, which integrates sensitive attribute assignments and outcome variables. GRF provides estimates of both average and individual causal effects, facilitating the detection of differential effects among sub-groups. This allows for the clustering and grouping of individual effects to reveal varying causal impacts. Essentially, GRF compares individuals within a sub-group to counterparts with different sensitive attributes while aiming to closely match all other relevant attributes.

Experiment and Setting

We have leveraged causal and sub-group analysis for disparity analysis using the pipeline shown in Figure 1 on two publicly available datasets, where we selected race as the sensitive attribute. We identified individuals with one race as the $s_{1}$ group and the rest as the $s_{2}$ group. Please refer to Table 1 for more details on the attributes designated as confounders, mediators, and outcomes. Within our pipeline, we utilized the faircause library (Plecko and Bareinboim 2022) and GRF (Wager and Athey 2018) for causal effect estimations.

Refer to caption — Figure 1: The steps involved in our approach to achieving fairness in ML classification models through causal pathway decomposition and sub-group analysis.

Table 1: The description of variables and their corresponding nodes in our causal graph for the disparity analysis experiments. where the sensitive attribute is defined as the White vs. non-White (Asian vs. Non-Asian) race.

Dataset

Attributes

Node

Types and Variables

Adults

Sensitive (S)

Race (S1: Non-White, S2: White)

Outcome (Y)

Salary (2 categories:

\leq

$50K/>$50K)

Confounders (X)

Age (continuous),

Applicant Sex (2 categories),

Marital Status (7 categories)

Mediators (M)

Education (16 categories)

Workclass (8 categories)

Occupation (14 categories)

Capital Gain (continuous)

Capital Loss (continuous)

Hours/Week (continuous)

HDMA

Sensitive (HDMA-White)

Race (S1: Non-White, S2: White)

Sensitive (HDMA-Asian)

Race (S1: Asian, S2: Non-Asian)

Outcome

Loan Status (Accepted/Rejected)

Confounders

Property Type (2 categories)

Owner Occupancy (3 categories)

Applicant Sex (2 categories)

Loan Type (4 categories)

Mediators

Loan Amount (continuous)

Applicant Income (continuous)

Adult. The adult dataset (Becker and Kohavi 1996) is a multivariate dataset designed to predict whether an individual’s annual income will exceed $50,000. This prediction is based on census data and is commonly known as the ’Census Income’ dataset. The data extraction was carried out by Barry Becker, utilizing the 1994 Census database. In this dataset, the goal is to identify and quantify the basic impact of individuals’ race (specifically white) on income as listed in detail in Table 1.

HDMA. The Home Mortgage Disclosure Act (HMDA) (government 2016) mandates numerous financial institutions to uphold, report, and openly divulge mortgage-related information. These publicly accessible data hold significance as they provide insights into whether lenders are effectively addressing their communities’ housing requirements. They also furnish public officials with valuable information to facilitate decision-making and policy formation, while also unveiling lending trends that could potentially exhibit bias. For our experiments, we leveraged the HDMA “Washington State Home Loans, 2016” dataset comprising a total of 466,566 instances of home loans within the state of Washington. The variables encompass a diverse range of information, including demographic details, location-specific data, loan status, property and loan types, loan objectives, and the originating agency. For HDMA, we conducted two sets of experiments, referred to as HDMA-White and HDMA-Asian to explore the impact of the presence and absence of specific races on the outcome. Please refer to Table 1 for more details.

Results

Causal aware disparity analysis

In Table 2, we present total variations along with decomposed causal effects using causal forest for experimental datasets. All metrics are computed based on the difference in outcomes when the sensitive attribute transitions from $s_{1}$ to $s_{2}$ . Positive results for our experiment favour individuals with $s_{2}$ , while negative results favour the other sensitive group, which is $s_{1}$ . Furthermore, we conducted comparisons of the causal effect estimates in our pipeline with two other widely-used causal decomposition libraries, as illustrated in Supplementary Table 5.

Table 2: Total Variation (TV), Direct Effect (DE), Indirect Effect (IE), and Spurious Effect (SE) estimations for each experiment.

Dataset

TV¹¹1Total Variation

ctf-DE²²2Direct Effect

ctf-IE³³3Indirect Effect

ctf-SE⁴⁴4Spurious Effect

Adult

0.104

0.015

0.032

0.057

HDMA-White

0.041

0.055

-0.005

-0.009

HDMA-Asian

0.005

0.030

-0.009

-0.017

Our findings reveal that within the Adult dataset, individuals from $s_{1}$ group are approximately 10.4% more inclined to obtain an annual income exceeding $50,000 than the $s_{2}$ group. Through causal analysis, we discern that about 1.5% of this 10.4% can be directly attributed to the causal influence of the sensitive attribute on the annual income (the full distribution of the ctf-DE is shown in the supplementary materials Figure 5). Additionally, approximately 3.2% can be attributed to an indirect effect mediated through other factors shown in table 1, while the remaining 6% is attributable to spurious effects.

In both experiments conducted with the HDMA dataset, minimal disparities were observed through TV. Specifically, there was a mere 4% difference and nearly zero TV between the two sensitive groups in loan acceptance status. In the first HDMA experiment, featuring a 4% TV, a 5.5% direct effect from the race to loan status was noted, while both the indirect and spurious effects were negligible. However, the analysis of the second experiment (last row in Table 2) yielded intriguing results. Despite the absence of TV and an indirect effect, there was a 3% direct effect observed alongside an approximately 1.7% negative spurious effect from the relevant factor to loan status. Additionally, as shown in the supplementary materials (Figures 5 and 6), the full distribution of direct causal effects for both HDMA experiments is skewed positively in favour of White and non-Asian sub-populations.

Figure 2 presents the top attributes identified within each experimental setting for direct causal effect estimation. In the Adult dataset, key attributes include age, education, workclass, occupation, and hours per week,while for HDMA, loan amount and application income are pivotal.

Sub-group analysis

Drawing on the distribution of the ctf-DE, as shown in the supplementary materials (Figure 5), we examined the trade-off between ensuring consistency across datasets with varying distributions and maintaining intra-group alignment on ctf-DE ranges for each dataset to minimize variations. We, therefore, determined four distinct sub-groups with the summary shown in Table 3 for categorical variables and Figure 3 for continuous variables, across all two experimental datasets. The sub-groups are arranged from negative to positive direct causal effect values, with Sub-group 1 representing ctf-DE values less than $-0.01$ (negative effects in favour of individuals in $s_{1}$ category), Sub-group 2 comprising ctf-DE values between $-0.01$ and $0.01$ (around zero effects), Sub-group 3 encompassing ctf-DE values between $0.01$ and $0.05$ (positive effects in favour of individuals in $s_{2}$ category), and Sub-group 4 indicating values greater than $0.05$ (very positive effects in favour of individuals in $s_{2}$ category) for the Adult dataset. For the HDMA dataset, the sub-groups are slightly different due to the ctf-DE values being skewed positively. Sub-group 1 has ctf-DE values less than $-0.005$ , Sub-group 2 ctf-DE values between $-0.005$ and $0.025$ , Sub-group 3 ctf-DE values between $0.025$ and $0.07$ (in favour of White or non-Asian), and Sub-group 4 indicating values greater than $0.07$ . For the categorical variables, we have reported the counts for the majority and non-majority categories for each sensitive (racial) group within each of the sub-groups. Evidently, there are remarkable similarities in both majority and minority counts and mean and standard deviation between the two sensitive categories within each sub-group.

Table 3: Summary of sub-group analysis for the Adult experiment with categorical variables. Sub-group 1 represents ctf-DE values less than

-0.01

, Sub-group 2 represents ctf-DE values between

-0.01

and

0.01

(around zero effects), Sub-group 3 represents ctf-DE values between

0.01

and

0.05

, and Sub-group 4 represents ctf-DE values greater than

0.05

		Education-Num		Workclass		Occupation
		Majority	Minority	Majority	Minority	Majority	Minority
Adults
Sub-group 1	White	College (%57.6)	Preschool (%0.2)	Private (%86.1)	State Gov. (%0.5)	Craft Repair (%43.1)	Transportation (%0.4)
(TV:0.04)	Non-White	College (%57.7)	Masters (%3.8)	Private (%88.5)	Self-emp.(NI) (%3.8)	Craft Repair (%53.8)	MOI ⁵⁵5Machine Operator Inspection (%1.9)
Sub-group 2	White	High School (%32.2)	Preschool (%0.2)	Private (%73.9)	Without Pay (%0.1)	Craft Repair (%13.0)	Armed-Forces (%0.0)
(TV:0.09)	Non-White	High School (%34.4)	Preschool (%0.3)	Private (%74.5)	Without Pay (%0.0)	Other Service (%18.2)	Armed-Forces (%0.0)
Sub-group 3	White	High School (%38.3)	1st-4th Grade (%0.1)	Private (%69.8)	Without Pay (%0.0)	Prof. Specialty (%20.0)	Armed-Forces (%0.0)
(TV:0.12)	Non-White	High School (%37.0)	11th Grade (%0.2)	Private (%68.5)	Self-emp.(I) (%3.0)	Prof. Specialty (%23.2)	Armed-Forces (%0.2)
Sub-group 4	White	Bachelors (%50.7)	Assoc. Voc (%0.2)	Private (%70.0)	Local Gov. (%0.2)	Exec-managerial (%40.1)	Protective Serv. (%0.4)
(TV:0.1)	Non-White	Bachelors (%37.7)	College (%3.8)	Private (%66.0)	Federal Gov. (%1.9)	Prof. Specialty (%49.1)	MOI (%1.9)
		Loan Purpose		Applicant Sex		Loan Type
		Majority	Minority	Majority	Minority	Majority	Minority
HDMA - White
Sub-group 1	White	Refinancing (%77.0)	Home Improv. (%1.3)	Male (%100.0)	Male (%100.0)	Conventional (%100.0)	Conventional (%100.0)
(TV:0.05)	Non-White	Refinancing (%73.7)	Home Purchase (%26.3)	Male (%100.0)	Male (%100.0)	Conventional (%100.0)	Conventional (%100.0)
Sub-group 2	White	Refinancing (%63.9)	Home Improv. (%2.2)	Male (%93.6)	Female (%6.4)	Conventional (%98.8)	FHA-insured (%0.4)
(TV:-0.02)	Non-White	Refinancing (%49.6)	Home Improv. (%1.0)	Male (%93.7)	Female (%6.3)	Conventional (%99.6)	FHA-insured (%0.1)
Sub-group 3	White	Home Purchase (%69.2)	Home Improv. (%2.6)	Male (%77.7)	Female (%22.3)	Conventional (%71.6)	FSA/RHS (%1.3)
(TV:0.03)	Non-White	Home Purchase (%71.8)	Home Improv. (%1.7)	Male (%74.8)	Female (%25.2)	Conventional (%78.1)	FSA/RHS (%0.3)
Sub-group 4	White	Refinancing (%61.8)	Home Improv. (%9.2)	Male (%65.8)	Female (%34.2)	Conventional (%78.5)	FSA/RHS (%0.8)
(TV:0.1)	Non-White	Refinancing (%65.9)	Home Improv. (%7.7)	Male (%62.9)	Female (%37.1)	Conventional (%79.9)	FSA/RHS (%0.3)
		Loan Purpose		Applicant Sex		Occupation Type
		Majority	Minority	Majority	Minority	Majority	Minority
HDMA - Asian
Sub-group 1	White	Refinancing (%95.3)	Home Improv. (%0.3)	Male (%95.7)	Female (%4.3)	Principal (%98.1)	N/A (%0.0)
(TV:0.03)	Non-White	Refinancing (%92.3)	Home Improv. (%0.1)	Male (%96.2)	Female (%3.8)	Principal (%96.4)	Non-Principal (%3.6)
Sub-group 2	White	Refinancing (%75.3)	Home Improv. (%1.5)	Male (%86.3)	Female (%13.7)	Principal (%90.8)	N/A (%0.0)
(TV:-0.01)	Non-White	Refinancing (%58.7)	Home Improv. (%0.8)	Male (%85.6)	Female (%14.4)	Principal (%86.2)	Non-Principal (%13.8)
Sub-group 3	White	Home Purchase (%59.3)	Home Improv. (%6.3)	Male (%74.6)	Female (%25.4)	Principal (%90.4)	N/A (%0.0)
(TV:0.02)	Non-White	Home Purchase (%67.8)	Home Improv. (%3.3)	Male (%71.0)	Female (%29.0)	Principal (%82.9)	N/A (%0.0)
Sub-group 4	White	Refinancing (%55.6)	Home Improv. (%8.5)	Male (%63.0)	Female (%37.0)	Principal (%94.1)	N/A (%0.1)
(TV:0.08)	Non-White	Refinancing (%62.5)	Home Improv. (%5.7)	Male (%58.0)	Female (%42.0)	Principal (%90.2)	Non-Principal (%9.8)

Applications for fairness in Machine Learning

In order to assess the practical utility and effectiveness of our causal disparity analysis in ML and automated decision-making, we trained an XGBoost classifier on both datasets to predict outcomes (referred to as the outcome node in Table 1). We created an 80-20% train-test split using stratified sampling from all sub-groups (direct causal effect values). We computed classification results for the test set within each sub-group using AI Fairness 360 (Bellamy et al. 2018) library. Tables 4 present the classification results, with the first row indicating the average performance and the last four rows representing the heterogeneity of performance across sub-groups. Across all experiments, performance varies among the sub-groups, with Sub-group 4 exhibiting worse performance and higher variability for all datasets except for the recall value for the Adult dataset; the Adult dataset (Precision: $0.76$ (95% CI interval: $0.016$ ), Recall: $0.71$ ( $0.007$ ), and Accuracy: $0.74$ ( $0.007$ )) for HDMA-White (Precision:0.69(0.000), Recall: 0.89(0.013), Accuracy: 0.68(0.005)) and -Asian (Precision: 0.68(0.001), Recall: 0.86(0.011), Accuracy: 0.67(0.003). In total, the performance of the Sub-groups 1 and 4 are lower than the other Sub-groups.

To better gauge the fairness of our ML classifier in our experiments and evaluate how decisions would differ if the circumstances were different, we plotted the performance gaps for the accuracy, recall, and precision between any two sensitive categories ( $s_{2}$ - $s_{1}$ ) in Figure 4 across all sub-groups. Notably, almost 70% of the performance gaps (positive gaps) favour the sensitive category $s_{2}$ , which corresponds to the White individuals for Adult and HDMA-White, and the Non-Asian category for HDMA-Asian. As the plots indicate, the absolute value of the aggregated performance gap ranges from 0 to 0.07, whereas within sub-groups, the variation is more pronounced. The largest gaps are observed in Sub-group 1 (Precision:0.5, Recall:0.27, and Accuracy:0.29) for the Adult dataset and Sub-group 4 for HDMA-White (Precision:0.06, Recall:0.09, and Accuracy:0.06) and HDMA-Asian (Precision:0.05, Recall:0.08, and Accuracy:.04). The combination of performance gaps and lower performance in Sub-group 4 indicates the model’s bias toward one of the sensitive categories. Of particular interest are the significant gaps in recall measures (higher false negative rates for one of the sensitive categories) among individuals in Sub-group 4 for both HDMA experiments.

Table 4: The XGBoost model classification performance.

	Adult (mean (95% CI interval))			HDMA-White(mean(95% CI interval))			HDMA-Asian(mean(95% CI interval))
	Precision	Recall	Accuracy	Precision	Recall	Accuracy	Precision	Recall	Accuracy
Entire Test Set	0.77(0.012)	0.66(0.004)	0.87(0.003)	0.78(0.001)	0.95(0.006)	0.76(0.003)	0.78(0.001)	0.96(0.005)	0.77(0.002)
Sub-group 1	0.76(0.016)	0.61(0.008)	0.77(0.007)	0.77(0.002)	0.99(0.001)	0.77(0.002)	0.78(0.001)	0.99(0.005)	0.77(0.003)
Sub-group 2	0.78(0.021)	0.63(0.004)	0.93(0.002)	0.82(0.001)	0.99(0.003)	0.81(0.001)	0.82(0.001)	0.99(0.004)	0.81(0.002)
Sub-group 3	0.78(0.012)	0.66(0.003)	0.87(0.003)	0.83(0.001)	0.99(0.004)	0.82(0.001)	0.79(0.001)	0.96(0.004)	0.77(0.001)
Sub-group 4	0.76(0.009)	0.71(0.007)	0.74(0.007)	0.69(0.000)	0.89(0.013)	0.68(0.005)	0.68(0.001)	0.86(0.011)	0.67(0.003)

Discussion

Main Findings

In this study, we have demonstrated the utilization of causal disparity analysis to show the complex relationships and causal pathways linking sensitive attributes (such as race) to real-world observational data outcomes (such as loan status or income) to supplement total variation (TV) also referred to as demographic parity. Our analysis is rooted in the assumptions of a basic causal graph, from which all findings are derived. Notably, our key finding reveals a direct causal link between race and loan status or income, which might not have been apparent from the observed disparities alone. In the Adult dataset, our analysis reveals the presence of indirect effects through mediators, a phenomenon that resonates with prior research by Binkytė et al. (Binkytė et al. 2023). However, the author’s exploration of fairness measures across different causal discovery algorithms and causal paths demonstrated significant variability in the observed discrimination.

Considering the presence of direct causal effects within our datasets, we delved deeper into the variability among individuals regarding how race directly influences their outcomes. This variability led to the identification of four distinct sub-groups, each sharing similar characteristics except for race. In other words, within each sub-group, all covariates except race remained consistent, with race being hypothetically randomized. The ML model used in our study showed varying performance across these sub-groups. Sub-groups with higher and positive direct causal effects, which exhibited larger disparities in outcomes attributed to race, experienced lower model performance. This performance gap within these sub-groups indicates potential unfairness and bias in the ML model, suggesting that race may be a factor contributing to disparate outcomes. In all three experiments, the larger gap in false negative rates for Sub-group 4, which is not in favor of non-whites and Asians, suggests that the classifier tends to incorrectly predict loan status as rejected when it is actually accepted among these individuals, compared to white individuals within the same sub-group. This indicates a bias in the predictions against non-white individuals. Similarly, in the HDMA-Asian dataset, there is a similar disparity where predictions are biased against non-Asians. Furthermore, for the Adult dataset, in addition to the large recall gap for Sub-group 4, there is a large gap in the true positive rates in Sub-group 1 in favour of white individuals. This implies that the classifier is more successful at correctly predicting high income among white individuals in Sub-group 1 compared to non-white individuals within the same sub-group. This suggests a bias in favor of white individuals in predicting high income.

In essence, this is a nuanced finding that cannot be captured solely by dividing the entire sample size into privileged and unprivileged groups based on the sensitive attribute alone which is race in our case. Our research findings are in accordance with existing literature in two significant respects. First, employing decomposed and structural causal analysis, our results resonate with a substantial body of research delving into mediating mechanisms by estimating both natural direct and indirect effects within the potential outcome framework (Jackson 2021; Park, Qin, and Lee 2022; Jeffries et al. 2019). Our causal methodology experiments echo the trajectory of research pioneered by counterfactual causal fairness analysis (Plecko and Bareinboim 2022) working on quantifying discrimination, decomposing variations, and deriving empirical measures of fairness from data. Second, considering heterogeneity in causal effects, our approach and findings align with other studies where the concept of heterogeneous treatment effects and the use of causal forest have been employed (Dandl et al. 2024). For instance, similar methodologies have been leveraged in analyzing environmental policy effects (Miller 2020), conducting cost-effectiveness analyses encompassing outcomes, costs, and net monetary benefits (Bonander and Svensson 2021), as well as in assessing educational interventions and grading discrimination (Jin, Naghi, and Pick 2019).

Limitations and Future Directions

As ML advances at an unprecedented pace, its societal implications have attracted heightened scrutiny. Consequently, the importance of conducting disparity analysis has been emphasized in the contemporary landscape. While this study has provided valuable insights into causal disparity analysis, it’s essential to acknowledge several limitations and explore potential avenues for improvement. The analysis primarily focused on disparities related to a single protective attribute, such as race. However, this narrow focus may not fully capture the intricate interplay of multiple factors contributing to discrimination and bias in real-world scenarios. Future research should consider incorporating intersectional disparity analysis, which examines how multiple protective attributes intersect and interact to shape outcomes. In line with this, future work should also involve a thorough exploration of diverse causal discovery algorithms and identification methods. It’s worth noting that the reliance solely on a basic causal graph framework in this study presents a limitation, as it may oversimplify the intricate causal relationships inherent in real-world data. Additionally, the datasets used in this study may not comprehensively represent the diversity and complexity of real-world populations. Limited diversity within the datasets can lead to biased results and may not encompass the full range of experiences and challenges faced by individuals from marginalized or underrepresented groups. Future work should involve utilizing more diverse and representative datasets, validating the findings within specific contexts, and identifying any context-specific factors that may influence fairness and bias.

To conclude, our study emphasized the imperative of delving into causal pathways, decomposing them, and assessing heterogeneity among individuals. This approach not only offers a comprehensive understanding of disparities within the data but also enables targeted interventions and strategies to promote fairness and equity.

References

Athey and Wager (2019) Athey, S.; and Wager, S. 2019. Estimating treatment effects with causal forests: An application. Observational studies, 5(2): 37–51.
Barocas, Hardt, and Narayanan (2023a) Barocas, S.; Hardt, M.; and Narayanan, A. 2023a. Fairness and Machine Learning: Limitations and Opportunities. MIT Press.
Barocas, Hardt, and Narayanan (2023b) Barocas, S.; Hardt, M.; and Narayanan, A. 2023b. Fairness and machine learning: Limitations and opportunities. MIT Press.
Becker and Kohavi (1996) Becker, B.; and Kohavi, R. 1996. Adult. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5XW20.
Bellamy et al. (2018) Bellamy, R. K. E.; Dey, K.; Hind, M.; Hoffman, S. C.; Houde, S.; Kannan, K.; Lohia, P.; Martino, J.; Mehta, S.; Mojsilovic, A.; Nagar, S.; Ramamurthy, K. N.; Richards, J.; Saha, D.; Sattigeri, P.; Singh, M.; Varshney, K. R.; and Zhang, Y. 2018. AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias.
Binkytė et al. (2023) Binkytė, R.; Makhlouf, K.; Pinzón, C.; Zhioua, S.; and Palamidessi, C. 2023. Causal discovery for fairness. In Workshop on Algorithmic Fairness through the Lens of Causality and Privacy, 7–22. PMLR.
Bird et al. (2020) Bird, S.; Dudík, M.; Edgar, R.; Horn, B.; Lutz, R.; Milan, V.; Sameki, M.; Wallach, H.; and Walker, K. 2020. Fairlearn: A toolkit for assessing and improving fairness in AI. Technical Report MSR-TR-2020-32, Microsoft.
Bonander and Svensson (2021) Bonander, C.; and Svensson, M. 2021. Using causal forests to assess heterogeneity in cost-effectiveness analysis. Health Economics, 30(8): 1818–1832.
Braveman et al. (2011) Braveman, P. A.; Kumanyika, S.; Fielding, J.; LaVeist, T.; Borrell, L. N.; Manderscheid, R.; and Troutman, A. 2011. Health disparities and health equity: the issue is justice. American journal of public health, 101(S1): S149–S155.
Coffman et al. (2023) Coffman, D. L.; Schuler, M. S.; Nguyen, T. Q.; and McCaffrey, D. F. 2023. Weighting Estimators for Causal Mediation. In Handbook of Matching and Weighting Adjustments for Causal Inference, 373–412. Chapman and Hall/CRC.
Crump et al. (2008) Crump, R. K.; Hotz, V. J.; Imbens, G. W.; and Mitnik, O. A. 2008. Nonparametric tests for treatment effect heterogeneity. The Review of Economics and Statistics, 90(3): 389–405.
Dandl et al. (2024) Dandl, S.; Haslinger, C.; Hothorn, T.; Seibold, H.; Sverdrup, E.; Wager, S.; and Zeileis, A. 2024. What makes forest-based heterogeneous treatment effect estimators work? The Annals of Applied Statistics, 18(1): 506–528.
Díaz-Rodríguez et al. (2023) Díaz-Rodríguez, N.; Del Ser, J.; Coeckelbergh, M.; de Prado, M. L.; Herrera-Viedma, E.; and Herrera, F. 2023. Connecting the dots in trustworthy Artificial Intelligence: From AI principles, ethics, and key requirements to responsible AI systems and regulation. Information Fusion, 101896.
Feldman et al. (2015) Feldman, M.; Friedler, S. A.; Moeller, J.; Scheidegger, C.; and Venkatasubramanian, S. 2015. Certifying and removing disparate impact. In proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, 259–268.
Glymour and Hamad (2018) Glymour, M. M.; and Hamad, R. 2018. Causal thinking as a critical tool for eliminating social inequalities in health.
government (2016) government, U. S. 2016. Home Mortgage Disclosure Act.
Hariton and Locascio (2018) Hariton, E.; and Locascio, J. J. 2018. Randomised controlled trials—the gold standard for effectiveness research. BJOG: an international journal of obstetrics and gynaecology, 125(13): 1716.
Holland (1986) Holland, P. W. 1986. Statistics and Causal Inference. Journal of the American Statistical Association, 81(396): 945–960.
Jackson (2021) Jackson, J. W. 2021. Meaningful causal decompositions in health equity research: definition, identification, and estimation through a weighting framework. Epidemiology, 32(2): 282–290.
Jeffries et al. (2019) Jeffries, N.; Zaslavsky, A. M.; Diez Roux, A. V.; Creswell, J. W.; Palmer, R. C.; Gregorich, S. E.; Reschovsky, J. D.; Graubard, B. I.; Choi, K.; Pfeiffer, R. M.; et al. 2019. Methodological approaches to understanding causes of health disparities. American journal of public health, 109(S1): S28–S33.
Jin, Naghi, and Pick (2019) Jin, F. F.; Naghi, A.; and Pick, A. 2019. Heterogeneous Treatment Effects of Educational Interventions by using Random Forests.
Kavouras et al. (2024) Kavouras, L.; Tsopelas, K.; Giannopoulos, G.; Sacharidis, D.; Psaroudaki, E.; Theologitis, N.; Rontogiannis, D.; Fotakis, D.; and Emiris, I. 2024. Fairness Aware Counterfactuals for Subgroups. Advances in Neural Information Processing Systems, 36.
Kearns et al. (2018) Kearns, M.; Neel, S.; Roth, A.; and Wu, Z. S. 2018. Preventing fairness gerrymandering: Auditing and learning for subgroup fairness. In International conference on machine learning, 2564–2572. PMLR.
Kearns et al. (2019) Kearns, M.; Neel, S.; Roth, A.; and Wu, Z. S. 2019. An empirical study of rich subgroup fairness for machine learning. In Proceedings of the conference on fairness, accountability, and transparency, 100–109.
Khademi et al. (2019) Khademi, A.; Lee, S.; Foley, D.; and Honavar, V. 2019. Fairness in algorithmic decision making: An excursion through the lens of causality. In The World Wide Web Conference, 2907–2914.
Lee (2009) Lee, M.-j. 2009. Non-parametric tests for distributional treatment effect for randomly censored responses. Journal of the Royal Statistical Society Series B: Statistical Methodology, 71(1): 243–264.
MacKinnon, Fairchild, and Fritz (2007) MacKinnon, D. P.; Fairchild, A. J.; and Fritz, M. S. 2007. Mediation analysis. Annu. Rev. Psychol., 58: 593–614.
Mehrabi et al. (2021) Mehrabi, N.; Morstatter, F.; Saxena, N.; Lerman, K.; and Galstyan, A. 2021. A survey on bias and fairness in machine learning. ACM computing surveys (CSUR), 54(6): 1–35.
Miller (2020) Miller, S. 2020. Causal forest estimation of heterogeneous and time-varying environmental policy effects. Journal of Environmental Economics and Management, 103: 102337.
Morgan and Winship (2015) Morgan, S. L.; and Winship, C. 2015. Counterfactuals and causal inference. Cambridge University Press.
Nations (2023) Nations, U. 2023. Universal Declaration of Human Rights.
Park, Qin, and Lee (2022) Park, S.; Qin, X.; and Lee, C. 2022. Estimation and sensitivity analysis for causal decomposition in health disparity research. Sociological Methods & Research, 00491241211067516.
Pearl (2014) Pearl, J. 2014. Interpretation and identification of causal mediation. Psychological methods, 19(4): 459.
Pearl (2022) Pearl, J. 2022. Detecting latent heterogeneity. In Probabilistic and Causal Inference: The Works of Judea Pearl, 483–506. ACM Books.
Pfohl et al. (2023) Pfohl, S. R.; Harris, N.; Nagpal, C.; Madras, D.; Mhasawade, V.; Salaudeen, O. E.; Heller, K. A.; Koyejo, S.; and D’Amour, A. N. 2023. Understanding subgroup performance differences of fair predictors using causal models. In NeurIPS 2023 Workshop on Distribution Shifts: New Frontiers with Foundation Models.
Plecko and Bareinboim (2022) Plecko, D.; and Bareinboim, E. 2022. Causal fairness analysis. arXiv preprint arXiv:2207.11385.
Plecko and Bareinboim (2024) Plecko, D.; and Bareinboim, E. 2024. A Causal Framework for Decomposing Spurious Variations. Advances in Neural Information Processing Systems, 36.
Raza et al. (2024) Raza, S.; Ghuge, S.; Ding, C.; Dolatabadi, E.; and Pandya, D. 2024. FAIR Enough: Develop and Assess a FAIR-Compliant Dataset for Large Language Model Training? Data Intelligence, 6(2): 559–585.
Rubin (1974) Rubin, D. B. 1974. Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of educational Psychology, 66(5): 688.
Rubin (2005) Rubin, D. B. 2005. Causal inference using potential outcomes: Design, modeling, decisions. Journal of the American Statistical Association, 100(469): 322–331.
Saha et al. (2020) Saha, D.; Schumann, C.; Mcelfresh, D.; Dickerson, J.; Mazurek, M.; and Tschantz, M. 2020. Measuring non-expert comprehension of machine learning fairness metrics. In International Conference on Machine Learning, 8377–8387. PMLR.
Saleiro et al. (2018) Saleiro, P.; Kuester, B.; Stevens, A.; Anisfeld, A.; Hinkson, L.; London, J.; and Ghani, R. 2018. Aequitas: A Bias and Fairness Audit Toolkit. arXiv preprint arXiv:1811.05577.
Shui et al. (2022) Shui, C.; Xu, G.; Chen, Q.; Li, J.; Ling, C. X.; Arbel, T.; Wang, B.; and Gagné, C. 2022. On learning fairness and accuracy on multiple subgroups. Advances in Neural Information Processing Systems, 35: 34121–34135.
Verma and Rubin (2018) Verma, S.; and Rubin, J. 2018. Fairness definitions explained. In Proceedings of the international workshop on software fairness, 1–7.
Wager and Athey (2018) Wager, S.; and Athey, S. 2018. Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association, 113(523): 1228–1242.
Willke et al. (2012) Willke, R. J.; Zheng, Z.; Subedi, P.; Althin, R.; and Mullins, C. D. 2012. From concepts, theory, and evidence of heterogeneity of treatment effects to methodological approaches: a primer. BMC medical research methodology, 12: 1–12.
Wu et al. (2019) Wu, Y.; Zhang, L.; Wu, X.; and Tong, H. 2019. Pc-fairness: A unified framework for measuring causality-based fairness. Advances in neural information processing systems, 32.
Yang, Cisse, and Koyejo (2020) Yang, F.; Cisse, M.; and Koyejo, S. 2020. Fairness with overlapping groups; a probabilistic perspective. Advances in neural information processing systems, 33: 4067–4078.
Zhang and Bareinboim (2018a) Zhang, J.; and Bareinboim, E. 2018a. Equality of opportunity in classification: A causal approach. Advances in neural information processing systems, 31.
Zhang and Bareinboim (2018b) Zhang, J.; and Bareinboim, E. 2018b. Fairness in decision-making—the causal explanation formula. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32.
Zhang, Wu, and Wu (2016) Zhang, L.; Wu, Y.; and Wu, X. 2016. A causal framework for discovering and removing direct and indirect discrimination. arXiv preprint arXiv:1611.07509.

Appendix A Natural Effects

In Table 5, we provide estimations of natural effects using three methods—CFA-CRF (Plecko and Bareinboim 2022), CFA-MedDML (Plecko and Bareinboim 2022), and twangmediation (Coffman et al. 2023)—for all three datasets.

Table 5: Summary of natural effect measurments. Each value in the table is formatted as mean (standard deviation).

	White		HDMA-White		HDMA-Asian
	NDE	NIE	NDE	NIE	NDE	NIE
CFA-CRF (Plecko and Bareinboim 2022)	0.016 (0.0001)	0.034 (0.0002)	0.062 (0.0001)	-0.01 (0.0001)	0.041 (0.0001)	-0.016 (0.0001)
CFA-MedDML (Plecko and Bareinboim 2022)	0.006 (0.0051)	0.044 (0.0019)	0.059 (0.0026)	-0.005 (0.0003)	0.041 (0.003)	-0.011 (0.0004)
twangmediation (Coffman et al. 2023)	0.017 (0.008)	0.032 (0.001)	0.065 (0.003)	-0.007 (0.007)	0.048 (0.004)	-0.179 (0.492)

Appendix B Histogram of ctf-DE and NDE Values

In Figure 5, we provide histogram plots of ctf-DE values for the three datasets: Adult, HDMA-White, and HDMA-Asian. Each histogram provides a visual representation of the distribution and spread of ctf-DE values within each dataset. These figures provide us with the knowledge to find optimal sub-groups. In Figure 6, we provide histogram plots of NDE values for all three datasets as well.