Perhaps the greatest challenge in the AI (Artificial Intelligence) industry at present is consensus on the definition of algorithm fairness. While exploring fairness, several tradeoffs need to be acknowledged such as:
- Discrimination versus ethics
- Justice, fairness and equality
- Individual versus group fairness
- Outcome and procedural fairness
One of the key dimensions to fairness is where it differs from regulation and enforcement of civil rights. Although the two there is a distinction between legality and ethics. Fairness is considered the avenue to explore these ethical considerations.
“When thinking about fairness and considering only discrimination, it does not mean it is wrong, it is just incomplete.” –Yochai Benkler
There are many social forces at play in defining fairness, yet ultimately it requires a human decision which requires balancing notions of justice, equality, and fairness. Some examples of how fairness definitions changes across the AI industry include, but are not limited to:
- Fairness as equality
- Fairness as the absence of bias
- Fairness as addressing present injustices
- Fairness as mitigation of bias towards sensitive attribute groups
Taking the GDPR as an example one would recognize that despite the formal respect of procedures (in terms of transparency, lawfulness or accountability), the substantial mitigation of unfair imbalances that create situations of “vulnerability.” While in data protection law, data collection and thresholds that must be met for sufficient data protection, are specified, it is not clear what is sufficient when it comes to ethics in AI. By protection law, users have the right to object collection of data and an awareness of who owns their data. For fairness, it is generally conceded that businesses should be measuring and maintaining estimates of impact of fairness risks and data use. This is well-established across industries which use AI systems for high-stakes decision-making, such as healthcare, finance or aerospace. Beyond this, a common ask is the need to more precisely determine what practices should be deemed de facto unfair.
Another challenge in aligning on a definition of fairness is the distinction between individual or group fairness. While group fairness as an ML (Machine Learning) concept most represents the concept of egalitarianism and anti-discrimination; individual fairness champions individual justice and consistency for users. However, fairness measurement, and thus mitigation, rely on the assumptions about the disparity source. Disparities due to personal choices and effort differ from those due to structural injustice, and different group fairness metrics are necessary to evaluate each of these dimensions.
Trust can be viewed as an accepted dependence, while dependability is the ability to avoid service failures. The concept of trust can be split into two different perspectives: socially, everyone interpret it in different ways, while technically, it focuses on accuracy and efficiency. Therefore, we can make the following claim:
“Trust in society is different.”
On the other hand, when analyzing the trustworthiness in reality there are two major trends that serve as pillars for a trustworthy machine learning system:
- Principled AI frameworks that represent a high level approach and which are usually set by policy makers, governments and industries.
- Technological solutions that represent a low level approach and which are evolved by computer scientists and developers.
A variety of technologies represent a set of trust-enhancing solutions used to categorize different types of frameworks such as:
- Fairness Technologies
- Explainability Technologies
- Auditability Technologies
- Safety Technologies
Given these frameworks, trust propagates gradually in the machine learning pipeline in an iterative way as the algorithm iterates through the stages during its lifecycle so that technology decisions in all stages impacts others and individuals can respond effectively to accidents, sudden breakdowns of trust or failure.
When thinking about the impact of trust in AI-assisted decision making systems and in order to be able to measure the accuracy and efficiency of it we would need to first determine whether the person will take the suggestion from AI or not. The idea of trust calibration represents an objective for prediction specific information design. However, the large discrepancy between human and AI performance would by default reward trusting AI, regardless the notion of calibration. In order to address this issue, by providing local explanations and by showing the AI confidence software engineers may build the user’s trust in making their own decisions in whether to trust AI or not.
There are different levels of trust exemplified by human-beings when being guided by AI:
- Human-beings have a cognitive bias with respect to socially-interactive agents
- Individual cognitive biases include a primacy effect.
- Human behavior can be influenced based on understanding violations of trust.
- Humans’ cognitive bias can lead to fundamental attribution bias.
Algorithm fairness and explainability have long had a nuanced relationship in the industry and both are rising as major focal points across verticals employing AI systems in society. Black-box models create business risk and hurts trust in AI. However, the need for explainable AI has never been more present and significant research and iteration is needed to unlock its power. Many have coined explainable AI as the “third wave of AI”, with the first being symbolic AI and the second being statistical AI.
Explainable AI introduces a human feedback and understand platform during the recommendation stage of an AI system, which has the potential to change how we build, deploy and maintain AI systems. Explainable systems need to be functional, operational, usable, safe and valid. Explainability may play different roles based on stakeholder roles: while engineers want to debug and explain recommendations, researchers enjoy the model-agnostic theoretical validity, and auditors require interpretable valid invariant explanations.
Many feature explainer algorithms exist with the purpose of explaining models, features and decisions within a defined context. The most popular, which are already being used in research across Silicon Valley’s tech giants, include:
- Ablations:An algorithm which drops a feature and measures feature attribution as prediction delta.
- mostly used for local debugging due to computational efficiency.
- Shapley (SHAP): A framework which uses game theory attribution to allocate output of any machine learning model across features (local or global explanation).
- Integrated Gradients (IG): An algorithm which integrates the gradient of a defined baseline against the feature along a straight-line path.
One of the clearest dimensions for applying explainability in AI is in debugging models. Google’s counterfactual what-if toolkit is perhaps the most straightforward example of applied debugging. The toolkit provides analytics which surface example data points for inspection and counterfactuals, subgroup evaluation, feature relationship measurement, and Shapley attribution of features.
A key dimension to future success in model debugging will be automation of bug detection using explainability and deployed fixes.
Beyond model debugging, explainable AI provides insights while answering complex questions. For example, by slicing and explaining samples with SHAP, one can isolate populations and explain differences in decisions for credit loan decisions. Therefore, users can explore counterfactuals and understand how recommendations may change based on certain characteristics.
Another important dynamic of explainable AI is its use in building trust in AI. Users might often prefer to use simulations of explainability to understand the model error such as in the case of Monte Carlo simulations. However, explainability inherently lacks robustness. Algorithms which explain AI focus on the input-output mapping and do not consider black-box mechanisms which are responsible for differences in prediction; therefore they do not consider the robustness of their explanation. This is common with simple feature explainers, and thus enables those who use explainable AI to strategically design answers to questions.
The biggest drawback of explainable AI is also its greatest power: it is highly contextual and requires careful design and evaluation to confirm the explanation is that which was intentioned. Explainability in AI should be used to drive understanding in a fairness context, but should not be used to justify decision-making.
Moreover, given the present strong interest to specify regulations for the tech industry, there need to be clear rules on how engineers should monitor and evaluate for algorithm bias. Yet, there is still a large open question of how to continue, especially with regard to sensitive data for measurement. An end-to-end framework for internal algorithm auditing can be developed with the goal to address the current gap between principles and practice. Furthermore, the following aspects can be taken into account while supporting the algorithmic equity:
- Building in mechanisms/levers that allow confronting design decisions and understandability in order to connect engineers who build algorithms to individuals who are expert decision-makers
- Engaging early with communications and policy makers
- Building auditability into the infrastructure
There are countless approaches that have been elicited in the industry to measure fairness in AI.
Preference Informed Fairness
To begin with, individual fairness and envy-fairness are concepts of fairness that are highly restrictive and often conflicting. Individual fairness requires treating similar individuals similarly, while envy fairness requires a user prefer their distribution of resources over everyone else. As individuals have diverse preferences over outcomes, adding constraints in the system may unintentionally lead to less-preferred outcomes for those the constraints were created to protect. The concept of PIIF (Preference Informed Fairness) allows for deviations from individual fairness by ensuring its alignment with individuals’ preferences. As such, the multiple-task setting in the presence of individual preferences provides an important model to investigate formal guarantees of non-discrimination without being overly-restrictive for the decision maker.
2. Machine Learning using Logistic Regression in Python with Code
3. Cheat Sheets for AI, Neural Networks, Machine Learning, Deep Learning & Big Data
4. Data Science Simplified Part 1: Principles and Process
Most fairness analysis have focused on static settings without considering long-term dynamics. Most workflows evaluate metrics in a dataset of the model in the context of a static environment and does not take into account feedback loops and long-term outcomes and consequences for society. In order to simulate long-term dynamics, a model that makes assumptions of the environment can be developed. For example, consequences of using equality of opportunity to improve short-term fairness can result in a larger gap in credit scores. In the long-run, more equality of opportunity policies grant many more loans to the disadvantaged group.
Best Practice Algorithms: Fairness Ranking by LinkedIn
From the industry perspective, LinkedIn’s approach to fairness ranking might be relevant for many stakeholders in the field. LinkedIn’s goal is that the top ranked results of candidates should follow a desired distribution on gender/age. The inspiration of their fairness aware reranking algorithm came from the idea of equal opportunity (“Diversity by Design”) and is composed of three steps:
- Partition the set of potential candidates into different buckets for each attribute value
- Rank candidates in each bucket according to the scores assigned by the ML model
- Merged the rank lists, balancing the representatives requirements and the selection
In order to validate their approach, LinkedIn used a metric called gender representativeness and determined that 95% of all searches are representative compared to the qualified population of the search. The business metrics obtained by performing A/B test over LinkedIn recruiter users for two weeks showed no significant change. During this process, the following lessons were learnt:
- Definitions of fairness (equal distribution matching inputed ideal distribution) do not match
Post-processing approach was desirable because:
- Model-agnostic (scalable across different model choices for an application)
- Acts as a fail-safe (robust to application specific business logic)
- Easier to incorporate as part of existing systems (build a stand-alone service or component for post-processing)
- Complementary to efforts to reduce bias from training data and during model training
- Collaboration/consensus across key stakeholders
FlipTest: Fairness Testing via Optimal Transport
Most of the existing methods rely on flipping the value of the protected attribute (e.g. gender), but this does not ensure that the model does not directly use the protected attribute to discriminate. Therefore, the indirect discrimination is not considered and this may result in obtaining out-of-distribution sample due to correlated attributes. FlipTest goal is to answer the following question:
Had an individual been of a different protected status, would the model have treated them differently?
In order to address this question, FlipTest adopts a technique based on optimal transport mappings, instead of relying on causal information as previous approaches do. The optimal transport map transforms one probability distribution into another, while minimizing the given cost defined over their respective supports. In particular, an optimal transport map can be used to map the distribution of men to women in order to obtain (female, male) pairs which are then used to query the model. The discriminatory behavior can be detected by noticing different outputs from the model for these two people.
From a high-level view FlipTest framework is composed of four steps:
- Sample populations;
- Approximate optimal transport mapping;
- Run pairs through the model;
- Compare outcomes.
Interventions for ranking in the presence of implicit bias
Implicit bias represent the unconscious association of qualities to members of a particular group. In the context of ranking hiring candidates problem, individuals placed later in the ranking are less likely to receive positive outcomes due to the presence of the implicit bias. In order to mitigate bias in the subset selection, constraints such as the Rooney rule. The idea of Rooney Rule is to select at least one candidate from an underprivileged group for interview and it was introduced in 2007 for coach positions in the National Football League.
The rule may be appealing because of its simplicity and understandability. However, by adopting the “equal outcomes” path, several possible implementations might need to be explored:
- use of sensitive dimensions to either force an proportional budget spend by group
- use of sensitive dimensions to enforce a minimum budget allocation to a group
- force a certain fraction of ads from a particular vertical for consideration at the final ranking stage for a given sensitive group etc.
Given the popularity of deep learning due to its ability for better accuracy, so far the focus has not been on the cost for this improved level of accuracy. By favoring accuracy, software engineers are automatically forced to make a trade-off between interpretability and fairness. Besides, the environment is also impacted: One large state-of-the-art deep learning model in natural language processing is equivalent to the carbon footprint left by one average car over its lifetime. This alone should be more than enough to make a shift in the current models.