Right now, there is a huge awareness and discussion (and rightly so!) going on around systemic racial biases in the US against the black community. I am not an expert in sociology; hence I would avoid sharing an opinion on how to resolve that. But what I’ll talk about is the next wave of systemic biases — algorithmic biases — which is not being discussed as much but will be more widespread and detrimental to the society if we don’t put an effort to address them. It should not be news that decision-making is moving towards machines as our machines are becoming smarter. But these machines depend on algorithms that are written by human beings and trained on the dataset representing the same underlying systemic biases (sometimes).
Machine Learning, or Artificial Intelligence what media prefers to call to match the image people have in their mind from Ironman or Terminator movies, ultimately relies on how you train the algorithm — features, and dataset you choose. Hence these algorithms aren’t better than humans if features and dataset used to train them are biased in the first place. In the layman language, features are the different characteristics that help machines to differentiate between data points — similar to how in the physical world some people co-relate skin/race/gender/education/neighborhood to certain things/activities.
1. Natural Language Generation:
The Commercial State of the Art in 2020
2. This Entire Article Was Written by Open AI’s GPT2
3. Learning To Classify Images Without Labels
4. Becoming a Data Scientist, Data Analyst, Financial Analyst and Research Analyst
Here is the real-world consequence of these (intentional or unintentional) biases, Optum designed an algorithm for hospitals (used in over 50 organizations and probably thousands of hospitals serving millions of patients in the US) that tells Doctors which of their patients require their additional attention based on their current health. It turned out the algorithm was biased towards healthier white patients and gave them priority over sicker black patients because the algorithm writer* used a feature of “cost” to rank patients. Historically healthcare cost on black patients has been lower than on the white patients; hence this feature shouldn’t have been selected in the first place. Ideally, patients should have been ranked based on their chronic diseases only and not on how much they pay for the additional care they receive by physicians. We don’t know how much damage this biased algorithm caused before the fix, but we probably wouldn’t see any protests against these biased algorithms or companies owning them. These AI systems are doing what and how they are trained for — (sounds familiar?!). This is just one example, but you can find hundreds of similar studies showing the same in various industries including job/resume selection based on the name and gender and men getting more credit limit than women with the same credit history when applying for a new credit card (Disclaimer: I am not blaming any of these companies doing this intentionally).
Hence as we are moving towards fully-automated or hybrid (machine filters first and human beings decide based on a machine provided options) decision-making systems — it is very critical that we create the right culture, environment, processes, policies, and training to address these unintended biases. I believe that the next wave of policies (I am calling them Policies 2.0) will have serious consequences of how our societies operate and these will not be driven by the policymakers in Capitol Hill but by these tech companies and their algorithm writers predominantly male in their 20s and 30s sitting in front of their wide-screen monitors and whiteboards. But that also means there is a big opportunity that we can actually fix these “systemic biases” if they are done right (not just with the best intentions but also actions with being open to accepting mistakes and correcting course).
Here are a few ideas that may be the first baby steps in addressing these issues:
- Diversity: It is not news that there is a racial as well as gender imbalance in the workforce in the tech industry. We need to fix this imbalance so that there are different opinions in the room and on whiteboards. We need to fix this 9:1 Men: Women ratio in our tech industry. Saying that there are not enough candidates should not be an excuse anymore. In recent years, big tech companies have acknowledged and taken encouraging actions. Still, progress has been really slow, and now it is high time that companies take measurable and actionable goals on these. Things are even worse when it comes to the Black and Latino communities working in the tech industry. If we can come up with driverless cars, then I am sure we can fix this too if large tech companies make this a real priority and not just for a good gesture or marketing.
- Feature engineering: This is one of the most critical, unacknowledged, and the hardest to capture sources of biases in algorithms. The way machine learning algorithms (over-simplified explanation) are built in most of the big tech companies, algorithm writers get or generate the dataset with the labeled data and split it into training and test data (e.g., 3:1) and build their algorithms on the training set and strive to match the output with their labeled data. To achieve this, Algorithm writers identify features that are characteristics of the data points and can be biased if the dataset is biased or if she/he lacks the domain knowledge to select unbiased features. For example, if an Ivy League University tomorrow decides to automate their undergrad admission process, they will design an algorithm that will look at ACT/SAT scores as one feature, high school GPA as the second feature, (maybe) parents jobs and education as the third feature to predict the probability of the success of applicants. Algorithm writers are obsessed with matching the output of their algorithms with the labels (aka ground truth), and they will give weightage to individual features to “fit” their output. With the re-emergence of Deep Learning in the past few years, even feature engineering and weightage are becoming automated. Hence the quality of decision making by an AI system will essentially be dependent on the quality of the data itself. Tech companies need to put a review process in place (maybe by a third-party and some regulatory body) for any AI system that does any decision making (fully or hybrid) in any public (serving human beings) institution like healthcare, education, law enforcement, and transportation. Facebook has recently created an oversight board to do something like that to police the content on its platform — a great first step but yet to be seen how Mark Zuckerberg will react when they overrule his decision.
Similarly, algorithm writers should be conscious of these biases and should raise the flags when they see any underlying biases in datasets. Our job as algorithm writer should not be to just increase accuracy measured by precision and recall but also making sure decision-making is better, more effective, and unaffected by which and how humans use them in the future (Tough one — right?!). Tech companies can also put training in place to train their engineers using case studies on how to identify and flag these biases.
- Unbiased Datasets: I have already explained the issue of biases in datasets above that will creep into AI systems if the job of algorithm writer is to fit their model with the labeled data, which is what most of the current systems do. It is unfair to expect an unbiased AI system if the dataset is biased in the first place. Hence in our machine learning strategy, there should be an additional step of vetting the biases in the dataset (e.g., crowdsourcing). We should verify if the dataset represents the diversity of the population, it is going to make predictions about and if the labeled data is fair and impartial. There are various works done in theory on this, but very rarely we have seen practicing them on a large scale as practicing any of them will increase the time to deliver the project (and consequently increase the cost and lower profits). Hence, leadership and culture play a critical role here in managing these priorities.
Again this is a hard problem to solve, and by no means, my purpose of this write up is to provide solutions — this will be the next big evolution of our society and human beings. The 21st century will be a defining moment in designing these policies and practices — maybe we need a new UN like global body to which everyone is answerable to. I have personally become more conscious of my unconscious biases after recent discussions and incidents that hopefully take us to a better place than before. But right now, what we need is (1) an acknowledgment or at least possibility of these biases in AI systems and (2) initiating debate and processes about how we can address these biases. Acknowledgment itself is the first stepping stone of this long journey ahead of us.
P.S. Any thoughts or ideas shared in this blog are entirely personal and have nothing to do with my current or any of my previous employer(s). Also, I am not blaming any particular tech company or person for any intentional discrimination against any race/gender/sexual orientation or anything. All information shared here is based on the publicly available articles, and no confidential information was used.
*I used the word Algorithm writer instead of programmer or engineer as there are a lot of people involved when AI systems are designed including managers of these programmers, middle and senior leadership that sets goals and outcome metrics, and culture of the company which ultimately is driven by the executive leadership. All these people are algorithm writers and play a role in shaping the outcome.