The dangers of algorithmic bias
Algorithmic bias is when the result of an algorithm is not neutral, fair or equitable. (Wikipedia)
AI systems used in the United States by the judiciary or police to predict the risk of recidivism or crime occurrence have been shown to be racist towards black and Latino populations.
Predictive policing algorithms are racist. They need to be dismantled.
The problem lies with the data the algorithms feed upon. For one thing, predictive algorithms are easily skewed by…
In 2015, Amazon tried a recruitment process via an algorithm trained on hundreds of thousands of resumes received by the company over the past 10 years. The algorithm selected mostly men because the executives hired in the past were overwhelmingly male. The algorithm therefore “learned” not to give women any chance.
Amazon scraps secret AI recruiting tool that showed bias against women
SAN FRANCISCO (Reuters) — Amazon.com Inc’s machine-learning specialists uncovered a big problem: their new recruiting…
A study recently published in the Journal of General Internal Medicine found, that a diagnostic algorithm for estimating kidney function that takes race into account; assigns blacks healthier scores, thus underestimating the severity of their kidney disease.
If the algorithm were corrected, one-third of the 2,225 black patients studied would be classified as having more severe chronic kidney disease, and 64 of them could receive a kidney transplant that the algorithm would have denied.
How a racist algorithm kept Black patients from getting kidney transplants
In recent years, doctors have started to lean on algorithms when determining certain aspects of patient treatments…
Such discriminatory results are largely explained by biased data used to train the algorithms, which in turn become biased.
In an August 2020 research paper, “Hidden in Plain Sight — Reconsidering the Use of Race Correction in Clinical Algorithms,” researchers show that in all areas of medicine (from oncology to cardiology to urology), white patients would be favored over racial and ethnic minorities in the allocation of resources and care by biased algorithms.
Detecting algorithmic biases
For Yannic Kilcher, algorithmic biases are caused by unbalanced data sampling. In this case, to correct the bias(s), we just need to correct the sampling of the data.
They can also be created by a reality that is not what we want (discriminating) but that is reflected by the data. In this case, correcting the bias(s) is a way to distort reality.
PAIR (People + AI) Research is a multidisciplinary team from Google that explores the sociological impact of AI. They have created simple interactive animations to explain the generation of algorithmic biases “AI Explorables”
People + AI Research
Big ideas in machine learning, simply explained The rapidly increasing usage of machine learning raises complicated…
One of the animations proposed by PAIR Research measures the fairness of an algorithm based on the data it is trained on. This is an algorithm used to predict whether people are sick. Two configurations can be encountered by the researchers to evaluate the performance of the model.
A choice can be made to never miss the disease. The risk is to have an algorithm that makes too many predictions of sick people when they are in good health (false positives).
Researchers may also want to develop an algorithm that almost never predicts sick people when they are healthy, but the risk is to miss the real sick patients predicting that they are healthy (false negative).
The aggressiveness of the model is up to the researchers and practitioners depending on how they want to use their algorithms. The unfairness arises when the groups of healthy and unhealthy patients are not equal in number (poor sampling) by age.
For the same “aggressiveness”, the model will diagnose more sick adults when they are not sick because in the given sample there are more healthy adults than sick adults. This is not the case in the child data sample.
In this scenario, researchers researchers have no leverage to correct this inequity. No matter how the sliders are moved, the two measures will never be right at the same time. The bias in this algorithm is related to reality because children are more often sick than adults. The sampling of the data only reflects reality.
Article written from these resources
MIT 6.S897 Machine Learning for Healthcare, Spring 2019 Instructor: Peter Szolovits-Fairness
PAIR AI Explorables | Is the problem in the data? Examples on Fairness, Diversity, and Bias — Yannic Kilcher