When Machines Know Sin: the Algorithmic Bias of Technology
Technology and machines are becoming more and more like us as they advance. But this mimicry is already crossing over to our darker prejudices, adopting the same biases that have plagued society for so long.
Machine learning, which is the most rapidly advancing manifestation of AI, is being utilized everywhere from pointlessly fun app features like Snapchat filters to gravely consequential applications like law enforcement.
The future isn’t far from when decisions of political and social import will be made by AI assistants. However, I can’t help but question the wisdom in placing more trust in machines than we can on a plain old-fashioned human being occupying high office.
Not, at least, until we continue training AI algorithms with lopsided datasets that compound gender and racial stereotypes, reinforcing the privilege of those established at the top of the social order.
And herein lies the danger of designing technology with complete disregard to its social repercussions.
Gender bias and racism in machine learning data
It sounds strange to suggest that an algorithm can be stereotypical and show preference towards certain groups in its complex computations. Does it even make sense to challenge the cold objectivity of mathematical algorithms and have the audacity to hold them guilty of biased judgment?
I hate to say it, but we have all the reason to doubt the impartiality of our technology.
Let’s take a look at a few examples:
A team of researchers from the University of Virginia found that image datasets used for training machine learning algorithms are characterized by gender bias, containing predominantly male images. Females are outrageously underrepresented in the Microsoft and Facebook cosponsored image dataset COCO as well as University of Washington’s ImSitu
The researchers found a classic case of gender stereotyping when a visual recognition algorithm trained on ImSitu mislabeled a man standing in a kitchen as “woman”. But sexism isn’t the only sin sophisticated machines are guilty of.
In another research on software that predicts criminal recidivism, it was found that the algorithm gave high re-offending risk scores to black people two times more than white people. A black woman with four juvenile misdemeanors was labelled “high risk”, while a white man with two prior armed robberies was labelled “low risk”. Surely enough, the lady didn’t reoffend, but the man went on to commit a grand theft.
Similar racial and gender discrimination is found in face recognition software of Microsoft and IBM, which are much less accurate at identifying black women as compared to white men.
So, what’s going on here?
To put it simply, machine learning algorithms are dependent on the datasets used to train them. If the data is skewed, the software will exhibit a preference towards the group that has majority representation in the dataset. Tech companies are simply feeding their own biases into machines.
The above cases reveal only one of the two ways that machines can be made to discriminate unfairly. These are all instances of bad datasets being used to train good algorithms. But what if the algorithm itself is biased?
If you’re looking for an example of such an algorithm, look no further than Google’s search engine.
The Bias of Google
Search “Asian girls” on Google, for instance, and the search results will push scantily clad Asian women in seductive poses on top of any other context Asian girls can be found in. The same goes for Latinas.
Google uses predictive technologies to anticipate user intent when a search query is typed. Now, I won’t be surprised if most people actually type in “Asian girls” with sexual amusement as the intent, but Google has a responsibility to display results in a balanced context, rather than blindly fueling racial stereotypes.
Correcting these issues might entail a modification of the existing Google algorithm, which at present considers some 200 factors when generating search results for a query. Ensuring non-discriminatory search results without undermining the accuracy of reading user intent will probably be a challenge, but it is one that deserves Google’s immediate attention.
As it stands, rewarding the privilege of majority on a platform as ubiquitous as Google is a disservice to the struggle for a fairer and equal society.
It is clear that both the data and the algorithm (or in other words, the interpretation of the data) need to be free from bias to refine accuracy and improve fairness in AI systems.
The ghost of scientific racism
In a way, the problem of discriminatory technology hearkens back to scientific racism. There’s an interesting example of good data leading to bad conclusions in research of comparative capabilities of different races, which piqued the interest of a few influential scientists in the 19th century.
Chief among these was the American physician Samuel Morton, the founder of craniometry. He collected some 900 skulls from five different ethnic groups which he categorized as Caucasian, African, Native American, Mongolian, and Malay.
From 1839 to 1844, he conducted his research, measuring the sizes of these skulls and averaging the results for each ethnic group. He found that Caucasians had the largest skull size and Africans were the smallest. Since brain size was believed to be the sole factor that determines intelligence by the contemporary scientific wisdom of the time, Morton had just supposedly found a scientific basis for the superiority of Caucasians and inferiority of Africans.
At around the same time, a German anatomist, Friederich Tiedemann, was performing the same kind of craniometric experiment on five different racial groups, but reached an entirely different conclusion. Noting the significant overlap between skull sizes of all the measured races, it was obvious to Tiedemann that no significant difference exists between cranial capacity of races, and thus, there are no scientific grounds for racism.
The important point to note here is that the data acquired by both scientists is almost similar and scientifically sound, but it led these men to diametrically opposite conclusions. If Tiedemann focused on averages, he would be tempted to make the same conclusion as Morton. But the measure that dominated Tiedemann’s attention was the ranges of skull size for each racial group rather than the average of each, which occupied Morton’s.
This simple change in interpretive approach was clear evidence to Tiedemann of the injustice of slavery and oppression of black populations, while Morton submitted to his possible inherent racism when pronouncing all Africans inferior to Caucasians.
If there’s something to take from this case and apply it to our present scenario, where the interpreters and decision-makers are machines, it is to purge our algorithms free from bias. Otherwise modern technology will only be perpetuating dangerous prejudices akin to scientific racism.
Where do we go now?
The problem with our misbehaving machines, as I see it, stems indirectly from the inequality of social structures around the world, and directly from the lack of diversity in leading tech companies i.e. Google, Facebook, Apple, and Amazon.
Google has the worst black representation out of all major tech companies. In fact, female and minority representation in the tech industry as a whole is abysmally low.
The lack of diversity promotes an environment devoid of any checks against the transmission of discriminatory beliefs into products. Anima Anandkuma, a Professor at California Institute of Technology who has previously worked on Amazon’s AI systems says, “Diverse teams are more likely to flag problems that could have negative social consequences before a product has been launched.”
There is arguably no other industry that is more impactful to society than the technology sector. If the firms that comprise this industry are almost entirely represented by white male populations, it is hardly surprising that their products end up being biased against other ethnicities.
As long as tech firms are monopolized by homogenous groups of people with little to no diversity among them, biases are going to creep their way into the systems and software these firms create.
It is not easy to fathom all the social implications of biased machines. But if there is one thing we can say about the future, I’ll bet that it will heavily centre around technology and Google, Facebook, Amazon will likely be the key shareholders in this world. Marching forward into the future carrying stereotypical biases in machines would be calamitous, and serve only to accentuate the social polarization of modern society.
To err is human, and that is exactly why we need technology.
I just hope that propensity for error remains confined to the human race, for our sake.
When machines know sin: the algorithmic bias of technology was originally published in Hacker Noon on Medium, where people are continuing the conversation by highlighting and responding to this story.