🌱 Edge#182: What Responsible AI starts with
🔥 The LAST three days to subscribe to TheSequence Edge, our Premium newsletter that keep you up to date with everything important in the ML&AI world, with a massive 50% discount. Subscribe and share with your colleagues! Thank you for your support.🔥
💥 What’s New in AI: Responsible AI starts with ethical data labeling
The rise of artificial intelligence has been nothing short of staggering. In just a few years, AI has made inroads into a number of industries, transforming how we live and work.
Healthcare, self-driving cars, the stock market – these are just a few of the areas where AI is making a major impact. But as the AI industry continues to grow, there are more and more questions about ethics and bias in algorithms and what systematic changes should be made to get us closer to Responsible AI.
It’s safe to say that Responsible AI should begin with groundwork that’s inclusive from the start, which brings us to data labeling.
Let’s explore why it's so important that data labeling companies take steps to ensure that their workers are treated fairly and that the labeling they perform helps algorithms be free from bias. As explored in a recent TechTimes article, Isahit, Toloka, CloudFactory, and Sama are a few companies that are leading the way in this regard.
What is Responsible AI?
Responsible AI is a governance framework that ensures that designing, developing, and deploying AI is performed with good intention to empower employees and businesses and fairly impact customers and society – eliminating bias and engendering trust.
“Responsible AI is a very important topic, but it is only as good as it is actionable,” said Olga Megorskaya, CEO of a global data labeling platform Toloka AI. “If you are a business, applying AI responsibly means constantly monitoring the quality of the models that you have deployed in the production at every moment of time and understanding where the decisions made by AI come from. You must understand the data on which these models were trained and constantly update the training models to the current context in which the model is operating. Secondly, responsible AI means responsible treatment of people who are actually acting behind the scene of training AI models. And this is where we tightly cooperate with many researchers and universities.”
How to Treat Annotators Responsibly
The so-called gig economy has been a boon for many businesses, but it has also created a new class of workers who are paid very little and often have few benefits. This is especially true for those who do data labeling for AI algorithms.
A recent Bloomberg article featured Professor Saiph Savage, who was recognized for her research on AI for good that empowers digital workers. In collaboration with Toloka, the project was named the most impactful on the IRCAI Global Top 100. This is a list of the top 100 AI projects helping to solve problems related to the United Nations’ 17 sustainable development goals (SDGs) worldwide.
By interviewing digital workers, the team identified that the workers faced a number of unfair evaluations that resulted in loss of wages and even termination. Following this, the team designed an intelligent system that uses deep learning to detect when a worker receives an unfair review. With this information, the system is able to recommend learning tasks to help reinforce and develop specific skills.
This is a great example of how AI can be used for good instead of perpetuating bias. By giving data labelers a voice and empowering them with tools, we can create a more fair and just society.
How to Eliminate Bias via Data Labeling
Another key issue when it comes to ethics and AI is bias. We've seen time and time again how AI can perpetuate bias and discrimination.
A 2012 study co-authored by an FBI technologist found that facial recognition systems used by the police were significantly more accurate at identifying Caucasian subjects than they were at identifying African Americans.
This kind of bias can have serious real-world consequences: predictive policing can lead to increased police surveillance of some groups; AI-powered hiring tools can reinforce gender and racial bias in the workplace. Even something as simple as User Experience is hampered for people of color when interactions with virtual assistants are fraught with frustration.
The good news is that there are ways to reduce bias in AI. One key step is to use a diverse, inclusive crowd of labelers.
If, for instance, only white people are labeling data for a facial recognition system, then that system is likely to perform better on white people than on other skin colors. But if a diverse group of people from different backgrounds are labeling the data, then the system is likely to be more accurate for everyone.
The same is true for gender and racial diversity. If only men label data for a sentiment analysis algorithm, then that algorithm is likely to be biased against women. But if a diverse group of people labels the data, then the algorithm is likely to be more accurate for everyone.
Ethical Data Labeling: Leading the Way
There are data labeling companies that are beginning to take steps to address these issues.
The aforementioned Toloka connects with researchers and universities to establish standards of Responsible AI. It makes transparency, empowerment, and fairness central to its business model. It also aims to make a social impact with the kinds of crowdsourcing tasks they offer – such as organizing garbage collection services by enabling AI that can detect if a garbage receptacle is full or even detecting missing persons in a forest.
Isahit is another data labeling company that is making a positive social impact. The company operates on a model of impact sourcing, which is "a practice of social inclusion through employment, whereby we give work to young people from disadvantaged backgrounds."
This means they hire students, people looking for a job, and even entrepreneurs in developing countries who wouldn't otherwise have access to work.
CloudFactory and Sama are other examples of responsible data labeling companies.
The Bottom Line
The data labeling industry is growing rapidly, but there are still some systematic issues that we have to think about sooner than later. These include the exploitation of workers, as well as the issue of algorithmic bias. When biased systems are deployed, they can have a profound impact on people's lives. This problem needs to be addressed, and ethical data labeling can play a role in reducing AI bias.