CDT examines challenges of moderating low-resource languages

CDT examines challenges of moderating low-resource languages

Technology
Webp 3r2l9nmmbri3huekmox6348shtyh
Alexandra Reeve Givens President & CEO at Center for Democracy & Technology | Official website

ORGANIZATIONS IN THIS STORY

LETTER TO THE EDITOR

Have a concern or an opinion about this story? Click below to share your thoughts.
Send a message

Community Newsmaker

Know of a story that needs to be covered? Pitch your story to The Business Daily.
Community Newsmaker

The Center for Democracy & Technology (CDT) has conducted an extensive study over the past 18 months on content moderation systems in the Global South, focusing on regions such as South Asia, North and East Africa, and South America. The research centered around four low-resource languages: Maghrebi Arabic Dialects, Kiswahili, Tamil, and Quechua. These languages face challenges due to limited training data available for developing accurate AI models.

In this comprehensive study, CDT engaged with social media users, digital rights advocates, language activists, tech company representatives, content moderators, and creators through interviews and surveys. Over 560 frequent social media users from various regions participated in an online survey. Roundtables and focus group sessions were also organized to better understand the digital environments of these regions.

A significant challenge identified was the culture of secrecy surrounding content moderation practices by technology companies. This field remains largely inaccessible to public scrutiny. Despite these obstacles, CDT's findings contribute valuable insights into content moderation challenges in the Global South.

The report compares insights from four case studies and offers recommendations for improving content moderation in low-resource languages. It emphasizes that each language carries its own unique history and linguistic features that must be acknowledged when discussing general content moderation strategies.

One key finding is that global tech companies often employ two main approaches to content moderation: a global approach applying uniform policies worldwide and a local approach tailoring policies to specific regions. The latter can sometimes limit users challenging local norms violating their rights. An exception is found with JamiiForums' "multi-country approach" in Kiswahili-speaking regions which assigns native language moderators to improve user satisfaction.

Users across all four case studies expressed concerns about misinformation and hate speech on social media platforms in their regions. Tamil and Quechua users particularly noted issues with wrongful removal of their content.

Four major outsourcing service providers—Teleperformance, Majorel, Sama, and Concentrix—dominate the market for moderating low-resource languages examined in this study. Moderators are often overworked without adequate psychological support or wellbeing breaks while lacking diversity in hiring processes.

Resistance among users against perceived undue moderation is common in the Global South; tactics include changing letters within words or using emojis creatively—a phenomenon known as "algospeak."

Finally, many NLP researchers from these regions have developed tools aimed at improving moderation but feel underutilized by tech companies despite having valuable expertise that could enhance current systems significantly if leveraged properly.

ORGANIZATIONS IN THIS STORY

LETTER TO THE EDITOR

Have a concern or an opinion about this story? Click below to share your thoughts.
Send a message

Community Newsmaker

Know of a story that needs to be covered? Pitch your story to The Business Daily.
Community Newsmaker

MORE NEWS