Hate speech and misinformation on social media can have a devastating impact, particularly on marginalized communities. But what if we use AI to combat such harmful content? That’s the goal of a team of University of Toronto researchers who were awarded a Catalyst Grant by the Data Sciences Institute to develop an AI system to address the marginalization of communities in data-centric systems – including social media platforms like Twitter.
The collaborative research team, which consists of Professors Syed Ishtiaque Ahmed, Department of Computer Science (Faculty of Arts & Science), Shohini Bhattasali, Department of Language Studies (University of Toronto Scarborough) and Shion Guha (Faculty of Information), intends to make content moderation more inclusive by involving the communities affected by harmful or hateful content on social media. The project collaborates with two Canadian non-profit organizations: the Chinese Canadian National Council for Social Justice (CCNC-SJ) and the Islam Unravelled Anti-Racism Initiative.
Professor Ahmed shares that historically marginalized groups are most affected by content moderation failings as they have lower representation among human moderators and their data is less available for algorithms. He says, “While most social media platforms have taken measures to moderate and identify harmful content and limit its spread, human moderators and AI algorithms often fail to identify it correctly and take proper actions.”
The team plans to design, develop, deploy, and evaluate the proposed system to address potential Islamophobic and Sinophobic posts on Twitter. The AI system aims to democratize content moderation by including diverse voices in two primary ways. First, by allowing users to contest a decision, the moderation process becomes more transparent and trustworthy for users who are victims of online harms. Second, by taking user input and retraining machine learning (ML) models, the system ensures that users’ contesting positions reflect on the prescreening ML system.
Ahmed explains, “Annotating data becomes challenging when the annotators are divided in their opinions. Resolving this issue democratically requires involving different communities, which is currently not common in data science practices. This project addresses the issue by designing, developing, and evaluating a pluralistic framework of justification and contestation in data science while working with two historically marginalized communities in Toronto.”
The AI system will integrate the wisdom, knowledge, and experiences of community members into the process of reducing hateful content directed toward their communities. The team is using a participatory data curation methodology. They learn about the characterization of different kinds of harmful content affecting a community and include members of the corresponding community in the data labeling process to ensure data quality.
“We are grateful to DSI for their generous support for this project. The DSI community has also helped us connect with people conducting similar research and learn from them. Thanks to the wonderful DSI community, whose mission includes innovating and adopting various data-centric approaches to social justice,” says Ahmed.
The research project is a promising initiative to address the issue of harmful content on social media and is expected to have far-reaching impacts beyond the two communities it is currently focusing on.