HaSPI – Hate Speech Prevention through Imitation

Using imitation learning to effectively combat hate speech on the German-language internet.

Background and Project Content

Hate speech and hateful comments have become a widespread problem on social media, forums, and similar platforms. Tackling this issue requires significant technical and human resources. Automated tools can help detect and remove hateful content, easing the workload and addressing the problem more effectively. However, most existing tools are not optimized for content in German-speaking regions. HaSPI aims to close this gap by developing moderation software capable of identifying problematic content and initiating appropriate interventions (i.e., moderation software).

Target Groups

Our software's target audience includes operators of German-speaking forums who want to promote open and transparent debates while preventing hateful comments. This group consists of both commercial providers and private persons, such as bloggers.

We also aim to engage the scientific community and developers of content moderation software by demonstrating the benefits of imitation learning—where software learns from user examples.

Another target group is users of online forums. By detecting and deleting hateful comments, they can improve the user experience and foster a better discussion culture. This can result in a greater diversity of opinion and bring back individuals who shied away from contributing because of online hate.

Methodology and Scope

Our software leverages machine learning methods, trained on a dataset of comments and posts from standard.at ("One Million Posts" corpus, of which about 11,000 will be labelled). The software shall learn from user actions and then be able to imitate them (i.e., imitation learning). To increase accuracy and performance we incorporate platform-specific contextual information, such as user history and the thematic focus of content. We evaluate the software's performance against other currently used models and test whether it can be successfully employed on standard.at. Additionally, we develop an user-friendly interface that allows users easily classify posts as problematic and detect as well as delete hateful comments.

Outcome and Outlook

In the HaSPI project, we are developing a software solution that uses imitation learning to automatically detect and remove hate speech. The software is optimized for use in the German-speaking internet and will be freely available. We use data from the standard.at forum, but to expand the software’s functionality, it needs to be trained with additional data corpora (e.g., the dataset from Rheinische Post) in future projects. To demonstrate the model’s broader applicability, we will test it on smaller forums and blogs as well.