Combating hate : how multilingual transformers can help detect topical hate speech
dc.contributor.author | Srikissoon, Trishanta | |
dc.contributor.author | Marivate, Vukosi | |
dc.contributor.email | vukosi.marivate@cs.up.ac.za | en_US |
dc.date.accessioned | 2024-05-30T11:03:48Z | |
dc.date.available | 2024-05-30T11:03:48Z | |
dc.date.issued | 2023 | |
dc.description.abstract | Automated hate speech detection is important to protecting people’s dignity, online experiences, and physical safety in Society 5.0. Transformers are sophisticated pre-trained language models that can be fine-tuned for multilingual hate speech detection. Many studies consider this application as a binary classification problem. Additionally, research on topical hate speech detection use target-specific datasets containing assertions about a particular group. In this paper we investigate multi-class hate speech detection using target-generic datasets. We assess the performance of mBERT and XLM-RoBERTA on high and low resource languages, with limited sample sizes and class imbalance. We find that our fine-tuned mBERT models are performant in detecting gender-targeted hate speech. Our Urdu classifier produces a 31% lift on the baseline model. We also present a pipeline for processing multilingual datasets for multi-class hate speech detection. Our approach could be used in future works on topically focused hate speech detection for other low resource languages, particularly African languages which remain under-explored in this domain. | en_US |
dc.description.department | Computer Science | en_US |
dc.description.librarian | am2024 | en_US |
dc.description.sdg | SDG-09: Industry, innovation and infrastructure | en_US |
dc.description.sponsorship | The ABSA Chair of Data Science, the TensorFlow Award for Machine Learning Grant. | en_US |
dc.description.uri | https://easychair.org/publications/EPiC/Computing | en_US |
dc.identifier.citation | Srikissoon, T. & Marivate, V. 2023, 'Combating hate : how multilingual transformers can help detect topical hate speech', EPiC SeriesinComputing, vol. 93, pp. 203-215. DOI:10.29007/1cm6. | en_US |
dc.identifier.issn | 2398-7340 (online) | |
dc.identifier.other | 10.29007/1cm6 | |
dc.identifier.uri | http://hdl.handle.net/2263/96304 | |
dc.language.iso | en | en_US |
dc.publisher | Easychair | en_US |
dc.rights | © 2023 EasyChair. | en_US |
dc.subject | Hate speech | en_US |
dc.subject | Machine learning | en_US |
dc.subject | Natural language processing | en_US |
dc.subject | SDG-08: Decent work and economic growth | en_US |
dc.title | Combating hate : how multilingual transformers can help detect topical hate speech | en_US |
dc.type | Article | en_US |