Multimodal misinformation detection in the South African social media environment

Loading...
Thumbnail Image

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

University of Pretoria

Abstract

The prevalence of the computer information system, personal communication devices, and the globalisation of the Internet and social media such as Facebook, Twitter and many more have reshaped our lives. These online social media platforms have revolutionised communication and information processing. People use these online social media platforms conveniently to share perspectives or personal messages in text, images, and video. However, while people enjoy social media or online social networking sites with snippets of textual and visualised content, deceptive activities like misinformation, disinformation, fake news, rumours, and spam mislead users by providing false information. Therefore, the widespread dissemination of information on social media and the Internet poses serious potential hazards to critical infrastructures like national security, health, and supply chains, potentially leading to shortages of essential commodities. Misinformation during the 2020 United States presidential election led to widespread confusion and public distrust, highlighting the need for users to critically assess information before believing it. Misinformation detection (MD) on social media has garnered significant attention and is a growing area of research interest. Unfortunately, existing methods often do not utilise textual and visual content simultaneously to understand the related and unrelated information so as to quantify the reported information as real information or not, particularly in the South African social media context. Furthermore, these methods heavily depend on manually crafted features from data and find it challenging to detect subtle forms of false information. The current methods are time-consuming, inefficient, and need constant updates for new trends, limiting their adaptability. In this dissertation, we are seeking to investigate the efficacy of misinformation detection models within the context of the South African social media environment. As a result, we proposed MMiC, a multimodal misinformation detection (MD) model that draws on a variety of information sources, including textual and visual aspects. Firstly, we use a pre-trained BERT model as a transformer-based model as an encoder to learn the underlying psychological representation in the textual data in a natural language, then use a pre-trained ResNet model to decode the visual content. Secondly, we amalgamated both the encoding and decoding layers over a fully connected layer and a SoftMax function to make the prediction. Throughout the investigation, the MMiC model undergoes comparisons with other baseline models and optimizations across multiple design cycles. These cycles involve developing the base framework, selecting the optimal combination of textual and visual encoders, and comparing different methods of multimodal feature fusion. The MMiC model is assessed in both a general context and specifically in the local South African context. The experiment results show that the MMiC model performs as well as the best current MD models (88% of the time) on the benchmark dataset (Fakeddit); adding local samples to the training dataset improves model performance by an average of 29%; and the MMiC model can accurately spot false information on South African social media sites (89% of the time). The results show that cultural differences in the places where MD models work affect their performance and that utilising multiple forms of communication can enhance knowledge transfer across various settings. Incorporating local data into training misinformation detection models is crucial to enhancing their performance. Moreover, including data from the local context helps ensure that the models are effective and accurate in various settings. We firmly believe that MMiC has the potential to facilitate the development and implementation of a misinformation detection system to combat misinformation in South Africa. Limitations encountered in this research include: obtaining access to existing MD datasets and state-of-the-art pre-trained models. Recommendations for future research involve expanding the subset of the local dataset that was used in this research to include samples from all social media platforms. Another recommendation would be to investigate the use of more complex methods in which to fuse the multimodal feature vector.

Description

Dissertation (MSc (Computer Science))--University of Pretoria, 2024.

Keywords

UCTD, Sustainable Development Goals (SDGs), Multimodal, Machine learning, Misinformation, Fake news detection

Sustainable Development Goals

SDG-09: Industry, innovation and infrastructure

Citation

*