Multimodal machine learning model for detecting Kiswahili hate speech on social media

Banchale Adhi Gufu; Edward Ombui; Audrey Mbogho

doi:10.21622/ACE.2026.06.1.2076

Multimodal machine learning model for detecting Kiswahili hate speech on social media

Banchale Adhi Gufu, Edward Ombui, Audrey Mbogho

Abstract

Social media have become a primary source of information across the world, shaping public discourse, opinion formation, and everyday decision-making in real time. The primary languages for discourse in the social media platforms are either, high-resource languages or low-resource. However, existing hate speech detection research has primarily focused on high-resource languages, highlighting a significant gap and need in developing detection system tailored for low-resource languages in general and the Kiswahili content in particular. The situation is exacerbated by the fact that, hate speech within social media is increasingly expressed through multimodal communication, shifting from purely text-based insults to more complex forms in which harmful intent is conveyed through combinations of text, images, videos, audio, and other rich media.

To address the exiting gap, we created ChujaHate, integrating socio-cultural qualitative analysis with supervised ML for Kiswahili text-image hate speech detection. A mixed method was utilized where qualitative methods incorporated socio-cultural insights into understanding hate speech, while quantitative methods facilitated model development, evaluation, and generalisation of findings. Over 115,204 publicly available Kiswahili text sample were extracted from X (formerly Twitter) and Facebook and annotated into nine classes (disability, gender, race, religion, sexual, tribe, chronic disease, not hate, offensive), and 2,607 images were also annotated as hate or non-hate. Text data were pre-processed and represented using TF-IDF and Word2Vec.

During multimodal integration, late fusion was performed at the decision level by combining the unimodal SoftMax probability distributions produced by the text and image models to generate the final multimodal hate-speech prediction model. The best-performing text classification model was the Bidirectional LSTM, which achieved an F-score of 97.32%, while the ResNet50V2 model for the image model attained an F-score of 89.77%. Furthermore, the multimodal fusion model that integrated both text (Bidirectional LSTM) and image (ResNet50V2) modalities achieved an F-score of 92.11% (precision: 91.87% recall: 92.35%) demonstrating the effectiveness of combining modalities for improved detection. This advances Natural Language Processing (NLP) by introducing ChujaHate model, with implications for moderating hate speech content, supporting safer digital spaces, and reducing online harm for the Kiswahili community.

Received 10 April 2026

Accepted 01 June 2026

Published 08 June 2026

Keywords

Hate speech detection, Late fusion, Low-resource languages, Machine learning, Multimodal datal, Kiswahili, Natural Language Processing (NLP)

Full Text:

PDF

References

F. N. Njung’e, A. M. Oirere, and R. N. Ndung’u, “A Comparative Study of Transformer-based Models for Hate-Speech Detection in English-Kiswahili Code-Switched Social Media Text,” International Journal of Advanced Trends in Computer Science and Engineering, vol. 13, no. 5, pp. 181–186, Oct. 2024, doi: https://doi.org/10.30534/ijatcse/2024/011352024.

G. Arya et al., “Multimodal Hate Speech Detection in Memes using Contrastive Language-Image Pre-training,” IEEE access, vol. 12, pp. 22359–22375, Jan. 2024, doi: https://doi.org/10.1109/access.2024.3361322.

A. Marshan, F. Nasreen, A. Ioannou, and K. Spanaki, “Comparing Machine Learning and Deep Learning Techniques for Text Analytics: Detecting the Severity of Hate Comments Online,” Information Systems Frontiers, pp. 1–19, Nov. 2023, doi: https://doi.org/10.1007/s10796-023-10446-x.

S. MacAvaney, H.-R. Yao, E. Yang, K. Russell, N. Goharian, and O. Frieder, “Hate speech detection: Challenges and solutions,” PLOS ONE, vol. 14, no. 8, p. e0221152, Aug. 2019, doi: https://doi.org/10.1371/journal.pone.0221152.

United Nations, “United Nations Strategy and Plan of Action on Hate Speech,” United Nations Digital Library System, 2019. https://digitallibrary.un.org/record/3889290

N. Onyango, L. Wanzare, and J. I. Obuhuma, “Swahili and Code-Switched English-Swahili Political Hate Speech Detection Textual Dataset,” Data Intelligence, vol. 7, no. 3, pp. 819–850, 2025, doi: https://doi.org/10.3724/2096-7004.di.2025.0053.

B. C. Solomon, Matthew, A. Hemmen, and J. N. Druckman, “Illusory interparty disagreement: Partisans agree on what hate speech to censor but do not know it,” Proceedings of the National Academy of Sciences, vol. 121, no. 39, Sep. 2024, doi: https://doi.org/10.1073/pnas.2402428121.

F. M. Ndahinda and A. S. Mugabe, “Streaming Hate: Exploring the Harm of Anti-Banyamulenge and Anti-Tutsi Hate Speech on Congolese Social Media,” Journal of Genocide Research, vol. 26, no. 1, pp. 1–25, May 2022, doi: https://doi.org/10.1080/14623528.2022.2078578.

E. Ombui, M. Karani, and L. Muchemi, “Annotation Framework for Hate Speech Identification in Tweets: Case Study of Tweets During Kenyan Elections,” IEEE Xplore, May 01, 2019. https://ieeexplore.ieee.org/document/8764868 (accessed Apr. 14, 2022).

T. M. Ababu and M. M. Woldeyohannis, “Afaan Oromo Hate Speech Detection and Classification on Social Media,” in ACL Anthology, Proc. 13th Language Resources and Evaluation Conf, Jun. 2022, pp. 6612–6619.

O. Oriola and KotzéE., “Improved semi-supervised learning technique for automatic detection of South African abusive language on Twitter,” South African Computer Journal, vol. 32, no. 2, pp. 56–79, Dec. 2020, doi: https://doi.org/10.18489/sacj.v32i2.847.

S. Badillo et al., “An Introduction to Machine Learning,” Clinical Pharmacology & Therapeutics, vol. 107, no. 4, pp. 871–885, Mar. 2020, doi: https://doi.org/10.1002/cpt.1796.

T. Davidson, D. Warmsley, M. Macy, and I. Weber, “Automated Hate Speech Detection and the Problem of Offensive Language,” Proceedings of the International AAAI Conference on Web and Social Media, vol. 11, no. 1, pp. 512–515, May 2017, doi: https://doi.org/10.1609/icwsm.v11i1.14955.

M. Wiegand, J. Ruppenhofer, Anna Marie Schmidt, and C. Greenberg, “Inducing a Lexicon of Abusive Words – a Feature-Based Approach,” in Publication Server of the Institute for German Language (Institute for German Language), Leibniz Institute for the German Language: Proc. NAACL-HLT, Jan. 2018. doi: https://doi.org/10.18653/v1/n18-1095.

P. Mishra, M. D. Tredici, H. Yannakoudakis, and E. Shutova, “Author Profiling for Hate Speech Detection,” 2019, doi: https://doi.org/10.48550/arXiv.1902.06734.

G. Martin, M. E. Mswahili, Y.-S. Jeong, and J. Woo, “SwahBERT: Language Model of Swahili,” in ACLWeb , Seattle, United States: Association for Computational Linguistics, Jul. 2022, pp. 303–313. doi: https://doi.org/10.18653/v1/2022.naacl-main.23.

O. F. Babu, M. Jahan, A. Faisal, Md. S. Islam, and R. Khan, “Bangla Hate Speech Detection System Using Transformer-Based NLP and Deep Learning Techniques,” in Proc. 3rd Asian Conf. Innovation in Technology (ASIANCON), Aug. 2023, pp. 1–6. doi: https://doi.org/10.1109/asiancon58793.2023.10269919.

A. S. Alammary, “BERT Models for Arabic Text Classification: A Systematic Review,” Applied Sciences, vol. 12, no. 11, p. 5720, Jun. 2022, doi: https://doi.org/10.3390/app12115720.

D. Adelani, G. Neubig, S. Ruder, and S. Rijhwani, “MasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity Recognition,” arXiv preprint arXiv:2210.12391, 2022, doi: https://doi.org/10.48550/arXiv.2210.12391.

A. Kaliba, “Performance Assessment of a New Swahili Lexicon (SWAHILILex.01) Tagged by Native Speakers for Polarity Analysis,” May 2023, doi: https://doi.org/10.36227/techrxiv.22806782.

M. Kandagor, “THE PLACE OF KISWAHILI IN THE TWENTY-FIRST CENTURY | Mark M. Kandagor,” in Mu.ac.ke, Youth, Globalization, and Society in Africa and Its Diaspora, 2020, p. 114.

C. Jacobs, N. C. Rakotonirina, E. A. Chimoto, B. A. Bassett, and H. Kamper, “Towards hate speech detection in low-resource languages: Comparing ASR to acoustic word embeddings on Wolof and Swahili,” arXiv.org, 2023. https://arxiv.org/abs/2306.00410

A. Irfan, D. Azeem, S. Narejo, and N. Kumar, “Multi-Modal Hate Speech Recognition Through Machine Learning,” Proc. IEEE 1st Karachi Section Humanitarian Technology Conf. (KHI-HTC), pp. 1–6, Jan. 2024, doi: https://doi.org/10.1109/khi-htc60760.2024.10482031.

A. G. Debele and M. M. Woldeyohannis, “Multimodal Amharic Hate Speech Detection Using Deep Learning,” Proc. Int. Conf. Information and Communication Technology for Development for Africa (ICT4DA), pp. 102–107, Nov. 2022, doi: https://doi.org/10.1109/ict4da56482.2022.9971436.

K. Perifanos and D. Goutsos, “Multimodal Hate Speech Detection in Greek Social Media,” Multimodal Technologies and Interaction, vol. 5, no. 7, p. 34, Jun. 2021, doi: https://doi.org/10.3390/mti5070034.

P. Kapil and A. Ekbal, “A transformer based multi task learning approach to multimodal hate detection,” Natural Language Processing Journal, vol. 11, p. 100133, Feb. 2025, doi: https://doi.org/10.1016/j.nlp.2025.100133.

M. D. Belete and Girma Kassa Alitasb, “Identification of Hateful Amharic Language Memes on Facebook using Deep Learning Algorithms,” Systems and Soft Computing, vol. 7, pp. 200258–200258, Apr. 2025, doi: https://doi.org/10.1016/j.sasc.2025.200258.

F. K. Saddozai, S. K. Badri, D. Alghazzawi, A. Khattak, and M. Z. Asghar, “Multimodal hate speech detection: a novel deep learning framework for multilingual text and images,” PeerJ Computer Science, vol. 11, p. e2801, Apr. 2025, doi: https://doi.org/10.7717/peerj-cs.2801.

F. Vargas et al., “HausaHate: An Expert Annotated Corpus for Hausa Hate Speech Detection,” Proceedings of the 8th Workshop on Online Abuse and Harms (WOAH 2024), pp. 52–58, 2024, doi: https://doi.org/10.18653/v1/2024.woah-1.5.

B. Vidgen and T. Yasseri, “Detecting weak and strong Islamophobic hate speech on social media,” Journal of Information Technology & Politics, vol. 17, no. 1, pp. 66–78, Dec. 2019, doi: https://doi.org/10.1080/19331681.2019.1702607.

A. K. Diallo and K. Abainia, “Offensive Language Detection in Code-Mixed Bambara-French Corpus: Evaluating machine learning and deep learning classifiers,” 2023 International Conference on Decision Aid Sciences and Applications (DASA), pp. 121–125, Sep. 2023, doi: https://doi.org/10.1109/dasa59624.2023.10286577.

S. H. Muhammad et al., “AfriSenti: A Twitter Sentiment Analysis Benchmark for African Languages,” Feb. 2023, doi: https://doi.org/10.48550/arxiv.2302.08956.

F. Ieracitano, C. Balenzano, S. Girardi, C. G. Gemmano, and F. Comunello, “Online Hate Speech as a Moral Issue: Exploring Moral Reasoning of Young Italian Users on Social Network Sites,” Social Science Computer Review, vol. 42, no. 1, p. 089443932311611, Mar. 2023, doi: https://doi.org/10.1177/08944393231161124.

R. A. Akbar, A. Mulyana, and M. Amalia, “Legal Challenges In The Age Of Social Media: Protecting Citizens From Misuse Of Information,” Golden Ratio of Law and Social Policy Review, vol. 3, no. 1, pp. 14–25, Dec. 2023, doi: https://doi.org/10.52970/grlspr.v3i1.328.

E. Ombui, L. Muchemi, and P. Wagacha, “Psychosocial Features for Hate Speech Detection in Code-switched Texts,” International Journal of Information Technology and Computer Science, vol. 13, no. 6, pp. 29–47, Dec. 2021, doi: https://doi.org/10.5815/ijitcs.2021.06.03.

A. Chhabra and D. K. Vishwakarma, “A literature survey on multimodal and multilingual automatic hate speech identification,” Multimedia Systems, vol. 29, no. 3, Jan. 2023, doi: https://doi.org/10.1007/s00530-023-01051-8.

F. Wu, G. Chen, J. Cao, Y. Yan, and Z. Li, “Multimodal Hateful Meme Classification Based on Transfer Learning and a Cross-Mask Mechanism,” Electronics, vol. 13, no. 14, p. 2780, Jul. 2024, doi: https://doi.org/10.3390/electronics13142780.

F. Chen, X. Li, Z. Li, C. Zhou, and J. Sheng, “Multimodal Rumor Detection via Multimodal Prompt Learning,” 2022 International Joint Conference on Neural Networks (IJCNN), vol. 7, pp. 1–8, Jun. 2024, doi: https://doi.org/10.1109/ijcnn60899.2024.10650974.

P. Vijayaraghavan, H. Larochelle, and D. Roy, “Interpretable Multi-Modal Hate Speech Detection,” arXiv.org, 2021. https://arxiv.org/abs/2103.01616? (accessed Jun. 06, 2026).

J. L. Imbwaga, N. B. Chittaragi, and S. G. Koolagudi, “Automatic hate speech detection in audio using machine learning algorithms,” International Journal of Speech Technology, vol. 27, no. 2, pp. 447–469, Jun. 2024, doi: https://doi.org/10.1007/s10772-024-10116-6.

Z. Zhao, Z. Zhang, and F. Hopfgartner, “Detecting Toxic Content Online and the Effect of Training Data on Classification Performance,” EasyChair Preprints, Apr. 2019, doi: https://doi.org/10.29007/z5xk.

M. Zampieri, S. Rosenthal, Preslav Nakov, Alphaeus Dmonte, and T. Ranasinghe, “OffensEval 2023: Offensive language identification in the age of Large Language Models,” Natural language engineering, vol. 29, no. 6, pp. 1416–1435, Nov. 2023, doi: https://doi.org/10.1017/s1351324923000517.

DOI: https://dx.doi.org/10.21622/ACE.2026.06.1.2076

Refbacks

There are currently no refbacks.

Advances in Computing and Engineering

E-ISSN: 2735-5985

P-ISSN: 2735-5977

Published by:

Academy Publishing Center (APC)

Arab Academy for Science, Technology and Maritime Transport (AASTMT)

Alexandria, Egypt

ace@aast.edu

Username
Password
Remember me