Optimizing translation for low-resource languages : efficient fine-tuning with custom prompt engineering in large language models

dc.contributor.authorKhoboko, Pitso Walter
dc.contributor.authorMarivate, Vukosi
dc.contributor.authorSefara, Joseph
dc.contributor.emailu21824772@tuks.co.za
dc.date.accessioned2025-09-05T05:53:00Z
dc.date.available2025-09-05T05:53:00Z
dc.date.issued2025-06
dc.descriptionDATA AVAILABILITY : Data will be made available on request.
dc.description.abstractTraining large language models (LLMs) can be prohibitively expensive. However, the emergence of new Parameter-Efficient Fine-Tuning (PEFT) strategies provides a cost-effective approach to unlocking the potential of LLMs across a variety of natural language processing (NLP) tasks. In this study, we selected the Mistral 7B language model as our primary LLM due to its superior performance, which surpasses that of LLAMA 2 13B across multiple benchmarks. By leveraging PEFT methods, we aimed to significantly reduce the cost of fine-tuning while maintaining high levels of performance. Despite their advancements, LLMs often struggle with translation tasks for low-resource languages, particularly morphologically rich African languages. To address this, we employed customized prompt engineering techniques to enhance LLM translation capabilities for these languages. Our experimentation focused on fine-tuning the Mistral 7B model to identify the best-performing ensemble using a custom prompt strategy. The results obtained from the fine-tuned Mistral 7B model were compared against several models: Serengeti, Gemma, Google Translate, and No Language Left Behind (NLLB). Specifically, Serengeti and Gemma were fine-tuned using the same custom prompt strategy as the Mistral model, while Google Translate and NLLB Gemma, which are pre-trained to handle English-to-Zulu and English-to-Xhosa translations, were evaluated directly on the test data set. This comparative analysis allowed us to assess the efficacy of the fine-tuned Mistral 7B model against both custom-tuned and pre-trained translation models. LLMs have traditionally struggled to produce high-quality translations, especially for low-resource languages. Our experiments revealed that the key to improving translation performance lies in using the correct prompt during fine-tuning. We used the Mistral 7B model to develop a custom prompt that significantly enhanced translation quality for English-to-Zulu and English-to-Xhosa language pairs. After fine-tuning the Mistral 7B model for 30 GPU days, we compared its performance to the No Language Left Behind (NLLB) model and Google Translator API on the same test dataset. While NLLB achieved the highest scores across BLEU, G-Eval (cosine similarity), and Chrf++ (F1-score), our results demonstrated that Mistral 7B, with the custom prompt, still performed competitively. Additionally, we showed that our prompt template can improve the translation accuracy of other models, such as Gemma and Serengeti, when applied to high-quality bilingual datasets. This demonstrates that our custom prompt strategy is adaptable across different model architectures, bilingual settings, and is highly effective in accelerating learning for low-resource language translation.
dc.description.departmentComputer Science
dc.description.librarianhj2025
dc.description.sdgSDG-09: Industry, innovation and infrastructure
dc.description.urihttps://www.elsevier.com/locate/mlwa
dc.identifier.citationKhoboko, P.W., Marivate, V. & Sefara, J. 2025, 'Optimizing translation for low-resource languages: efficient fine-tuning with custom prompt engineering in large language models', Machine Learning with Applications, vol. 20, art. 100649, pp. 1-18, doi : 10.1016/j.mlwa.2025.100649.
dc.identifier.issn2666-8270 (online)
dc.identifier.other10.1016/j.mlwa.2025.100649
dc.identifier.urihttp://hdl.handle.net/2263/104222
dc.language.isoen
dc.publisherElsevier
dc.rights© 2025 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
dc.subjectLarge language model (LLM)
dc.subjectParameter-efficient fine-tuning (PEFT)
dc.subjectNatural language processing (NLP)
dc.subjectMistral 7B
dc.subjectPrompt engineering
dc.subjectIn-context-learning (ICL)
dc.subjectEnglish-to-Zulu (En-Zul)
dc.subjectEnglish-to-Xhosa (En-Xh)
dc.subjectBLUE-score
dc.subjectF1score
dc.subjectG-Eva (mean cosine-similarity score)
dc.titleOptimizing translation for low-resource languages : efficient fine-tuning with custom prompt engineering in large language models
dc.typeArticle

Files

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: