The automatic determination of translation equivalents in lexicography : what works and what doesn't?

dc.contributor.authorDenisova, Michaela
dc.contributor.authorDe Schryver, Gilles-Maurice
dc.contributor.authorRychly, Pavel
dc.date.accessioned2025-09-16T13:09:38Z
dc.date.available2025-09-16T13:09:38Z
dc.date.issued2024-12
dc.descriptionThis paper is part of the publication: Despot, K. Š., Ostroški Anić, A., & Brač, I. (Eds.). (2024). Lexicography and Semantics. Proceedings of the XXI EURALEX International Congress. Institute for the Croatian Language.
dc.description.abstractCross-lingual embedding models act as facilitator of lexical knowledge transfer and offer many advantages, notably their applicability to low-resource and non-standard language pairs, making them a valuable tool for retrieving translation equivalents in lexicography. Despite their potential, these models have primarily been developed with a focus on Natural Language Processing (NLP), leading to significant issues, including flawed training and evaluation data, as well as inadequate evaluation metrics and procedures. In this paper, we introduce cross-lingual embedding models for lexicography, addressing the challenges and limitations inherent in the current NLP-focused research. We demonstrate the problematic aspects across three baseline cross-lingual embedding models and three language pairs and outline possible solutions. We show the importance of high-quality data, advocating that its role is vital compared to algorithmic optimisation in enhancing the effectiveness of these models.
dc.description.departmentAfrican Languages
dc.description.librarianam2025
dc.description.sdgSDG-04: Quality Education
dc.description.urihttps://euralex.org/publications/
dc.identifier.citationDenisova, M., De Schryver, G.-M., Rychly, P. 2024, 'The automatic determination of translation equivalents in lexicography : what works and what doesn't?', EURALEX Proceedings, pp. 305-316.
dc.identifier.issn2521-7100
dc.identifier.urihttp://hdl.handle.net/2263/104347
dc.language.isoen
dc.publisherEuropean Association for Lexicography
dc.rights© European Association for Lexicography. All materials here are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.
dc.subjectTranslation equivalent determination
dc.subjectCross-lingual embedding models
dc.subjectEvaluation
dc.titleThe automatic determination of translation equivalents in lexicography : what works and what doesn't?
dc.typeArticle

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Denisova_Automatic_2024.pdf
Size:
358.36 KB
Format:
Adobe Portable Document Format
Description:
Article

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: