Exploring the Role of Large Language Models in Translation Education: A Systematic Review

Anas M. Alkhofi

doi:10.17507/jltr.1702.28

Authors

Anas M. Alkhofi King Faisal University

DOI:

https://doi.org/10.17507/jltr.1702.28

Keywords:

ChatGPT for translators, LLMs in translation education, machine translation, translation feedback, translation assessment

Abstract

Despite the surge of research interest in generative AI and the rapid public adoption of large language models (LLMs), their role in translation remains unclear. The reliability of these systems and their limitations as machine translation tools continue to be a central concern for translation teachers and students. Systematic reviews that specifically examine LLMs in translation are still scarce. This systematic review aims to address this gap by synthesizing and interpreting recent empirical studies on the use of LLMs in translation across three areas: (1) LLMs’ translation quality, (2) LLM-generated translation feedback, and (3) the integration of LLMs into translation education. Drawing on 55 empirical studies, the findings show that LLMs—particularly GPT—consistently outperform conventional neural MT systems. For general, non-specialized texts, their output often approaches human quality, though human translators maintain a clear advantage in culturally dense, technical, or literary content. Evidence further indicates that LLMs can provide helpful and timely feedback that identifies common linguistic issues, which in turn can assist both teachers and students; however, teacher feedback remains superior in depth, contextual sensitivity, and clarity. As contemporary translation workplaces increasingly rely on MT and AI-supported tools, training students to work with LLMs has become essential for aligning classroom practice with professional expectations. At the same time, educators must balance LLM-assisted tasks with hands-on human translation to ensure that students continue to develop essential linguistic and problem-solving skills.

Author Biography

Anas M. Alkhofi, King Faisal University

Department of English, College of Arts

References

Abdelhalim, S. M., Alsahil, A. A., & Alsuhaibani, Z. A. (2025). Artificial intelligence tools and literary translation: A comparative investigation of ChatGPT and Google Translate from novice and advanced EFL student translators’ perspectives. Cogent Arts & Humanities, 12(1), 2508031. https://doi.org/10.1080/23311983.2025.2508031

Aghai, M. (2024). ChatGPT vs. Google Translate: Comparative Analysis of Translation Quality. Iranian Journal of Translation Studies, 22(85). Retrieved November 2, 2025, from https://dorl.net/dor/20.1001.1.17350212.1403.22.1.9.2

AlAfnan, M. A. (2024). Large Language Models as Computational Linguistics Tools: A Comparative Analysis of ChatGPT and Google Machine Translations. Journal of Artificial Intelligence and Technology, 5, 20–32. https://doi.org/10.37965/jait.2024.0549

Al-khresheh, M. H., & Almaaytah, S. A. (2018). English Proverbs into Arabic through Machine Translation. International Journal of Applied Linguistics and English Literature, 7(5), 158–166. https://doi.org/10.7575/aiac.ijalel.v.7n.5p.158

Al Rousan, R., Jaradat, R., & Malkawi, M. (2025). ChatGPT translation vs. human translation: An examination of a literary text. Cogent Social Sciences, 11(1). https://doi.org/10.1080/23311886.2025.2472916

Ataman, D., Birch, A., Habash, N., Federico, M., Koehn, P., & Cho, K. (2025). Machine translation in the era of large language models: A survey of historical and emerging problems. Information, 16(9). Retrieved November 2, 2025, from https://www.mdpi.com/2078-2489/16/9/723

Alghamdi, F. A., & Alotaibi, H. (2025). Using AI in Translation Quality Assessment: A Case Study of ChatGPT and Legal Translation Texts. Electronics, 14(19). https://doi.org/10.3390/electronics14193893

Almaaytah, S. A., & Alzobidy, S. A. (2023). Challenges in rendering Arabic text to English using machine translation: A systematic literature review. IEEE Access, 11, 94772–94779. Retrieved November 2, 2025, from https://ieeexplore.ieee.org/abstract/document/10233872/

Alkhofi, A. (2024). The Use of Google Translate in the Arabic-English Classroom. Theory and Practice in Language Studies, 14(12), 3861-3870.

Alkhofi, A. (2025). Can ESL instructors spot machine translation? Evidence from the Arabic-English classroom. In Forum for Linguistic Studies (Vol. 7, pp. 340-350).

Alzain, E., Nagi, K. A., & Algobaei, F. (2024). The Quality of Google Translate and ChatGPT English to Arabic Translation: The Case of Scientific Text Translation. Forum for Linguistic Studies, 6(4), 837–849. Retrieved November 2, 2025, from https://www.academia.edu/download/121385964/The_Quality_of_Google_Translate_and_ChatGPT_English_to_Arabic_Translation_The_Case_of_Scientific_Text_Translation.pdf

Bahdanau D, Cho K, & Bengio Y. (2014). Neural machine translation by jointly learning to align and translate. In: Proceedings of the 3rd International Conference on Learning Representations.

Bowker, L. (2019). Machine translation literacy as a social responsibility. Proceedings of the Language Technologies for All (LT4All), 104–107. Retrieved November 2, 2025, from https://lt4all.elra.info/media/papers/O7/145.pdf

Brown, P. F., Cocke, J., Della Pietra, S. A., Della Pietra, V. J., Jelinek, F., Lafferty, J., ... & Roossin, P. S. (1990). A statistical approach to machine translation. Computational linguistics, 16(2), 79-85.

Cai, L. (2024). How does ChatGPT Compare with Conventional Neural Machine Translation: Ingenta Connect. Retrieved November 2, 2025, from https://www.ingentaconnect.com/content/plg/jts/2024/00000004/00000001/art00003

Calvo-Ferrer, J. R. (2024). Can you tell the difference? A study of human vs machine-translated subtitles. Perspectives, 32(6), 1115–1132. https://doi.org/10.1080/0907676X.2023.2268149

Cao, S., & Zhou, T. (2025). Exploring the Efficacy of ChatGPT-Based Feedback Compared With Teacher Feedback and Self-Feedback: Evidence From Chinese-English Translation. Sage Open, 15(3). https://doi.org/10.1177/21582440251369204

Chan, V., & Tang, W. K.-W. (2024). GPT and Translation: A Systematic Review. 2024 International Symposium on Educational Technology (ISET), 59–63. https://doi.org/10.1109/ISET61814.2024.00021

Corpas Pastor, G., & Noriega-Santiáñez, L. (2024). Human versus Neural Machine Translation Creativity: A Study on Manipulated MWEs in Literature. Information (Switzerland), 15(9). https://doi.org/10.3390/info15090530

Constantine, P. (2020). Literary Translation Pedagogy in the United States: New Trends. Translation Review, 106(1), 10–14. https://doi.org/10.1080/07374836.2019.1625833

Dong, D., Wu, H., He, W., Yu, D., & Wang, H. (2015). Multi-task learning for multiple language translation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers, pp. 1723–1732). Retrieved November 2, 2025, from https://aclanthology.org/P15-1166.pdf

Dinh, C.-T. (2025). EFL Students’ Perspectives on ChatGPT in Translation: Exploring AI Assistance, Motivation, and Learning Outcomes. Electronic Journal of E-Learning, 23(2), 99–116. https://doi.org/10.34190/ejel.23.2.4006

Duan, H., Gao, X., & Zhang, Y. (2025). The Application of AI Translation Tools in Improving Students’ Translation Fidelity and Accuracy. Arab World English Journal, 16, 290–306. https://doi.org/10.24093/awej/AI.16

Ed-Dali, R. (2025). Assessing DeepSeek R1 and ChatGPT 4.5 in Arabic-English literary translation: Performance, challenges, and implications. Cogent Arts & Humanities, 12(1). https://doi.org/10.1080/23311983.2025.2531183

El-Saadany, M. R. (2024). A Comparative Study between Chat GPT and Human Translation in Translating English Proverbs into Arabic. مجلة البحث العلمي في الآداب, 25(5), 24–54. https://doi.org/10.21608/jssa.2024.257874.1592

Farghal, M., & Haider, A. S. (2024). Translating classical Arabic verse: Human translation vs. AI large language models (Gemini and ChatGPT). Cogent Social Sciences, 10(1), 2410998. https://doi.org/10.1080/23311886.2024.2410998

Gao, Y., Wang, R., & Hou, F. (2024). How to Design Translation Prompts for ChatGPT: An Empirical Study. Proceedings of the 6th ACM International Conference on Multimedia in Asia Workshops, 1–7. https://doi.org/10.1145/3700410.3702123

Gjorevski, A., Li, M., & Cox, T. L. (2025). Exploring the Potential of ChatGPT for Evaluating English Essays in a Criterion‐Based Assessment. TESOL Quarterly, 59(S1), S251-S279. https://doi.org/10.1002/tesq.70011

Habib, R., Alkhawaja, L., Khoury, O., & Al-Sayyed, S. (2025). Six NMT Systems, One Language Pair: Which Best Translates Arabic-English? World Journal of English Language, 16(1). https://doi.org/10.5430/wjel.v16n1p1

Haider, A. S., & Alkhatib, R. (2024). Subtitling English Legal Acronyms into Arabic: Human vs Machine. Kutafin Law Review, 11(4). Retrieved November 2, 2025, from https://kulawr.msal.ru/jour/article/view/424

He, Y. (2021). Challenges and Countermeasures of Translation Teaching in the Era of Artificial Intelligence. Journal of Physics: Conference Series, 1881(2). https://doi.org/10.1088/1742-6596/1881/2/022086

Huang, Y., Li, D., & Cheung, A. K. F. (2025). Evaluating the linguistic complexity of machine translation and LLMs for EFL/ESL applications: An entropy weight method. Research Methods in Applied Linguistics, 4(3). https://doi.org/10.1016/j.rmal.2025.100229

Jiang, Z. (2025). Does LLM translation align with translation universals? A cross-genre simplification study on English-Chinese translation based on dependency grammar. PLOS ONE, 20(6). https://doi.org/10.1371/journal.pone.0324830

Jiang, Z., & Zhang, Z. (2024). Can ChatGPT Rival Neural Machine Translation? A Comparative Study. CoRR. Retrieved November 2, 2025, from https://openreview.net/forum?id=pxNZz3EfES

Jiao, H., Hu, W., & Zhang, X. (2025). To eat or to feed: Can large language models provide useful feedback in translation education? The Interpreter and Translator Trainer, 1–21. https://doi.org/10.1080/1750399X.2025.2533074

Jiao, W., Wang, W., Huang, J., Wang, X., Shi, S., & Tu, Z. (2023). Is ChatGPT a good translator? Yes with GPT-4 as the engine. arXiv Preprint arXiv:2301.08745. Retrieved November 2, 2025, from https://arxiv.org/abs/2301.08745

Jia, Y., Carl, M., & Wang, X. (2019). Post-editing neural machine translation versus phrase-based machine translation for English–Chinese. Machine Translation, 33(1), 9–29.

Johnston, H., Wells, R. F., Shanks, E. M., Boey, T., & Parsons, B. N. (2024). Student perspectives on the use of generative artificial intelligence technologies in higher education. International Journal for Educational Integrity, 20(1). https://doi.org/10.1007/s40979-024-00149-4

Koponen, M. (2016). Is machine translation post-editing worth the effort? A survey of research into post-editing and effort. The Journal of Specialised Translation, 25(2), 131–148.

Krüger, R. (2023). Some reflections on the interface between professional machine translation literacy and data literacy. Journal of Data Mining & Digital Humanities (IV. Challenges for professional translation). Retrieved November 2, 2025, from https://jdmdh.episciences.org/9728

Lee, S.-M. (2020). The impact of using machine translation on EFL students’ writing. Computer Assisted Language Learning, 33(3), 157–175. https://doi.org/10.1080/09588221.2018.1553186

Lee, S.-M. (2023). The effectiveness of machine translation in foreign language education: A systematic review and meta-analysis. Computer Assisted Language Learning, 36(1–2), 103–125. https://doi.org/10.1080/09588221.2021.1901745

Liu, F. (2018). Ways to improve effect of college English translation teaching. International Conference on Education, Psychology, and Management Science, 979–983. Retrieved November 2, 2025, from https://webofproceedings.org/proceedings_series/ESSP/ICEPMS%202018/ICEPMS208.pdf

Lu, S.-C., Xu, C., Kaur, M., Edelen, M. O., Pusic, A., & Gibbons, C. (2025). Can machine translation match human expertise? Quantifying the performance of large language models in the translation of patient-reported outcome measures (PROMs). Journal of Patient-Reported Outcomes, 9(1), 94. https://doi.org/10.1186/s41687-025-00926-w

Manapbayeva, Z., Zaurbekova, G., Ayazbekova, K., Kazezova, A., & Pirmanova, K. (2024). AI in Literary Translation: ChatGPT-4 vs. Professional Human Translation of Abai’s Poem ‘Spring.’ Procedia Computer Science, 251, 526–531. https://doi.org/10.1016/j.procs.2024.11.143

Mohsan, M., & Nayab, D. e. (2024). Estimating and Comparing Translation Skills: A Comparative Study of ChatGPT and Human Translation. Journal of Development and Social Sciences, 5(3), 75–86. https://doi.org/10.47205/jdss.2024(5-III)08

Moneus, A. M., & Sahari, Y. (2024). Artificial intelligence and human translation: A contrastive study based on legal texts. Heliyon, 10(6). https://doi.org/10.1016/j.heliyon.2024.e28106

Mizumoto, A., & Eguchi, M. (2023). Exploring the potential of using an AI language model for automated essay scoring. Research Methods in Applied Linguistics, 2(2), 100050. Retrieved November 2, 2025, from https://www.sciencedirect.com/science/article/pii/S2772766123000101

Ng, S.-L., & Ho, C.-C. (2025). Generative AI in Education: Mapping the Research Landscape Through Bibliometric Analysis. Information, 16(8), 657. https://doi.org/10.3390/info16080657

OpenAI. (2022, November 30). Introducing ChatGPT. OpenAI https://openai.com/index/chatgpt/

Özmat, D., & Akkoyunlu, B. (2024). Artificial Intelligence-Assisted Translation in Education: Academic Perspectives and Student Approaches. Participatory Educational Research, 11(H. Ferhan Odabaşı Gift Issue), 151–167. https://doi.org/10.17275/per.24.99.11.6

Povilaitienė, M., & Kasperė, R. (2022). Machine Translation for Post-Editing Practices. Scientific Journal of Mykhailo Dragomanov State University of Ukraine. Series 9. Current Trends in Language Development, 24, 47–62. https://doi.org/10.31392/NPU-nc.series9.2022.24.04

Page, M. J., McKenzie, J. E., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., Shamseer, L., Tetzlaff, J. M., Akl, E. A., & Brennan, S. E. (2021). The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ (Clinical research ed.), 372. Retrieved November 2, 2025, from https://www.bmj.com/content/372/bmj.n71.short

Rao, P., McGee, L. M., & Seideman, C. A. (2024). A Comparative assessment of ChatGPT vs. Google Translate for the translation of patient instructions. Journal of Medical Artificial Intelligence, 7. https://doi.org/10.21037/jmai-24-24

Rico, C., & Gonzalez Pastor, D. (2022). The role of machine translation in translation education: A thematic analysis of translator educators’ beliefs. Translation & Interpreting-the International Journal of Translation and Interpreting, 14(1), 177–197. https://doi.org/10.12807/ti.114201.2022.a010

Rico Pérez, C. (2024). Re-thinking machine translation post-editing guidelines. Retrieved November 2, 2025, from https://docta.ucm.es/entities/publication/36ab888d-39bb-400d-bbf1-e379ed394296

Rivera-Trigueros, I. (2022). Machine translation systems and quality assessment: A systematic review. Language Resources and Evaluation, 56(2), 593–619. https://doi.org/10.1007/s10579-021-09537-5

Sadiq, S. (2025). Evaluating English-Arabic translation: Human translators vs. Google Translate and ChatGPT. Journal of Languages and Translation. https://doi.org/10.21608/jltmin.2025.423147

Sanz-Valdivieso, L., & López-Arroyo, B. (2023). Google Translate vs. ChatGPT: Can non-language professionals trust them for specialized translation. In International Conference Human-Informed Translation and Interpreting Technology (HiT-IT 2023) (pp. 97–107).

Shormani, M. Q., & Alfahad, A. (2025). Artificial Intelligence or Human: The Use of ChatGPT in the Academic Translation for Religious Texts. SAGE Open, 15(3). https://doi.org/10.1177/21582440251343954

Stapleton, P., & Kin, B. L. K. (2019). Assessing the accuracy and teachers’ impressions of Google Translate: A study of primary L2 writers in Hong Kong. English for Specific Purposes, 56, 18–34. Retrieved November 2, 2025, from https://www.sciencedirect.com/science/article/pii/S0889490619300158

Su, Y., Xu, S., & Liu, K. (2025). Adapt or adopt? Examining the efficacy of ChatGPT in providing translation feedback. The Interpreter and Translator Trainer, 1–21. https://doi.org/10.1080/1750399X.2025.2541486

Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems (Vol. 2, pp. 3104-3112). MIT Press

Tafa, T. O., Hashim, S. Z. M., Othman, M. S., Alhussian, H., Nasser, M., Abdulkadir, S. J., Huspi, S. H., Adeyemo, S. O., & Bena, Y. A. (2025). Machine Translation Performance for LowResource Languages: A Systematic Literature Review. IEEE Access. Retrieved November 2, 2025, from https://ieeexplore.ieee.org/abstract/document/10972018/

Tavares, C., Oliveira, L., Duarte, P., & Da Silva, M. M. (2023). Artificial intelligence: A blessing or a threat for language service providers in Portugal. Informatics, 10(4), 81. Retrieved November 2, 2025, from https://www.mdpi.com/2227-9709/10/4/81

Toledo-Báez, C. (2024). Post-editing and human-machine parity in neural machine translation: An empirical study from professional translation. Lebende Sprachen, 69(2), 434–463. https://doi.org/10.1515/les-2024-0003

Vieira, L. N. (2019). Post-editing of machine translation. In The Routledge handbook of translation and technology (pp. 319–336). Routledge. Retrieved November 2, 2025, from https://www.taylorfrancis.com/chapters/edit/10.4324/9781315311258-22/post-editing-machine-translation-lucas-nunes-vieira

Vygotsky, L. S. (1978). Mind in society: The development of higher psychological processes (Vol. 86). Harvard University Press.

Wang, H., Wu, H., He, Z., Huang, L., & Church, K. W. (2022). Progress in Machine Translation. Engineering, 18, 143–153. https://doi.org/10.1016/j.eng.2021.03.023

Wang, Y. (2023). Artificial Intelligence Technologies in College English Translation Teaching. Journal of Psycholinguistic Research, 52(5), 1525–1544. https://doi.org/10.1007/s10936-023-09960-5

Weaver, W. (1952). Translation. In Proceedings of the Conference on Mechanical Translation. Retrieved November 2, 2025, from https://aclanthology.org/1952.earlymt-1.1.pdf

Wu, J. (2023). A comparative analysis of Chinese-English translation quality based on ChatGPT: A case study of Chinese characteristic words. Journal of Social Science Humanities and Literature, 6(5), 53–58. https://www.adwenpub.com/index.php/jsshl/article/view/71

Wu, Y., Schuster, M., Chen, Z., Le, Q. V., Norouzi, M., Macherey, W., Krikun, M., Cao, Y., Gao, Q., Macherey, K., Klingner, J., Shah, A., Johnson, M., Liu, X., Kaiser, Ł., Gouws, S., Kato, Y., Kudo, T., Kazawa, H., … & Dean, J. (2016). Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. arXiv. https://doi.org/10.48550/arXiv.1609.08144

Woodrum, C. (2024). ChatGPT and Language Translation A Small Case Study Evaluating English—Mandarin Translation. In H. Degen & S. Ntoa (Eds.), Artificial Intelligence in Hci, Pt Iii, Ai-Hci 2024 (Vol. 14736, pp. 147–157). Springer International Publishing Ag. https://doi.org/10.1007/978-3-031-60615-1_10

Xu, S., Su, Y., & Liu, K. (2025). Investigating student engagement with AI-driven feedback in translation revision: A mixed-methods study. Education and Information Technologies, 30(12), 16969–16995. https://doi.org/10.1007/s10639-025-13457-0

Yamada, M. (2019). The impact of Google neural machine translation on post-editing by student translators. The Journal of Specialised Translation, 31(1), 87–106. Retrieved November 2, 2025, from https://www.researchgate.net/profile/Masaru-Yamada/publication/364994185_art_yamada_newpdf/data/63620fc037878b3e87755cf7/art-yamada-new.pdf

Zimmerman, B. J. (2002). Becoming a Self-Regulated Learner: An Overview. Theory Into Practice, 41(2), 64–70. https://doi.org/10.1207/s15430421tip4102_2

Yang, Y., Liu, R., Qian, X., & Ni, J. (2023). Performance and perception: Machine translation post-editing in Chinese-English news translation by novice translators. Humanities and Social Sciences Communications, 10(1). https://doi.org/10.1057/s41599-023-02285-7

Yao, Y., Han, T., & Li, D. (2025). Measuring translation trainees’ effort in AI-assisted post-editing: A multi-method approach. The Interpreter and Translator Trainer, 19(3–4), 357–378. https://doi.org/10.1080/1750399X.2025.2535239

Zhang, H. (2022). Comparison between Human Translation and Machine Translation in Translating the Publicity Text of Haihunhou Museum. In 2022 8th Annual International Conference on Network and Information Systems for Computers (ICNISC) (pp. 177–180). https://doi.org/10.1109/ICNISC57059.2022.00045

Zhang, W., Li, A. W., & Wu, C. (2025). University students’ perceptions of using generative AI in translation practices. Instructional Science, 53(4), 633–655. https://doi.org/10.1007/s11251-025-09705-y

Zhang, Z., Abdullah, S. N. S., Abdullah, M. A. R., & Zhou, L. (2025). Google Translate or ChatGPT-4? A Multi-Metric Evaluation of Chinese-to-English Technical Translation. Forum for Linguistic Studies, 7(9), 770–788. https://doi.org/10.30564/fls.v7i9.11014

Exploring the Role of Large Language Models in Translation Education: A Systematic Review

Authors

DOI:

Keywords:

Abstract

Author Biography

Anas M. Alkhofi, King Faisal University

References

Downloads

Published

Issue

Section