Can ChatGPT Analyze Textual Data? The Case of Conceptual Metaphors in Short Stories of Language Assessment
DOI:
https://doi.org/10.17507/jltr.1605.24Keywords:
ChatGPT, conceptual metaphors, textual analysis, short stories, language assessmentAbstract
ChatGPT, a modern artificial intelligence (AI) chatbot, has emerged as an unprecedented breakthrough in multiple domains traditionally dominated by humans. Its ability to engage in human-like conversations has the potential to influence the fields of linguistics and education. The fundamental functions of ChatGPT in teaching and learning have been the subject of some research, but its application in textual analysis has received scant attention. This study aims to investigate how ChatGPT assists in analyzing conceptual metaphors (CMs) in short stories used in language assessment. Based on the Conceptual Metaphor Theory (CMT) by Lakoff and Johnson (1980), the study identified the structural, orientational, and ontological metaphors in 22 short stories from the book Tests and Us 2, first by the cutting-edge AI program ChatGPT (GPT-4), then refined by the researchers and validated by linguistic experts. The results showed a total of 250 conceptual metaphors, including 131 structural metaphors, 64 ontological metaphors, and 55 orientational metaphors. When validated by human specialists, GPT-4 accurately recognized conceptual metaphors in 81.2% of the cases, amounting to 203 instances. The most dominant error made by GPT-4 was classifying non-metaphoric expressions as metaphoric, followed by providing unclear explanations and classifying metaphoric expressions as non-metaphoric. Errors associated with being too general and too non-literal, unmatched categorization as well as wrong mapping of source or target domain also occurred. Our study shows that ChatGPT, despite its controversial position in academic settings, can be used as a relatively reliable tool in aiding the analysis of textual data.
References
Anh, D. T. M. (2017). An investigation into conceptual metaphors denoting life in American and Vietnamese short stories. Journal of Development Research, 1(1), 29-35. https://doi.org/10.28926/jdr.v1i1.16
Baruch, Y. (2006). Role-play teaching acting in the classroom. Management Learning, 37(1), 43–61. https://doi.org/10.1177/1350507606060980
Bednarek, M. A. (2005). Construing the world: Conceptual metaphors and event construals in news stories. Metaphorik.de. (pp. 6-32).
Boje, D. M. (2017). The storytelling organization: A study of story performance in an office-supply firm. In S. Minahan (Ed.), The aesthetic turn in management (pp. 211-231). Routledge.
Choi, S. R., & Lee, M. (2023). Transformer architecture and attention mechanisms in genome data analysis: A comprehensive review. Biology, 12(7), 1033. https://doi.org/10.3390/biology12071033
Cohen, J. D., & Adams, D. (2024). Text understanding in GPT-4 versus humans. Royal Society Open Science, 11(5), 241313. https://doi.org/10.1098/rsos.241313
de Kok, T. (2024). ChatGPT for textual analysis? How to use generative LLMs in accounting research. Management Science, Forthcoming. Available at http://dx.doi.org/10.2139/ssrn.4429658
Geng, H. (2023). A Seed in a Desert or Oasis. In Nimehchisalem, V., & Geng, H. (Eds.). Tests & Us - A Collection of Real Stories (Vol. 2). (pp. 15-19). Generis Publishing.
Gibbs Jr, R. W. (2011). Evaluating conceptual metaphor theory. Discourse Processes, 48(8), 529-562. https://doi.org/10.1080/0163853X.2011.606103
Gray, D. E. (2007). Facilitating management learning developing critical reflection through reflective tools. Management Learning, 38(5), 495–517. https://doi.org/10.1177/1350507607083204
Grimmer, J., & Stewart, B. M. (2013). Text as data: The promise and pitfalls of automatic content analysis methods for political texts. Political Analysis, 21(3), 267-297. https://doi.org/10.1093/pan/mps028
Group, P. (2007). MIP: A method for identifying metaphorically used words in discourse. Metaphor and Symbol, 22(1), 1-39. https://www.tandfonline.com/doi/abs/10.1080/10926480709336752
Hassan, T. A., Hollander, S., Van Lent, L., & Tahoun, A. (2019). Firm-level political risk: Measurement and effects. The Quarterly Journal of Economics, 134(4), 2135-2202. https://doi.org/10.1093/qje/qjz021
Hibbert, P. (2013). Approaching reflexivity through reflection issues for critical management education. Journal of Management Education, 37(6), 803–827. https://doi.org/10.1177/1052562912467757
Johansson Falck, M., & Okonski, L. (2023). Procedure for identifying metaphorical scenes (PIMS): The case of spatial and abstract relations. Metaphor and Symbol, 38(1), 1-22. https://doi.org/10.1080/10926488.2022.2062243
Kearney, R. (2002). On stories. Psychology Press. https://doi.org/10.4324/9780203453483
Lakoff, G., & Johnson, M. (1980). Metaphors We Live By. University of Chicago Press.
Lakoff, G., & Johnson, M. (2008). Metaphors We Live By. University of Chicago Press.
McGlone, M. S. (2007). What is the explanatory value of a conceptual metaphor? Language & Communication, 27(2), 109-126. https://doi.org/10.1016/j.langcom.2006.02.016
Nimehchisalem, V. (2023). Unforgiven. In Nimehchisalem, V., & Geng, H. (Eds.). Tests & Us - A Collection of Real Stories (Vol. 2). (pp. 11-14). Generis Publishing.
Nimehchisalem, V., & Geng, H. (Eds.). (2023). Tests & Us - A Collection of Real Stories (Vol. 2). Generis Publishing.
Qammar, A., Wang, H., Ding, J., Naouri, A., Daneshmand, M., & Ning, H. (2023). Chatbots to ChatGPT in a cybersecurity space: Evolution, vulnerabilities, attacks, challenges, and future recommendations. arXiv preprint arXiv:2306.09255. https://doi.org/10.48550/arXiv.2306.09255
Reiss, M. V. (2023). Testing the reliability of ChatGPT for text annotation and classification: A cautionary remark. arXiv preprint arXiv:2304.11085. https://doi.org/10.48550/arXiv.2304.11085
Schultz, P. L., & Quinn, A. S. (2014). Lights, camera, action! Learning about management with student-produced video assignments. Journal of Management Education, 38(2), 234–258. https://doi.org/10.1177/1052562913488371
Tan, Y., Min, D., Li, Y., Li, W., Hu, N., Chen, Y., & Qi, G. (2023, October). Can ChatGPT replace traditional KBQA models? An in-depth analysis of the question answering performance of the GPT LLM family. In International Semantic Web Conference (pp. 348-367). Springer Nature Switzerland.
Taylor, S. S., & Statler, M. (2014). Material matters increasing emotional engagement in learning. Journal of Management Education, 38(4), 586–607. https://doi.org/10.1177/1052562913489976
ÜNLÜ, C. (2023). Interpretutor: Using large language models for interpreter assessment. In Proceedings of the International Conference HiT-IT (pp. 78-96).
Zhu, D., Chen, J., Shen, X., Li, X., & Elhoseiny, M. (2023). Minigpt-4: Enhancing vision-language understanding with advanced large language models. arXiv preprint arXiv:2304.10592. https://doi.org/10.48550/arXiv.2304.10592