A Computational Dialectology Approach to Mapping Bidayuhic Varieties in Tayan Hulu Using Gabmap
DOI:
https://doi.org/10.17507/jltr.1702.09Keywords:
Bidayuhic, Austronesian language, dialectometry, Gabmap, computational dialectologyAbstract
This study examines the linguistic variation of the Bidayuhic language in Tayan Hulu, West Kalimantan, Indonesia, through a computational dialectological approach using Gabmap. This study applies Levenshtein Distance to measure lexical and phonological differences in six observation sites, analyzing 491 lexical items. The findings show that the Bidayuhic language forms a linguistic continuum, where dialectal variation is not entirely aligned with geographical boundaries. Instead, lexical and phonological differences are influenced by language contact, social mobility, and cultural interaction. This study identifies the merger of the Proto-Malayo-Polynesian (PMP) phonemes R and l into /r/, /ɣ/, and /h/, reflecting phonological innovations in the Bidayuhic language. Furthermore, ablaut in verb morphology is observed, distinguishing between transitive and intransitive verb forms. Cluster analysis via Multidimensional Scaling (MDS) and probabilistic clustering revealed two main groups, confirming that variation is gradual rather than regionally segmented. Despite adding 0.8 probabilistic disturbances, the clustering remained stable, validating the effectiveness of Gabmap in dialect classification. These results emphasize that Bidayuhic variation is shaped more by sociolinguistic interactions than geographical factors. This study highlights the role of Gabmap in linguistic mapping, offering a methodological model for mapping local languages in Indonesia.
References
Asfar, D. A. (2014). Klasifikasi bahasa Dayak Pruwan sebagai bahasa Bidayuhik [Classification of Pruwan Dayak as a Bidayuhik language]. Kandai, 10(2), 138–152. https://doi.org/10.26499/jk.v10i2.318
Asfar, D. A. (2015). Bahasa Ribun: Refleks fonem Proto-Melayu Polinesia dalam bahasa Ribun [Ribun language: Polynesian Proto-Malay phoneme reflexes in Ribun languages]. Top Indonesia.
Asfar, D. A. (2016). Kearifan lokal dan ciri kebahasaan teks naratif masyarakat Iban [Local wisdom and linguistic features of Iban narrative texts]. Litera, 15(2), 366-378. https://doi.org/10.21831/ltr.v15i2.11835
Beier, C., & Epps, P. (2020). Reflections on fieldwork: A view from Amazonia. Language Documentation and Conservation, Special Issue, (15), 321–329.
Bonilla, J. E. (2023). Superdialects, Dialects, and Subdialects of Colombian Spanish. Lexis (Peru), 47(2), 536–564. https://doi.org/10.18800/lexis.202302.002
Chambers, J. K. (2015). Dialectology. In International Encyclopedia of the Social & Behavioral Sciences: Second Edition. https://doi.org/10.1016/B978-0-08-097086-8.52005-4
Chebanne, A. (2016). Writing Khoisan: Harmonized orthographies for development of under-researched and marginalized languages: The case of Cua, Kua, and Tsua dialect continuum of Botswana. Language Policy, 15(3), 277–297. https://doi.org/10.1007/s10993-015-9371-1
Chong, S., & Gedat, R. A. (2012). An introduction to the Austronesian languages in western Borneo. Language and Linguistics, 13(2), 321-349.
Collins, J. T. (2018). The Sekujam language of West Kalimantan (Indonesia). Wacana, 19(2), 425–458. https://doi.org/10.17510/wacana.v19i2.702
Collins, J. T. (2021). Keberagaman Bahasa dan Etnisitas di Kalimantan Barat [Language Diversity and Ethnicity in West Kalimantan]. Pontianak: Indonesia Melestarikan Bahasa Ibu.
Coluzzi, P., Riget, P. N., & Wang, X. (2013). Language vitality among the Bidayuh of Sarawak (East Malaysia). Oceanic Linguistics, 52(2), 375–395. https://doi.org/10.1353/ol.2013.0019
Contandriopoulos, D., Sapeha, H., & Larouche, C. (2019). Some insights related to social network analysis data collection challenges–a research note. International Journal of Social Research Methodology, 22(5), 463–468. https://doi.org/10.1080/13645579.2019.1574957
Dezsö, J. (2016). A magyar történeti dialektológia korszakai [Periods of Hungarian historical dialectology]. Magyar Nyelv, 112(1), 17–31. https://doi.org/10.18349/MagyarNyelv.2016.1.17
Dunn, J. (2019). Global Syntactic Variation in Seven Languages: Toward a Computational Dialectology. Frontiers in Artificial Intelligence, 2, 1-22. https://doi.org/10.3389/frai.2019.00015
Effendy, C., Sulissusiawan, A., Syahrani, A., Jupitasari, M., Asfar, D. A., & Lubna, S. (2023). Marine fauna lexicon of Malay community in West Kalimantan. AIP Conf. Proc. 2913, 060017. https://doi.org/10.1063/5.0175681
François, A. (2020). In search of island treasures: Language documentation in the Pacific. Language Documentation and Conservation, 15(Special Issue), 276–294.
Francois, S., Wu, K., Doe, E., Tucker, A., & Theall, K. (2023). The influence of racial violence in neighborhoods and schools on the psycho-behavioral outcomes in adolescence. Research in Human Development, 20(1–2), 48-64. https://doi.org/10.1080/15427609.2023.2171694
Huisman, J. L. A., Franco, K., & van Hout, R. (2021). Linking linguistic and geographic distance in four semantic domains: Computational geo-analyses of internal and external factors in a dialect continuum. Frontiers in Artificial Intelligence, 4, 1-19. https://doi.org/10.3389/frai.2021.668035
Irawan, Y., Setiawan, F. A., Asfar, D. A., Irmayani, Herpanus, & Pramulya, M. (2024). Lexical and post-lexical prosodic documentation of Embaloh language. ILS, 13(1), 22–40. https://doi.org/10.33736/ils.6025.2024
Isaías, P., Pífano, S., & Miranda, P. (2012). Subject recommended samples: Snowball sampling. In Information Systems Research and Exploring Social Artifacts: Approaches and Methodologies (pp.43-57). https://doi.org/10.4018/978-1-4666-2491-7.ch003
Kehrein, R. (2012). Linguistic Atlases: Empirical Evidence for Dialect Change in the History of Languages. In The Handbook of Historical Sociolinguistics. https://doi.org/10.1002/9781118257227.ch26
Kessler, B. (1995). Computational dialectology in Irish Gaelic. In Proceedings of the Seventh Conference on European Chapter of the Association for Computational Linguistics (pp. 60-66). Morgan Kaufmann Publishers Inc. https://doi.org/10.3115/976973.976983
Kristophson, J. (2013). Theory of dialect (descriptive). In Die slavischen Sprachen / The Slavic Languages. Halbband 2 (pp. 2061–2067). Retrieved February 11, 2025, from https://www.scopus.com/inward/record.uri?eid=2-s2.0-85119566486&partnerID=40&md5=8d9cc21b95de7d12077103b5bd33de09
Lafkioui, M. B. (2018). The Rif Berber language continuum: An algorithmic geolinguistic study. In HuldeAlbum Voor Jacques Van Keymeulen, hal-01914354. Retrieved February 11, 2025, from https://hal.science/hal-01914354/document
Leinonen, T., Çöltekin, Ç., & Nerbonne, J. (2016). Using Gabmap. Lingua, 178, 71–83. https://doi.org/10.1016/j.lingua.2015.02.004
Lendik, L. S., & Yuit, C. M. (2021). A preliminary study on the use of epithets in Kenyah Long Wat. Journal on Asian Linguistic Anthropology, 3(1), 56–75. https://doi.org/10.47298/jala.v3-i1-a3
Lindström, L., & Pilvik, M.-L. (2018). Korpuspõhine kvantitatiivne dialektoloogia [Corpus-based quantitative dialectology]. Keel ja Kirjandus, 61(8–9), 643–662.
Markus, M. (2022). A critical assessment of English dialect feature catalogues: Towards a dialectometrical evaluation of the English Dialect Dictionary Online. Lingua, 279. https://doi.org/10.1016/j.lingua.2022.103428
Mikuleniene, D. (2013). Contemporary linguistic situation in Lithuania: Geolinguistic aspects and new descriptive possibilities. Acta Baltico-Slavica, 37, 459–471. https://doi.org/10.11649/abs.2013.031
Mwelwa, J., & Spencer, B. (2013). A bilingual (Bemba/English) teaching resource: Realising agency from below through teaching materials designed to challenge the hegemony of English. Language Matters, 44(3), 51–68. https://doi.org/10.1080/10228195.2013.840011
Nath, P. K. (2008). Doing fieldwork on the Singpho language of North Eastern India. Cambridge University Press. https://doi.org/10.1017/UPO9788175968431.016
Nerbonne, J., Colen, R., Gooskens, C., Kleiweg, P., & Leinonen, T. (2011). Gabmap—A web application for dialectology. Dialectologia, II(SPEC. ISSUE 2), 65–89. https://raco.cat/index.php/Dialectologia/article/view/245345
Nerbonne, J., Kleiweg, P., Heeringa, W., & Manni, F. (2008). Projecting Dialect Distances to Geography: Bootstrap Clustering vs. Noisy Clustering. In C. Preisach, H. Burkhardt, L. Schmidt-Thieme, & R. Decker (Eds.), Data Analysis, Machine Learning and Applications (pp. 647–654). Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-540-78246-9_76
Nerbonne, J., & Kretzschmar Jr., W. (2006). Progress in dialectometry: Toward explanation. Literary and Linguistic Computing, 21(4), 387–397. https://doi.org/10.1093/llc/fql034
Nerbonne, J., & Kretzschmar, W. A. (2013). Dialectometry. Literary and Linguistic Computing, 28(1), 2–12. https://doi.org/10.1093/llc/fqs062
Nevaci, M. (2016). O cercetare sociolingvistica asupra dialectului aromân [A Sociolinguistic Research on the Aromanian dialect]. Fonetica si Dialectologie, 35, 145–154.
Nguyen, D., & Eisenstein, J. (2017). A Kernel independence test for geographical language variation. Computational Linguistics, 43(3), 567–592. https://doi.org/10.1162/COLI_a_00293
Pröll, S. (2013). Detecting structures in linguistic maps-fuzzy clustering for pattern recognition in geostatistical dialectometry. Literary and Linguistic Computing, 28(1), 108–118. https://doi.org/10.1093/llc/fqs059
Smith, A. D. (2021). The historical phonology of Hliboi, a bidayuh language of Borneo. Oceanic Linguistics, 60(1), 133–159. https://doi.org/10.1353/ol.2021.0004
Spencer, P. T. (2024). Documenting Endangered Languages with LangDoc: A Wordlist-Based System and A Case Study on Moklen. FieldMatters 2024—3rd Workshop on NLP Applications to Field Linguistics—Proceedings of the Workshop (pp. 28–36). Retrieved February 11, 2025, from https://www.scopus.com/inward/record.uri?eid=2-s2.0-85204305321&partnerID=40&md5=8b870e068336e500a469c5dc988d2787
Spruit, M. R. (2006). Measuring syntactic variation in Dutch dialects. Literary and Linguistic Computing, 21(4), 493–505. https://doi.org/10.1093/llc/fql043
Sung, H. W. M., Prokić, J., & Chen, Y. (2024). A New Dataset for Tonal and Segmental Dialectometry from the Yue- and Pinghua-Speaking Area. In SIGTYP 2024—6th Workshop on Research in Computational Linguistic Typology and Multilingual NLP, Proceedings of the Workshop (pp. 25–36). https://www.scopus.com/inward/record.uri?eid=2-s2.0-85189635282&partnerID=40&md5=b0c08abb36e6fba15b92311f5e3b2f82
Wei, W., & Schnell, J. (2025). The Routledge Handbook of Endangered and Minority Languages. Routledge. https://doi.org/10.4324/9781003439493
Wieling, M., & Nerbonne, J. (2015). Advances in Dialectometry. Annual Review of Linguistics, 1(1), 243–264. https://doi.org/10.1146/annurev-linguist-030514-124930
Wieling, M., Nerbonne, J., & Baayen, R. H. (2011). Quantitative social dialectology: Explaining linguistic variation geographically and socially. PLoS ONE, 6(9), 1-14. https://doi.org/10.1371/journal.pone.0023613
Wieling, M., Sassolini, E., Cucurullo, S., & Montemagni, S. (2016). ALT explored: Integrating an online dialectometric tool and an online dialect atlas. In Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016 (pp. 3265–3272). Retrieved February 11, 2025, from https://www.scopus.com/inward/record.uri?eid=2-s2.0-85037117736&partnerID=40&md5=02b3dea69e1081b3cdaeadb1cc95667f
Yumnam, G., & Singh, C. I. (2024). A Bibliometric Perspective of Regional Languages on Select Scholarly Articles. DESIDOC Journal of Library and Information Technology, 44(1), 37–44. https://doi.org/10.14429/djlit.44.1.18938