Aleksi Sahala

Aleksi Sahala

University of Helsinki

Code-switching in Sumerian Emesal texts: A computational approach

One of the great remaining mysteries in the study of the Sumerian language is the nature and origin of its only known variety, Emesal, which made a somewhat counterintuitive appearance in ancient Mesopotamian texts only after the extinction of Sumerian as a spoken language around 2000 BCE. Although it is well known that Emesal words mostly occur in liturgical texts and lamentations, it is not yet understood what conditions triggered the code-switching from Sumerian into Emesal within certain parts of these texts, and why Emesal became a part of certain Sumerian compositions in the first place. With the most comprehensive digital collection of Emesal texts now available in the Open Richly Annotated Cuneiform Corpus, we aim to analyze the Emesal texts with Natural Language Processing methods to shed light on these questions. Based on our team’s preliminary philological analysis, at least in one isolated case the code-switching between the language varieties is not spontaneous, but rather constrained by semantic differences between the Emesal and Standard Sumerian words, which may indicate that at least some Emesal words were used in restricted contexts. We aim to determine the extent of regularity behind the code-switching over the entire Emesal vocabulary by using count-based word embedding models, which we use to measure how predictable the choice of the language variant is in certain linguistic contexts, and in certain subsets of our corpus. Our preliminary results indicate a varying degree of predictability in the use of Emesal words, and we hope that through further philological and statistical analysis of these results, we are able to discover and explain previously unnoticed regularities behind the code-switching. By identifying and explaining the contexts in which code-switching between Sumerian and Emesal takes place, and especially by studying whether those contexts varied in different time periods, text genres, or geographical regions, we can try to understand this long dead language on its own terms, and to improve our knowledge on the origins and development of Emesal.

Bio: