This paper investigates the interdependent complexities of Central Kurdish endoclitic markers from an empirical point of view. Using a corpus-based approach, we analyze the usage and distribution of the pronominal endoclitics and the emphasis clitic =îş in Central Kurdish. To this end, we annotate a corpus of Central Kurdish with part-of-speech tags and also, specific tags for the target clitics. Furthermore, we analyze the behavior of these clitics at the word level using data generated by finite-state transducers.
This work documents a new and rare phenomenon and analyzes it using a data-driven corpus approach. The collected data will also be useful for further analyses in the field.
The analysis is available in this Jupyter Notebook. The tagged corpus will be released shortly.
Please consider citing this paper (bib
file):
@inproceedings{ahmadi2023sle,
title = "A Corpus-based Study of Endoclitic \textit{=îş} in {Kurdish}",
author = "Ahmadi, Sina and Anastasopoulos, Antonios and Walther, Géraldine",
booktitle = "Book of abstracts of the the 56th Annual Meeting of the Societas Linguistica Europaea",
month = sept,
year = "2023",
address = "Athens, Greece",
publisher = "the 56th Annual Meeting of the Societas Linguistica Europaea"
}