This site contains the written Corpus of Digor dialect of Ossetic language (ODC). The total size of ODC is 2.3 millon tokens. The Corpus was created in 2014 with the financial support of the Presidium of the Russian Academy of Sciences program “Corpus linguistics” (leader Arseniy P. Vydrin)
ODC consists mainly of the issues of the Digor newspaper “Digorӕ” for 2006-2014. It also includes several issues of the Digor literature journal “Irӕf”, the Nart Sagas, and the following fiction and poetic texts of Digor writers of the 20th century:
All texts included in the corpus are automatically annotated in English and in Russian (the annotation consists of grammatical information and translation). The number of annotated wordforms after automatic annotation is approximately 84% of the total number of wordforms. The corpus uses a modified version of the search engine of the Eastern Armenian National Corpus (EANC), which allows searching by lexeme, wordform, and by particular grammatical features. To prevent infringement of copyrights the access to the full versions of the texts is unavailable. When searching for a particular wordform, lemma, or a set of grammatical tags, the platform displays all sentences containing the requested wordforms. Every result sentence can be expanded to 3 sentences before and 3 sentences after it.
We are grateful to the editorial boards of “Digorӕ” newspaper and the literary journal “Irӕf” for providing us with soft versions of their publications.
The corpus is created under the supervision of A. P. Vydrin. Digor dictionary was modified for the purpose of the corpus by L.V. Klimenchenko and A.P. Vydrin. The automated morphological analysis tool UniParser was developed by T.A. Arkhangelskiy. Scanning of some texts and editing of Digorӕ texts was made by M.V. Darchieva.
Currently, the corpus is being maintained and developed by A.P. Vydrin
Comments can be sent to Arseniy Vydrin by email: senjacim@gmail.com
We will be thankful for any published texts in Digor dialect sent to us. Texts are accepted in any text format (doc, docx, rtf, txt, odt) by the following emails: ossetic.studies@gmail.com and senjacom@gmail.com. We warrant the copyrights protection. The received texts will be used only for corpus purposes.