The Languages of Bulgaria
Bulgarian is the official language of the Republic of Bulgaria. It is spoken by approximately nine million native speakers (as of 2011), mainly in Bulgaria. Bulgarian belongs to the family of South Slavic languages and forms part of the Balkan linguistic Union. The Bulgarian regional variations are split into Eastern and Western by the “Yat Border”, which marks the different mutations of the Old Bulgarian “yat” form, the thirty-second letter of the old Cyrillic alphabet.
Features of Bulgarian:
- The official Bulgarian alphabet is called the Cyrillic. It is the first Slavic language with its own writing system and dates back to the 9th century.
- As a Slavic language it possesses a rich inflectional and derivational morphology. However, due to the mutual influence of Balkan languages, Bulgarian lost it's noun cases (except vocative) and also completely lost the infinitive form.
- Specific characteristics of the Bulgarian language pose a challenge for the computational processing of natural language. The rather flexible word order which when combined with the lack of morphological distinction for nominal cases and subject omission is a real challenge for natural language processing of Bulgarian.
NCC Lead Bulgaria
Dr. Svetla Koeva is Head of the Department of Computational Linguistics at the Institute for Bulgarian language, Bulgarian Academy of Sciences. She received her Ph.D. in Structural, Applied and Mathematical Linguistics at the Institute for Bulgarian Language (Bulgarian Academy of Sciences). She has been involved in research and development of a variety of linguistic resources for Bulgarian, for example WordNet, FrameNet, numerous corpora, spell and grammar checkers and the Bulgarian NLP chain. She was awarded with three national scientific awards. Her research interests are in the field of computational linguistics; natural language processing; problems of formal natural language description and ontologies. Her current research priorities are oriented towards problems of machine translation; corpus studies and syntactic parsing.
Current National Initiatives
- There is a need for a large collection of data sets and resources, services and tools for spoken language.
- The National Scientific Fund supports LT projects in common with all other disciplines.
- In 2020 the Bulgarian government has adopted the "Concept for the Development of Artificial Intelligence in Bulgaria until 2030".
Events
META-NET White Paper on Bulgarian
Diana Blagoeva, Svetla Koeva, and Vladko Murdarov. Българският език в дигиталната епоха - The Bulgarian Language in the Digital Age. META-NET White Paper Series: Europe's Languages in the Digital Age. Springer, Heidelberg, New York, Dordrecht, London, 9 2012. Georg Rehm and Hans Uszkoreit (series editors).
Full text of this META-NET White Paper (PDF)
Additional information on this META-NET White Paper
Availability of Tools and Resources for Bulgarian (as of 2012)
The following table illustrates the support of the Bulgarian language through speech technologies, machine translation, text analytics and language resources.
Speech technologies | Excellent
support |
Good
support |
Moderate
support |
Fragmentary
support |
Weak/no
support |
---|---|---|---|---|---|
Machine translation | Excellent
support |
Good
support |
Moderate
support |
Fragmentary
support |
Weak/no
support |
Text analytics | Excellent
support |
Good
support |
Moderate
support |
Fragmentary
support |
Weak/no
support |
Language resources | Excellent
support |
Good
support |
Moderate
support |
Fragmentary
support |
Weak/no
support |