National Competence Centre Romania

The Languages of Romania

Next to the official language Romanian, the government declared 18 official minority languages. For every minority language, an official language policy was published to ensure the right of the minorities to speak their native language. Romanian is also the official language of Moldavia (where the language is called Moldavian), the autonomous province Volvodina in Serbia and the autonomous Mount Athos in Greek. The four Romanian dialects (Daco-Romanian, Aromanian, Istro-Romanian, Megleno-Romanian) are listed in the UNESCO Red Book of Endangered Languages because of their small amount of speaker. Romanian belongs to the Eastern branch of the Romance language family. Because of the geographic isolation from other Romance languages, it developed into a different direction. It maintained many Latin elements like the morpho-syntactic structure or the vocative.

Features of Romanian:

  • 60% of the basic vocabulary descends from Latin.
  • Word formation is realised with derivation most of the time.
  • The rich inflectional system contains for nouns, pronouns and adjectives 5 cases, 2 numbers, and for pronouns stress or for nouns and adjectives a definite or indefinite feature. Inflected forms alternate sometimes within the root.
  • Romanian is a pro-drop language, i.e., subjects can be deleted, if the verb already contains the same information. There is also the possibility to double the subject or some pronominal clitics.

NCC Lead Romania

Prof. Dan Tufiș leads the research institute for artificial intelligence at the Romanian Academy as Director and Senior Researcher. Since 1982, his research comprises Artificial Intelligence, Natural Language Processing, Machine Translation, Language Technologies, Machine Learning, Natural Language Understanding and Generation, Knowledge Representation and the Semantic Web. He authored and co-authored four books, edited 25 volumes and authored and co-authored over 350 book chapters, journal and conference articles in this wide area of scientific topics. He served as a member of the editorial board for ten scientific journals. Moreover, he participates in many international and national projects or initiatives of the LT or AI community. He took part in the META-NET and CLARIN initiatives, the project Multilingual Web and others. He is also a member of the EACL, ACM, EAMT and several other associations. For his important scientific work, a ”Steaua României” (similar to a knight rank) was awarded to him by the president of Romania.

Prof. Dan Tufiș

Current National Initiatives

  • There is no LT-specific funding programme. The LT Research & Development (mainly speech) is embedded in AI calls.
  • Currently two large projects are ongoing: “Robots and the Society: Cognitive Systems for Personal Robots and Autonomous Vehicles” (ROBIN) and “Resources and Technologies for the development of man-machine interfaces for Romanian language” (ReTeRom). Both have a significant part related to speech LRs and processing of Romanian.

Wikipedia contributors. (2020, June 27). Romanian language. In Wikipedia, The Free Encyclopedia. Retrieved 13:00, July 1, 2020, from https://en.wikipedia.org/wiki/Romanian_language.

Events

2021
10th National ELG Workshop: Romania symbol of elg in colour National workshop Romania September 29

META-NET White Paper on Romanian

Diana Trandabăț, Elena Irimia, Verginica Barbu Mititelu, Dan Cristea, and Dan Tufiș. Limba română în era digitală - The Romanian Language in the Digital Age. META-NET White Paper Series: Europe's Languages in the Digital Age. Springer, Heidelberg, New York, Dordrecht, London, 9 2012. Georg Rehm and Hans Uszkoreit (series editors).
Full text of this META-NET White Paper (PDF)
Additional information on this META-NET White Paper

Cover of Romanian Whitepaper

Availability of Tools and Resources for Romanian (as of 2012)

The following table illustrates the support of the Romanian language through speech technologies, machine translation, text analytics and language resources.

Speech technologies Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support
Machine translation Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support
Text analytics Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support
Language resources Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support