National Competence Centre Slovakia

The Languages of Slovakia

Slovak is spoken by approximately 5.2 million native speakers (2015). The largest minority groups in Slovakia are Hungarians and Roma. Nevertheless, Slovak is the only official language on the country level. Slovak is also spoken in certain communities in the United States and the Czech Republic. Slovak belongs to the Western branch of the Slavic languages, like Polish, Czech and Sorbian. It is split in many different regional dialects, which are subclassified in three main classes: Western, Central and Eastern Slovak dialects. Regional variations are most significant in the mountainous regions of Slovakia.

Features of Slovak:

  • Slovak speakers use a modified Latin alphabet with extra diacritical marks. The diacritical marks are used to denote palatalisation, postalveolar sibilants and the length of vowels.
  • The Slovak pronunciation is based on a rhythmic rule. This rhythmic rule describes the tendency of not having two long adjacent syllables.
  • The rich inflectional system contains 6 cases and 4 genders. The gender masculine is separated in masculine animate and inanimate.
  • The Slovak alphabet has the most letters compared to other European languages.

NCC Lead Slovakia

Dr. Radovan Garabík works at the Ľudovít Štúr Institute of Linguistics, Slovak Academy of Sciences. His research interests cover corpus linguistics, natural language processing, computational linguistics, human language technology and modern digital lexicography. He is the main architect of the Slovak National Corpus project, a huge representative corpus of modern Slovak, together with relevant NLP tools for Slovak, such as Slovak language morphology analysis and POS tagging. He also led the development of several other significant resources: Slovak WordNet, Slovak Multext East morphosyntactic specification, several parallel corpora. He represented the Ľ. Štúr Institute of Linguistics in following European projects: MONDILEX – Conceptual Modelling of Networking of Centres for High-Quality Research in Slavic Lexicography and Their Digital Resources; Slovak Online; CESAR – CEntral and South-east europeAn Resources; NETWORDS – The European Network on Word Structure; EuroMatrixPlus "Bringing Machine Translation for European Languages to the User" ; www.slovake.eu – Extending the offer of the e-learning platform for the Slovak language; lingvo.info; MARCELL – Multilingual Resources for CEF.AT in the legal domain; CURLICAT – Curated Multilingual Language Resources for CEF AT; Nexus Linguarum – European network for Web-centred linguistic data science, LITHME – Language in the Human-Machine Era. He is the principal author of several specialized dictionaries of contemporary Slovak language. On the national level, he takes part in the project SNK (Slovak National Corpus). In the area of computational linguistics, he authored and co-authored more than 60 scientific papers and conference proceedings.

Dr. Radovan Garabík

Current National Initiatives

  • There is no LT funding programme; some minor programmes oriented towards LT have been successful in the past, but mostly as parts of other actions or grants.
  • LT oriented industry is rare, with companies usually trying to use existing technologies rather than developing new ones.
  • There is a lack of understanding of NLP within the industry.

Wikipedia contributors. (2020, July 1). Slovak language. In Wikipedia, The Free Encyclopedia. Retrieved 17:00, July 2, 2020, from https://en.wikipedia.org/wiki/Slovak_language.

Wikipedia contributors. (2020, July 4). Slovakia. In Wikipedia, The Free Encyclopedia. Retrieved 12:00, July 6, 2020, from https://en.wikipedia.org/wiki/Slovakia.

Events

2021
10th Regional ELG Workshop: Slovakia, Czech Republic symbol of elg in colour Regional workshop   October 18

META-NET White Paper on Slovak

Mária Šimková, Radovan Garabík, Katarína Gajdošová, Michal Laclavík, Slavomír Ondrejovič, Jozef Juhár, Ján Genči, Karol Furdík, Helena Ivoríková, and Jozef Ivanecký. Slovenský jazyk v digitálnom veku - The Slovak Language in the Digital Age. META-NET White Paper Series: Europe's Languages in the Digital Age. Springer, Heidelberg, New York, Dordrecht, London, 9 2012. Georg Rehm and Hans Uszkoreit (series editors).
Full text of this META-NET White Paper (PDF)
Additional information on this META-NET White Paper

Cover of Slovak Whitepaper

Availability of Tools and Resources for Slovak (as of 2012)

The following table illustrates the support of the Slovak language through speech technologies, machine translation, text analytics and language resources.

Speech technologies Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support
Machine translation Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support
Text analytics Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support
Language resources Excellent
support
Good
support
Moderate
support
Fragmentary
support
Weak/no
support