Call for Papers
With the increasing number of platforms, grids and infrastructures in the wider area of Language Technologies (LT), NLP, NLU, speech, interaction and language-centric AI, there is also a growing need for sharing experiences, approaches and best practices to learn and benefit from the work of others and also, practically, to start a collaboration towards platform interoperability. The workshop addresses all smaller and larger language grids, language-related infrastructures, platform initiatives (including general and domain-specific) as well as collaborative research projects that touch upon one or more of the topics mentioned below, both in Europe and world-wide. The objective is to assemble representatives of these initiatives and all interested parties to exchange and discuss observations, experiences, solutions, best practices as well as current and future challenges. The workshop also addresses the issue of fragmentation in the Language Technology landscape. Instead of “platform islands” that simply exist side by side, possibly even competing with each other, initiatives should discuss how their platforms can be made interoperable and how they can interact with one another to create synergies towards a productive LT platform ecosystem. The long-term vision of platform interoperability has several prerequisites including technical requirements that need to be addressed, for example, through the use of common standards, but also community-related aspects that need to be addressed and strengthened through open discussions and further joint development. Both aspects are covered by this workshop.
Context and Motivation
The EU project European Language Grid (ELG; 2019-2021) is creating a platform that will provide thousands of data sets and hundreds of LT services. ELG aims to promote technologies tailored to all European languages and cultures, adapted to their social and economic needs. At the same time, there are several established platforms or infrastructure-related initiatives as well as emerging new ones, both on the international European but also on the national level as well as on other continents. Some of the initiatives are more language-related and have a strong industry focus, others are mainly research-oriented. Moreover, there are digital public service initiatives or platforms in which language is only one aspect of many. With all these established and emerging initiatives, there is a risk of even stronger fragmentation in the Language Technology field, which is already highly fragmented, at least in Europe. Our approach is to bring these initiatives together to discuss ways not only of preventing further fragmentation but, crucially, of reversing it. This will only be possible if interoperability and mutual data exchange is ensured and if metadata formats and technical requirements are compatible, among various other aspects. We invite authors to submit contributions on the current situation of their platform-related projects or initiatives (including technical, governance, community, uptake, interoperability, social aspects). We especially invite all relevant international or national grid, platform or infrastructure projects to participate in the workshop actively with a contribution.
Topics of Interest
In the list of topics below, the term “Language Technology (LT)” comprises Natural Language Processing, Natural Language Understanding, all types of speech technologies, conversational agents as well as language-centric AI and general AI. The term “platform” includes notions such as, among others, infrastructures, frameworks, clouds etc.
- LT platforms: architectures and approaches (including commercial and non-commercial; national and international; domain-specific and general purpose; all countries, regions and continents)
- LT platform interoperability: standards, APIs, workflows, as well as the exchange of services, models, data and metadata
- Data and metadata exchange formats and harvesting (including taxonomies, ontologies and other forms of semantic descriptions of repository records)
- Operational and legal policies as well as government structures for LT platforms (GDPR, data management, billing, business models)
- (Cloud-based) containerisation and virtualisation technologies for LT platforms
- Training, re-training and adaption of models; connecting data sets, tools and machine learning frameworks
- LT platforms and challenges regarding the availability of CPU/GPU resources
- From (general or domain-specific) AI platforms and (general or domain-specific) LT platforms and back again
- Community-related aspects of LT platforms
We invite contributions on the topics mentioned above or any related topics of interest.