Language data and resources

Corpora, models, lexical/conceptual resources and grammars

Corpora

Structured collections of pieces of data (textual, audio, video, multimodal/multimedia, etc.), selected according to specific criteria external to the data, such as size, type of language, type of text producers or expected audience, etc.

Browse

Models

Resources created through a training process involving an algorithm and the training data to learn from; examples include translation models, speech models, transformers, n-gram models, etc.

Browse

Lexical/Conceptual resources

Resources such as terminological glossaries, word lists, semantic lexica, ontologies, etc., organized on the basis of lexical or conceptual units (lexical items, terms, concepts, phrases, etc.) with their supplementary information e.g., grammatical, semantic, statistical information, etc.

Browse

Computational grammars

Resources composed of rules representing the structure of a language.

Browse

Latest added

Recently updated