Language data and resources

Corpuses, language descriptions and lexical/conceptual resources

Corpora

Structured collections of pieces of data (textual, audio, video, multimodal/multimedia, etc.), selected according to specific criteria external to the data, such as size, type of language, type of text producers or expected audience, etc.

Models

Resources created through a training process involving an algorithm and the training data to learn from; examples include translation models, speech models, transformers, n-gram models, etc.

Lexical/Conceptual resources

Resources such as terminological glossaries, word lists, semantic lexica, ontologies, etc., organized on the basis of lexical or conceptual units (lexical items, terms, concepts, phrases, etc.) with their supplementary information e.g., grammatical, semantic, statistical information, etc.

Computational grammars

Resources composed of rules representing the structure of a language.

Language data and resources

Corpora

Models

Lexical/Conceptual resources

Computational grammars

Latest added

Recently updated