Serbian NooJ module




Serbian NooJ module (SrpNooJ) was produced in the scope of the EU-funded CESAR project. It consists of a set of resources in both alphabets that are in use for Serbian: Cyrillic and Latin. Each set consists of: the dictionary properties’ definition file (metadata), one text – a novel “Dva carstva” (Two empires) from a Serbian author Branimir Ćosić comprising of 106684 tokens, a sample dictionary in readable form with 35 lemma that belong to 9 grammatical classes, with examples of multiword units and derivational morphology, a sample of morphological grammars used for lemmas from a sample dictionary – three for simple nouns, two for adjectives, two for verbs, and one for a multiunit noun, a readable sample dictionary of inflected forms automatically produced from a sample dictionary of lemmas and a sample morphological grammars, a syntactic grammar for recognition of one class of named entities – full personal names with their roles or functions, a full compiled dictionary (divided in three files: nouns, verbs, and other). It comprises of 85868 entries: nouns (40886), adjectives (25558), verbs (15366), and other (4058).

  • NooJ;LeXimir
