JTok Tokenizer


JTok provides a tokenizer in Java that identifies paragraphs,
sentences and tokens of an input text. Non-word tokens are further
classified into abbreviations, numbers, punctuation and clitics.

