French-Serbian Aligned Corpus




The corpus includes French or Serbian source literary and newspaper texts and their translations. The alignment was performed on the subsentencial level. Texts are segment and aligned automatically and then manually checked to obtain one-to-one alignment (in most of the cases). The corpus contains 32 literary texts: 29 French originals with Serbian translation (one with two translations), 2 Serbian originals with French translations, and one English novel translated to French and Serbian. The corpus also contains all articles from the issue of "Le monde diplomatique" from May 2001. The size of the corpus is 59,425 aligned segments and 1,948,679 words (1,063,564 in the French part, 885,115 in the Serbian part). More about the content of this corpus can be found at:

  • downloading from Web; retyping; scanning and OCR; obtaining from authors and translators
    • Corpus query processor (CQP)