English-Serbian Aligned Corpus




This corpus consists of English source texts translated into Serbian, and Serbian source texts translated into English, and several aligned English and Serbian translations of literary texts originally in French. The texts belong to various domains: fiction, general news, scientific journals, web journalism, health, law, education, movie sub-titles. The corpus also contains several Serbian translations of texts from the ‘Acquis communautaire’ corpus and from the ‘Intera’ corpus aligned with their originals. The alignment was performed on the subsentencial level. The texts were segmented and aligned automatically and then manually checked. In most cases the alignment is one-to-one. The size of the corpus is 5,078,280 words (2,672,911 in the English part, 2,405,369 in the Serbian part). More about the content of this corpus can be found at: http://www.korpus.matf.bg.ac.rs/SrpEngKor/SrpEngKor_2013_01.pdf

You don’t have the permission to edit this resource.

  • downloading from Web; retyping; scanning and OCR; obtaining from authors and translators
    • Corpus query processor (CQP)