Polish-Russian Parallel Corpus




A manually aligned Polish-Russian parallel corpus of 4 250 000 words from 20 Polish literary works and 1 legal text and 14 Russian literary texts translated into Russian and Polish respectively. The texts were processed and aligned using ABBY Aligner. All corrections to the segmentation and alignment of the translation with the original were performed manually. The texts are provided as TEI P5-compliant XML files with custom PELCRA extensions to mark complex translation equivalence types, and in the XLIFF and TMX formats. The corpus was originally acquired from the University of Warsaw and enhanced by University of Lodz and the Institute of Computer Science, Polish Academy of Sciences.

You don’t have the permission to edit this resource.

    • xmllint