PELCRA time-aligned spoken corpus of Polish (CC-BY-NC)




A subset of the PELCRA corpus of conversational Polish, time-aligned on the utterance level, licensed under the CC-BY-NC license. This resource contains 386 744 words in 73 transcriptions of over 43 hours of recordings made in the years 2008-2010. The texts are provided as TEI P5-compliant XML files with custom PELCRA extensions and in the XLIFF format.

