Persian Speech Corpus
View resource name in all available languages
Corpus oral persan
This about 2.5-hour Single-Speaker Speech corpus has been developed using the same methodologies used in the PhD work carried out by Nawar Halabi at the University of Southampton. The corpus was recorded in Persian (Tehrani accent) by one male speaker using a professional studio, through a "Blubbery" model microphone of "Blue" brand with "Presonus Studio Channel” as preamp and compressor. It has been recorded by "Reaper" software, and some plugins for enhancing his voice. Synthesized speech as an output using this corpus has produced a high quality, natural voice.
This package includes:
- 399 .wav files containing spoken utterances.
- 399 .lab files containing phonetic utterances.
- 399 .TextGrid files containing the phoneme labels with time stamps of the boundaries where these occur in the .wav files. These files can be opened using Praat software (see http://www.fon.hum.uva.nl/praat).
- aligned.mlf which contains the HTS friendly alignments.
- orthographic transcriptions are gathered in one single text file (orthographic-transcript.txt) which has the form "[wav_filename]" "[Orthographic Transcript]" in every line.
Persian Speech Corpus by Nawar Halabi is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
View resource description in all available languages