German Polyphone Database (SpeechDat(M)) DB1
View resource name in all available languages
Base de données "Polyphone" en allemand (SpeechDat(M)) DB1
The database consists of read speech. A prompt sheet with a unique identification number has been distributed to the potential callers.
The speech data is recorded with digital lines (ISDN), resulting in A-law format (8 bit), 8 kHz sampling rate. The data collection comprises 1000 speakers, with a particular care of a balance with respect to gender. The age of the callers were to be between 16 and 65 (No controlled distribution).
Callers could call from any kind of acoustic and network environment: home, business, mobile phone, phone booth, wired or cordless phone, etc. (No controlled distribution).
The regional distribution was expected to fit within the following scheme: from each of the 16 German states there were to be 32 speakers. Speakers from Austria, Switzerland and other countries were not be controlled. The utterances to be gathered have been specified and consisted of several speech sequences, including sentences from different sources (local newspapers, existing corpora, law articles, etc.) to ensure a good phonetic coverage, application words from a defined list of command words, digits (isolated digits, connected digits, and natural numbers), currency amounts, quantities, credit card numbers, spelled words (mainly names), time of day (spontaneous) and time phrase (prompted, word style), city of call/birth, etc.
A pronunciation lexicon with a phonemic transcription in SAMPA is also included.
View resource description in all available languages