Dutch SpeechDat(II) MDB-250
View resource name in all available languages
Base de données SpeechDat(II) MDB-250 du néerlandais
The Dutch SpeechDat(II) MDB-250 comprises 250 Dutch speakers (125 males, 125 females) recorded over the Dutch mobile telephone network. This database is partitioned into 5 CDs The speech databases made within the SpeechDat(II) project were validated by SPEX to assess their compliance with the SpeechDat format and content specifications.
Speech samples are stored as sequences of 8-bit 8 kHz A-law. Each prompted utterance is stored in a separate file. Each signal file is accompanied by an ASCII SAM label file which contains the relevant descriptive information.
The following items were recorded:
- 8 application words (2 optional); 2 isolated digits; 1 sequence of 10 isolated digits; 3 connected digits: 1 telephone number (1-10 digits), 1 credit card number (1-16 digits), 1 digit PIN code (6 digits); 3 dates: 1 spontaneous date, 1 date, 1 relative date expression; 1 embedded application word; 3 spelled words: 1 forename (spontaneous), 1 city name, 1 word; 1 currency money amount; 1 natural number; 6 directory assistance names: 1 forename (spontaneous), 1 city of birth, 1 most frequent city, 1 city name, 1 company name, 1 forename surname; 2 yes/no questions: 1 predominantly "yes" question, 1 predominantly "no" question; 9 phonetically rich sentences; 2 time phrases: 1 time of day (spontaneous), 1 time phrase; 4 phonetically rich words.
The following age distribution has been obtained: 5 speakers are under 16, 90 are between 16 and 30, 89 between 31 and 45, 56 between 46 and 60, and 10 are over 60. The lexicon was created following the guidelines in SD1.3.1 v4.3.
A pronunciation lexicon with a phonemic transcription in SAMPA is also included.
View resource description in all available languages