Flemish/Dutch SpeechDat-Car database
View resource name in all available languages
Base de données SpeechDat-Car pour le flamand et le hollandais
The Flemish/Dutch SpeechDat-Car contains the recordings of 302 speakers (154 males, 148 females) from Flanders and The Netherlands, recorded over the mobile telephone network and in a car. The database contains recordings both in Flemish and in Dutch as spoken in Flanders (about 1/3 of the speakers), as well as recordings in Dutch as spoken in The Netherlands (about 2/3 of the speakers).
This database is partitioned into 162 CDs.
The speech data files are in two formats. Four of the microphones were recorded on the computer in the boot of the car. The speech data are stored as sequences of 16 kHz, 16 bit and uncompressed. The fifth microphone was connected to the GSM phone, and was recorded on a remote machine, with compressed data stored as sequences of 8 bit A-law 8.kHz. Each signal file is accompanied by an ASCII SAM label file which contains the relevant descriptive information.
This speech database was validated by SPEX (the Netherlands) to assess its compliance with the SpeechDat-Car format and content specifications.
Each speaker uttered the following items:
- 2 voice activation keywords
- 1 sequence of 10 isolated digits
- 7 connected digits (1 sheet number -5 digits, 1 spontaneous telephone number, 3 read telephone numbers, 1 credit card number ?14/16 digits, 1 PIN code -6 digits)
- 3 dates (1 spontaneous date e.g. birthday, 1 prompted date, 1 relative or general date expression)
- 2 word spotting phrases using an embedded application word
- 4 isolated digits
- 7 spelled words (1 spontaneous e.g. own forename or surname, 1 directory city name, 4 real word/name, 1 artificial name for coverage)
- 1 money amount
- 1 natural number
- 7 directory assistance names (1 spontaneous e.g. own forename or surname, 1 city of birth/growing up, 2 most frequent cities, 2 most frequent company/agency, 1 ?forename surname?)
- 9 phonetically rich sentences
- 2 time phrases (1 spontaneous time of day, 1word style time phrase)
- 4 phonetically rich words
- 67 application words (13 mobile phone application words, 22 IVR function keywords, 32 car products keywords)
- 2 additional language dependent keywords
- Prompts for spontaneous speech
The following age distribution has been obtained: 107 speakers are between 16 and 30, 127 speakers are between 31 and 45, 66 speakers are between 46 and 60, and 2 speakers are over 60.
A pronunciation lexicon with a phonemic transcription in SAMPA is also included.
View resource description in all available languages