French Speechdat(II) FDB-1000
View resource name in all available languages
Base de données SpeechDat(II) FDB-1000 du Français
This French SpeechDat(II) FDB-1000 database contains the recordings of 1,017 French speakers recorded over the fixed telephone network. This speech database was sponsored by the European Commission (CEC DGXIII), under the project LE2-4001. The database is partitioned into CD-ROMs
Speech samples are stored as sequence of 8-bit, 8kHz A-law and are uncompressed. Each prompt utterance is stored within a separate file (file extension FRA) and has an accompanying ASCII SAM label file (file extension FRO).
It contains 48 utterances (40 mandatory and 8 optional items) for 1,017 different speakers, 17 speakers have been added to the original 1,000 speakers to fit the requirements of the database. The main content of the database is speech and orthographic transcription files.
The database was validated by SPEX (the Netherlands) to assess its compliance with the SpeechDat format and content specifications.
It is designed for development and assessment of French speech recognizers.
Each speaker uttered the following items:
* 5 application words
* 1 sequence of 10 isolated digits
* 4 connected digits: 1 sheet number (5+ digits), 1 telephone number (9-11 digits), 1 credit card number (14-16 digits), 1 PIN code (6 digits)
* 3 dates: 1 spontaneous date (e.g. birthday), 1 prompted date (word style), 1 relative and general date exp.
* 2 word spotting phrases using an application word (embedded)
* 1 isolated digit
* 3 spelled words (letter sequences): 1 spontaneous, e.g. own forename, 1 spelling of direct. city name, 1 real/artificial for coverage
* 1 currency money amount
* 1 natural number
* 5 directory assistance names + 1 spelled name: 1 spontaneous, e.g. own forename, 1 city of birth / growing up (spont), 1 most frequent cities (set of 500), 1 most frequent company/agency (set of 500), 1 "forename surname", 1 spelled city of birth
* 2 questions, including "fuzzy" yes/no: 1 predominantly "yes" question, 1 predominantly "no" question
* 9 phonetically rich sentences
* 2 time phrases: 1 time of day (spontaneous), 1 time phrase (word style)
* 8 phonetically rich words
View resource description in all available languages