FinDialogue - spontaanin suomenkielisen puheen korpus
The FinDialogue corpus is a subcorpus of FinINTAS. FinDialogue consists of ten spontaneous dialogues between friends, duration 45-55 minutes each. The corpus includes audio files (WAV) and phonetic annotation files (Praat TextGrid). FinDialogue will be made available at http://lat.csc.fi in the near future, along with FinRead.
The dialogues are numbered from D1 to D12. (The original recordings of D3 and D5 have been excluded due to ethical and technical reasons.)
The speakers were native Finns from the capital city region in Finland. Ten speakers were 20 to 30 years of age (D1, D2, D4, D6, D7), whereas the rest of the speakers (D8-D12) were between 45-65 years. The speakers are the same as in FinRead, the other subcorpus of FinINTAS.
The recordings were performed in an anechoic room for dialogues D1-D7 and in a professional recording studio for dialogues D8-D12. In both cases, the speakers were sitting a few meters apart, facing opposite directions, with headphone-microphone combos on their heads. Thus, the situation resembled a telephone conversation. Even though the anechoic room was a somewhat strange environment, it was found that the speakers usually relaxed after a few minutes and started to chat quite casually. In order to encourage ordinary conversation, the person responsible of the recording left the speakers alone and did not monitor them during the session. The recording person interrupted the conversation only a couple times during each recording session in order to see that all was well and to provide the speakers with a new topic to discuss (first school, then movies/films, and finally travel). However, the speakers were instructed not to stick to the given topic in case they found something else to talk about, which thay often did.
The FinDialogue corpus of spontaneous Finnish speech will be made available at http://lat.csc.fi.