Hungarian Kindergarten Language Corpus
The Hungarian Kindergarten Language Corpus (HUKILC) has been compiled predominantly for child language variation studies. It contains 62 interviews with 4,5-5,5 year-old kindergarten children from Budapest. The interviews are at least 20 minutes long. The children are divided into 4 groups concerning socio-economic status (SES) and sex. There is a higher SES group with males (hm), and one with females (hf), and a lower SES group with males (lm) and females (lf), respectively. The corpus is also a useful source for other fields of child language research (eg. phonetics, or developmental morphology). A morphological analyzer (Humor) and disambiguator (PurePos) has been adapted for child language data (still in progress) in HUKILC. The corpus is anonymized, pseudonyms are used.
- humor, purepos