CW Corpus The Complex Word (CW) Corpus contains 731 sentences each with one annotated CW. These simplifications were mined from Simple Wikipedia edit histories. Each entry gives an example of a sentence requiring simplification by means of a single lexical edit. This resource is primarily designed for the evaluation of CW identification systems. Distribution Availability
Available - Restricted Use
Licence CC - BY - SA
Restrictions: Academic - Non Commercial Use, Attribution, Share Alike
User Nature: Academic
Attribution Details: Shardlow, M. (To appear). The CW Corpus: A new resource for evaluating the Identification of Complex Words. In Proceedings of the Second Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR 2013), Sofia, Bulgaria, Association for Computational Linguistics
Distribution Access/Medium: Accessible Through Interface
Distribution rights holders:
IPR Holder
Contact Person
Monolingual text corpus Languages
English
Linguality Linguality type: Monolingual
Size Character encoding
UTF - 8
Domains Modalities Annotation Other Segmentation level: Sentence, Word
Format: Text
Annotation Mode: Manual
Interannotator Agreement: 97.5% Kappa
Creation Creation mode: Manual
Original Sources Resource Creation Metadata Created: 06/12/2013
Last Updated: 06/12/2013
Usage Foreseen Use Nlp Applications Use NLP Specific: Natural Language Understanding, Text Generation
Documentation Tool Documentation: Online
Document Type: Other
People who looked at this resource also viewed the following: