The English Speech Corpus with Different Proficiency Levels

About

The English Speech Corpus with Different Proficiency Levels aims to provide learners, teachers and researchers with high-quality authentic recordings from practice tests and detailed annotations of linguistic features of English learners. The corpus contains 72 sets of spontaneous speech data and 15 sets of classroom presentation data. Of the 72 sets of data, 42 are collected from mainland China and Hong Kong learners, and 30 are retrieved from IELTS speaking official videos.

There is currently over 300 minutes of recording data with linguistic annotations that focus on four aspects (i.e., fluency and coherence, lexical resource, grammatical range and accuracy, and pronunciation) according to the IELTS speaking criteria, which can help Chinese English learners identify difficulties in English speaking learning. The high-quality recordings are ideally suited for learners, teachers, and researchers in Greater China and beyond.