Skip to main content
(BASE) British Academic Spoken English Corpus

(BASE) British Academic Spoken English Corpus

About BASE

The BASE corpus was developed by Hilary Nesi, with Paul Thompson. Natalie Snodgrass and Sarah Creer were employed as research assistants and Tim Kelly was video director for the project. Lou Burnard (Oxford University) and Adam Kilgarriff (Lexicography MasterClass Ltd) acted as consultants.

The corpus facilitates, amongst other things, investigation of:

  • The frequency and range of academic lexis
  • The meaning and use of individual words and multi-word units
  • The structure of academic lectures
  • The pace, density and delivery styles of academic lectures
  • The discourse function of intonation
  • Patterns of interaction, including turn-taking and topic selection
  • The interplay of visual and aural stimuli
  • The representation of ideas and the expression of attitudes

British Academy logo

AHRB logo

The lectures and seminars have been transcribed and tagged using a system devised in accordance with the TEI Guidelines. The corpus has been deposited in the Oxford Text Archive and is catalogued by the Arts and Humanities Data Service

This Excel Spreadsheet contains information about the corpus holdings.

The BASE corpus manual is available. (131 Kb).

The early stages of corpus development were assisted by funding from the Universities of Warwick and Reading , BALEAP, EURALEX, and The British Academy (2000-2001, Grant reference: SG 30284).

Major funding was provided by the Arts and Humanities Research Board as part of their Resource Enhancement Scheme (2001–2005, Award Number: RE/AN6806/APN13545).

 Queen’s Award for Enterprise Logo
University of the year shortlisted
QS Five Star Rating 2023