A multilevel description of textbook linguistic complexity across disciplines: Leveraging NLP to support disciplinary literacy

In this study, linguistic complexity is measured at the clause, phrase and lexical levels, across eight disciplines represented by a corpus of secondary school textbooks. Innovative natural language processing systems extract an unprecedented number of complexity measures, and discriminant function analysis describes features that best differentiate disciplinary writing. Results indicate disciplines vary along different clines of complexity. The first tends to discriminate humanities from science subjects, along features such as academic phraseology, possessive noun phrases, auxiliary verbs and clause dependents. Other clines show that subjects such as history and physics can be similarly complex in features such as their prepositional expansion of noun phrases. The paper concludes with detailed and specific pedagogical takeaways for teachers.
Source: Linguistics and Education - Category: Speech-Language Pathology Source Type: research