Hoppa till sidinnehåll
Läsa, skriva

Paper and Digital Reading Assessments. Exploring aspects of validity using PIRLS and ePIRLS

Publicerad:12 mars

Digitala och pappersbaserade läsprov mäter i stor utsträckning samma läsförmåga. Digitala prov ställer dock även krav som kan handla om navigering i icke-linjära texter. Det visar Elpis Grammatikopoulou i sin avhandling.

Författare

Elpis Grammatikopoulou

Handledare

Docent Stefan Johansson, Göteborgs universitet Docent Rolf Strietholt, Dortmund university, Tyskland

Opponent

Professor Gustaf Skar, NTNU – Norwegian University of Science and Technology

Disputerat vid

Göteborgs universitet

Disputationsdag

2026-02-25

Abstract in English

The shift from paper-based to digital assessment formats in education raises fundamental questions about the comparability of reading scores across modes and over time. International large-scale assessments increasingly rely on digital delivery while continuing to report trends on a common scale, making it essential to establish whether paper-based and digital reading assessments support equivalent score interpretations and valid trend comparisons across cycles. This dissertation investigates the comparability and validity of paperbased and digital reading assessments, using data from PIRLS 2016 and its digital extension, ePIRLS. Drawing on Kane’s argument-based framework for validation, the dissertation constructs and evaluates a validity argument focusing on three key inferences: generalisation across assessment modes, generalisation across student subgroups, and extrapolation of reading scores to later educational outcomes. The dissertation comprises three empirical studies employing psychometric modelling, regression analyses, and longitudinal analyses linking assessment data to Swedish national register data. Results show that paper-based and digital reading assessments share a strong common core of reading comprehension, demonstrating substantial construct overlap across modes. At the same time, systematic mode-related variation is observed, indicating that the two assessment formats are not fully equivalent despite their strong association. This variation differs across contexts and student groups, indicating that full generalisation across modes and populations cannot be assumed. Longitudinal analyses further demonstrate that both paper-based and digital reading scores meaningfully predict later academic outcomes, supporting their predictive validity, while small but systematic differences in predictive strength suggest that digital reading captures additional variance of educational relevance. Taken together, the findings show that paper-based and digital reading assessment scores are similar but not interchangeable. Validity in digital reading assessment should therefore be understood as conditional on assessment mode, student population, and context, rather than assumed a priori. These findings have important implications for international largescale assessments and for how reading literacy is conceptualised and measured in an increasingly digital educational landscape. The dissertation concludes that ongoing empirical validation is required to ensure the quality and validity of trend interpretations in international reading assessments, particularly as assessment modes and literacy practices continue to evolve.