While international large-scale assessments (ILSAs) are now regarded by many as a regular feature of the educational assessment landscape, they remain a relatively recent phenomenon. Their origins can be traced back to the IEA pilot survey, an early investigation into student performance conducted in the 1960’s by the International Association for the Evaluation of Educational Achievement (IEA). Since that time there have been significant developments (this text is partly based on Wagemaker, 2014, pp.17):
Number of organizations
The number of organizations which are responsible for the development, management, and conduct of ILSAs has grown from one to seven big players in the field. Their studies are presented on the gateway, and they include CONFEMEN, the IEA, the IDB, the OECD, SACMEQ, UNESCO, and the World Bank.
Numbers of participating entities
Numbers of participating entities (e.g., countries, economies, regions) have increased considerably from 12 in the IEA’s First International Mathematics Study (FIMS) in 1964 up to 61 in PIRLS 2016, 64 in TIMSS 2015, and 73 in PISA 2015. At present, about 70% of the countries in the world are estimated to participate in ILSAs (Lietz, Cresswell, Rust, & Adams, 2017, p. 1).
The populations investigated have been expanded so that ILSAs cover a broad range from preprimary to postsecondary. A large number of studies investigate students enrolled in primary and/or secondary schools and some also extend the investigation to include teachers, principals, and parents of the target age group (e.g., in ICCS, ICILS, PASEC, PIRLS, PISA, SACMEQ, TERCE, and TIMSS). Other studies focus on teachers in their own right, e.g., current primary teachers and their principals (TALIS) or future primary and lower secondary mathematics teachers and their educators (TEDS-M). ILSAs undertaken with individuals outside the school context can focus, for example, on young children (PRIDI) or adults (PIAAC, STEP).
The domains which are the focus of investigation, regardless of whether the study includes an assessment or not, have evolved from mathematics in FIMS to cover a broad range of areas, including as follows: mathematics or numeracy (PASEC, PIAAC, PISA, SACMEQ, TERCE, TEDS-M, TIMSS) but then also science (PISA, TERCE, TIMSS), various aspects related to reading and language (PASEC, PIAAC, PIRLS, PISA, PRIDI, SACMEQ, STEP, TERCE), civic education (ICCS), HIV/AIDS knowledge (SACMEQ), and, more recently, computer and information literacy (ICILS), problem-solving (PIAAC, PISA), and financial literacy (PISA). Other studies deal with socio-emotional and motor skills of young children (PRIDI), teachers, their teaching and teaching environments (TALIS), or teacher’s pedagogical content knowledge (TEDS-M).
From input toward output
A shift in focus from an input toward an output/outcomes orientation has occurred: Early reform efforts in education focused on concerns related to inputs and the challenges of ensuring equity in terms of school enrollments. However, with greater recognition of the effects of globalization and economic competitiveness and the greater concerns for equity of learning outcomes (e.g., in terms of what students know and can do), ILSAs are increasingly seen as a necessary condition for monitoring and understanding the outcomes of the significant investments that all nations make in education.
Measurement methodologies have evolved considerably: ILSAs such as PISA, PIRLS, and TIMSS have their roots in the methodologies of the long-term trend assessment NAEP (National Assessment of Educational Progress) in the USA. Its methodologies have been adapted and extended to meet the challenges of these other studies which assess educational achievement beyond national boundaries, ensuring, for example, comparability of test items while operating with a growing number of participating countries and languages.
But ILSAs have made important progress in other methodological areas, such as sampling, instrument development and validation, or scaling (Lietz et al., 2017, p. 16). As far as analytical procedures are concerned, early studies employed classical item analysis for score calculation whereas most recent studies use item response theory (IRT) for analysis of cognitive outcome data.
Throughout the years, the organizations conducting ILSAs have developed technical standards describing minimum requirements, and the reports of results for each study cycle are typically accompanied by comprehensive technical documentation, which provides critical guidance for data interpretation and the implementation of secondary analyses. The provision of technical documents, labeled, for example, “Methods and Procedures” (in PIRLS and TIMSS) or “Technical Report” (e.g., in ICCS, ICILS, PIAAC, PISA, and TALIS) is one of the selection criteria for inclusion of an ILSA here.
Regardless of the high level of excellence that ILSAs have achieved, there is still room for improvement, also in the above-mentioned areas. Future challenges include the development, expansion, and refinement of electronic and web-based instruments (eAssessments), greater participation by middle and low-income countries, the inclusion of new assessment domains, the development and implementation of (more) longitudinal studies with the same cohorts (Lietz et al., 2017, pp. 17).
Major objectives of ILSAs, especially those undertaken in the school context, include improving education quality and equity, as well as serving the increasing demand worldwide for greater accountability for the investments made in educational provision. In general, ILSAs share common objectives that either explicitly or implicitly include one or more of the following elements:
- Provision of high-quality data to improve policymakers’ understanding of key school-based and non-school-based factors influencing teaching and learning
- Provision of high-quality data as a resource for identifying areas of concern and action and for preparing and evaluating educational reforms
- Development and improvement of the capacity of educational systems to engage in national strategies for educational monitoring and improvement (Wagemaker, 2014, pp.13)
ILSAs are typically organized as cross-national and cross-sectional studies, providing information about a population and area of interest for a specific point in time in the participating countries or regions so that the participating entities can learn from each other. However, these comparisons across countries, and often across cultures, involve considerable challenges (Lietz et al., 2017). Therefore, measuring trends within countries is an even more important objective for many ILSAs; these are conducted at regular intervals (e.g., PISA every three years; TIMSS every four years; PASEC, PIRLS, and TALIS every five years) so that regularly participating countries can compare their own results over time and make informed decisions for improving their education systems.
International large-scale assessments have also made major contributions to education research. They have contributed to educational theory, for example, in terms of model building and testing, and helped to build and strengthen a world-wide community of researchers in educational evaluation. The notion that educational reform and improvement rather than assessing and testing as such are the goals of ILSAs creates the imperative to ensure that the data gathered are readily accessible and used (Wagemaker, 2014, p.14). This gateway supports these endeavors by facilitating the location of ILSA resources.
Dealing with differences
ILSAs share common objectives and features, but there are also differences. In addition to the above-mentioned differences in terms of the number of participating entities, domains under investigation, or target populations, the studies may vary in the following ways:
- Some ILSAs (e.g., ICCS, ICILS, PASEC, PIRLS, PISA, TIMSS) adopt a cyclical, trend approach, repeating the study at more or less regular intervals with revised and improved versions of the previous data collection instruments and partly the same and partly different participating countries. Other ILSAs (e.g., PIAAC, STEP) operate in waves or rounds, applying the same set of instruments to different groups of countries at different points in time.
- Some studies (e.g., PASEC, PIRLS, SACMEQ, TERCE, TIMSS) use a curriculum-based approach, assessing student learning after a fixed period of schooling, whereas others (e.g., PISA, PIAAC, STEP) use an age- and skills-based approach (Wagemaker, 2014, p.14).
- The majority are conducted as international studies with participants from throughout the whole world. But there are also transnational studies with a more regional focus, such as PASEC (Francophone Africa), SACMEQ (Anglophone Africa), PRIDI (Latin America), and TERCE (Latin America and the Caribbean) (Wagemaker, 2014, p.6).
Numerous countries participate in more than one of these ILSAs, either in parallel or alternately, considering them complementary approaches to the investigation of learning outcomes and educational provision (Wagemaker, 2014, p. 19).
For the first time, these studies are gathered on a single platform: the ILSA Gateway. As the overarching service for educational ILSAs, the gateway offers comprehensive information on all ILSAs in a standardized format while maintaining the characteristics of each individual study. These informational texts are complemented by hyperlinks, allowing for fast and easy access to documents, data, and other resources on the external study websites. The gateway services are intended to encourage the exchange of knowledge and materials, to inspire future research, and to contribute to the further development of the ILSAs themselves.
Hans Wagemaker, IEA Executive Director (1997–2014)
Nathalie Mertes, ILSA Gateway Production Manager, IEA Hamburg
Lietz, P., Cresswell, J. C., Rust, K. F., & Adams, R. J. (2017). Implementation of large-scale education assessments. In P. Lietz, J. C. Cresswell, K. F. Rust, & R. J. Adams (Eds.), Wiley series in survey methodology. Implementation of large-scale education assessments (pp. 1–25). Chichester, United Kingdom: John Wiley & Sons.
Wagemaker, H. (2014). International large-scale assessments: From research to policy. In L. Rutkowski, M. von Davier, & D. Rutkowski (Eds.), Statistics in the social and behavioral sciences series. Handbook of international large-scale assessment. Background, technical issues, and methods of data analysis (pp. 11–36). Boca Raton: CRC Press.