Statistical matching in practice
An application to the evaluation of the education system from PISA and TALIS
Statistical matching methods are aimed at the integration of information collected through multiple sources, usually, surveys drawn from some target population. As opposed to record linkage methods -where we search for identical units-, in statistical matching we search for similar units in order to find statistical relations across databases. Methods: Statistical matching is feasible provided that the independent surveys share a common block of variables. A particular solution is based on imputation methods for missing data: first, the distinct files are concatenated (i.e. rows and columns are joined together to form a unique file); next, empty cells corresponding to non-observed values are interpreted as missing data, and they are imputed according to observed data. Results: The fundamental concepts of statistical matching are shown, and the process is illustrated with the PISA (2012) and TALIS (2013) educational studies with Spain’s data. Imputations are carried out using mice package from the free R software. A first validation of the results is performed. Conclusions: Statistical matching offers high potential benefits for the social sciences since it enables to relate information from independent information sources. These techniques can now be applied with relative ease thanks to the development of tools such as R computing environment.