Table 1

Definitions of the psychometric properties used in the selection criteria for included articles

Domain	Measurement property	Example test statistic	Definition
Reliability			The consistency of a test or measurement, that is, how consistently a measure produces similar results with repeated measures over a short period of time or across assessors at the same time point. This can also be thought of as the correlation between observed scores across replications.
	Test–retest reliability	Correlation coefficient	Correlation between scores from the same test from assessments conducted over a short time interval.
	Inter-rater reliability	kappa, Bland-Altman test	The extent to which independent assessors produce similar ratings in judging the same abilities or characteristics in the same target person at the same time.
	Internal consistency reliability	Cronbach’s alpha, alpha	Degree of interrelatedness among items on the same tool, that is, how well the items work together to provide information on an underlying construct.
Validity			The degree to which the tool measures what it is supposed to measure, that is, the degree to which the tool reflects the underlying construct.
	Content/face validity		The degree to which the content of the tool is adequate for the construct being measured, that is, assessing the extent to which a tool appears to reflect the underlying construct.
	Concurrent/criterion validity	Correlation coefficient; regression estimate	The degree to which scores on one measurement tool are related to scores obtained at about the same point in time from another tool considered the gold standard.
	Convergent validity	Correlation coefficient; regression estimate	Evidence that scores on a test or measurement are associated with theoretically related measures or variables.
	Predictive	Correlation coefficient; regression estimate	Evidence that a score correlates with a variable that can only be assessed at some point after the test has been administered or the measurement made, for example, evidence that scores now are correlated with scores at a future time point.
	Structural validity (dimensionality)	Exploratory factor analysis: number of factors, eigen values Confirmatory factor analysis: model fit statistics such as Comparative Fit Index, root mean square error of approximation	The degree to which the scores of an assessment are an adequate reflection of the dimensionality of the construct to be measured.
Invariance			The property when a scale or construct provides the same results across different samples, populations, settings or characteristics.
	Measurement invariance over countries	Likelihood ratio χ² statistic and p-value from freeing parameters across groups	The degree to which an assessment or construct provides the same results across separate samples in different countries.
	Measurement invariance over other groups	Likelihood ratio χ² statistic and p-value from freeing parameters across groups	The degree to which an assessment or construct provides the same results across different groups.

All definitions are based on the APA Dictionary of Psychology.7