Statistical test | Facet of validity reflected | Interpretation criteria | ||
---|---|---|---|---|
 |  | Good outcome | Acceptable outcome | Poor outcome |
Strength and direction of association at individual level [8] | ≥0.50 [2] | 0.20 - 0.49 [2] | <0.20 [2] | |
Paired t-test/ Wilcoxon signed rank test [8,22,23,25,27,28,33,34,36,48,49,52-56,60,62,65,66,68,69,91] | Agreement at group level [8] | P > 0.05 [8] |  | P ≤ 0.05 [8] |
Percent difference [8,22,23,25,27,28,33,34,49,52-56,60,65-72,91] | Agreement at group level (size and direction of error) [8] | Â | 0.0 - 10.0% [77] | >10% |
Cross-classification (tertiles/ quartiles or quintiles) [8,22,31,32,35-38,41,42,44-51,55-61,63-69,91] | Agreement (including chance), at individual level [8] | ≥50% in same tertile/quartile [2] ≤10% in opposite tertile/quartile [2] |  | <50% in same tertile/quartile [2] >10% in opposite tertile/quartile [2] |
• In same tertile | ||||
• In opposite tertile | ||||
Weighted Kappa statistics (coefficient) [8,24,26,30,40,43,54,58,59,63,64,66-69,91] | Agreement (excluding chance) at individual level [8] | ≥0.61 [2] | 0.20 - 0.60 [2] | <0.20 [2] |
Bland Altman analysis: Correlation between mean and mean difference) [6,21,33,34,37-39,43,50,53,54,61,63,69,76,92] | Presence, direction and extent of bias at group level [6,76] | P > 0.05 [6] |  | P ≤ 0.05 [6] |