Additional material
This manuscript presents additional material to the article entitled: “Liver fibrosis diagnosis by blood test and elastography in chronic hepatitis C: agreement or combination?” and its Supplementary material that includes Tables S1 to S7.
Accuracy evaluation
Impact of fibrosis stage on accuracy
Figure S4 (Supplementary material) shows accuracy of tests for severe fibrosis as a function of fibrosis metrics according to Metavir F stages. Scoring accuracy curves had similar V shapes with nadirs at a stage adjacent to the cut-off for severe fibrosis (F2 or F3). The scoring accuracies of the three tests displayed similar V shapes and nadirs, i.e. relatively high
accuracy at the extreme F stages (0 or 4) but progressively lower accuracy at the approach of the cut-off stage (F2 or F3). This can be attributed to the classical border effect of the cut-off (see discussion). With classification, the shapes differed greatly, with in particular a marked attenuation of the cut-off nadir. Furthermore, FibroMeterV2G or VCTE alone had very poor F0 accuracy, but when combined, F0 accuracy improved greatly.
Impact of concordance on classification accuracy
The rate of correctly classified patients by classifications as a function of concordance between the classifications of FibroMeterV2G and VCTE is described in Table 4 (main text).
The handling of discordance is stated in the main text. The following question remains to be addressed: what fibrosis stages should be retained in cases of partial concordance of
classifications when using only constitutive tests? We note three facts: the median F stage within a class is the most prevalent (e.g. F3 in class 3); the accuracy of either classification is significantly proportional to the difference in median F stages between the two tests; thus, this accuracy falls below 87% when the median F difference exceeds 1F. Therefore, the median F stages having the lesser difference between the two classifications have to be chosen as expected; e.g. classes 3 and 2 led to retain 2/3; classes 3 and 1/2 led also to retain 2/3.
Border effect of cut-off
Due to the apparent overlap in F stages in classes adjacent to the cut-off in classifications, one can discuss the discriminative ability of these classes for fibrosis staging. Therefore, we evaluated the pathological meaning of these adjacent classes, i.e. 2 (F2±1) and 3 (F3±1) of test classifications for severe fibrosis diagnosis, and the corresponding impact of concordance status (Table S8). Briefly, in all three tests considered globally, and as expected, the Metavir score was significantly increased in class 3 vs 2 class with a mean difference in Metavir F score around 0.8. This difference was significantly higher in patients with concordant constitutive tests (mean difference in Metavir F score around 1.3). By contrast, in patients with discordant constitutive tests, this F difference became non-significant and negative (mean difference in Metavir F score around -0.3) in FibroMeterV2G and VCTE. However, this paradoxical negative result observed in the constitutive tests was neutralized by their
combination into FibroMeterVCTE2G (mean difference in Metavir F score at 0.6). Finally, FibroMeterVCTE2G was the only test to effectively reflect differences in Metavir F score between classes 2 and 3 whatever the concordance status.
What role does fibrosis classification have in border effect rescue? This accuracy rescue concerns only one Metavir stage across the cut-off. For example, considering the diagnosis of severe fibrosis, a patient with Metavir F2 would be correctly classified in class 2 or 3 but not in class 3/4. Conversely, a patient with Metavir F3 would be correctly classified in class 2 or 3 but not in class 1/2. Finally, the error in F score estimation by diagnostic tests was calculated in the whole population via the absolute difference in mean F scoring between the test
classifications and Metavir F stage (Table S9). Briefly, the F difference grew in the following order: correctly classified patients with concordant constitutive tests (F difference 0.5) then correctly classified patients with discordant constitutive tests (0.6), then incorrectly
classified patients with concordant constitutive tests (1.7) then incorrectly classified patients with discordant constitutive tests (2.0).
Agreement between metric, test and Metavir reference
Agreement between diagnostic tests and severe fibrosis by Metavir F was evaluated as a function of fibrosis metric. FibroMeterVCTE2G classification was the metric providing the most accurate prevalence of estimated severe fibrosis (32.5% vs 32.0% by Metavir, p=0.929) with the highest kappa index (0.573) of agreement (Table S10). All other test metrics provided estimated prevalences of severe fibrosis significantly different from Metavir F staging (p<0.001, except VCTE by scoring).
Interrelationships between fibrosis level and concordance or accuracy
They were evaluated for the diagnosis of severe fibrosis.
Pathology impact on concordance
Our results indicate an interaction between fibrosis degree and concordance status. Thus, the level of fibrosis in discordant patients was different from that of concordant patients (see Figure S1 first panel). We therefore further analyzed these interactions with Metavir F staging. Briefly, the Metavir F score in patients with discordant constitutive tests was significantly lower in Metavir F<3 and significantly higher in Metavir F≥3 than in patients with concordant constitutive tests (details not shown) as expected. Therefore, the correlation of independent fibrosis descriptors (blood test, VCTE and liver morphometry) with Metavir F was significantly lower in patients with discordant constitutive tests than in patients with concordant constitutive tests (Figure S1). The discordance was not due to Metavir staging (e.g. interobserver variability) since the same differences were observed with the area of porto-septal fibrosis (Figure S1) and other blood tests (data not shown). This suggests that the pathological patterns of discordant patients were different from those of concordant patients.
Regarding the correlations of diagnostic tests with Metavir staging, both FibroMeterV2G and, to a lesser extent, VCTE had lower correlations in patients with discordant constitutive tests than in patients with concordant constitutive tests (Figure S1). Finally, FibroMeterV2G was inaccurate in patients with discordant constitutive tests but very accurate in patients with concordant constitutive tests. In that setting, VCTE compensated for FibroMeterV2G’s weakness in patients with discordant constitutive tests. Indeed, VCTE had a significantly higher correlation with FibroMeterVCTE2G than with FibroMeterV2G in patients with discordant constitutive tests whereas the contrary was observed in patients with concordant constitutive tests (details not shown).
Discordance
Regarding the independent predictors of discordance between VCTE and FibroMeterV2G for severe fibrosis, the interaction between Metavir F and VCTE or FibroMeterV2G is depicted in Tables S11 and S12. Briefly, the Metavir F score in patients with discordant constitutive tests was significantly higher in Metavir F<3 and significantly lower in Metavir F≥3 than in
respective patients with concordant constitutive tests (Table S11) as expected. Therefore, the
correlation of fibrosis descriptors with Metavir F was significantly lower in patients with discordant constitutive tests than in patients with concordant constitutive tests (Table S12).
This was not due to interobserver variability for Metavir staging since the same differences were observed with the area of porto-septal fibrosis (Table S12) and other blood tests (data not shown). Regarding the correlations of diagnostic tests with pathological descriptors, both FibroMeterV2G and, to a lesser extent, VCTE had lower correlations in patients with
discordant constitutive tests than in patients with concordant constitutive tests (Table S13).
Accuracy
Accuracy by classification was evaluated as a function of fibrosis degree and test concordance between FibroMeterV2G and VCTE for severe fibrosis (Table S14). There was a significant interaction between fibrosis degree and concordance status with FibroMeterV2G and
FibroMeterVCTE2G. Indeed, the odds ratios in accuracy as a function of concordance were significantly different between Metavir F<3 and Metavir F≥3 for both tests. Thus, test concordance had a significant impact on accuracy in Metavir F<3 but not in Metavir F≥3 for both tests. But concordance tended to decrease accuracy of FibroMeterVCTE2G in Metavir F≥3 (opposite effect to FibroMeterV2G) and to significantly increase accuracy in Metavir F<3 in both tests compared to discordance. Therefore, the overall effect of concordance was not significant with FibroMeterVCTE2G in contrast to FibroMeterV2G. Finally, discordance means decreased accuracy of FibroMeterVCTE2G in Metavir F<3.
Table S8. Comparison of Metavir F score (mean±SD) between fibrosis classes adjacent to the severe fibrosis cut-off in classification as a function of concordance between FibroMeterV2G and VCTE for severe fibrosis.
Tests Fibrosis class
2 (F2±1) 3 (F3±1) F difference between classes
p a
All patients:
FibroMeterV2G 2.03±0.97 2.55±1.02 0.52±0.12 <0.001
VCTE 1.97±0.97 2.78±1.03 0.82±0.12 <0.001
p b 0.597 0.071 -
FibroMeterVCTE2G 1.81±0.89 2.61±0.97 0.80±0.11 <0.001 Concordant tests:
FibroMeterV2G 1.93±0.87 3.25±0.82 1.32±0.13 <0.001
VCTE 1.66±0.90 3.07±0.82 1.41±0.13 <0.001
p b 0.021 0.203 -
FibroMeterVCTE2G 1.80±0.89 2.97±0.94 1.16±0.18 <0.001 Discordant tests:
FibroMeterV2G 2.53±1.26 2.15±0.90 -0.37±0.24 0.227
VCTE 2.45±0.83 2.18±1.16 -0.27±0.19 0.228
p b 0.792 0.891 -
FibroMeterVCTE2G 1.83±0.89 2.47±0.95 0.64±0.14 c <0.001
F difference by concordance: Delta F d
FibroMeterV2G 0.60±0.24 -1.10±0.13 1.69 -
p e 0.061 <0.001 <0.001 -
VCTE 0.79±0.12 -0.89±0.20 1.68 -
p e <0.001 <0.001 <0.001 -
FibroMeterVCTE2G 0.02±0.12 -0.49±0.21 0.52 c -
p e 0.832 0.020 <0.001 -
a Comparison between classes 2 and 3 by unpaired Student t test
b Comparison between FibroMeterV2G and VCTE by unpaired Student t test
c p <0.001 vs F difference of FibroMeterV2G or VCTE
d Delta of F differences between concordant and discordant tests
e Comparison of F differences between concordant and discordant tests by unpaired Student t test
Table S9. Differences in F scores between test classifications and Metavir F stages as a function of test concordance between FibroMeterV2G and VCTE for severe fibrosis.
Difference is expressed as mean (±SD) of absolute differences, reflecting the cumulative individual differences, and as mean (±SD) of raw differences in brackets, reflecting the difference at the population level. In test classifications, F value was the median class value, e.g. F2 in class 2 (F2±1) or 1.5 in class 1/2 (F1/2).
Metavir vs FibroMeterV2G VCTE FibroMeterVCTE2G
All patients:
Discordant 1.05±0.76 (-0.59±1.16)
0.82±0.71 (0.05±1.09)
0.81±0.61 (-0.36±0.95) Concordant 0.62±0.45
(-0.02±0.76)
0.63±0.50 (-0.12±0.80)
0.56±0.53 (0.06±0.77) p a <0.001 (<0.001) 0.001 (0.055) <0.001 (<0.001) Correctly classified:
Discordant 0.61±0.45 (-0.37±0.66)
0.52±0.42 (0.15±0.66)
0.66±0.46 (-0.18±0.79) Concordant 0.51±0.32
(-0.03±0.60)
0.53±0.35 (-0.11±0.62)
0.47±0.42 (0.07±0.62) p a 0.018 (<0.001) 0.909 (<0.001) <0.001 (<0.001) Incorrectly classified :
Discordant 1.99±0.33 (-1.06±1.73)
1.95±0.40 (-0.32±1.99)
2.03±0.26 (-1.82±0.94) Concordant 1.58±0.29
(0.00±1.62)
1.79±0.44 (-0.24±1.85)
1.75±0.39 (0.03±1.82)
p a <0.001 (0.001) 0.092 (0.857) 0.002 (<0.001)
a Comparison between patients with discordant and concordant constitutive tests by unpaired Student’s t test
Table S10. Comparison of severe fibrosis prevalence (%) between diagnostic tests or fibrosis metrics.
FibroMeterV2G VCTE FibroMeterVCTE2G p a
Metavir 32.0 32.0 32.0 -
Scoring () 39.5 (0.482) 34.9 (0.531) 41.5 (0.561) <0.001 Classification () 41.5 (0.474) 25.0 (0.529) 32.5 (0.573) <0.001 Comparison (p):
All a <0.001 <0.001 <0.001 -
Metavir vs scoring c <0.001 0.111 <0.001 -
Metavir vs classification c <0.001 <0.001 0.929 - Scoring vs classification c <0.001 <0.001 <0.001 - VCTE:vibration-controlled transient elastography, : kappa index (all with p<0.001) with Metavir F
a Comparison between prevalences according to all diagnostic tests by paired Cochran test
b Comparison between prevalences according to all staging methods by paired Cochran test
c Comparison between prevalences according to staging method pairs by paired McNemar test
Table S11. Fibrosis degree as a function of test concordance between FibroMeterV2G and VCTE for severe fibrosis, fibrosis metric and severe fibrosis by Metavir F.
Metavir fibrosis stages
All F<3 F≥3 p a
Metavir F score:
All patients 2.00 ± 1.13 1.32±0.57 3.44±0.50 <0.001 Scoring:
Discordant 1.99±0.99 1.44±0.55 3.25±0.44 <0.001
Concordant 2.00±1.19 1.27±0.57 3.51±0.50 <0.001
p b 0.957 0.004 0.001 -
Classification:
Discordant 2.16±0.98 1.51±0.54 3.35±0.44 <0.001
Concordant 1.94±1.18 1.26±0.57 3.53±0.50 <0.001
p b 0.025 <0.001 <0.001 -
Whole fibrosis area (%):
All patients 5.1±3.9 3.9±2.2 7.7±5.3 <0.001
Scoring:
Discordant 5.2±3.6 4.4±2.2 7.2±5.3 <0.001
Concordant 5.0±4.0 3.7±2.2 7.9±5.3 <0.001
p b 0.501 0.005 0.510 -
Classification:
Discordant 5.4±3.7 4.5±2.3 6.8±5.0 <0.001
Concordant 5.0±3.9 3.7±2.1 8.1±5.4 <0.001
p b 0.287 0.004 0.166 -
a Comparison between fibrosis stages by unpaired Student t test
b Comparison between discordant and concordant patients by unpaired Student t test
Table S12. Relationship between fibrosis descriptors and test concordance between
FibroMeterV2G and VCTE for severe fibrosis by classification metric as a function of Metavir F stages.
Metavir F Concordance FibroMeterV2G vs VCTE
Discordant Concordant p a F0:
Area of portal fibrosis (%) 0.65±0.37 0.51±0.43 0.653
FibroMeterV2G 0.56±0.41 0.25±0.14 0.018
VCTE 11.9±8.6 5.9±2.2 0.009
FibroMeterVCTE2G 0.72±0.03 0.25±0.15 <0.001
F1:
Area of portal fibrosis (%) 1.28±0.76 1.06±0.70 0.077
FibroMeterV2G 0.73±0.22 0.36±0.18 <0.001
VCTE 8.6±4.7 6.0±1.9 <0.001
FibroMeterVCTE2G 0.67±016 0.31±0.17 <0.001
F2:
Area of portal fibrosis (%) 1.90±0.96 1.88±1.04 0.904
FibroMeterV2G 0.77±0.18 0.53±0.21 <0.001
VCTE 9.4±5.9 8.0±4.0 0.074
FibroMeterVCTE2G 0.75±0.14 0.50±0.24 <0.001
F3:
Area of portal fibrosis (%) 3.11±1.69 3.36±2.18 0.549
FibroMeterV2G 0.78±0.19 0.72±0.21 0.130
VCTE 9.5±2.9 12.1±5.9 0.005
FibroMeterVCTE2G 0.78±0.12 0.75±0.23 0.399
F4:
Area of portal fibrosis (%) 6.54±8.11 7.44±6.17 0.665
FibroMeterV2G 0.76±0.18 0.90±0.13 <0.001
VCTE 12.2±5.4 23.7±13.3 0.001
FibroMeterVCTE2G 0.88±0.07 0.94±0.012 0.033
Comparison between F stages:
ANOVA (p):
Area of portal fibrosis <0.001 <0.001 -
FibroMeterV2G 0.394 <0.001 -
VCTE 0.106 <0.001 -
FibroMeterVCTE2G <0.001 <0.001 -
Correlation by rs (rp) b:
Area of portal fibrosis 0.572 (0.448) 0.718 (0.610) 0.013
FibroMeterV2G 0.067 (0.090) 0.719 (0.750) <0.001
VCTE 0.260 (0.151) 0.665 (0.653) <0.001
FibroMeterVCTE2G 0.398 (0.401) 0.738 (0.778) <0.001
VCTE:vibration-controlled transient elastography
a Comparison between discordant and concordant patients by unpaired Student t test unless otherwise stated
b Correlation with F Metavir by Spearman coefficient (rs) (with comparison by Fisher’s z-test:
p) and by Pearson coefficient (rp)
Table S13. Correlation (rs) between the three diagnostic tests evaluated and pathological patterns or other tests as a function of test concordance between FibroMeterV2G and VCTE for severe fibrosis by classification metric.
Pattern / test Concordance FibroMeterV2G vs VCTE
All patients Discordant Concordant
FMV2G VCTE FMVCTE2G FMV2G VCTE FMVCTE2G FMV2G VCTE FMVCTE2G
Metavir fibrosis stage 0.622 0.609 0.701 0.067 0.260 0.398 0.719 0.665 0.738
Area of whole fibrosis 0.377 0.443 0.454 -0.169 0.242 0.109 0.478 0.466 0.494
Area of porto-septal fibrosis 0.519 0.547 0.592 -0.081 0.352 0.257 0.620 0.568 0.628
FibroMeterV2G - - 0.868 - - 0.184 - - 0.926
VCTE 0.520 - 0.792 -0.479 - 0.508 0.670 - 0.813
Area of sinusoidal fibrosis 0.086 0.137 0.128 -0.182 0.110 -0.025 0.130 0.130 0.133
Area of steatosis 0.293 0.257 0.283 0.071 0.090 -0.06 0.337 0.288 0.304
Metavir activity grade 0.445 0.226 0.412 0.181 0.063 0.034 0.490 0.376 0.465
Serum ALT activity 0.560 0.367 0.453 0.426 -0.077 -0.055 0.569 0.454 0.481
FMV2G: FibroMeterV2G, FMVCTE2G: FibroMeterVCTE2G, VCTE:vibration-controlled transient elastography Best correlations among tests per patient group are shown in bold
Table S14: Patients correctly classified (%) for severe fibrosis by the classification metric according to diagnostic tests as a function of test concordance between FibroMeterV2G and VCTE for severe fibrosis by classification metric and presence of severe fibrosis by Metavir staging.
Metavir F Concordance FibroMeterV2G VCTE FibroMeterVCTE2G As a function of concordance status:
All
Discordant 68.1 79.1 89.0
Concordant 89.9 91.5 92.7
p a <0.001 <0.001 0.118
F<3
Discordant 61.4 80.7 83.2
Concordant 92.0 93.1 93.0
p a <0.001 <0.001 0.002
F≥3
Discordant 79.4 76.5 98.5
Concordant 85.2 87.9 91.9
p a 0.285 0.031 0.112
Homogeneity b - 0.001 0.460 0.003
p c - <0.001 <0.001 0.128
As a function of fibrosis stage:
All:
F<3 - 84.4 90.0 90.6
F≥3 - 83.4 84.3 94.0
p d 0.738 0.031 0.133
Discordant:
F<3 - 61.4 80.7 83.2
F≥3 - 79.4 76.5 98.5
p d 0.012 0.497 0.001
Concordant:
F<3 - 92.0 93.1 93.0
F≥3 - 85.2 87.9 91.9
p d 0.023 0.057 0.673
Homogeneity e - 0.001 0.460 0.003
p f - 0.970 0.085 0.144
Vertical traits denote corresponding statistical tests
a Comparison between discordant and concordant tests by unpaired ² test
b Homogeneityof theodds ratio according to concordance between fibrosis classes byBreslow-Day test (interaction test)
c Comparison between discordant and concordant tests by Mantel-Haenszel test (global adjusted test)
d Comparison between fibrosis classes by unpaired ² test
e Homogeneityof theodds ratio according to fibrosis classes between concordance classes byBreslow- Day test (interaction test)
f Comparison between fibrosis classes by Mantel-Haenszel test (global adjusted test)