Overview of the diagnostic value of biochemical markers of liver fibrosis (FibroTest, HCV FibroSure) and necrosis (ActiTest) in patients with chronic hepatitis C

Background Recent studies strongly suggest that due to the limitations and risks of biopsy, as well as the improvement of the diagnostic accuracy of biochemical markers, liver biopsy should no longer be considered mandatory in patients with chronic hepatitis C. In 2001, FibroTest ActiTest (FT-AT), a panel of biochemical markers, was found to have high diagnostic value for fibrosis (FT range 0.00–1.00) and necroinflammatory histological activity (AT range 0.00–1.00). The aim was to summarize the diagnostic value of these tests from the scientific literature; to respond to frequently asked questions by performing original new analyses (including the range of diagnostic values, a comparison with other markers, the impact of genotype and viral load, and the diagnostic value in intermediate levels of injury); and to develop a system of conversion between the biochemical and biopsy estimates of liver injury. Results A total of 16 publications were identified. An integrated database was constructed using 1,570 individual data, to which applied analytical recommendations. The control group consisted of 300 prospectively studied blood donors. For the diagnosis of significant fibrosis by the METAVIR scoring system, the areas under the receiver operating characteristics curves (AUROC) ranged from 0.73 to 0.87. For the diagnosis of significant histological activity, the AUROCs ranged from 0.75 to 0.86. At a cut off of 0.31, the FT negative predictive value for excluding significant fibrosis (prevalence 0.31) was 91%. At a cut off of 0.36, the ActiTest negative predictive value for excluding significant necrosis (prevalence 0.41) was 85%. In three studies there was a direct comparison in the same patients of FT versus other biochemical markers, including hyaluronic acid, the Forns index, and the APRI index. All the comparisons favored FT (P < 0.05). There were no differences between the AUROCs of FT-AT according to genotype or viral load. The AUROCs of FT-AT for consecutive stages of fibrosis and grades of necrosis were the same for both moderate and extreme stages and grades. A conversion table was constructed between the continuous FT-AT values (0.00 to 1.00) and the expected semi-quantitative fibrosis stages (F0 to F4) and necrosis grades (A0 to A3). Conclusions Based on these results, the use of the biochemical markers of liver fibrosis (FibroTest) and necrosis (ActiTest) can be recommended as an alternative to liver biopsy for the assessment of liver injury in patients with chronic hepatitis C. In clinical practice, liver biopsy should be recommended only as a second line test, i.e., in case of high risk of error of biochemical tests.


Background
One of the major clinical problems is how to best evaluate and manage the increasing numbers of patients infected with the hepatitis C virus (HCV) [1]. Liver biopsy is still recommended in most patients [2,3]. However, numerous studies strongly suggest that due to the limitations [4][5][6] and risks of biopsy [7], as well as the improvement of the diagnostic accuracy of biochemical markers [8,9], liver biopsy should no longer be considered mandatory.
The aim of this study was to summarize the diagnostic value of these tests by an overview of the scientific literature and to respond to the following frequently asked questions by performing original new analyses: 1) what is the range of the FT-AT diagnostic values across the different studies? 2) What are the base evidence comparisons between FT-AT and other published biochemical markers? 3) Are there differences in diagnostic values according to HCV genotype or viral load? 4) Are there differences between the FT-AT diagnostic values according to stages and grades? -In other words, is FT better at predicting no or minimal fibrosis (F0 vs F1) or advanced fibrosis/cirrhosis (F3 vs F4) than at predicting intermediate levels of fibrosis (F1 vs F2)? And 5) what is the conversion between FT-AT results and the corresponding fibrosis stages and necrosis grades?  [8,9,[11][12][13][14][15][16][17][18][19][20][21][24][25][26] and 4 abstracts [27][28][29][30] without corresponding publications were identified.

Diagnostic value of FT-AT among published studies
For 12 groups of patients detailed in 6 publications [8,11,12,14,19,26], it was possible to assess the preva-lence of significant fibrosis and the FT area under receiver operating characteristics curve (AUROC) values, as well as the sensitivity and specificity for the 4 different FT cut offs (Table 1). For the diagnosis of significant fibrosis by the METAVIR scoring system, the AUROC ranged from 0.73 to 0.87, significantly different from random diagnosis in each study (Table 1), in meta-analysis (mean difference in AUROC = 0.39, random effect model Chi-square = 529, P < 0.001) (Figure 1, upper panel), or after pooling data in the integrated database (Table 2). For the cut off of 0.31, the FibroTest negative predictive value for excluding significant fibrosis (prevalence 0.31) was 91% ( Table 2).
For four groups of patients detailed in two publications [8,11], it was possible to assess the prevalence of significant necrosis and the AT AUROC values, as well as the sensitivity and specificity for 4 different AT cut offs (Table 3). For the diagnosis of significant necrosis by the METAVIR scoring system, the AUROC ranged from 0.75 to 0.86, significantly different from random diagnosis in each study (Table 3), in meta-analysis (mean difference in AUROC = 0.29, random effect model Chi-square = 556, P < 0.001), or after pooling data in the integrated database (Table 4). For the cut off of 0.36, the ActiTest negative predictive value for excluding significant necrosis (prevalence 0.41) was 85% ( Table 2).

Comparison of FT-AT diagnostic values with other biochemical markers
In four studies there was a direct comparison in the same patients of FT versus other biochemical markers, including hyaluronic acid [12], the Forns index [16], the APRI index [17] and the GlycoCirrhoTest [26]. All the comparisons were in favor of FT (Table 1) (Figure 1, lower panel), except for the GlycoCirrhoTest, which has a similar AUROC (0.87 vs 0.89 for FT) [26].

Integrated database
A total of 1,570 subjects were included in the integrated database. Of these, 1,270 were patients with chronic hepatitis C who tested PCR positive before treatment and who had had a liver biopsy and METAVIR staging and grading performed. Of these patients, 453 were from our center [11,14], including 130 patients coinfected with HCV and HIV [14]. Eight hundred and seventy (870) patients were from a multicentre study with a total of 398 patients assessed at inclusion and 419 at the end of follow-up six months after treatment; 352 being investigated twice. Three hundred (300) healthy blood donors were also included [20].

Diagnostic value of FT-AT according to HCV genotype and viral load
There was no difference between the AUROC of FT-AT for the diagnosis of significant fibrosis (F2F3F4) (Figure 2A)  Figure 2B) between 4 classes of genotype (1, 2, 3 and the rarer genotypes 4, 5, 6 grouped together). There was also no difference between the AUROC of FT-AT of patients with high or low viral loads for the diagnosis of significant fibrosis ( Figure 2C) or significant necrosis ( Figure 2D).

Diagnostic value of FT according to the independency of authors
Among the 13 published studies of FT (detailed in Table  1), 9 studies estimated FT and 4 studies compared FT to other non-invasive tests. Among the 9 studies estimating FT, 5 were performed by the same single center (non-independent center), two were performed in totally independent centers, and two were performed in multiple centers, including the non-independent center. The AUROCs for the diagnosis of F2F3F4 versus random AUROCs at 0.50, were all significant and similar between these 3 groups in a meta-analysis: mean difference in AUROC = 0.29 (random effect model Chi-square = 549, P < 0.001), including 0.24 for independent, 0.25 for mixed and 0.36 for dependent studies. In the Callewaert et al. [26] study the AUROC of FT for the diagnosis of F4 was 0.89.

Diagnostic value of FT-AT according to stage and grade
The AUROCs between different stage combinations are given in Table 5. Between two contiguous stages (one stage difference), the AUROCs were not significantly different and ranged from 0.63 to 0.71. Between patients with a two-stage difference, the AUROCs were not significantly different and ranged from 0.75 to 0.86. Between patients with a three-stage difference, the AUROCs were not significantly different and ranged from 0.87 to 0.95. Between patients with a four-or five-stage difference (blood donors versus F3 or F4, and F0 versus F4), the AUROCs were not significantly different and ranged from 0.95 to 0.99.
The AUROCs between different grade combinations are given in Table 6. Between two contiguous grades (one grade difference), the AUROCs were not significantly different and ranged from 0.60 to 0.70. Between patients with a two-grade difference, the AUROCs were not significantly different and ranged from 0.75 to 0.86. Between patients with a three-grade difference, the AUROCs were not significantly different and ranged from 0.87 to 0.95. Between patients with a four-grade difference (blood donors versus F3 and F0 versus F4), the AUROCs were not significantly different and ranged from 0.95 to 0.99.

Discussion
Based on the limitations of liver biopsy and the present overview of the diagnostic value of FT-AT, it seems that these non-invasive markers should be used as a first line Liver biopsy has three major limitations, which are the risk of adverse events [2,3,7], sampling error [4][5][6], and  inter-and intra-pathologist variability [23]. An overview of published studies summarizes the risks of liver biopsy as pain (around 30%), severe adverse events (3/1,000) and death (3/10,000) [2,3,7]. Sampling variation is the major cause of variability [4][5][6]. In a study of patients with chronic hepatitis C that included only good quality biopsies, 30 of 124 patients (24.2%) had a difference of at least one grade, and 41 of 124 patients (33.1%) had a difference of at least one stage between the right and left lobes [4]. In 18 patients (14.5%), an interpretation of cirrhosis was made in one lobe, whereas stage 3 fibrosis was made in the other [4]. Recently, Bedossa et al. [6] observed very high coefficients of variation (55%) and high discordance rates (35%) for fibrosis staging in biopsies measuring 15 mm in length. The variability significantly improved in biopsies measuring 25 mm in length but was still very high with a 45% coefficient of variation and 25% discordance rate; the minimal variability was reached for biopsies, which were 40 mm in length [6].
Liver biopsy has also potential advantages. Biopsy could be of diagnostic value for other unrecognized liver disease. These events are probably rare in practice, as we observed no such a case in a prospective study of 537 consecutive patients with chronic hepatitis C [9]. For FT-AT it must be realized that the same predictive values were observed for patients coinfected with HIV [14], and in patients with other causes of liver fibrosis such as chronic hepatitis B [31], alcoholic liver disease [27] or non-alcoholic steato-hepatitis [27].
It is possible that biochemical markers such as those described here may provide a more accurate (quantitative and reproducible) picture of fibrogenic and necrotic events occurring within the liver than hepatic biopsy. The greater accuracies of FT-AT, when assessed with biopsy specimens greater than 15 mm versus smaller biopsies, suggest that some discordance between FT-AT and histology were due to biopsy specimen sampling error [8]. Several case reports have observed false negatives of liver biopsy versus biochemical markers [8,9,11]. The error was attributable to biopsy because there were overt clinical signs of cirrhosis such as esophageal varices, low platelet counts or a dysmorphic liver on ultrasound. In a recent prospective study we estimated that 18% of discordances between FT-AT and histology were attributable to biopsy failure (mostly due to small length) and 2% to FT-AT failure [9].
The present work allowed frequently asked questions to be answered, the first being whether the diagnostic values of FT-AT had been confirmed in all studies performed to date. A major strength of the studies pertaining to FT-AT is that they were carried out on a large number of patients with chronic hepatitis C, and the results were reproducible in different populations, including patients coinfected with HIV. There was a small variability in the AUROCs, A weakness of this study was that the same group, which developed these tests, performed most of the published studies. However the independent published studies found the same significant diagnostic values than nonindependent or multicentre studies. Several recent independent studies confirmed the predictive value of FT-AT [26,30].
The second question concerned the comparison of FT-AT to other tests. In their recent review, Gebo et al. [10] concluded that panels of markers might have the greatest value in predicting the absence or no more than minimal fibrosis on biopsy, and in predicting the presence of cirrhosis on biopsy (Evidence Grade B). They pointed out that five studies [11,[32][33][34][35] used large panels of markers and achieved the greatest predictive values. Among these 5 studies were the first FT-AT study [11] and another study developed by the same group (combining age and platelets) [34]. A recent study compared FT-AT to the age and platelets index in the same patients and found that FT-AT was significantly better [15]. Three studies directly compared FT-AT, to hyaluronic acid [12], the Forns index [16] and the Wai index [17] in the same patients. FT-AT had higher diagnostic values (the AUROC was significantly higher). FT was in particular more sensitive for discriminating between F1 and F2, and more linearly correlated to stages when compared to those 3 other markers [12,16,17]. An additional weakness of the Forns index is the inclusion of cholesterol, which varies greatly in patients with genotype 3 [16]. The limitations of these three comparisons [12,16,17] are that they were retrospec-tive and were performed by the same group. These comparisons, however, had no evident sources of bias. The comparison with the Forns Index [16] included all patients of the Imbert-Bismut et al. study (n = 323) [11], as the parameters belong to the routine biochemical tests. The comparison with the APRI index included 249/323 patients (77%) without any difference between included or non-included patients when all characteristics were compared [17]. The comparison with hyaluronic acid [12] included a total of 165 out of the 244 (68%) randomized patients pre-included. The 165 included patients did not differ from the 79 non-included patients according to the The AUROC between all different stage combinations are given. Between two contiguous stages (one-stage difference), the AUROCs are given in bold. Between patients with a two-stages difference, the AUROCs are given in italics. Between patients with a three-stages difference, the AUROCs are given in bold and italics. Between patients with a four-or five-stages difference (blood donors versus F3 or F4, and F0 versus F4), the AUROCs are underlined. Significant differences were observed between AUROCs when there was a two-stage or more difference. - The AUROCs between all different grade combinations are given. Between two contiguous grades (one-grade difference), the AUROCs are given in bold. Between patients with a two-grades difference, the AUROCs are given in italics. Between patients with a three-grades difference, the AUROCs are given in bold and italics. Between patients with a four-or five-grades difference (blood donors versus F3 or F4, and F0 versus F4), the AUROCs are underlined. Significant differences were observed between AUROCs when there was a two-grade or more difference. main characteristics. Among the 165 patients, the fibrosis index was assessed in 461 samples and hyaluronic acid in 457 samples [12].
Recently, a study using profiles of serum protein N-glycans found that a profile has a similar AUROC than FT for the diagnosis of compensated cirrhosis. When combined with FT this marker had 100% specificity and 75% sensitivity for the diagnosis of compensated cirrhosis, which is not significantly different from the 92% specificity and 67% sensitivity of the FT [26]. This study was independent and prospectively designed for taking FT as the comparison test. Only 24 patients with cirrhosis were included and no details were given concerning the causes of discordance between biopsy and biochemical markers.
However FT-AT is the only panel of markers identified by an independent overview [9], which has been compared in the same patients with most of the other proposed markers. No studies were found that compared FT-AT with a panel of extra-cellular matrix markers [31]. Compared to other panels, FT-AT also allowed an estimation to be made not only of the fibrosis stage but also the necroinflammatory (histological) activity.
The present analysis of the integrated database demonstrated that the diagnostic value of FT-AT did not depend on HCV genotype or viral load. However, because of the small number of patients included, studies in genotype 4, 5 and 6 would be useful.
The present analysis also answered another frequently asked question concerning the predictive values for the intermediate stages of fibrosis. Contrary to the initial hypothesis, the diagnostic values of FT-AT for consecutive stages of fibrosis and grades of necroinflammatory activity were the same for both moderate and extreme stages and grades. Our interpretation is that the same overlap exists between all stages, which is mainly related to the sampling error of the biopsy. It is very reassuring that the medians of FT-AT are linearly associated with stages and grades ( Figures 3A,3B). The linearity of this association became even more evident as a larger number of patients were included (data not shown).
Finally, the integrated database allowed a simple conversion system to be proposed to clinicians between liver injury as estimated by the FT-AT and that as estimated by liver biopsy (Figure 4). One conventional way to express Conversion between FibroTest and fibrosis stages, and between ActiTest and necroinflammatory activity grades -Graphs Figure 3 Conversion between FibroTest and fibrosis stages, and between ActiTest and necroinflammatory activity grades -Graphs.

B -ActiTest
the diagnostic values of FT-AT was summarized using the cutoffs of the distribution by stages and grades (Tables 2  and 4). The negative predictive value of FT for excluding significant fibrosis was excellent for the 0.31 cutoff (91%), as was the negative predictive value for excluding significant activity at the 0.36 cutoff of AT (85% negative predictive value). The positive predictive value of the 0.72 cutoff of FT for significant fibrosis was also high at 76%. This, however, may appear lower than the negative predictive value. There is a technical explanation owing to the prevalence of significant fibrosis, which was only 0.31 in this population. According to the excellent specificity (above 0.95), the positive predictive value increased rapidly in populations with more fibrosis (data not shown). We recently observed that the main reason for this was probably because most of the so-called false positives of the FT were in fact false negatives due to the small sampling size of liver biopsies [5,9]. The same comments can be made concerning the positive predictive value of AT for significant necrosis with 77% at the 0.60 cutoff. Again, it is probable that a large proportion of so-called false positives of AT were in fact false negatives due to liver biopsies which were too small. The ideal study would be one using biopsies measuring 40 mm in length, as two samples of 20 mm each during laparoscopy. Only this very high quality biopsy can be considered as a true gold standard. Obviously this type of biopsy cannot be performed routinely as first line, but it could be recommended for clinical research.

Conclusions
Based on these results, the use of the biochemical markers of liver fibrosis (FibroTest) and necrosis (ActiTest) can be recommended as an alternative to liver biopsy for the first line assessment of liver injury in patients with chronic hepatitis C.

Analysis of the literature
We did a search for all publications and communications between February 2001 and March 2004 with the key words "FibroTest" and "ActiTest" in Medline and in the abstract books of hepatology, gastroenterology, internal medicine and infectious diseases annual meetings. Only publications or abstracts concerning FT-AT in chronic hepatitis C were included.

Diagnostic value of FT-AT among published studies
For each study we assessed the diagnostic value for the diagnosis of significant fibrosis (bridging fibrosis or stages F2, F3, F4 according to the METAVIR scoring system) and significant necroinflammatory activity (moderate or severe necrosis, grades A2 or A3 according to the META-VIR scoring system) by the area under the receiver operating characteristics curve (AUROC).
For several databases it was possible to re-analyze the individual data and we looked at the sensitivity and specificity according to different thresholds (0.10, 0.30, 0.60 and 0.80). When FT-AT was compared to other biochemical tests, we also assessed the corresponding sensitivity and specificity according to several thresholds.
Conversion between FibroTest and fibrosis stages, and between ActiTest and necroinflammatory activity grades -Panels Figure 4 Conversion between FibroTest and fibrosis stages, and between ActiTest and necroinflammatory activity grades -Panels. Conversion between FibroTest and fibrosis stages using METAVIR, Knodell and Ishak fibrosis scoring systems (upper panel). Conversion between ActiTest and activity grades using METAVIR, Knodell and Ishak necroinflammatory activity scoring systems (lower panel