2.4 discussion and conclusions

(1)

2.4 discussion and conclusions 34

2.4 discussion and conclusions

Obtaining reproducible and in-focus images is of utmost importance for further image analysis.

WSS

s have built-in procedures to calibrate illumination settings to make the ac- quisition conditions reproducible. Recently, some

WSS

manufacturers provided scanners with new "continuous" focusing methods, that enable to focus each

FOV

while maintain- ing scanning speed. Although these new focussing methods provide relatively good results with

H&E

stained slides, they seem less efficient with other stains, including

IHC

. In contrast,

WSS

s of the previous generation, such as the one in use at DIAPath, use fast- focusing procedures that estimate a reasonable focus plane from few focusing points.

Although this method work well most of the time, we observed that approximately 25%

of the slides from the DIAPath routine scanning had to be rescanned. To prevent invalid quantification results, a quality control step to assess image sharpness was introduced.

During the elaboration of this quality control step, we found that focusing problems were visible only at magnifications of 10X and above, forcing the operator to assess

VS

s sharpness at high magnifications. Due to the size and the amount of the

VS

images to assess, this quality control step was considered as a tedious task, strongly reducing the scanning throughput. To accelerate this task and alleviate scanner operators’ work- load, we developed an automated tool using a supervised classification method. This method proved to be efficient and reduced the strain of the operators during the sharp- ness assessment, while augmenting the scanning throughput of the complete analysis workflow (see Figure 11) in comparison to the manual method.

However, the throughput could be increased even more by interfacing our method with the scanner driver, as carried out in [40], to automatically define new focusing points and scanning areas (see Figure 1C of the paper). The computation of the features used for the classification could also be accelerated using a parallel implementation.

However, our first effort to parallelize the feature computations on graphics processing unit (

GPU

) were not conclusive. Indeed, the time needed to transfer the large amount of data to the

GPU

outweighed any benefit of the parallelization.

Finally, even when continuous focussing methods become widespread, a quality con-

trol step will remain necessary before publishing the

VS

s. Our tool should thus remain

relevant in this new technological context.

(2)

2.4 discussion and conclusions 35

Figure 10: Example of a pre-scanning step result with automatic tissue detection, focal plane division (maximum 1.5 mm length horizontally or vertically) and focusing point posi- tioning (9 per focal plane).

Acquisition device

calibration Image acquistion

Image quality control (sharpness)

Manual or automated ROI definition

Staining characterization

Image registration Staining colocalization Morphological features

extraction

Staining segmentation Statistical analysis on

patient cohort

Figure 11: Contribution of our sharpness assessment tool in the biomarker analysis workflow.

The green box indicates the step for which our developments were made, while the

gray box shows the step which benefits from these developments.

(3)

3

T I S S U E - B A S E D B I O M A R K E R A N A LY S I S

36

(4)

3.1 conventional biomarker expression analysis 37

3.1 conventional biomarker expression analysis

In the Introduction chapter, we have presented certain pitfalls encountered in biomarker evaluation in clinical and research applications. Quantitative image analysis is often presented as a mean to provide more precise and reproducible biomarker evaluation measurements, and alleviate pathologists from the time-consuming semi-quantitative scoring. The most common quantitative features used for staining characterization esti- mate the proportion of labeled cells (or

LI

) and the staining intensity, such as described in [16]. All these features rely on the segmentation of the labeled versus negative tissue parts in the observed slide. Image processing techniques, such as colorspace conversion can be used to ease this important segmentation step. In [67], the authors proposed an efficient method to separate stains into different channels, a process which is often referred to as "color deconvolution". The interested reader will find a description of our implementation of this method in the context of this thesis in appendix B.

Since accurate cell segmentation is difficult and highly dependent on cell density,

LI

computing based on automated cell counts does not provide the most accurate estima- tion, in contrast to the ratio of positive to reference area. The term "reference area"

applies either to the whole tissue area or the total cell nucleus area that are used as references for staining evaluations (application dependent) [16]. The interested reader may find [16] in appendix C.

To evaluate staining intensity, commonly used measures are based on the total inte- grated intensity (

TII

), which is the sum of all pixels intensities computed only for pos- itive areas. To compare intensities from tissue samples with different sizes, the

TII

can be normalized by either (i) the positive surface area or (ii) the (tissue/nucleus) reference surface area. The first normalization case gives the mean intensity for the positive area, whereas the second quantifies the mean intensity of the entire reference area, such that negative pixels are considered to have zero intensity. The first feature is usually referred to as

MI

, while the second feature is sometimes labeled the "quick score" (

QS

) because it corresponds to the product of the labeling index and the mean intensity [16].

Despite the fact that the results obtained by image analysis are more precise and reproducible than manual scoring (under the condition that protocols exist to ensure re- producible image acquisition parameters), conventional measures such as

LI

,

MI

and

QS

can be influenced by staining batches which may introduce variations in staining inten-

sity. In addition to intensity-based features (such as

MI

and

QS

), these variations directly

impact the segmentation of positive versus negative pixels (and thus also

LI

). The seg-

mentation is often the result of a binary thresholding procedure for which the threshold

was chosen by investigating "representative"

FOV

s. In heterogenous tumor, the thresh-

old should be set cautiously (and tested on multiple

FOV

s exhibiting various staining

patterns) as a poor

FOV

selection may bias segmentation. If all tissue samples involved

in a study cannot undergo all the staining steps (antigen retrieval and counterstaining)

at the same time (i.e. in a single batch), a common set of tissue samples, which cover the

distribution of the

IHC

staining expression patterns under analysis, should be included

(5)

3.2 novel features for biomarker expression analysis 38

Figure 12: Two

FOV

s of Ki67-stained tissue samples with close

LI

values (respectively 20.2% and 20.5%) although the staining patterns are very different.

in the different batches for staining feature normalization. As illustrated in [16], tissue microarray (TMA) technology is a useful tool for such inter-batch normalization.

Another limitation, which will be further discussed in the next section, is that the conventional features fail to distinguish very different expression patterns. These lim- itations motivated us to develop a topological approach for staining characterization that is less sensitive to staining segmentation errors (and reasonable inter-batch staining variations). Our approach aims to characterize heterogeneous

IHC

expression distribu- tion for nuclear biomarkers. In a first attempt to characterize the distribution of nuclear biomarkers, we tested graph-based methods (see appendix D for details) without obtain- ing the expected success. This is the reason why we tried a different approach, which uses an unsupervised learning method to cluster positive nuclei in so-called "hot-spots"

(

HSs

). We have tested our method on Ki67-stained high-grade glioma sections. Ki67 is a nuclear biomarker that is expressed in the nuclei of proliferating cells; therefore, Ki67 hot-spot (

HS

)s are representative of highly-proliferating regions.

3.2 novel features for biomarker expression analysis

Conventional features for characterizing

IHC

expression patterns have been successfully

used in research to assess the prognostic value of several tissue-based biomarkers on

patient cohorts [42, 44, 64, 81]. To do so, biomarker expression of groups of patients

with different prognostic or theragnostic outcomes are characterized with these features

and their values are compared. These features cannot, however, distinguish a diffuse

staining pattern from a pattern organized in different dense regions (see Figure 12 for

an example). In recent studies on glial brain tumors, tumor heterogeneity has been

investigated by focusing the

LI

evaluation on tissue areas harboring high densities of

stained cells, or

HS

s [57, 86]. In these studies, the

HS

s were identified by pathologists

using visual (and therefore "manual") examination of low magnification images [57, 86].

(6)

3.2 novel features for biomarker expression analysis 39

In the case of proliferation biomarkers, such as the widely-used Ki67, the detection of dense staining area would isolate highly proliferative regions.

In the following paper, for which supporting information is provided in appendix F, we propose a method for detecting Ki67

HS

s. The main challenge is that Ki67 is a nuclear biomarker, henceforth, the sole segmentation of Ki67-positive nuclei will not suffice to automatically segment the Ki67

HS

s. After segmenting the labeled nuclei, it remains necessary to group together, or to cluster, close nuclei and to discard nuclei that belong to sparsely stained regions. This task requires a method capable of identifying an unknown number of clusters, which may be highly variable in terms of shape, size, and density. During our developments we observed the absence of strict definition of what should be considered as a

HS

causing a high interobserver variability during the manual annotation of the Ki67

HS

s by pathologists. Additionally, this absence prevented the use of a supervised classification approach. Therefore, we developed a hybrid clustering method that identifies good

HS