• Aucun résultat trouvé

Classification of Human White Blood Cells Using Machine Learning for Stain-Free Imaging Flow Cytometry

N/A
N/A
Protected

Academic year: 2022

Partager "Classification of Human White Blood Cells Using Machine Learning for Stain-Free Imaging Flow Cytometry"

Copied!
2
0
0

Texte intégral

(1)

Classification of human white blood cells using machine learning for stain-free imaging flow

cytometry

Maxim Lippeveld1,2, Carly Knill3, Emma Ladlow3,4, Andrew Fuller3, Louise J Michaelis5,6, Yvan Saeys1,2, Andrew Filby3†, and Daniel Peralta1,2†

1 Data Mining and Modelling for Biomedicine, VIB Center for Inflammation Research, Ghent, Belgium

2 Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Belgium

3 Institute of Cellular Medicine, Newcastle University, Newcastle upon Tyne, UK

4 Newcastle Upon Tyne Hospitals NHS Foundation Trust, Newcastle upon Tyne, UK

5 Great North Childrens Hospital, Newcastle Upon Tyne Hospitals NHS Foundation Trust, Newcastle upon Tyne, UK

6 Institute of Health and Society, University of Newcastle, Newcastle upon Tyne, UK

Keywords: Stain-free imaging flow cytometry· Classification·Bio-imaging

Motivation Imaging flow cytometry (IFC) is a technology used for studying heterogeneity in cell populations, for example in blood samples. An IFC in- strument typically produces two types of images per cell: (1) bright-field and dark-field images that capture transmitted and scattered light passing through a cell, and (2) up to 10 different fluorescence intensity images that capture flu- orescence emitted by targeted stains attached to the cell surface. These stains are used to identiy specific cell types.

The images are acquired at a throughput of 5000 cells/second. This fast ac- quisition of information-rich data makes IFC technology an ideal candidate for machine learning applications. In particular, since staining techniques are ex- pensive and possibly a confounding factor, a machine learning-aided application of interest isstain-free cell type classification.

Problem Identifying the cell type based on bright-field, dark-field and fluo- rescence images is relatively easy through targeted staining followed by manual gating. In manual gating cells are hierarchically subdivided into populations of cell types based on measurements extracted from the images, such as fluo- rescence intensity, image heterogeneity, cell roundness, and other morphology features. Each population subset is defined by selecting an area on a 2D-scatter plot of two measurements.

When the analysis is limited to only stain-free data, the identification be- comes much more challenging. Then, only morphological characteristics inherent

Denotes equal contribution.

Copyright c2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).

(2)

2 M. Lippeveld et al.

to the cell can be captured in the dark- and bright-field images. Manual gating is in this case no longer feasible, as this linear, bivariate method is too simplistic to uncover the subtle morphological differences between cell types.

Approach In this work we assess whether we can use machine learning classi- fiers to tackle this problem. We use two IFC datasets in which immune cells were labeled with a ground truth cell type by an expert, using the targeted staining process described earlier. The classifiers are trained to predict this ground truth using only dark- and bright-field data.

Additionally, we extensively compare the performance of deep and classical machine learning on this problem. For the former, we use two convolutional neural nets: ResNet and DeepFlow. DeepFlow is an architecture optimized for IFC data. These models are optimized with the Adam optimizer from scratch using the images (90x90 pixels) acquired by the imaging flow cytometer. For the latter models, we use a random forest and gradient boosting classifier. These models are trained on a host of features computed on the images by IDEAS, a specialized software suite for IFC data.

We apply data augmentation on minority class instances to counter class imbalance present in the data. For feature-based models instances are randomly over-sampled, and for image-based models, instance images are randomly ro- tated, shifted, and flipped horizontally or vertically.

Results We obtained results on two datasets: (1) one contained eight labels identifying several white blood cell (WBC) types, the other (2) contained three, identifying eosinophils – an immune cell type circulating in the blood – in two activation states. The datasets contain 98 013 and 190 557 instances, respectively.

On the WBC dataset the best obtained results were 0.778 and 0.703 balanced accuracy for classical machine learning and deep learning, respectively. On the eosinophil dataset this was 0.871 and 0.856 balanced accuracy, respectively.

In both datasets, the classifiers struggle with fine-grained cell type classifi- cation. For example, in the WBC dataset, T-cells can be separated from other cell types relatively well with a recall of 0.789. However, fine-grained CD4+ and CD8+ T-cell classification is much more challenging (recall of 0.609 and 0.588, respectively). The same is seen for the eosinophil dataset, where eosinophils can be separated from other cells, but active and resting eosinophils cannot. This could be a limitation of stain-free IFC: the bright- and darkfield images might not contain enough information to accurately make a fine-grained distinction.

Conclusion We conclude that classifying cell types based on only stain-free images is possible with all four classifiers, but fine-grained classification remains challenging. Noteworthy, we also find that the deep learning approaches tested in this work do not outperform the approaches based on manually engineered features.

Acknowledgements The NVIDIA GPU Grant Program supported this work.

M. Lippeveld is a PhD fellow of the Research Foundation - Flanders (1SB9419N).

Références

Documents relatifs

Herein, we use single-cell mass cytometry to dissect the effects of graphene oxide (GO) and GO functio- nalized with amino groups (GONH 2 ) on 15 immune cell populations,

In haematology patients presenting with accessible pulmon- ary nodules or lesions, we strongly recommend further investi- gation via bronchoscopy, with lavage of the affected

classified as croissants manually but not by the CNN (false negative slippers) are depicted in the left image, whereas all false positive slippers are shown in the right montage

Our contributions focus on two issues that are specific to fluorescence imaging: a) the difficulty of associating objects from one frame to another in a noisy or a poorly

This article aims to use multi-resolution remote sensing imaging technology(two kinds of resolution images: the high-resolution images, such as ASTER images, and the low

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des

This is encouraging since it proves that the fluorescent probe can accurately scan the nanofiber fundamental guided mode, whose evanescent field spreads 200 nm outside the surface,

MAHMOUDI ABDELKARIM Professeur d’Anesthésie-Réanimation Inspecteur du service de santé des Forces Armées Royales En témoignage de notre grand respect et notre profonde considération