B RINGING 3D AND Q UANTITATIVE D ATA IN F LEXIBLE E NDOSCOPY Benjamin Mertens
Jury : Pr. Pascal Kockaert (Pr´esident) Pr. Olivier Debeir (Secr´etaire) Pr. Alain Delchambre (Promoteur) Dr. Charles Beumier Pr. Guido Costamagna Pr. Jacques Devi`ere
Th`ese originale d´epos´ee en vue de l’obtention
du titre de Docteur en Sciences de l’Ing´enieur. Ann´ee acad´emique 2013-2014
To Lucie, Capucine & Ariane;
the lights who structure my life.
Acknowledgements
Je tiens `a remercier certaines personnes sans qui ce travail ne serait pas ce qu’il est.
Tout d’abord, le Prof. Alain Delchambre qui m’a propos´e de faire une th`ese au service BEAMS et qui m’a encadr´e malgr´e son emploi du temps tr`es charg´e.
Le Prof. Jacques Devi`ere de l’hˆopital Erasme, qui a toujours pris le temps avec gentillesse, entre deux avions, pour me donner des conseils et pour me montrer ce qu’est la gastroent´erologie.
Nicolas Cauche qui a ouvert la voie `a cette collaboration avec les ´equipes de gastroent´erologues d’Erasme, il est toujours d’un bon conseil et fait passer les autres avant lui.
Prof. Pierre Lambert pour ses conseils toujours pertinents et le temps pass´e devant la fontaine d’eau ou entre deux portes `a discuter de choses et d’autres.
Prof. Christophe Collette, toujours de bonne humeur et pour la relecture du pr´esent document.
Prof. Olivier Debeir qui a toujours la r´eponse aux probl`emes que je lui soumets et pour ses id´ees.
Prof. Pascal Kockaert qui met toujours le doigt sur le point qui m’a ´echapp´e, m’invitant `a aller toujours plus loin.
EndoTools Therapeutics, et en particulier Alexandre Chau et Martin Hiernaux pour la collaboration et pour m’avoir permis de fabriquer mes prototypes.
Regina Sauma pour avoir corrig´e mon anglais.
Tous ceux qui ont ´et´e l`a tant au service qu’en dehors et qui m’ont donn´e envie de venir tous les matins.
Enfin, et surtout, Ariane, ma moiti´e, pour m’avoir soutenu durant ce travail.
Abstract
In a near future, the computation power will be widely used in endoscopy rooms. It will enable the augmented reality already implemented in some surgery. Before reaching this, a preliminary step is the development of a 3D reconstruction endoscope. In addition to that, endoscopists suffer from a lack of quantitative data to evaluate dimensions and distances, notably for the polyp size measurement. In this thesis, a contribution to more a robust 3D reconstruction endoscopic device is proposed. Structured light technique is used and implemented using a diffractive optical element. Two patterns are developed and compared: the first is based on the spatial-neighbourhood coding strategy, the second on the direct- coding strategy. The latter is implemented on a diffractive optical element and used in an endoscopic 3D reconstruction device. It is tested in several conditions and shows excellent quantitative results but the robustness against bad visual conditions (occlusions, liquids, specular reflection,...) must be improved.
Based on this technology, an endoscopic ruler is developed. It is dedicated to answer endoscopists lack
of measurement system. The pattern is simplified to a single line to be more robust. Quantitative data
show a sub-pixel accuracy and the device is robust in all tested cases. The system has then been validated
with a gastroenterologist to measure polyps. Compared to literature in this field, this device performs
better and is more accurate.
Acronyms
3D 3 Dimension
CT Computed Tomography
DSfM Deformable-Shape-from-Motion DOE Diffractive Optical Element FOV Field Of View
FPS Frames Per Second
GERD GastroEsophageal Reflux Disease GI Gastrointestinal
GPU Graphic Processor Unit HD High Definition
NOTES Natural Orifices Transluminal Endoscopic Surgery POV Point Of View
SfM Shape-from-Motion SfS Shape-from-Shading
SLAM Simultaneous-Localisation-and-Mapping SPA Single Port Access
ToF Time-of-Flight
Contents
1 Introduction . . . . 1
1.1 Introduction . . . . 1
1.2 Structure and Reading Suggestions . . . . 2
1.2.1 Structure of the Report . . . . 2
1.2.2 Reading Suggestions . . . . 2
1.3 Contributions . . . . 3
1.4 Context and Motivations . . . . 4
1.4.1 Evolution of Endoscopy . . . . 4
1.4.2 Technical Challenge : Providing Quantitative Information . . . . 7
1.4.3 Measuring in Endoscopy: Prior Art . . . . 8
1.4.4 Discussions . . . . 9
1.5 Establishment of the Medical Requirements List . . . 10
1.6 Conclusion . . . 10
2 Key Concepts . . . 13
2.1 What is the 3D ? . . . 13
2.2 3D From Parallax . . . 13
2.2.1 Generalisation to 3D . . . 14
2.2.2 Epipolar Geometry . . . 15
2.2.3 Theoretical Calculation of the Error . . . 16
2.3 What is an Endoscope? . . . 18
2.4 What is a Diffractive Optical Element? . . . 18
3 State of the Art . . . 19
3.1 Introduction . . . 19
3.2 Making 3D in Endoscopy . . . 19
3.2.1 Stereoscope . . . 20
3.2.2 Structured Light . . . 23
3.2.3 Shape-from-Motion . . . 26
3.2.4 Shape-from-Shading . . . 27
3.2.5 Time-of-Flight . . . 28
3.2.6 Other Technologies . . . 29
3.3 Discussion and Conclusion . . . 29
4 Imaging : Making 3D from Structured Light . . . 33
4.1 Introduction . . . 33
4.2 Structured Light . . . 34
4.2.1 Pattern Coding Strategies . . . 34
4.2.2 Defining Sequence and the Symbols . . . 34
4.2.3 Image Processing Constraints . . . 37
4.2.4 Discussions . . . 37
4.3 Spatial Neighbourhood-based approach . . . 39
4.3.1 Proposed Pattern . . . 39
4.3.2 Results . . . 43
Contents
4.3.3 Conclusion . . . 46
4.4 Direct coding-based Approach . . . 47
4.4.1 Hypotheses . . . 47
4.4.2 Construction of the pattern . . . 49
4.4.3 Decoding the Pattern . . . 50
4.4.4 Experimental Setup . . . 51
4.4.5 Results . . . 51
4.4.6 Conclusion . . . 54
4.5 Discussion and Comparison of Both Approaches . . . 55
4.6 Implementation on a Diffractive Optical Element . . . 57
4.6.1 Constraints . . . 58
4.6.2 Construction of the Pattern . . . 59
4.7 Conclusions . . . 60
5 Calibration . . . 61
5.1 Introduction . . . 61
5.2 Description of the Hardware . . . 61
5.3 Camera Calibration . . . 62
5.3.1 From Real World to Image . . . 62
5.3.2 Image Distortion . . . 63
5.3.3 Calibration Accuracy . . . 65
5.4 Structured Light System Calibration . . . 66
5.4.1 Prior Art . . . 67
5.4.2 Calibration Approaches . . . 69
5.4.3 Results and Comparison of both Approaches . . . 74
5.5 Conclusions . . . 77
6 Diffractive Optics . . . 79
6.1 Introduction . . . 79
6.2 Diffractive Optical Element . . . 80
6.2.1 Principle of Diffractive Optics . . . 80
6.2.2 Constraints . . . 81
6.2.3 Manufacturing of the Diffractive Optical Element . . . 82
6.3 Camera . . . 82
6.3.1 Origin of the Speckle . . . 83
6.3.2 Quantification . . . 86
6.3.3 Speckle Reduction . . . 87
6.4 Beam Source . . . 88
6.4.1 Light Coherence . . . 88
6.4.2 Influence of the Beam Spot Size . . . 90
6.4.3 Influence of the Wavelength . . . 90
6.5 Conclusions . . . 91
7 Device development . . . 93
7.1 Introduction . . . 93
7.2 Mechanical Development . . . 94
7.2.1 Technical Specifications . . . 94
7.2.2 Design Solutions . . . 94
7.2.3 External Device . . . 95
7.2.4 Embedded Device . . . 97
7.3 Synchronisation System Development . . . 98
7.3.1 Video Signals . . . 98
7.3.2 Implementation . . . 100
7.3.3 Results . . . 100
7.4 Conclusions . . . 101
xiv
Contents
8 Integration and Validation Tests . . . 103
8.1 Introduction . . . 103
8.2 3D Reconstruction Endoscopic Device . . . 104
8.2.1 Integration . . . 104
8.2.2 Integration Tests . . . 104
8.2.3 Discussion . . . 107
8.3 Endoscopic Ruler . . . 108
8.3.1 Integration . . . 109
8.3.2 Integration Tests . . . 110
8.3.3 Validation . . . 115
8.3.4 Discussion . . . 116
8.4 Conclusions . . . 117
9 Discussion, Conclusion and Further Work . . . 119
9.1 Discussions and Conclusions . . . 119
9.2 Further Work . . . 120
References . . . 122
A Interesting Videos and 3D Models . . . 127
A.1 Videos . . . 127
A.2 3D Models . . . 128
B External System Calibration . . . 135
C Appendix : Additional Results . . . 139
D Bill Of Material . . . 143
D.1 Devices . . . 143
D.2 Calibration . . . 143
D.3 Test Equipment . . . 143
E Publication List . . . 145
F Patent for Endoscopic Integration of the Device . . . 147
Chapter 1
Introduction
“If I had asked people what they wanted, they would have said faster horses.”
Henry Ford
1.1 Introduction
Colorectal cancer strikes more than 140,000 people every year in the United States, 50,000 of them will die in 2014. Over a lifetime, the risk of developing a cancer is about 5%.
1Gastroenterologists have great challenges ahead of them.
The complete digestive track is a complex and sometimes hardly accessible 9-metre long channel. The esophagus starts beyond the oral cavity and is a 25 cm long track of a diameter of 2 cm to 3 cm [1], see Fig. 1.1. Down the esophagus is the stomach, an up to 4 litres expanding chamber. At its base, is the duodenum, where gastric juices coming from the pancreas and from the gall bladder are mixed with the alimentary bolus. It is directly followed by the 6-metre-long small intestine. Next, is the large intestine with variable diameters and four sections: the cecum, the sigmoid, the rectum and the anal canal.
The esophagus, the stomach and the duodenum are easily accessible by the oral cavity; it is the upper gastrointestinal (GI) endoscopy. The large intestine is accessible by the anal canal: the lower GI en- doscopy. The sigmoid shape of the large intestine makes it difficult to access it without hurting walls.
Diseases could appear in every section of this long digestive channel: from the esophagus till the rectum.
Different equipment and patients preparations are required for treatment. There is still room for innova- tion in this field. The interested reader may find out more in the paper published by Valdastri [1]. The gastrointestinal channel offers natural orifices but its length and shape are challenging to access every section. Technology is of major importance to provide gastroenterologists with all the required tools to successfully diagnose and treat patients.
1
According to the American Cancer Society : http://www.cancer.org/
CHAPTER 1. INTRODUCTION
Stomach, expanding from 0.1L -> 4 Liter Esophagus,
25cm to 30 cm in length and 2 cm -> 3 cm in diameter
Small intestine, a 6 meters in length and 3
cm in diameter
Large intestine, 1.5 meters in length
Rectum
Anal canal Duodenum
Gall bladder
Pancreas
Appendix
Fig. 1.1 Anatomy of the gastrointestinal tract; modified from http://www.tutorialsolutions.com
1.2 Structure and Reading Suggestions
In this section, the structure of the report is presented, then the reading suggestions are detailed.
1.2.1 Structure of the Report
This report is structured in several parts built like a building, see Fig. 1.2. Section 1.4 of the present chapter details the context and motivations: the basement. It develops the need and gives the technical specifications to fulfil. Next is the state-of-the-art, chapter 3. It determines what technology is the most suitable to answer the need. The device,schematically represented Fig. 1.3, is made of different parts called units. Each of them (chapters 4, 5, 6, 7) is developed and unit-tested at the first floor. A pattern is built (chapter 4), calibrated (chapter 5) and projected by a laser beam diffracted by a diffractive optical element (DOE) (chapter 6). The beam is carried by an optical fibre and focused with a lens.
A camera images the scene and provides 3D reconstruction and quantitative data. Based on this, two devices are developed on the next floor: a 3D reconstruction device and a measurement device (the endoscopic ruler). They are integrated in chapter 8 and respectively detailed in section 8.2 and in section 8.3. The integration test sections checks whether the technical specifications are fulfilled or not. Then, the validation section checks whether the technical specifications are well defined or not. The electronic version of this report, in appendix A, contains videos and 3D models. This appendix must be consulted on a computer (videos are not working on tablets).
1.2.2 Reading Suggestions
Chapters and sections given in table 1.1 are essential to understand this thesis.
2
1.3. CONTRIBUTIONS
Units Development
Context & Motivations
(section 1.4)State-of-the-art
(chapter 3)Validation
(section 8.3.3)Pattern (chapter 4)
Calibration (chapter 5)
Diffractive Optics (chapter 6)
Device Development
(chapter 7)
Integration
(chapter 8)3D Endoscope (section 8.2)
Endoscopic Ruler (section 8.3)
Fig. 1.2 Structure of the report, it is built like a building, where every floor relies on the other to stand up.
Camera
Optical Fibre
Pattern DOE
Lens
Fig. 1.3 The device is made of several units: an optical fibre brings a laser beam to a lens. It focuses the beam for the diffractive optical element (DOE). It diffracts the beam and projects the pattern onto the surface of interest. The scene is imaged by a camera.
Chapter/section Title Page Why ?
1.4 Context and Motivations 4 Introduces the need
2 Basics 13 Introduces the key concepts
3.3 Discussion of the state-of-the-art 29 Summarizes the state-of-the-art 4.6 Pattern on the Diffractive Optical Element 57 Details the pattern used
7.2 Mechanical development 94 Device used
8.2 3D Endoscope 104 Results of the 3D device
8.3 Endoscopic Ruler 108 Results of the endoscopic ruler
Table 1.1 Reading suggestions for this thesis
1.3 Contributions
Several contributions were developed during this research at different levels. On a conceptual level, the major contribution is the Endoscopic Ruler device: it robustly provides quantitative data to physicians.
In particular, the analysis of the need and the used technology to address the problem are novel. The 3D reconstruction device has also better results compared to literature. In particular, this is the first 3D reconstruction device based on a diffractive optical element in flexible endoscopy.
On a technical point of view, two patterns were developed for structured light, see sections 4.3 and 4.4. The first combines two characteristics (period and duty cycle) to limit the required number lines for identification. The second proposes a direct identification of every element in a stripe-based pattern.
Here is also developed the integration of structured light in flexible endoscopy.
A new calibration of the structured light system is proposed. It is based on cubic spline and shows
better results.
CHAPTER 1. INTRODUCTION
1.4 Context and Motivations 1.4.1 Evolution of Endoscopy
Endoscopy has a relatively short story behind it. Evolutions have greatly accelerated in the last decades.
Here is presented the evolution of endoscopy to give the reader the required background. Figure 1.4 gives an overview of the major evolutions in endoscopy; they are detailed in section paragraph.
Evolution of Endoscopy 20
30 Augmented Reality ?
02 00 'sCapsule endoscopy
HD endoscope Spectral endoscopy 1985
CCD Cameras
91 96 1st polypectomy 1950
's First video endoscopy First fibre based gastroscopy
81 07 's1st endoscopic procedure
Edison's globe
Fig. 1.4 Timeline giving an overview the major evolutions of endoscopy
T
ECHNICALE
VOLUTION OFE
NDOSCOPY:
FROMC
ANDLEL
IGHT TOC
APSULEE
NDOSCOPYThe first successful operative endoscopic procedure was fully conducted in the nineteenth century (1867) by Antonin Jean Desormeaux known as the ”father of endoscopy” [2]. He improved a device developed several years earlier by Phillip Bozzini: the lichtleiter (the light guide in german [3]). He used concave mirrors to conduct candle light into human cavities. In 1877, Nitze miniaturised the Edison’s filament globe to place it at the tip of the endoscope. Clinicians quickly realised they would not have a clear view without an insufflation of the observed cavities. An air canal was added several years later. The first fibre-based-flexible-endoscope was invented in 1954 by Hopkins; endoscopy made a step in the era of modern endoscopy. Three years later, in 1957, Hirschowitz performed the first fibreoptic gastroscopy, see Fig. 1.5.
Fig. 1.5 First fibreoptic gastroscopy performed in 1957 by Hirschowitz [4]
Looking in an endoscope and manipulating at the same time is not easy. Visual comfort was improved in 1956, where the first video endoscopy was conducted in France. The endoscope was connected to a regular television camera; the patient had to be taken to a video studio for this first experiment. This improvement was widely spread with the introduction in 1985 of CCD image sensors. Endoscopists could now look at the screen instead of looking in the endoscope.
4
1.4. CONTEXT AND MOTIVATIONS More recently, standard white light has evolved into spectral endoscopy -the use of various wavelengths to enhance the visibility of some diseases. Image quality has also been improved passing to high quality images (HD : 1920x1080 pixels). Recent developments have miniaturised cameras and light sources to fit in a pill-sized endoscope. Viewing the complete small bowel is now easily achieved by this pill and about to be widespread. Possibilities of diagnosis have thus evolved with the emergence of new technologies. From diagnosis, endoscopy has evolved to therapy.
M
EDICAL EVOLUTION OFE
NDOSCOPY: F
ROMD
IAGNOSIS TOE
NDOSCOPICS
URGERYAt first, endoscopes were exclusively used for diagnosis but it quickly evolved to therapeutic endoscopy.
Since the 1960s, early endoscopic procedure have been developed and generalised. In particular, the colonoscopic polypectomy first performed by Hiromi Shinya in 1969 [3, 4]. Morgenthal et al. [3] con- sider this procedure as the most significant development in therapeutic endoscopy. New challenges are on the way and Pauli and Ponsky [4] have divided developments into three categories. The first one is the elimination of high-volume surgery: Gastrostomy, morbid obesity, gastroesophageal reflux disease (GERD). The second one is to improve the efficacy of surgery: localization of gastrointestinal bleeding or polyps and lesions. The third one is to eliminate complex operations with high morbidity : treatment of variceal haemorrhage, biliary interventions and management of Barrett esophagus.
This list is not exhaustive and is frequently extended. Recent advances of Natural Orifices Translumi- nal Endoscopic Surgery (NOTES) are leading gastroenterology to the era of real surgery. This technique uses natural orifices to access other cavities such as the peritoneal cavity. It represents a new opportunity to achieve scar-free procedure, see Fig. 1.6. However, surgery with a Single Port Access (SPA) is de- veloped in parallel. SPA enables making surgery with one single incision and passing all the equipment through it.
Fig. 1.6 Access to peritoneal cavities through natural orifices [5]
Pauli and Ponsky [4] conclude with this:
“Clearly, surgeons must remain involved with flexible endoscopy if they wish to remain competitive. (...)”
For several years, endoscopy has made a big step towards the mini-invasive surgery. From a simple
viewing tool, endoscopy has evolved to gastrointestinal surgery.
CHAPTER 1. INTRODUCTION
C
URRENTL
IMITS OFE
NDOSCOPES ANDF
URTHERD
EVELOPMENTSThe combination of technical and medical developments have enabled great improvements in en- doscopy. Gastroenterologists are now proceeding with complex surgical procedures passing through natural orifices. Many big steps forward in medical procedures were made possible thanks to technical developments. Endoscopes permanently show their limits and engineers work hard to provide clinicians with improved devices.
Current Limits
Identifying limits is not simple; here are mentioned three of them. First, complex procedures are hard to complete with one hand. Ongoing developments are trying to solve this: triangulation systems provide the endoscope with a second hand. Second, standard 2D vision is not well adapted for real surgery. The evaluation of the tool position related to the wall is not easy. Third, measuring distances and sizes is challenging. Users face problems in evaluating the working depth or in measuring distances and sizes with 2D cameras. In particular, the lack of depth perception is easily understood with tool such as Endomina developed by Endotools Therapeutics S.A. see Fig. 1.7.
Fig. 1.7 Endomina developed by Endotools Therapeutics SA [6]
This tool provides endoscopes with two arms. It switches endoscope from a one hand-based pro- cedure to two hand-based procedures, for example: lift and cut. Augmented reality may solve these limitations, notably by reconstructing a 3D model of the endoscopic view. This point is addressed in the next paragraph.
Towards Augmented Reality Endoscopy
A recently published paper by L. Maier-Hein et al. [7] highlights clinical applications of bringing 3D reconstruction to laparoscopy. Two conclusions can be applied to gastroenterology: the intra-operative registration for augmented reality guidance, and second, the biophotonics
2. In this field, measuring 3D shape and sizes is important for new biophotonics modalities [7, 8]. Often, the small field of view limits these techniques. Moving tissues are then hardly analysed. A 3D reconstruction of the tissue offers a way to overcome this limitation. The field of view is virtually expanded by a registration of observed tissues. Figure 1.8 gives an overview of the identified challenges to solve before having a widespread real-time-augmented-reality-3D-endoscope. Four main categories are considered: performance, integra- tion, reliability and clinical acceptance.
Performance is the first category engineers should focus on. Maier-Hein et al. [7] have shown that to their knowledge, no 3D reconstruction system is robust enough to handle bad visual conditions of endoscopy (smoke, blood, homogeneous surfaces,...). The surface registration is still difficult and biomechanical modelling of deformable surface needs to be developed [7]. Sensors fusion would enable taking ad- vantages of different technologies [8, 7]; for example, structured light is more robust on homogeneous surface whereas stereoscopy is more robust on textured objects.
Integration and Reliability are underlined by Valdastri et al. [1]; devices will be widespread in hospital if they are properly integrated in current endoscopic systems and reliable. Especially, the validation needs
2
Biophotonics modalities consist in using light properties to analyse tissues in-vivo.
6
1.4. CONTEXT AND MOTIVATIONS
Real time augmented reality 3D endoscopy
Performance 1.
Robustness in
gatrointestinal environment 1.1
Liquids Occlusions Smoke
Non-rigid surface registration
1.2 Biomechanical modeling of
deformable surfaces
Sensor fusion 1.3
Synchronisation between sensors Taking advantage of different technologies Real time performance
1.4
High quality visualization 1.5
Integration 2.
Miniaturisation 2.1
Integration in actual endoscopic systems 2.2
Reliability 3.
High diagnostic accuracy 3.1
Low complication rates 3.2
Standardization to ensure reproducibility and informative value
3.3
Clinical acceptance 4.
Widespread acceptance if integrated into hospital information systems 4.1
Short learning curve 4.2
Clinical workflow integration 4.3
Fig. 1.8 Towards a real time augmented reality 3D endoscope: the technical challenges.
to be [7] made under realistic and reproducible conditions. So far, all the experiments have been made with phantoms or under hardly-reproducible conditions.
Clinical acceptance [7], at the end, the systems will be used if physicians are ready to use it.
T
ECHNICALC
ONTEXTComputation power has provided new opportunities for image processing, notably in terms of 3D imag- ing and image processing. Seeing a 3D movie at the theatre has become common and it may be soon widely spread in everyone’s home. On the other hand, real-time processing of pictures with Instagram on a smartphone is also a good witness of the power of computation power. Imaging seizes all the opportunities offered by computation units; this is still the early stage of developments.
C
ONCLUSIONSDevelopments have improved many aspects of endoscopy, and this is still the early stage of develop- ments. In particular, computation power is now about to be spread in clinical rooms. Image processing is a new wave of developments. Current “acquire and see” will soon be replaced by a “acquire, process and see”. Computers will provide clinicians additional data to help them focus on the patient.
Many teams of engineers are working hard on developing new devices to address these limitations. In a near future, multi-modal imaging will be generalised. Integration of pre-operative images (CT-SCAN, MRI,...) with endoscopic views will be soon a reality; it would provide the augmented reality environ- ment. But before being widely spread, technical challenges need to be solved : performance, reliability, integration and clinical acceptance.
1.4.2 Technical Challenge : Providing Quantitative Information
In the context described above, a 3D reconstruction endoscope clearly appears to be a major step. Many teams of researchers have developed such endoscopes, as described in the state-of-the-art (see chapter 3). However, as pointed out by Maier-Hein et al. [7], no 3D reconstruction system is robust enough to be used in endoscopy rooms. There are still some steps to go through to reach the goal, as illustrated Fig. 1.9. In addition to 3D vision, placing all points in a 3D space mainly provides two functionalities:
3D navigation and getting quantitative data to make size measurements and shape analysis. The first one
could also be partly brought by 3D vision as it enables a qualitative relief- and shape-analysis. The sec-
CHAPTER 1. INTRODUCTION
2D View
Functionalities
3D Reconstruction
Mesurements capabilities (Depth, sizes,...)
3D navigation
Image registration
with pre-operative images Augmented Reality endoscopy
Fig. 1.9 Steps before having real time augmented reality 3D endoscope; 3D reconstruction functionalities could be split into two categories : navigation and quantitative data.
ond one brings absolute size information (depth, distances, sizes,...): the quantitative analysis. Research has been brought at every steps of this sequence. Especially thanks to laparoscopy (and neurosurgery, in particular) which already has implemented the augmented reality environment. In flexible endoscopy and laparoscopic surgery, deformable environment and bad visual conditions lead to robustness issues.
I
S3D
VISION REQUIRED?
Laparascopy has already integrated 3D vision systems. Open-surgery has ”simply” reduced the incision sizes. However, in flexible endoscopy some studies have shown [9, 10] that 3D vision has no significant advantage over a good 2D vision; moreover, 3D vision gives headaches and is not comfortable enough [10]. This is balanced by Neumann et al. [11] who have recently shown that 3D vision could help in some specific cases, such as Barrett esophagus diagnosis procedure. Augmented reality will definitely bring more to gastroenterology than regular 3D vision could. Contrary to surgeons, they are used to having a 2D view. The added-value of 3D vision in flexible endoscopy is therefore limited.
Figure 1.10 gives an overview of the elements that led to this conclusion.
3D Endoscope
ContextDevelopment of endoscopic surgery Computation power development Mass development of 3D devices Development of triangulation systems
Constraints Limited space
Real-time Actual image must be kept identical Gastrointestinal environment
Need Depth perception
Distance evaluation Distance measurement
Functionalities
Distance measurements 3D View
Registration with pre-operative images
Fig. 1.10 Mind map to a 3D reconstruction endoscope
1.4.3 Measuring in Endoscopy: Prior Art
Some research team have already tried to address the need of quantitative data. Notably for the polyp size measurement in colonoscopy. It is challenging and of great importance to diagnose which ones could be malignant. Paris classification has been established [12]
3and is the gold standard set of criteria for this type of diagnosis. This classification is based on the geometry analysis (size, shape) of polyps. It
3
This classification is well detailed on the World endoscopy organization http://www.worldendo.org/
paris-endo-classification.html
8
1.4. CONTEXT AND MOTIVATIONS is evolving [13] but criteria remains based on geometry. In colon capsule endoscopy, measuring sizes is even more challenging. Contrary to endoscopy where distance and sizes may be evaluated by moving the endoscope or using biopsy forceps, in capsule endoscopy, there is no way to get this information [14].
Summers has published a review of the polyp size measurement [15]. He reveals that the polyp sizes evaluation is still challenging. Nowadays, several methods based on pre-operative image acquisition have shown good results. However, no in-vivo method is as reliable as the size measurement after a polypectomy. Gopalswamy et al. [16] have compared three in-vivo for measuring polyp sizes: the visual estimation, the open biopsy forceps methods
4and a linear probe with ten markings (each of 2 mm).
According to them, the latest is the most efficient. Unfortunately, according to Summers [15], the open biopsy forceps method is the most frequently reported in literature. According to Schoen et al. [17], pathologic size measurements of polyps must be preferred to the unreliable endoscopic estimations.
Measurements differ from the reference standard by more than 20%. In the review published by Sum- mers et al. [15], it has been shown that colonoscopic measurements were 12% higher than measurements based on pre-operative images (CT). On the other hand, pathologic measurements were 25% smaller than those at CT colonography. Barancin et al. [18] confirmed this result, they have shown that in-vivo and CT measurements were less reliable than ex-vivo measurements.
Some teams have developed dedicated devices to answer this need. Hyun et al. [19] have combined two endoscopes to get quantitative 3D information. Users had to manually select corresponding points on both images to get it located in 3D. Nakatani et al. [20] have projected four laser spots to estimate the working distance. The image is then scaled with the estimated ratio according to the distance. Their main assumption is that the observed area is approximately plane (which is a rough estimation). Depth- From-Focus has been implemented by Chadebecq et al. [21]. They combined Depth-From-Focus and Depth-From-Defocus
5to recover absolute scale. No additional hardware is required for this method but the endoscopist has to move away from the surface to enable the detection. Another issue may be due to large depth of field of endoscopic cameras, it decreases the system accuracy.
All these device try to enable the resect and discard strategy. It consists in resecting and discarding polyps without further analysis for small sized polyps. No in-vivo measurement device seems to be efficient enough for clinicians. Here is a great opportunity for research.
1.4.4 Discussions
It has been shown that 3D reconstruction in endoscopy faces robustness issues. This must be solved before getting into endoscopy rooms. Such devices would bring both quantitative and qualitative infor- mation. The former is missing in endoscopy, in particular for size and depth evaluation. The latter is not widely recognised as relevant in endoscopy. Moreover, no measuring device seems to be efficient enough to be integrated in endoscopes.
Here is summarised the analysis of the section:
• No 3D reconstruction device is robust enough to be widespread in clinical rooms.
• 3D reconstruction devices could bring quantitative and qualitative information.
• Qualitative information relies on the quality of the image more than on depth perception.
• Quantitative information is missing in endoscopy.
Here is proposed to split the quantitative and qualitative problems. First, a research to contribute to a more robust 3D reconstruction device. Second, the development of a robust quantitative mea- surement tool.
As detailed in Chapter 8, the endoscopic ruler is, in fact, a simpler and more robust 3D reconstruction device.
4
knowing the forceps size, the polyp size is roughly estimated
5
These methods are described in section 3.2.
CHAPTER 1. INTRODUCTION
1.5 Establishment of the Medical Requirements List
Defining the specifications of a device is a complex and an iterative task [22]. The following list focuses on the prototype that would validate the concept. This list has to be extended in further developments.
According to Pahl et al. [22], the first step is to define primary and complementary functionalities, constraints and the requirement list itself follow.
P
RINCIPALF
UNCTIONALITIESHere are described the functionalities that should be fulfilled by the device. The resolution should permit the detection of millimetre-sized abnormalities, and at minimum below the millimetre as detailed in Paris Classification [12].
1. Depth evaluation of the working distance, resolution below 1 mm;
2. Distance measurements, resolution below 0.2 mm;
3. Emphasise of the wall relief to distinguish abnormalities, resolution below 0.2 mm.
C
OMPLEMENTARYF
UNCTIONALITIESThe complementary functionalities are expected by the user without having been formally demanded.
These requirements are much less important in this research which focus on a proof of concept.
1. Easy embodiment on/into an endoscope;
2. Real time calculation (less than 40 ms to ensure 25 images per second);
C
ONSTRAINTSConstraints have multiple sources: legal, geometry, non-invasiveness, performance. In this case, the main constraint is the gastrointestinal environment and all its implications: space available, visual envi- ronment,...
1. Legal: respect of medical device standards and of existing patents and copyrights.
2. Geometry: space available limited.
3. Non-invasiveness: the quality of the original image must be preserved: the system should not be visually invasive.
4. Robustness: robustness is a key concept and must be ensured.
Robustness as it used in this report refers to the robustness against bad visual conditions. They are mainly due to the biological environment (smoke, blood,...). The system is considered as robust, if results are good whatever the environment. In a 3D reconstruction, it is measured by the number of points that are well positioned compared to the total number of points (quantitatively), or simply, by the quality of visual rendering (qualitatively).
R
EQUIREMENTSL
IST TO ADDRESS THE NEED DETAILED IN THIS SECTION.
All these criteria are gathered in table 1.5. The device may not fulfil all the requirements but should at least be compatible with all these aspects. This list must also be updated with further developments.
1.6 Conclusion
This chapter has raised the question to solve in this thesis. More than 3D vision, endoscopists need quantitative data for multiple reasons. However, augmented-reality endoscope are soon going to be required and need to be developed in parallel. In this thesis, two parts are developed. First a research
10
1.6. CONCLUSION Requirement
Size Total diameter of the device and the endoscope below 20 mm Geometry No sharp edges
Environment Gastrointestinal environment, respect of all the applicable medical devices standards Safety Safe for the patient and the user
Resolution The resolution of the depth and distances measurements must remain below 0.2 mm User Interface Easy to handle, original image preserved
Performance Real time calculation, must allow 25 images per second Mounting time Must remain below several minutes
Table 1.2 Requirements list
is brought to develop a 3D reconstruction endoscope. Based on this device, a simpler but more robust
measurement device is developed to answer the need for quantitative data raised here above.
Chapter 2
Key Concepts
“Dans les sciences, le chemin est plus important que le but. Les sciences n’ont pas de fin.”
Erwin Chargaff
2.1 What is the 3D ?
The third dimension refers to the third axis: depth. Here are distinguished two different sort of 3D: 3D Vision, and 3D reconstruction.
The first one is reconstructed by the brain. As people go to see a 3D movie to the theatre, see Fig. 2.1a they wear 3D glasses to perceive depth. Two images are sent, one to the left and one to the right eye, depth is then automatically reconstructed by the brain.
(a) Principle of 3D vision (image from http://evolutionducinema.blogvie.com/)
(b) Principle of 3D Reconstruction (image from http://fond-ecran-colore.com/) Fig. 2.1 3D Vision versus 3D Reconstruction
The second one is the 3D reconstruction and is a computer model; it is completely independent of the way it is displayed: in 2D or in 3D. For example, a Pixar Movie could be displayed in 2D. the movie was generated in 3D but displayed in 2D. In other words, 3D reconstruction is a computer model, see Fig. 2.1b. In a nutshell, 3D vision is 3D in the brain and 3D reconstruction is 3D in a computer.
2.2 3D From Parallax
There are several techniques to get 3D data. One of them is called parallax, and is very important in this
thesis. An object viewed along different point of views results in a difference in the apparent position;
CHAPTER 2. KEY CONCEPTS
this is the parallax principle, see Fig. 2.2. One single Point Of View (POV) is not sufficient for finding
Point of View 1
Point of View 2
Point of View 1 Point of View 2
Point of view 1
Point of view 2
α
d
β
(x,z) z
x
Fig. 2.2 Illustration of the parallax, a different position in the point of view results in different views.
the depth coordinate. The red dot could be anywhere along the line going from POV 1 to the red dot; the other Point Of View (POV2) is required to solve this problem. Coordinates (x, z) of the red dot are easily calculated given the distance d between both point of views and angles α and β . The interest point is at the intersection of both lines. In this ideal case, x and z position are given by Eq. 2.1 and illustrated in Fig. 2.3.
Point of View 1
Point of View 2
Point of View 1 Point of View 2
Point of view 1
Point of view 2
α
d
β
(x,z) z
x
Fig. 2.3 Calculation of the depth, the red dot is at the intersection of lines pro- jected from POV 1 and POV 2. Parameters are the baseline d and angles α and β .
( x = z tan(α )
x = d + z tan(β ) (2.1)
2.2.1 Generalisation to 3D
The principle described in previous section is easily generalised in 3D. Here is considered the first point
of view (POV1): the reference point: (0, 0, 0). In this reference coordinate, every line -reduced, without
loss of generality to its directing vector- going from any point to POV1, has for parametric equations
2.2. In equations 2.2 and 2.3, u (and respectively v) is the equation parameter, α
x,y(and respectively β
x,y)
provides the slope to the red ball in its reference coordinate. One single reference coordinate is required
to solve this system of equations. Lines of Eq. (2.3) (coordinate reference of POV2 is noted prime
0) are
14
2.2. 3D FROM PARALLAX
Point of view 1
Point of view 2
y x z
Fig. 2.4 Generalisation for 3D calculation, a third equation is required for the third coordinate; it is given by the second point of view.
x = u tan(α
x) y = u tan(α
y) z = u
(2.2)
x
0= v
0tan(β
x0) y
0= v
0tan(β
y0
) z
0= v
0(2.3)
rotated with a rotation matrix R and translated with the translation vector T = (x
0, y
0,z
0) to reference of POV1. Lines are defined by the origin and the directing vector, in matrix notations:
P
i= R
−1h
P
i0−T i
(2.4) Or in the other coordinate reference, see Eq. 2.5.
P
i0= R P
i+ T (2.5)
The system of the equations (2.2), (2.3) and (2.4) gives the 3D position of the red dot. The solution is easily calculated and is given in equation (2.6).
x = z tan(α
x) y = z tan(α
y) z =
−z0 tan(βx)+x0tan(αx)−tan(βx)
z =
−ztan(α0 tan(βy)+y0y)−tan(βy)
(2.6)
Three unknowns must be determined: x, y,z. Equations (2.4) show that the z-coordinate is calculated in two different manners, one using x-equations and another one using y-equations. Now, four equations are given by solving this system: two for each of Eq. (2.2) and (2.3). Geometrically, the x-based calcu- lation is the intersection in the xz plan and the y-based solution is the intersection in the yz plan. On the one hand, if lines perfectly intersect in one single point, the calculated z is the same. On the other hand, two different z-coordinates are calculated. Next section explains how to deal with this issue: epipolar geometry.
2.2.2 Epipolar Geometry
Here is introduced the basics of epipolar geometry. The interested reader may find out more in chapter 8 of the book written by Hartley and Zisserman [23]. Explanations given here are described in the context of stereovision but the principle is applicable to all parallax-based 3D reconstruction systems, particularly to structured light where one camera is replaced with a projecting device. It is assumed that two cameras are used: a left (C) and a right C
0camera. Details about structured light are provided in section 3.2.2.
Let P be an interest point (red dot) in 3D space, given C and C
0the origin of cameras, the epipolar plane π is the plane built by these points, as illustrated Fig. 2.5. The distance C - C
0is called the baseline.
The epipolar line (in red) is the image of the epipolar plane. Assuming now, the position of camera
2 related to camera 1 is well known. For every interest point P, the epipolar plane is directly found
CHAPTER 2. KEY CONCEPTS
P
P?
P
P?
Image Plane 2 Image P
lane 1
C C’
Im ag e P lan
e 1
Im ag e Pl an e 2
Baseline
(a) The epipolar plane is defined by the camera origins C, C
0and the object point P.
P
P?
P
P?
Image Plane 2 Image Plan
e 1
C C’
Image P lane 1
Image Plane 2 Baseline
(b) The epipolar line is the image of the epipolar plane π in the image plane. An object P viewed in the left camera is searched along the epipolar line to find its 3D coordinates.
Fig. 2.5 Illustration of the epipolar plane and of epipolar lines [23].
given fixed points C, C
0and the point of interest P. This has two implications. First for every point P imaged in camera 1, its image in camera 2 is along a known line (the epipolar line). The 3D- problem is brought back to a series of 2D-problems. Second, a displacement in depth in the world coordinate results in a displacement along the epipolar line in the image plane. In practice, the point is identified in camera 1 image, as the second camera position is known, the epipolar plane is found.
The corresponding point in camera 2 image is searched along the epipolar line.
In the example given in the equation (2.6), the xz plane is the epipolar plane. Thus, the y-based equation is redundant in both equations (2.2) and (2.3). If epipolar lines are horizontal, the calculation of z with the y coordinate is not possible as tan(α
y)− tan(β
y) = 0. If the epipolar plane is not merged with the xz plan, the second calculation method may be used as an evaluation of the error.
This section has highlighted the parallax principle for 3D reconstruction. Several solutions are possible : use two cameras, move one single camera or still replace a camera by a projector. The geometry described here above was given for one single point. This process has to be applied to every point/pixel of interest. This could lead to long computation time.
2.2.3 Theoretical Calculation of the Error
Theoretical calculation of the error allows to evaluate whether the device under or out-performs com- pared to theory. It also enables to make sure main sources of error are properly mastered. In this case, the localisation of a point is limited by the pixel resolution. A high definition image provides much more details than a low definition image. As detailed in the previous section, a 3D space problem could be brought back to a 2D problem. Any point p in its epipolar plane is localised by (x,z) as illustrated Fig.
2.6. The point p is imaged by two devices (typically cameras) with an angle α and β . The baseline is the distance d. In the case of structured light, camera 2 is replaced with a projector. The abscissa position x from both devices as :
( x = z tan (α )
x = d − z tan (β ) (2.7)
By withdrawing x and proceeding with the differential, it easily comes that:
| δ z
z |=| (δ (tan (α)) + δ (tan (β ))) z
d | (2.8)
where (see Fig. 2.6):
16
2.2. 3D FROM PARALLAX
f
xp
x
Fig. 2.6 Illustration of the parameters used for the positioning error of the point P at (x,z), α and β are the angles between the camera central line and the point of interest and d is the baseline. The focal length f
xand the pixel disparity δ p
xare used for the conversion in pixel.
•
δzzis the relative error at working depth z;
• d is the baseline;
• α and β are respectively the angle between the observed point and the camera 1 straight axis and the same for camera 2;
• δ tan (α) =
δfxpp
, where x
pis the pixel position of the object and f
pis the focal length expressed in pixels.
In the particular case of structured light, camera 2 is replaced with a projector. If tan (β ) is well known, equation 2.8 may be rewritten as in Eq. 2.9.
| δ z
z |=| zδ tan (α )
d | (2.9)
Or still, with
dz= tan (α ) + tan (β ) it comes that :
| δ z
z |=| δ tan (α )
tan (α ) + tan (β ) | (2.10)
Lateral Resolution
The lateral resolution δ x may also be calculated using this model. Starting from equation 2.7, δ x is easily calculated.
δ x = δ ztan (α ) + zδ tan (α) (2.11)
This equation gives δ x, given the error on z and on the tangent of the angle.
E
VALUATION OF THEE
RROR INP
IXELAs detailed, the error is a function of the baseline and of the working depth. In this report, to get rid of any geometric parameters, errors are presented in the pixel unit.
| f
xdδ z
z
2|=| δ p
x| (2.12)
In some cases, specially in the validation phase, errors are given in millimetres to match the users needs.
CHAPTER 2. KEY CONCEPTS Lateral Resolution
Using a similar reasoning, equation 2.11 becomes : δ x = δ z p
xf
x+ z δ p
xf
x(2.13)
Where, p
xis the position of the point of interest in pixel and δ p
xis the lateral resolution. The more centred is the point of interest, the better is the accuracy.
2.3 What is an Endoscope?
An endoscope is a tool to view (-scope) inside a closed volume (endo-). In medical applications, two types of endoscopes exist : flexible and rigid endoscopes. The latter is used in open surgery (the la- paroscopy) where the former is used in gastroenterology. Flexible endoscopes enable to access gastroin- testinal cavities through natural orifices. Flexible endoscopes are typically made of a camera, an operator channel and a lighting (see Fig. 2.7). The typical diameter is of about 1 cm but it could be smaller or larger depending on the application. Nowadays, endoscope embed high definition cameras.
Camera
Lighting Operator
channel
Fig. 2.7 A typical endoscope is a 1 m-long and thin tube of approximately 1 cm. It is made of a camera, a lighting (conducted with an optical fibre) and an operator channel.
2.4 What is a Diffractive Optical Element?
A Diffractive Optical Element (DOE) is a small optical part usually made of plastic or glass. It diffracts a laser beam to create a pattern, see Fig. 2.8. It is popularly used to project pattern in night clubs or with laser pointer. The pattern could be made more complex by respecting some constraints, see chapter 6 for more details.
Fig. 2.8 A DOE diffracts a focusing laser beam to create a pattern (parallel lines in this case). It is popularly used with laser pointer.
18
Chapter 3
State of the Art
“Puisqu’on ne peut ˆetre universel et savoir tout ce qu’on peut savoir sur tout, il faut savoir un peu de tout.
Car il est bien plus beau de savoir quelque chose de tout que de savoir tout d’une chose;
cette universalit´e est la plus belle.”
Pascal, Pens´ees Abstract Several technologies may be used to proceed with 3D reconstruction in endoscopy, mainly : stereoscopy, structured light, shape-from-motion, shape-from-shading and time-of-flight. Here are dis- cussed all technologies and their potential application to flexible endoscopy. It comes out that structured light is the more robust and the most accurate. It is the best candidate for the application developed in this thesis.
Main Contribution The state-of-the-art for 3D reconstruction technologies applied to flexible en- doscopy and in particular, the proposed taxonomy. Technologies are classified under two axis, the space invasiveness and the visual invasiveness.
3.1 Introduction
The state-of-the-art section here is dedicated to defining the technology that should be used to fulfil the requirements presented in table 1.5. A more specific state-of-the-art will be made in following chapters according to the needs. In following sections, see Fig. 3.1, each technology is developed under three
State-of-the-Art
Stereoscopes Structured Light Shape-from-Motion Shape-from-Shading Time-of-Flight Other Technologies
Fig. 3.1 Outline of the chapter, it gives an overview of all technologies analysed.
paragraphs: the working principle, applications to endoscopy and a discussion. The last section is a global discussion over this state-of-the-art. The design of 3D devices in endoscopy has really started two decades ago together with the mini-invasive-surgery development. Figure 3.2 illustrates the evolution of the number of publications (Fig. 3.2b) and patents (Fig. 3.2a) since 1996. There is an increasing interest in this field of research.
3.2 Making 3D in Endoscopy
The goal of 3D reconstruction endoscopes is to locate every observed point in 3D space. There are
multiple ways to achieve this, parallax is the most common way . There are two categories of devices,
parallax-based, see section 2.2 and the others (gathered in the monocular devices). Here are analysed all
CHAPTER 3. STATE OF THE ART
126 107 131
97 141
8782 76 88
60 4750 56 36 4547 35 16
22
17
1011 9 8
4 3 5 4 4 3 2 3 4 2 0 5
(a) Number of patents published with the search key 3D Endoscope on espacenet.com
126 107 131
97 141
8782 76 88
5060 5647 36 4547 35 16
(b) Number of publications published with the search formula 3D AND(Endoscope OR En- doscopy) on scopus.com
Fig. 3.2 Evolution of the number of publications and of the patents published since 1996
technologies potentially applicable starting with parallax-based techniques: stereoscopy and structured light.
3.2.1 Stereoscope
P
RINCIPLEThe word stereoscope was invented in 1838 by C. Wheatstone [24] by the words stereo- and -scope. In our context, it refers to the use of two images -left and right- to get 3D data. The first application, is the direct stereovision, it provides the user two images, one for each eye. Just as humans, the computer may use two images to reconstruct a 3D scene.
Stereoscopy is the direct application of the parallax-principle detailed in section 2.2. The main challenge of stereoscopic methods is to match every pixel from the left image with every pixel from the right image. Especially in case of homogeneous surfaces, where every pixels is similar to its neighbour; it is known as the correspondence problem. This is the same issue as when one looks at an homogeneous surface, sometimes it is hard to properly focus.
A
PPLICATION TOE
NDOSCOPYAll the presented devices in this section are dedicated to laparoscopy. Even though constraints are slightly different, most of the devices could, somehow, be used in gastroenterology.
The first way to implement stereoscopy is to place two cameras side-by-side in the endoscope; it directly provides 3D vision. The Belgian company BARCO has developed the vision system used in the da Vinci.
This endoscope was designed for laparoscopy and is thus rigid, see Fig 3.3a. A similar device has also been developed by Olympus [25], see Fig. 3.3b. Both of them were implemented in rigid endoscopes.
There is sufficient space available for two cameras. This works fine as in regular 3D vision there are no robustness issues. However, 3D reconstruction faces big robustness issues and the available space is reduced in flexible endoscopy. Authors have developed devices to solve these issues. Here are first presented the proposed solutions to solve the space issues and the correspondence issues.
Solving the Space Issue
The major issue with a stereoscopic system is the space occupied by the second camera. There are several ways to reduce the occupied space. Hori et al. and Jurgen et al. have combined two cameras in a compact way [26, 27], and some (Clark et al., Street et al. Mc Kinley et al.) have generated left and right images with one single camera [28, 29, 30]. As illustration, Street et al. [29], see Fig. 3.4a have patented a device based on the polarisation of the light to distinguish left and right images. Both images are polarized perpendicularly using beam polariser placed on the beam splitter (element 7 in the image). Another solution device proposed by Clark et al. is based on a prism [28] to give light
20
3.2. MAKING 3D IN ENDOSCOPY
(a) Illustration of the da Vinci endoscope (image from http://intuitivesurgical.com/)
(b) Stereoscope, patent filed by Olympus [25]. 211L and 211R are the left and right cameras.
Fig. 3.3 Stereoscopes with two side by side cameras
two directions before arriving on the CCD sensors, see Fig. 3.4b. Some devices based on two cameras were also developed. Koicho Hori et al. have developed an asymmetric disposition of cameras in the endoscopic head to reduced the space occupied by the cameras [26], see Fig. 3.4c. Cameras are placed one behind the other, the image is sent to the rear camera using an image guided optical fibre
1.
(a) Stereoscopic system based on the polarisation of the light [29]. Beam polariser (8 and 9 on the image) placed on the beam splitter 7 makes two images that are transmitted together then split again at proximal end of the endoscope.
(b) Endoscope using a prism to give image two directions.
[28]
(c) Asymmetric endoscope, cameras 68 and 80 and placed one behind the other. The image is sent to the second cam- era using an optical fibre. [26]
Fig. 3.4 Examples of devices focusing on reducing the space occupied
1