Index Terms— Nonrigid Image Registration, Biome- chanical Model, Lung, Video-AssistedThoracoscopicSurgery
Surgical resection of lung nodules via Video-Assisted Thora- coscopic Surgery (VATS) is one of the treatments available for early stage lung cancer. In comparison to open thoracotomy, this minimally invasive procedure reduces the length of hos- pitalization and minimizes post-operative complications . However, at the beginning of the procedure, the insertion of surgical ports and the artificial ventilation applied only on the contralateral lung, allow air to flow into the intrapleural space. This abnormal air inflow, known as a pneumothorax, induces a collapse of the lung towards the hilum, and, therefore large anatomical deformations. As a result, the intraoperative local- ization of small, deep or low-density nodules becomes con- siderably difficult .
Trial Research Team et al., 2011). Surgical resec- tion is considered one of the best curative treatments for patients with early-stage lung cancer. Histori- cally, lung lobectomy (i.e. the removal of entire lung lobes) through open thoracotomy was the chosen pro- tocol. Within the last decades, clinical practice has evolved towards less invasive, better tissue preserving techniques. For instance, minimally-invasive video- assistedthoracoscopicsurgery (VATS) has proven to yield equivalent clinical outcomes while improving patient care, and decreasing both the length of hos- pitalization and post-operative complications (Falcoz et al., 2016). In parallel, the interest for smaller, non- anatomical resections (wedge resections) has arisen for small nodules as a substitute to lung lobectomy. Although no consensus has been reached yet, studies suggest that the use of appropriate negative margins during wedge resections could provide patient out- comes equivalent to those of traditional lobectomies (Mohiuddin et al., 2014; Wolf et al., 2017). How- ever, this shift from lung lobectomy to wedge re- section through minimally-invasive VATS has intro- duced new surgical challenges. For instance, thoracic incisions to insert surgical instrument break the pres- sure equilibrium in the intrapleural space and cause air to flow into the thoracic cage. This abnormal air inflow, known as a pneumothorax, induces very large tissue deformation by collapsing the lung. While this voluntary induced pneumothorax is required to cre- ate surgical workspace, it significantly impairs the intraoperative localization of lung nodules, especially for small nodules that are generally not visible to the naked eye nor palpable through thoracoscopic instruments (Chao et al., 2018). Failing to localize lung nodules during VATS may ultimately result in unplanned surgical conversion to open thoracotomy, with a conversion rate as high as 54% reported in some studies (Suzuki et al., 1999). Therefore, several nodule localization strategies are commonly used in clinical practice. The main approach consists in plac- ing fiducial markers in the nodule to facilitate its in- traoperative localization. This nodule marking gen- erally requires an additional preoperative procedure, before surgery, for the placement of hookwires, micro- coils, or dyes under fluoroscopy guidance (Keating and Singhal, 2016). Despite the high success rates re-
Keywords: VideoAssistedThoracoscopicSurgery (VATS), Image Registration, Lung, Cone-Beam CT
Lung cancer remains as the worldwide leading cause of cancer death for both women and men. 1 , 2 Such a high mortality is related to the late detection of the disease, where curative treatements are normally not available and the 5-year survival rate lies between 6% and 18%. 2 , 3 However, screening programs performed on patients at risk have demonstrated that survival rates might be significantly increased if diagnosis and treatement are performed at early stages. 4 , 5 In such scenarios, surgical resection of malignant nodules is prescribed to patients.
Cancer?”, Chest 143.5 (2013), e93S–e120S.
 M. Zaman et al., “In patients undergoing video-assistedthoracoscopicsurgery excision, what is the best way to locate a subcentimetre solitary pulmonary nodule in order to achieve successful excision?”, Interactive CardioVascular and Thoracic Surgery 15.2 (Aug. 2012), pp. 266–272.  A. Uneri et al., “Deformable registration of the inflated and deflated lung in cone-beam
B. Parallel Image Kernels
The main feature of Video++ is the easy definition of parallel image processing kernels. The framework allows to define kernels that run up to 32 times faster (see sec. III) than the naive non optimized version, while being shorter to write. The pixel_wise construct allows to spread the execution of a kernel over all the available CPU cores. Since it splits the execution in several threads running on several cores, there must be no dependency between the computation of two pixels.
These different types of uses can be put to use in different application domains. Video archives (e.g. TV, surveillance) can propose an enhanced access to their collections through video annota- tions, allowing to find specific video fragments. The Yovisto platform 2 (Waitelonis and Sack, 2012) of- fers for example access to video through semantic annotations, allowing to look for specific location, people, events... Sports analysis also greatly re- lies on video material, which can be used in a re- flective way by offering the sportsman to view his own performance, or to analyse the behaviour of adversaries on recordings of previous competitions. Many applications such as EliteSportsAnalysis or MotionView Video Analysis Software offer tools to annotate and analyse sport performances. Research on activity in domains such as ergonomics, animal behaviour, linguistics, etc. also uses annotation soft- ware, since researchers need to perform a precise analysis of video recordings. There exist a number of research tools such as Advene, Anvil or Transana, as well as commercial offers like Noldus. They all offer annotation capabilities accompanied by various
Montreal Rd, M-50, Ottawa, Canada K1A 0R6 http://iit-iti.nrc-cnrc.gc.ca
This paper presents a number of new views and techniques claimed to be very important for the problem of face recog- nition in video (FRiV). First, a clear differentiation is made between photographic facial data and video-acquired facial data as being two different modalities: one providing hard biometrics, the other providing softer biometrics. Second, faces which have the resolution of at least 12 pixels between the eyes are shown to be recognizable by computers just as they are by humans. As a way to deal with low resolution and quality of each individual video frame, the paper offers to use the neuro-associative principle employed by human brain, according to which both memorization and recogni- tion of data are done based on a flow of frames rather than on one frame: synaptic plasticity provides a way to mem- orize from a sequence, while the collective decision mak- ing over time is very suitable for recognition of a sequence. As a benchmark for FRiV approaches, the paper introduces the IIT-NRC video-based database of faces which consists of pairs of low-resolution video clips of unconstrained fa- cial motions. The recognition rate of over 95%, which we achieve on this database, as well as the results obtained on real-time annotation of people on TV allow us to believe that the proposed framework brings us closer to the ulti- mate benchmark for the FRiV approaches, which is “if you are able to recognize a person, so should the computer”.
that this network was able to address flashing caused by saturated reds after only a few training steps
∙ STA: This network is largely based on the spatial temporal encoder in [ 12 ] used for optical flow generation. The first 4 layers are 32 filter convolutional layers with 3x3 kernels and ReLU activation. Next there is one layer of pooling to complete the encoder section. Following this, we insert two 32- filter convolutional LSTM layers, followed by two regular convolutions with large 15x15 kernels. After this we have a 2-filter 1x1 convolution and a 2-filter 3x3 convolution. Patraucean et al. [ 12 ] explain that they use large 15x15 kernels to "regress from the memory feature space to the space of flow vectors". We can relate our problem to that of optical flow by casting our problem as next frame prediction following the beginning of a flash. Therefore, the extracted optical flow features can similarly be effective for video sanitization. Unlike [ 12 ] we did not apply a Huber loss penalty. The encoder portion of this network is the inverse of the decoder. This network was relatively wide and deep. Our primary motivation for this approach was to see if a network of stacked layers was capable of video reconstruction in similar fashion to optical flow prediction. A subtle difference between this network and smLSTM is that we do not use bidirectional LSTMs, so next frame prediction is based on previous frames.
None of these techniques factored in the video bit rate
characteristic for the calculation of the TCP-friendly
transmission rate. The video bit rate tends to vary according to the complexity of the frame data, for example an I-frame would be more complex compared to a P-frame as it results in more bits after compression. The same also applies to scene changes and high motion scenes in a video sequence as they tend to incur a higher prediction error which results in a lower compression efficiency. Thus a typical video bit rate will have occasional ‘pulses’, a smoothed transmission rate will reduce these ‘pulses’ and ends up affecting the video quality.
1.1.2 Image-guidance systems in transoral robotic surgery
A number of clinical studies in the field of image-guided transoral robotic surgery for oropharyngeal cancer have been proposed. For instance, Desai et al. has demonstrated a CT image-guidance system in the da Vinci ® robotic surgery, where the CT image and a pointer are registered to a patient and the trajectory of the tip of the pointer is displayed on the CT image on an external screen [ Desai et al. , 2008 ]. The main concern of this preoperative CT image-based approach is that the spatial registration between the CT image and the patient is based on some bony landmarks. However, the manipulation of the patient — the tongue is retracted as shown in Figure 1.2 — may change the relationship between the tumor and the bony landmarks, thereby leading to inaccurate intraoperative guidance. A recent study proposed by Clayburgh et al. used intraoperative transoral US technique, prior to starting the tumor resection, to show the tumor edge and important vessels to a surgeon [ Clayburgh et al. , 2016 ]. However, the surgeon needs to extract tumor information from a group of 2D US images and mentally transform the information to the operative field. The output quality of this mental work highly depends on the surgeon’s training, experience and his/her thorough knowledge of human anatomy on US images.
Jun Shen · Nabil Zemiti · Christophe Taoum · Guillaume Aiche · Jean-Louis Dillenseger · Philippe Rouanet · Philippe Poignet
Received: / Accepted:
Abstract Purpose Surgical treatments for low-rectal cancer require careful considerations due to the low location of cancer in rectums.Successful surgical out- comes highly depend on surgeons' ability to determine clear distal margins of rectal tumors. This is a challenge for surgeons in robot-assisted laparoscopic surgery, since tumors are often concealed in rectums and robotic sur- gical instruments do not provide tactile feedback for tissue diagnosis in real time. This paper presents the de- velopment and evaluation of an intraoperative ultrasound- based augmented reality framework for surgical guid- ance in robot-assisted rectal surgery.
Hernia surgery , edited by Volker Schumpelick, Georg Arlt, Joachim Conze, Karsten Junge and Georg Thieme Verlag KG, Germany, Thieme Publishers Stuttgart, 2019, 329 pp
Abdominal wall and hernia surgery constitute the most frequent surgical procedures performed by abdominal and general surgeons. Despite that fact, complications are not rare and sometimes impair the quality of life of the patients for decades. Recurrence can also be the consequence of inadequate surgical technique. It is, therefore, of great importance to improve the results of our abdominal wall surgery for the best of the Belgian population.
the bit budget constraint, the controller performs, first, bit allocation per frame and/or per block. Then, using appropriate rate distortion models, optimal coding parameters such as quantization parameters are computed. Due to more complicated coding structure and the adoption of new coding tools, the statistical characteristics of transformed residues are significantly different. Thus, rate control techniques have evolved greatly with the develop- ment of video coding techniques. Different rate control methods have been implemented and tested over video encoders, some of them based on simple rate expressions such as in TM5 for MPEG-2, VM8 for MPEG4 and TMN8 for H.263 others on more complex mathematical representations such as in H.264/AVC and HEVC. The accuracy of these models has been enhanced by introducing the so-called complexity of the source and by considering advanced video coding features.
The application of the above-mentioned marketing strategies is contributing to the demarcation of the two branches of the industry mentioned above. These branches apply different economic models and target different audiences, a mainly urban local and diasporic middle class for the films released in the cinema circuit, and a more popular, both urban and rural, local and non- economically-uniform class of people for the straight-to-video films, and especially for those in local languages. This may be a simplification of the situation, but it helps to analyse the directions that the industry is taking. The two markets can easily overlap, as happened for instance with the film Jenifa, a low-budget Yoruba film which became the biggest popular Nollywood success in 2008, cutting across all social and cultural divisions. However this interpretative scheme offers a tool to look at the transformations that the industry is currently undergoing. These transformations are also a reaction to the way in which Nollywood has been represented in relationship to other cinematic traditions, and implies the need for a reconfiguration of view on Nollywood cinema within the landscape of world cinema production.
tions have been well defined after bariatric surgery, the risk of cancer in this population remains ill-defined.
Methods: We report 2 cases of gastric cancer, a B-cell lym-
phoma of the distal stomach after gastric bypass and a GIST after vertical banded gastroplasty, which illustrate the delay in diagno- sis that results from the procedure and from the negligence of upper gastrointestinal symptoms often present in this population.
Figure 3 introduce the concept of ROI with the squares. It defines an interesting group of pixels. More information can be found in the subsection 5.4.2.
On this video flux, up to 32 ROI could be placed. One ROI is defined as a square of 3 to 8 pixels wide (see figure 4), a flag assigned to the ROI that would indicates its mode (inverted or not, explained below) and a X/Y position within the VGA frame. A ROI in its normal operation works in NORMAL MODE . It meant that the detection was
The goal of this thesis is to develop a novel algorithms for sports video proces- sing and analysis as well as integrate theme into one framework. In chapter 2, we presented a state of art in sports video processing and analysis, also the structure of soccer video, furthermore we have reviewed many of the video summarization, browsing, and retrieval methods existing in literature. In chapter 3 we introduced our framework for soccer video processing and analysis. We have presented an en- hanced method for dominant color extraction using two color spaces in chapter 4. A new method for shot transitions detection based on discreet cosine transform DCT and a novel method for shot classification base on spatial segmentation were pre- sented in chapter 5. These algorithm exploit the spatial reorganization of pixels and domain knowledge. In chapter 6, we have investigated the video for score box de- tection and text recognition in soccer video, this method is based on motion vector computation. In the last chapter we have introduced a method for summaries and/or highlights generation based on domain knowledge and finite state machine (FSM), audio marker is also used in this framework.
In this way, all kinds of détournements do not end in the creation of a finished work. However, it can be postulated that all the playful practices that lead to the production of a work are (to varying degrees) détournements – or, at least, deserve to be considered as such.
As an illustration, one might think, at first glance, that the let’s play 14 does not constitute a form of détournement. These videos, apparently, do not seem to be reconfigurations or transformations of the game’s elements: they appear as records and testimonies (more or less faithful) of a subjective playing experience. Yet some let’s players show a real creativity through their way of playing and through the comments they superimpose on their performances – these may sometimes transfigure the playing activity. I therefore start from the principle that, when a player records himself playing and broadcasts the video on the internet, the recording device and the presence of an audience necessarily introduce a shift, a discrepancy (which may be minimal) in the activity, by moving it from the register of play to the register of performance.