1.2 Research Aims
All above-mentioned challenges and issues motivated us to focus the PhD research on two aspects. On the one hand, to effectively and efficiently compress 3D image, it is important to account for human visual system (HVS) characteristics and properties (e.g., visual sensitivity). In particular, our research aims to investigate the spatial visibility threshold based on both monocular and binoc- ular visual properties. This threshold is usually referred to as the just noticeable difference (JND), which determines the maximum distortion undetectable by human eyes. Moreover, since an accurate stereoscopic three dimensional just noticeable difference (3D-JND) model can be applied in perfor- mance improvement of 3D image compression and qualityassessment (QA), this research also aims to propose a reliable 3D-JND model based on study and comparison between state-of-the-art 3D-JND models. On the other hand, aiming to provide the promising viewing experience for 3D content, per- ceptual QA for stereoscopic images is quite crucial to evaluate or optimize the performance of S3D processing algorithms/systems. Therefore, the purpose of this research is to propose accurate and efficient stereoscopicimagequalityassessment (SIQA) methodologies based on the investigation of binocular perception. Specifically, the most important step is to find monocular and binocular factors affecting the perceptual quality of 3D images. In addition, we need to explore and model the binocular vision properties linked to the behavior of human 3D quality judgment. Finally, the SIQA models will be proposed combining the quality-related factors and considering the binocular vision properties.
3. SUBJECTIVE STEREO IMAGEQUALITYASSESSMENT
In general the design of objective qualityassessment metrics needs to be validated by subjective qualityassessment. Then the definition of specific test setups for subjective test experiments is required. Methods have been proposed for 2D quality such as double stimulus continuous quality scale (DSCQS) [ 36 ] and SAMVIQ [ 37 ]. We choose to follow the SAMVIQ protocol which stability allows to conduct the experiments in a more reliable way. More precisely, the test was performed in a controlled environment as recommended in ITU BT 500-11 [ 36 ], by using displays with active liquid crystal shutter glasses. SAMVIQ is a methodology for subjective test of multimedia applications using computer displays, whose application can be extended to embrace the full format television environment as well. The method proposed by SAMVIQ specification makes it possible to combine quality evaluation capabilities and ability to discriminate similar levels of quality, using an implicit comparison process. The proposed approach is based on a random access process to play sequence files. Observers can start and stop the evaluation process as they wish and can follow their own paces in rating, modifying grades, repeating play out when needed. Therefore, SAMVIQ can be defined as a multistimuli continuous quality scale method using explicit and hidden references. It provides an absolute measure of the subjective quality of distorted sequences which can be compared directly with the reference. As the assessors can directly compare the impaired sequences among themselves and against the reference, they can grade them accordingly. This feature permits a high degree of resolution in the grades given to the systems. Moreover, there is no continuous sequential presentation of items as in DSCQS method, which reduces possible errors due to lack of concentration, thus offering higher reliability. Nevertheless, since each sequence can be played and assessed as many times as the observer wants, the SAMVIQ protocol is time consuming and a limited number of tests can be done.
4.1. Stimuli and evaluated conditions
The source sequences (SRC) from NAMA3DS1 were impaired by various spatial or coding degradations. Eleven hypothetical reference conditions (HRCs), including the unprocessed reference as HRC0, were considered (see Table 2). Coding impairments were introduced through H.264/AVC video coder (JM reference software, v18.2) and JPEG 2000 still image coder (Kakadu software, v7.0). In both cases, left and right images were encoded separately with the same parameters (symmetric processing). Losses in resolution have also been considered: two HRCs feature sequences downsampled by a factor of 4, and later upsampled for displaying. Finally, a typical post-processing step in TV applications has been included: image sharpening through edge enhancement.
3. EXPERIMENTAL PROTOCOL
In this experiment, synthesized still imagequality is evaluated in stereoscopic conditions. In these conditions, we have several ob- jectives. First, we aim at determining whether ACR-HR method- ology is appropriate for the assessment of different DIBR algo- rithms; Second, the required number of participants enabling a reliable subjective assessment test is questioned; Third, we inves- tigate whether the results of the subjective assessments are con- sistent with the objective evaluations; Fourth, we need to compare the obtained results to the monoscopic conditions results. The material comes from the same set of synthesized views as described in Section 2. The stereopairs consist of two stereo- compliant views. One view is the original acquired frame and the other is a synthesized frame. All the synthesized frames used in this experiment are exactly the same as those used in the previous study (in monoscopic viewing conditions, with still-images). ACR-HR methodology was used with 25 naive observers. The stimuli were displayed on an Acer GD245HQ screen, with NVIDIA 3D Vision Controller.
The perceived video quality is of highest importance for the adoption of a new technology from a user’s point of view and thus, consequently, from an industry perspective. Subjective assessment is commonly used to measure users’ quality experience. Subjective tests are performed in order to obtain accurate and reliable quality evaluations, however the use of subjective tests is time consuming and expensive. Therefore objective quality metrics, a fast and automatic way of measure or predict video or imagequality according to image physical characteristics, is highly desired. For 2D videos a great effort has been done on developing objective models see e.g.. There are some studies on the evaluation of Stereoscopic 3D (S3D) images with existing 2D quality metrics , however for S3D videos the objective quality metrics are still not widely studied.
3D visualization 53
2.3.6 Visualization artifacts
Unfortunately none of the technologies listed above are free of flaws, which cause visual- ization artifacts in addition to other drawbacks inherent to 3D acquisition. All of the view misalignment problems discussed in Section 2.2.2 can potentially occur when two me- chanical projectors are used to present an image to each eye. Often low image luminance and contrast are intrinsic to stereoscopic visualization due to light losses in filter-based systems and systems with glasses. In LCD panels, contrast reduction is the result of backlight leakage from pixels that are turned off. Furthermore, it influences stereoacuity and thus the ability to distinguish fine stereoscopic details [ Legge and Yuanchao, 1989 ]. The most common problem of all stereoscopic displays is probably crosstalk or ghost- ing, which is the result of an imperfect separation of the left and right views in the stereoscopic system. Hence the part of the signal intended for the left eye leaks into the right eye or vice versa. As a consequence, a ghost image or double contours can be perceived by viewers (see Fig. 2.13 ). Even more crosstalk is introduced with a higher contrast of the content and disparity values [ Boev et al., 2008 ]. In addition, small lev- els of crosstalk may reduce the amount of perceived depth, but high levels reduce the viewer’s comfort [ Watt and MacKenzie, 2013 ].
by one eye is highly disturbed. The visual system has indeed a tendency to compensate the lower quality perceived by an eye by the quality of the other eye perception. Nevertheless eye fatigue phenomena can be observed in such case. As a consequence, imagequality assessors must take such factors into account in order to really evaluate the user experience. This brief overview shows the wide research area for quality metric design that still waits to be investigated. Nevertheless, in a first step, all the parameters can not be taken into ac- count. It is necessary to first, choose a technology and focus on a limited set of factors. Such evaluation task started re- cently for example in  where the impact of the compres- sion of the depth information for 2D plus depth visual scene coding on auto stereoscopic displays has been studied. The preference of the observers in regard of the compression method and related bit rates are investigated. Also, in  is presented the beginning of a metric design based on subject- ive measures that allows 3D imagequalityassessment. The aim of this work is to analyse the relevancy of 2D quality metrics applied to stereo content. Different metrics where evaluated for stereoscopic vision with glasses. Nevertheless, introducing 2D quality metrics in a 3D context is a real chal- lenge when attempting to integrate depth information. We propose an attempt for such factor integration by involving a quality metric on disparity maps between views. Then sever- al questions appears from the choice of the quality measure operator to the fusion if its result in the original metric. This first attempt is still limited to stereo vision with glasses, an extension to 2D plus depth for auto stereoscopic displays is expected in future research.
Fig. 5. Performance dependency of the proposed metric with the changing ratios (γ%)
IV. C ONCLUSION
In this paper, we proposed a novel full-reference shift com- pensation based imagequalityassessment metric for DIBR- synthesized views (SC-IQA). An SURF + RANSAC homog- raphy approach and a multi-resolution block matching are used to compensate the global shift and to penalize the geometric distortions as well. The experimental results show that it significantly outperforms the state-of-the art 3D synthesized view dedicated metrics: 3DSwIM, MP-PSNR, MW-PSNR, CT-IQA, EM-IQA and the conventional 2D IQA metrics: PSNR, SSIM.
DIBR techniques can be useful for ﬁlling the created disocclusions but they induce typical distortions. These distortions are diﬀerent from those commonly encountered in 2D video compression, such as blur or blocking artifacts. In fact, due to the warping process, DIBR algorithms induce geometric distortions mainly located around object boundaries. Moreover, depending on the in-painting method used by the DIBR algorithm, the types of artifacts may also diﬀer. For these reasons, commonly used 2D imagequality metrics fail in rendering quality scores correlated to human perception and there is an increasing need for deﬁning new quality metrics. Post processing of acquired video sequences is often necessary to provide a good 3D Quality of Experience 23 . As explained in 23 , scene depth adaptation and 3D display adaptation among others may require the use of view
Keywords: stereo vision, geodesic distances, 3D watershed, image seg- mentation, sparse disparity measurements, dense estimation
The problem of computing a depth map from a pair of rectified stereo images is undoubtedly a classic one in computer vision. When a point of the scene projects onto the two image planes, it does so with the same ordinates but with different abscissa. The difference of abscissa corresponds to what is commonly referred to as the disparity and is inversely proportional to the point’s depth being sought for. Finding point correspondences between the left and right views of the stereo pair is relatively easy across non-uniformly textured areas. However, homogeneous regions are the source of matching ambiguities whilst the occlusion phenomenon makes it impossible for some pixels to have a correspondence and thus require their disparity to be estimated according to a suitable model.
dataset was sampled at a scale of an entire representative medium-sized European city (Figure S1). The data is split according to the two most common urban garden types: allotment and home gardens (Table S1). Furthermore, sample plots were assigned one of three garden habitat types: vegetable beds (i.e., annual vegetable plants), flower beds and berry cultivations (i.e., perennial flowers, roses, and berry shrubs), and lawn (i.e., meadows and turf). Descriptive statistics are given in Tables S4–S6. A graphical representation of percentage deviations from the overall mean value split by garden and habitat type is given in Figure 2. In summary, this dataset provides information about a city-wide soil qualityassessment of urban gardens. This data can be used for comparing soil properties among different cities or land use types. Moreover, our study may help to analyze the effect of garden management or urbanization on soil quality (see Tresch et al., 2018 ) or provide data for modeling of carbon dynamics in urban soils or other soil based ecosystem services.
b IRCCyN UMR 6597 CNRS, Ecole Polytechnique de l’Universite de Nantes
rue Christian Pauc, La Chantrerie 44306 Nantes, France
Most of the efficient objective image or video quality metrics are based on properties and models of the Human Visual System (HVS). This paper is dealing with two major drawbacks related to HVS properties used in such metrics applied in the DWT domain : subband decomposition and masking effect. The multi-channel behavior of the HVS can be emulated applying a perceptual subband decomposition. Ideally, this can be performed in the Fourier domain but it requires too much computation cost for many applications. Spatial transform such as DWT is a good alternative to reduce computation effort but the correspondence between the perceptual subbands and the usual wavelet ones is not straightforward. Advantages and limitations of the DWT are discussed, and compared with models based on a DFT. Visual masking is a sensitive issue. Several models exist in literature. Simplest models can only predict visibility threshold for very simple cue while for natural images one should consider more complex approaches such as entropy masking. The main issue relies on finding a revealing measure of the surround influences and an adaptation: should we use the spatial activity, the entropy, the type of texture, etc.? In this paper, different visual masking models using DWT are discussed and compared.
ing the epipolar geometry and the orientation of the represented surface, the homography estimation be- tween two regions is defined. Then, an ImageQualityAssessment (IQA) is used to evaluate the similarity (or the dissimilarity) between the initial area z and the warped area ˜z from which we can deduce the planarity of z. The IQA(SP1, ˜ SP1) and the IQA(SP2, ˜ SP2) are more similar than IQA(SP3, ˜ SP3), cf. Figure 2 that shows an example of this behaviour with a planar and a non-planar regions z delimited by three 2D points noted q 1 , q 2 and q 3 .
We also chose to use multiple databases because there was no HDR database with enough images to train the metric. To illustrate this, we can train HDR-VDP-2 CtCp on a unique database (using the particle swarm optimizer) and observe the result on other databases. We performed this training four times using four different databases: three HDR databases, Zerman et al., Korshunov et al. and Narwaria et al. (cf. Section 2.3) and one SDR database: TID2013. TID2013  is an extension of the TID2008 database with notably more color artifacts (Image color quantization with dither, Chromatic aberration ...). For this database, we did not use the distortion called "Change of color saturation" as this default created outlier for HDR-VDP-2 (cf. Figure 6.2). The same behavior was observed for HDR-VDP-2 CtCp after the training of the weights. This reminds the problem with the gamut mismatch artifacts of the HDdtb as both artifact create similar distortion(cf. Section 3.3). Those distortions are clearly visible but do not impact that much the imagequality.
What I am saying is that subjective quality and objective quality are two dif- ferent things; it is not the measure that is subjective or objective, it is the “qual- ity” that is either subjective or objective. They of course differ in the mechanism but they also differ in the constraints, the factors, and the usage. There are some attributes in the signal and there are attributes in the application and the usage of the quality that affect the choice of objective versus subjective quality measures. This becomes clearer when we developed and tried to measure the qual- ity in 3-D video and in human interac- tion applications. If you give a person a job to perform (e.g., tie a shoelace) in virtual reality using state-of-the-art interactive devices (e.g., data gloves) and assign an evaluator to watch the virtual action, how would the quality assigned by the evaluator differ from an objective quality measure? It is very complex sig- nal and even to define a “good” shoe lace tying is a very tough task to start with. Similarly, a single monocular cue such as color or texture may have some impact on the quality of a two-dimen- sional (2-D) video, but they have cata- strophic impact on the quality of a 3-D video. Nevertheless, humans may not notice such implications in a 3-D video, while they notice it clearly in a 2-D video. Back to the example of “high reso- lution means higher quality.” In our lab, we ran many experiments with several subjects over a number of displays (a couple of years ago) to understand the impact of resolution, frame rate, texture content, etc. on the perceived quality of ultrahigh-definition (HD) videos as well as 3-D videos. High resolution gave “good experience” for a short period of time but ultimately the quality of the content (color, sharpness, texture etc.) decided on the perceived quality of vid- eos. Limited resolution means, in my opinion, that we can hide several arti- facts from human eyes. This takes me
Using a smaller measurement area, with less magnification, would lower the uncertainty. Another way of lowering the uncertainty would be to increase the time between the laser pulses. The relatively high uncertainty was accepted for two reasons. The first was the desire for the largest measurement area from a single field of view. The second was because the measurement plane was across the strongest flow direction. This is the primary plane of interest for many flow measurements around a ship hull (e.g. a wake survey through the propeller plane). This required relatively short laser pulse times to ensure the same particles are within the measurement space for both image pairs, which resulted in relatively high values of uncertainty.
It is clearly seen that a decrease in the encoding deteriorates the measurement. Fig. 5 shows that the asymptotic value k a of 6-bit like image results are upper but close to 8-bit image results, and 4-bit like image results are bad (except for the largest subset). For subset sizes of 15 or 21 pixels, the error value for large period (smallest strain gradient) is clearly dependant on the dynamic of images. Indeed, k a values for 8-bit images are lower than k a values for 4-bit images. When a larger subset size is considered, k a values are much closer and the coding influence becomes negligible. As there is less information in 4-bit like images than in 8-bit images, for a giving error it needs a larger subset size that decreases the spatial resolution.
The aim of this work is to characterize the software in order to facilitate its use. Thus, different controllable parameters in a conventional use of the system (i.e. on real images captured with CCD) have been tested here. By creating well-controlled synthetic images, we are able to vary some parameters such as the type of encoding (4-, 6- or 8-bits) and the saturation independently. The DIC software treatment has been realised with different subset sizes, in order to highlight the relationship between image parameters and the DIC software parameters. With deformed images obtained assuming a unidirectional sinusoidal displace- ment with different values for the amplitude and the period, we created a set of images with a strong gradient of grey level (for 4- and 6-bit like images) and a set of images that simulates overexposed speckle. The results of the first set show that a
Breast cancer is a global challenge, causing over 1 million deaths in 2018 and affecting millions more. Screening mammograms to detect breast cancer in its early stages is an extremely vital step for prevention and treatment. However, to maximize the efficacy of mammography-based screening for breast cancer, proper positioning and quality is of utmost importance. Improper positioning could result in missed cancers or might require return patient visits for additional imaging. Therefore, assessment of quality at the first visit prior to examination of the mammogram by a radiologist is a crucial step in accurate cancer detection. This study proposes multiple deep learning techniques combined with geometric evaluations to provide numerical metrics on the quality of mammographic images. The study found that using a RetinaNet model to detect breast landmarks achieved high precision in the mediolateral oblique view (92% for muscle top and 51% for muscle bottom) and 83% in detecting the nipple in both the mediolateral oblique and craniocaudal view. Using these detected landmarks, we provide a report containing numerical metrics on positioning evaluations of the breast images for mammography technologists to use during the patient visit to avoid fallbacks of improper positioning. This report could aid technologists in taking proper precautions to help radiologists effectively detect breast cancer.