• Aucun résultat trouvé

The question of parameters is now addressed, In § 5.3.5, it has been formulated that if pa- rameter

a

is set to zero, the binary decision to classify a gradient value into an edge or into a surface is not required. In this case, however, the diffusive process does not sharpen the edge (already pointed out in $ 5.3.4). One solution to this problem consists in using a competitive stage. With this solution, the threshold problem is avoided, although the problem of fixing pa- rameters is not. According to the analogy with the RC electrical circuit, the diffusive time con- stant (equation (5.32)) depends on parameters 6 and e. ffiowledge of the velocity of the spreading of the diffusive process is thus the only determinant parameter. Once its range has been determined, fixing the parameters

6

and E is feasible.

Psychophysical measurements have already furnished such a range of velocity values for the human visual system (Paradiso and Nakayama 1991) but the present model is too far from the biological reality and it seems too early to adapt these data to the diffusive parameters. An alternative, adopted in this thesis, is to match as closely as possible perceptive performances obtained, for a noisy image, from subjects with those obtained from the recursive model with diffusion. Parameters 6 and E are progressively adapted to increase the matching performance.

From this point of view the task is not to find optimum parameters but rather to mimic as much as possible human perception.

5.4 Conclusions

5.4.1 A new method

A new mathematical tool has been proposed to deal formally with asynchrony: the dynam- ic transinformation. It resulted in demonstrating that temporal precedence in information can help to classify symbols when they are perturbed by noise. In particular, specific architectures were described and shown to be well adapted to asynchrony if a temporal integration stage was added.

5.4.2 Contour completion

The question of edge estimation has also been addressed. While the habitual strategy is to choose a threshold, this chapter suggested to fix parameters with respect to human performance rather than heuristics. Furthermore, only ad hoc methods dealing with the question of contour completion have been proposed: a vectorial method was intended to imitate cooperation, and a mechanism of competition was proposed to ensure edge sharpening. Again, the contour com- pletion problem is far too complex to be considered in this thesis, and thus still represents an active research area (see $ 7.3-3).

5.4.3 The critical parameter

In spite of the numerous parameters required along the various stages of the proposed ar- chitectures, one parameter was shown to be critical: the diffusive time constant. Obviously, other parameters are critical. For instance, the size of the filters should have been considered to address the question of the scale of visual processing. Nevertheless, the issue of this scale problem is far too complex to be included in this thesis.

CHAPTER 5. Asynchronous Visual Processing I. Foundations

5.4.4 What next?

Applications and simulations of the different equations presented in this chapter are to be found in the next chapter. In addition to the latency dependency on luminance, formally treated in this chapter, extension to other features will also be considered. In particular, and in conse- quence of the sensitivity of the isotropic and anisotropic stages to specific spatial patterns, la- tencies will be shown to be dependent on geometry.

Can a cu6e thut hesnot h t fm any time at alZ t'iuve a real e+tence? Char& any red bodj must h e extension in four directions: it must h u e Length) Breadth) ?7iic/(,,sJ and Duration.

H.G. Wells, The Time Machine, 1895.

CHAPTER 6

Asynchronous Visual Processing 11. Applications

Asynchrony in visual processing has been postulated in the previous chapters as being a resultant from latencies caused by differences in luminance. Does the human visual system benefit from this asynchrony? While this chapter does not pretend to answer this question, it attempts to find out the origin of the discrepancy existing between human perception perfor- mances in visual tasks and poor performances obtained from the classical (synchronous) model of the early visual system.

An example of such a discrepancy is given by considering the visual task which consists of segregating objects, contained in an image, on the basis of visual features (texture, color, lu- minance, etc.). This task, named segmentation and defined more precisely in section 6.1, is in most cases easily performed by biological visual systems. Conversely, segmentation has been deemed as complex and in most cases unresolvable when using current models.

To test whether asynchrony may help to segment images, first of all the effect of an in- crease in transinformation is experimented on a particular case of the segmentation problem:

the figure-ground separation problem (FGS) for artificial images. Furthermore, performances of both architectures, feedforward and recursive with diffusion, are compared. Extension of the figure-ground separation problem to segmentation of real images is then performed. Whether other sources of asynchrony could be used to enhance model performances for analyzing com- plex images is questioned in section 6.2. Spatial information is postulated to yield varying la- tencies according to receptive field selectivity on geometric patterns.

6.1 Segmentation

6.1.1 Introduction

The general problem of segmentation goes well beyond the scope of this thesis. For that reason, segmentation is limited to the problem of finding region boundaries. Thus, it does not

120 CHAPTER 6. Asynchronous Visual Processing 11. Applications consider the problem of textures, because finding region boundaries would fragment a homo- geneous textured surface into many small areas. More generally, it does not consider the prob- lem of segregating regions not differentiated by luminance values. Even with these restrictions, segmentation is ill-posedl: there is no ideal or "correct" segmentation and often uniqueness of the solution is not guaranteed. For instance, under differing lightning conditions, the solution will change. Segmentation is thus '6restricted7' to mimicking as closely as possible human per- ception. Ideal solutions being inexistent, the aim of this approach is to implement the asynchro- nous models of the early visual system, presented in the previous chapter, which, when provid- ed with suitable parameters, can give an account of perceptual experiments that synchronous models cannot account for.

This section deals mainly with the segmentation in the case of artificial images composed of one background and one foreground (results in

5

6.1.5 and $6.1.6). Very low signal-to-noise ratios are considered to demonstrate the potential of asynchrony when working with such de- graded images. Also dealt with in this section is the segmentation in the case of real images (results in $6.1.7). In both cases, performance of the asynchronous approach is compared with a synchronous approach obtained by using a common architecture.

6.1-2 Figure-ground separation

The performance of the human visual system in extracting noisy figures from a noisy back- ground is astonishingly good. Even in situations of very poor contrast, boundaries emerge clearly. Conversely, typical (synchronous) edge detectors fail to give good results for such im- ages. In an attempt to explain the discrepancies in these performances, the architectures defined in the previous chapter are tested on artificial images composed of a background and a fore- ground and corrupted by white noise. Quantitative comparisons confronting the synchronous versus asynchronous approaches for the same architectures is to be found in

9

6.1.4 while the framework behind these experiments is discussed in $ 6.1.3. Applications of the FGS problem are presented in § 6.1.5 and

8

6.1.6. Although other techniques of segmentation robust to noise have been developed, such as pyramidal and/or stochastic approaches (e.g. Chou and Brown 1987; Devijver and Dekesel1987; Rosenfeld and Sher 1988; HCrault and Hauraud 1991; Spann 199 I), they are not discussed here. Except for the pyramidal approach which implicates recep- tive fields of various sizes, biologically plausible, stochastic approaches invoke mechanisms not always compatible with those found in biological systems.

6.1.3 Framework

In CHAPTER 5, two architectures have been proposed to model the primate early visual system: (i) feedforward; (ii) recursive with diffusion. During the course of development of these architectures, the feedforward version was meant to demonstrate the increase in perfor- mance a temporal integration stage could supply, and its limitations were very quickly appar- ent. The subsequent version has thus been especially studied to overcome these limitations by adjoining a time-dependent spatial diffusion. The main effect of the spatial diffusion being to filter out small discontinuities and sharpen edges when iteratively reevaluated (see section 5.3), it allows to cope even with very low signal-to-noise ratios.

1. By opposition to a well-posed problem where a solution exists, is unique and depends continuously on the initial data (Poggio et al. 1987).

Segmentation 121 Another aspect introduced in the previous chapter was the notion of transinformation ap- plied to two symbols perturbed by two kinds of white noise, that is to say Gaussian and uni- form. In the context of asynchrony, transinformation resulted into two different curves accord- ing to the kind of noise (Figure 5.3 for Gaussian white noise and Figure 5.6 for uniform white noise). Also, for a given signal-to-noise ratio (SNR) a gain was defined (equation (5.9)). For various SNR values, the gains were shown to be much higher for uniform noise than for Gaus- sian noise. Repercussions of these differences in dynamic transinformation are of primal inter- est. For that reason, images with Gaussian and uniform noise are considered. In consequence, three performance comparisons have to be fulfilled as a function of SNR: (i) synchronous ver- sus asynchronous; (ii) Gaussian white noise versus uniform white noise; (iii) feedfonvard ver- sus recursive with diffusion,

6.1.4

Synchronous vs. asynchronous: performances

Owning to the duality existing between regions and boundaries, performance of segmen- tation is measured with respect to the edges separating the foreground from the background.

Two different images are considered. In the first (Iml), the image is divided into two vertical rectangles, one of which (left) is the background, the other (right) is the foreground. In the sec- ond (Im2), a horizontal rectangle, centered in the middle of the image, represents the fore- ground, the rest of the image being the background. In both images, the foreground is defined as being brighter than the background.

Measure of the performances of edge detection, called "merit9' in the graphs below, is de- fined with respect to edges as follows:

zr-x;

Merit = -

A ' r

where

2

is the average edge amplitude obtained by adding values along the column or the line where edges are located, and the subscripts n and r refer to the image with and without noise respectively. In order to allow for a comparison between synchronous and asynchronous ap- proaches, even for very low SNR, a tolerance in the edge loci was admitted. Thus, if an edge is slightly displaced within the admitted tolerance (maximum 3 pixels, generally required by the synchronous approach), only the edge amplitude is taken into account (and not its position).

Note that the signal-to-noise ratio (SNR) is defined as the square of the contrast amplitude (foreground luminance f minus background luminance b) divided by the noise variance:

(I--

Ib)

SNR = o2

Having to consider only two orientations (vertical and horizontal), the cooperation de- scribed in 5.2.4 was used in both architectures. Note that the results presented in this paragraph are based on an recursive architecture with a slight modification: output of the diffusive stage is normalized after every iteration.

The measure of performances being based on edges, diffusion was not applied in the syn- chronous case. The reason for this choice is that parameter

a

in equation (5.25) was set to zero.

In consequence, the condition for edge sharpening was not fulfilled and performances defined

L LL CHAP 1 bK 6. Asynchronous Visual Processing 11. Applications by equation (6.1) would not increase with diffusion. Alternatively, it could be argued that in using the anisotropic diffusion as defined by Berona and Malik (1990), edges would sharpen and performances increase with diffusion. Nevertheless, in this case a threshold would have to be fixed and inevitably "wrong" edges would be sharpened. Also, given the relative edge am- plitude measure defined by equation (6. I), performances would not necessarily increase. Final- ly, a decisive argument for not using another model for the synchronous case was the desire to compare, for a common architecture, the performance obtained from an asynchronous data flow with those obtained from a synchronous data flow.

Figure 6.1 and Figure 6.2 present the results obtained from image h l . Figure 6.1 com- pares, for the feedforward model, the synchronous approach with the asynchronous approach, for uniform and Gaussian white noise. Figure 6.2 compares, for the recursive model with dif- fusion, the synchronous approach with the asynchronous approach, for uniform and Gaussian white noise. Figure 6.2 presents the results obtained from the rectangle (image Im2). Only the recursive model with diffusion is opposed to synchronous versus asynchronous approaches (for uniform and Gaussian white noise).

For image Iml, a point in the resulting graphs represents the average of several values of merit, according to the variability in the results (maximum 3), obtained from different images of same SNR. For image Im2, a point represents, in addition to the average defined for Iml, the average of the horizontal edges and vertical edges performances. Identical conditions were al- ways used for the asynchronous and synchronous approaches. Furthermore, the synchronous approach was tested on the same architecture as the one used for the asynchronous approach, with setting the sampling time in such a way that all data came at the same time.

Several conclusions can be inferred from these three graphs:

synchronous versus asynchronous: except for the feedfonvard architecture tested with Gaussian noise, the asynchronous approach always perfoms better than the synchronous approach;

Gaussian versus uniform: performances are higher for the uniform noise.

This result is in agrement with the dynamic transinformation curves (Figure 5.3 versus Figure 5.6) as well as with the differences in gain existing between both noise distributions (Figure 5.4 versus Figure 5.7);

feedforward versus recursive with diffusion: a comparison between results shown in Figure 6.1 and Figure 6.2, for Gaussian noise, clearly demonstrates differences in performances whether the feedfonvard or the recursive model is used. With the feedforward model, both synchronous and asynchronous approaches give poor performances. With the recursive model, asynchrony in data flow increases the performance as shown by the plateau where values of merit are stable (around value 1) for a large range of SNRs. Such a plateau has been observed to be characteristic of human visual perception: performances are stable over a large range of SNRs before dropping suddenly (subjects cannot decide whether or not there is a foreground; for discriminative tasks for very low SNRs, see Burgess et al.

198 1).

Segmentation

Synchronous A versus asynchronous

0

Uniform noise Merit

Merit

4

0.01 0.05 0.1 0.25 0.5 1 2:5

Gaussian noise

Figure 6.1 : Performance comparisons for the feedforward architecture (without diffusion), asynchronous versus synchronous approach. Two cases are represented: uniform (top) and Gaussian (bottom) a a t i v e white noise. The kind of pattern used is shown at top right; background level: 124, foreground level: 130. The scale of the signal- to-noise ratio (SNR) is logarithmic. See equations (6.1) and (6.2) for the definitions of the figure of merit and SNR.

Parameters are: (i) isotropic stage: A = 100, B = 90, C = 4, D = 60, E = 1, a = 1, P = 2, filter size, 21x21 pixels; (ii) anisotropic stage: AR = 0.3, f = 1, filter size: 21x21 pixels, 1 orientations (90"); (iv) temporal param- eters (asynchronous case): sampling time: dt = 0.5 ms , upper simulation time: 22 ms.

CHAPTER 6. Asynchronous Visual Processing 11. Applications

Synchronous A versus asynchronous O

Merit Uniform noise

Merit Gaussian noise

Figure 6.2 : Performance comparisons for the recursive architecture with diffusion, asynchronous versus synchro- nous approaches. Two cases are represented: uniform and Gaussian additive white noise. The kind of pattern used is shown at top right; background level: 124, foreground level: 130. The scale of the signal-to-noise ratio (SNR) is logarithmic. See equations (6.1) and (6.2) for the definitions of the figure of merit and SNR. Parameters are: (i) isotropic stage: A = 100, B = 90, C = 4, D = 60, E = 1, a = 1, p = 2, filter size, 21x21 pixels; (ii) aniso- tropic stage: AR = 0.3, f = 1, filter size: 21x21 pixels, 1 orientations (90"); (iii) diffusive stage (synchronous case): 6 = 20000, E = 100, subsampling time: dt/25 ; (iv) temporal parameters (asynchronous case): sampling time: dt = 0.5 ms, upper simulation time: 22 ms.

Segmentation 125

Synchronous A versus asynchronous

C11

Merit Uniform noise

1

.o

0.8 0.6 0.4

0.2

SNR

0

0.005 0.01 0.05 0.1 0.25 0.5 1 2.5

Gaussian noise Merit

Figure 6.3 : Performance comparisons for the recursive architecture with diffusion, asynchronous versus synchro- nous approaches. Two cases are represented: uniform and Gaussian additive white noise. The kind of pattern used is shown at top right; background level: 124, foreground level: 130. The scale of the signal-to-noise ratio (SNR) is logarithmic. See equations (6.1) and (6.2) for the definitions of the figure of merit and SNR. Parameters as in Figure 6.2.

From the graphs presented in Figure 6.1, Figure 6.2, and Figure 6.3, it can be deduced that the most effective architecture is the recursive version with diffusion (with respect to the feed- forward model). Furthermore, asynchrony in data flow was shown to increase robustness to noise. Based on these two observations, the FGS problem is solved for different images which have the common characteristic of being easily segmentable by human subjects. Comparisons with the synchronous approach are included.

6.1.5

Results. I.

Noisy figure/ground

h a g e Im2, described in the previous paragraph

(8

6.1.4) and composed of a rectangle as the foreground, is now used to test the applicability of the asynchronous approach for the recursive model. Results of simulations, for two different SNRs, are shown in Figure 6.4 and Figure 6.6 for Gaussian and uniform white noise. Comparisons with the synchronous approach

126 CHAPTER 6. Asynchronous Visual Processing 11. Applications make clear that the conductance coefficients can better control the diffusion when visual infor- mation is treated asynchronously. For the Gaussian white noise, with SNR = 0.1, it must be not- ed that the resulting image is squarish instead of rectangular (output of the diffusive stage, bot- tom right of Figure 6.4). This result should be compared to the perception of subjects who, when asked to describe the shape of the foreground of this particular image, usually report see- ing a square. The diffusive parameters of the model having been fixed to mimic human perfor- mance, similar results between subjects and the model are expected.

To allow a quantitative evaluation of the quality of surfaces, and thus of the control of diffusion, the diffusion outputs are segmented into two levels: 1 for the foreground and 0 for the background. The algorithm used to accomplish this segmentation uses region growing based on the similitude of neighbor pixels, similitude defined accordingly to a tolerance. Re- sults are shown in Figure 6.5 and Figure 6.7 for, respectively, Gaussian and uniform noise.

Gaussian Noise S N R

=

0.5 S N R

=

0.1

original

synchronous result

asynchronous result

Figure 6.4 : Output of the diffusive stage of the recursive model with diffusion for two different SNRs. For both SNRs (0.5 and 0.1, Gaussian white noise), the asynchronous approach perfectly controls the diffusion, which means that edges are well defined. Line artifacts are due to the edge estimation stage which is performed on a whole line or column (cooperation). Image size: 128 x 128 pixels; grey levels and parameters as in Figure 6.2, with normalization of the output of the diffusive stage. Doaed rectangles have been added by hand to situate the position of the foreground.

Gaussian Noise Segmented Diffusion output image

synchronous result

SNR

=

0.5

synchronous result

Figure 6.5 : Segmentation. Results shown in the previous Figure (diffusion output) are segmented into a fore- ground and a background accorhng to the algorithm of region growing where a tolerance in the relative values of two neighbor pixels is specified. Tolerance in neighboring values: 10% for synchronous case (gives a better result than 20%) and 20% for asynchronous case. Note the better results prevailing through the asynchronous approach.

CHAPTER 6. Asynchronous Visual Processing 11. Applications

Uniform Noise

SNR =

0.5

SNR =

0.05

original

synchronous result

Figure 6.6 : Output of the diffusive stage of the recursive model with diffusion for two didferent SNRs. For both SNRs (0.5 and 0.05, uniform white noise), the asynchronous approach perfectly controls the diffusion, except for the left side of the rectangle when a SNR of 0.05 is used Line artifacts are due to the edge estimation stage (co-

Figure 6.6 : Output of the diffusive stage of the recursive model with diffusion for two didferent SNRs. For both SNRs (0.5 and 0.05, uniform white noise), the asynchronous approach perfectly controls the diffusion, except for the left side of the rectangle when a SNR of 0.05 is used Line artifacts are due to the edge estimation stage (co-