• Aucun résultat trouvé

5.1 Dynamic transinformation

5.1.5 Application of the dynamic transinformation

In this section, the dynamic transinfomation has been defined and evaluated for two kinds of noise. At the issue of these evaluations, it was clear that transinformation reaches an opti- mum when time progresses. Furthermore, inside a specific range, dynamic transinformation was shown to be greater than synchronous transinforrnation (illustrated by the hatched surfaces in Figure 5.3 and Figure 5.6). How to use this increase in transinforrnation is addressed in the two following sections (sections 5.2 and 5.3). More specifically, two asynchronous visual ar- chitectures modeling the early visual system are proposed and described in detail. Owning to the nature of the various processing stages, the main functionality of these architectures is to extract region boundaries (high pass filtering). The first architecture, feedforward, is very basic but is useful to introduce the concept of temporal integration. The second, recursive with dif- fusion, attempts to model other functions than just the high pass filtering. Indeed, the human visual system does not see only a world made of contours (see rj 2.4.4). Asynchrony will be supposed to stem exclusively from luminance differences and latencies to be calculated as in this section, that is to say linearly.

5.2 Asynchronous model. I. Feedforward

5.2.1 Global architecture

Although very simplistic, the feedforward model in its asynchronous version can already

Asynchronous model. I. Feedforward 101 benefit from the increase in dynamic transinformation to locate boundaries separating regions of differing luminances. This model is constituted of four main stages (Figure 5.8): (i) isotropic filtering; (ii) anisotropic filtering; (iii) cooperation; (iv) temporal integration. The last stage, temporal integration, is the major modification required to work with an asynchronous rather than synchronous data flow which, in the case considered, stems from luminance differences.

Another modification is the conversion of amplitudes into latencies according to a linear func- tion (as in section 5. I), to create the dynamic data flow (see in Figure 5.8 the module "laten- cies"). Note that the luminance information is spatially represented according to a two-dimen- sional regular lattice of points.

Figure 5.8 :(A): Synchronous archtecture. Outputs give, for a specific orientation, values of coherence within im- age columns (see text). (B): Asynchronous architecture which is similar to the one given in (A) but with two sup- plementary stages: latencies computation and temporal integration. Signals entering the system are thus delayed according to their amplitude and once they have appeared they keep their value, replacing the original value which was set to zero; this corresponds to sustained responses (see 3 4.4.2). The appearance of these signals is thus tem- porally distributed and produces a dynamic asynchronous data flow, astinguished from synchronous inputs by hatched arrows.

5.2.2 Isotropic filtering

The first stage of processing is an isotropic filtering with a small on-center large off-sur- round structure. It does not include the complementary off-center on-surround channel de- scribed in 5 2.2.3. This level corresponds physiologically to the ganglion cells in the retina (for models, see for instance Richter and Ullman 1982; Wehmeier et a1 1989). It is realized by using a set of neurons, modeled by the shunting equations

(3

3.4.3), forming a lateral inhibitory struc- ture (section 3.5). As it has been seen in § 3.5.7, this level of processing demonstrates an adap- tive property important for the perception of contrasts: the Weber-Fechner law. Output of this stage is half-wave rectified (see

5

2.4.3 for the rectification definition). The effect of this rec- tification is to keep only positive response values. The equation describing this high pass iso- tropic filtering is (Grossberg et al. 1989):

CHAPTER 5. Asynchronous Visual Processing I. Foundations

where X, is the half-wave rectification of the activity of cell xij at position (i, j) , I are inputs, A is the decay of the membrane potential, B and D are respectively the excitatory on-center and inhibitory off-surround saturation values, Kc and

&

are respectively the excitatory and inhibitory kernel coefficient, ra and ra are respectively the radius of the excitatory and inhib- itory regions for which the exponentid value gives 0.5. The range of value p, q is contained in the limit of the receptive field size (circular in that case). Properties of these equations have already been studied in Briffod and Burgi (1991).

5.2.3

Anisotropic filtering

The second stage implements an anisotropic filtering by means of even Gabor filters. This level has its physiological correspondence in the primate primary visual cortex (Daugman 1985; Glezer 1985; Palmer et al. 1985). Output of this stage is "full-wave" rectified with the effect of converting negative responses into positive responses (absolute value), thus rendering this stage independent of contrast polarity. It is described by the following equations:

where

6

is the rectification of the activity of cell

yi

at position (i, j ) , X are inputs, k is the filter orientation, f and cp are respectively the frequency and phase of the side bands, el and e , are respectively the normalized receptive field elongation and width. Note the definition of the aspect ratio (AR). The range of value p, q is contained in the lirnit of the receptive field size (elliptic accordingly to the aspect ratio).

Asynchronous model. I. Feedforward 103

5.2.4 Cooperation

In the context of this thesis, cooperation has not been used to propose a solution to contour completion, treated theoretically for instance by Grossberg (e.g. Grossberg 1987), but rather to propose ad hoc methods which could enhance the quality of edges. The principal idea is that spatially distant features, such as the responses of various cells of the anisotropic stage for a given orientation, are supposed to be interconnected (physiologically plausible according to

$ 2.4.4). On the other hand, and to remain in the spirit of asynchrony, cooperation will be based on the notion of events temporally distributed (Figure 5.9A). Outcome of these two constraints is that cooperation is viewed as a nonlinear process which links spatially distant features shar- ing similar characteristics (in this case, orientation).

Events temporally distributed

and

coherency

The dependence between the frequency of a train of spikes impinging on a cell and the la- tency of the appearance of the first spike from this cell was established in § 4.1.3. To measure the similarity between various cell responses, also called "coherency", the latency of their first spike, referred to as the phase, is thus significant. If the phase difference between cells is small, then it can be deduced that the activities of those cells are similar. When designing an operator measuring the similarity between signals, it is important to take into account the signal ampli- tudes and not only their phase. It is wished, for instance, that the output of this operator for two signals in phases of small amplitude gives a weaker response than for two signals in phases of large amplitude. To express these two factors (phase and spike frequency), cell responses are represented by vectors whose length is related to spike frequency and whose angle is related to spike latency (thus both, length and angle are dependent on signal amplitude). In consequence, the transformation of a train of spikes into a vector, called vector conversion (illustrated in Fig- ure 5.9B), involves two functions: (i) one for converting frequencies into vector magnitudes;

(ii) another for converting latencies into vector angles (in the range [0, .n] to avoid cancella- tion of vectors pointing in opposite directions; this situation would correspond to inhibitory connections, which are not wanted). These two functions are chosen linear (G and H in Figure 5.5).

Measuring coherency

Measuring the coherency among responses for a specific image column1 of a specific ori- entation is made in three stages: (i) convert responses into vectors according to the previous rules; (ii) calculate the vectorial sum of the vectors determined in the previous step. The result- ing vector gives a measure of strength of the responses along an image column with a weak measure of the extent of spreading of the vectors; (iii) make a measurement of the dispersion of all the vectors with respect to the resulting vector calculated in the previous step. The dis- persion of n vectors of length R i and angle O i with respect to a resulting vector of length R, and angle OR,

,

is measured by applying the following definition2:

1. An image column refers to the lattice of points which is supposed to form columns according to the horizontal and vertical orientations. Extension to n orientations is not considered (see text).

2. A similar definition is used by Rao and Schunck (1991) to measure the flow orientation coherence of vectors corresponding to spatial gradients.

CHAPTER 5. Asynchronous Visual Processing I. Foundations

An alternative to this vectorial method for measuring the coherency among neuronal re- sponses is the coincidence detection which favors simultaneous spike arrivals by implementing a multiplicative operator, recently formalized by Bugmann (1992).

For practical reasons, the technique of cooperation (or coherency detection) just described is applicable to an image for only two orientations: horizontal and vertical. Extension to more orientations would lead to a combinatorial explosion. A second limitation is that the whole col- umn contributes to the response. Thus, gaps existing along a column are not considered. A project aimed at finding solutions to these limitations is being carried out (Durante and Burgi 1992). This project involves the analysis of nonlinear activity of neuronal structures, and par- ticularly, a study of the oscillatory mode.

Example:

Figure 5.9 : Illustration of the correspondence between a cell response and vector. (A): Cooperation based on a measure of coherency between a set of neurons whose activity is characterized by spike frequency and phase. The

? operator is described in the text. (B): Illustration of the vector conversion. There is a linear relation between spike frequency and vector magnitude as well as between latency and vector angle, described respectively by the func- tion G(x) = a - x + b , where b is representative of a maintained discharge, and H(x) = - a'x + b' , where b' = 7t (a and a' are chosen accordingly to the considered range of signal amplitude). The example on the right shows how the vectors corresponding to three resAponses with different frequencies are disposed with the resultant of the vectorial addition indicated by the vector R,.

5.2.5 Temporal integration

In accordance with the relationship between intensity and latency, luminance differences yield a dynamic data flow of the visual information. The outputs of the last stage of processing of the feedforward architecture will thus evolve dynamically along time (the time unit is arbi- trary). According to the principle that dynamic transinfomation of a source perturbed by white noise has, along time, an optimum, one goal is to make use of it. As it has been pointed out, the main function of the feedforward architecture is to implement a high pass filtering. Thus, it de-

tects boundaries separating regions which are differentiates by their luminance values. Also, an increase in the confidence for classifying a point of the lattice as belonging either to one region or to another may result in an increase in the confidence of boundary locations. The idea of tem- porally integrating outputs of the last stage of processing aspires at conserving responses which correspond to periods where dynamic transinformation was particularly high during the pro- cessing (these optimum responses are particularly visible in Figure 5.13). Temporal integration is defined as follows:

z$t) = %(t - I)

+

$(t)

~ $ 0 ) = 0 b'i, b'j, b'k

where is defined in equations (5.1 1) and 2; is the output of the integrative stage (orienta- tion k and position (i, j) ). Time t - l indicates the time of the previous processing (defined with respect to the sampling time).

5.2.6

Illustration of the effects of asynchrony

To demonstrate the importance of this temporal integration stage, a seemingly simple prob- lem is considered (Burgi and h n 1991b; Burgi 1992a). k t an image, formed by a lattice of points, be divided into a left and a right part of equal area but of differing luminance values referred to as the background for the part of lower luminance and as the foreground for the part of higher luminance (e.g. left in Figure 5. IOA). Output responses of the feedforward architec- ture to this image are estimations of the location of the boundary separating the background from the foreground (e.g. Figure 5.10B,C). For a given column, the higher the response, the higher the confidence in the boundary location.

While this problem is easily solvable for a noiseless image, for very low signal-to-noise ratios it becomes critical. To establish the role of the temporal integration stage, white noise is added on this image. Comparisons between asynchronous and synchronous approaches are given in Figure 5.10 and Figure 5.11 for, respectively, Gaussian white noise of SNR 0.1 and uniform white noise of SNR 0.12. Brightness amplitude range is [ O . . .255] corresponding, given a linear relation, to the latency range [22 ms

. .

.2 ms ]

.

For the asynchronous simulations, sampling times were 2 ms (Figure 5.10) and 1 ms (Fig- ure 5.1 I), corresponding to 11 and 22 iterations respectively. Conversely, for the synchronous simulations, sampling time were 22 ms, corresponding to only one iteration. Results in Figure 5.10C and Figure 5.1 1C clearly demonstrate that boundary location is correctly performed with the asynchronous approach, as shown by the peak in estimation of the location in the middle of the image, whereas the synchronous approach gives a wrong location (the maximum response is not situated in the middle of the image).

To illustrate the effect of varying the sampling along time, different values have been cho- sen to process asynchronously the image with Gaussian white noise. Results shown in Figure 5.12 make clear that an increase in sampling time yields a decrease in the performance of boundary location. Note that the limit case in this increase corresponds to the synchronous pro- cessing.

CHAPTER 5. Asynchronous Visual Processing I. Foundations

A:

SNR = 0.1

without noise with Gaussian noise

E3: synchronous processing asynchronous processing with vectorial coherence cooperation with vectorial coherence cooperation

Figure 5.10 : Application of the feedforward model to locate the boundary separating two regions of differing lu- minances pertmbed by Gaussian white noise. (A): Gaussian white noise of standard deviation 79 has been added on a step of height 25 (125-100), resulting in a SNR of 0.1; (B): Coherency detection in vertical image columns.

For the asynchronous processing, sampling time of 2 ms; (C): Slices of (B), with numbers on the right indicating the worst peak-to-peak ratio (the highest divided by the second highest). Images 128 x 128; isotropic filter 11 x 1 1; anisotropic filters 21 x 21, aspect ratio 0.5. Brighmess range is [O.. .255] corresponding to the latency range

[22 ms . . .2 rns ] . Note that because of the reproduction on paper, the edge in (A) might not be visible and, for that reason, a dashed line has been added.

Asynchronous model. I. Feedforward

A:

SNR = 0.12

B:

synchronous with vectorial asynchronous with vectorial coherence cooperation coherence cooperation

Figure 5.11 : Application of the feedforward model to locate the boundary separating two regions of differing lu- minances perturbed by uniform white noise. (A): Uniform white noise of range 50 has been added on a step of height 5, resulting in a SNR Of 0.12; (B): Coherency detection in vertical image columns. For the asynchronous processing, sampling time of 1 ms; (C): Slices of (B), with numbers on the right indicating the worst peak-to-peak ratio (the highest divided by the second highest). Image 128 x 128; isotropic frlter 1 I x 11; anisotropic fdters 21 x 21, aspect ratio 0.5. Brightness range is [O ... 2551 corresponding to the latency range [22 ms . . .2 ms] . Note that because of the reproduction on paper, the edge in (A) might not be visible and, for that reason, a dashed line has been added.

CHAPTER 5. Asynchronous Visual Processing I. Foundations

Figufe 5.12 : illustration of the effect of varying the sampling time on the performance. Noisy step as in Figure 5.10 (Gaussian white noise and SNR=O. 1). Sampling time values are shown above graphs. Performance is mea- sured as the peak-to-peak ratio (the highest divided by the second highest) with the values shown on the right of the graphs. Increasing sampling time diminishes the asynchrony in processing, the limit being the synchronous case. Filter parameters as in Figure 5.10.

In order to illustrate the notion of optimum in processing, which is hypothesized to result from the increase in transinformation, a 3-D plot of the temporal evolution of the cooperative stage output (before the temporal integration stage), for every column image, is shown in Fig- ure 5.13. Peaks in coherence are temporally situated in a period of time where dynamic transin- formation is higher than synchronous transinformation (for a contrast of 25, it has been esti- mated that the maximum in transinformation occurs at x, = -39.2, which, for the image shown in Figure 5.10, would correspond to t = 6.8 ms ).

Asynchronous model. 11. Recursive with diffusion

Coherency f

u '

Time

Figure 5.13 : Illustration of the dynamic evolution of the feedfoxward architecture outputs, before the temporal integration stage. h u g e and parameters as in Figure 5.10. Coherency is calculaed according to equation (5.12).

The indication "middle" corresponds to the edge position. On the time axis, the indication "Opt" refers to the time where there is an optimum in response (obtained at t z 7 ms ), and the indication "Syn" corresponds to the re- sponse obtained from the synchronous case ( t = 22 ms ). Note that the temporal integration stage summates, for a given position, all the coherence responses along time (from t = 0 to t = 22 ms ).

5.3 Asynchronous model. 11. Recursive with diffusion

5.3J

Introduction

Most of the processing stages of this architecture are identical to those used in the feedfor- ward version (section 5.2). Only the new disposition of the processing stages and the adjunction of a new diffusive stage confer to this architecture new properties. Thus, the isotropic and anisotropic filtering stages (including the rectifications), the cooperation stage (when used), and the temporal stage are identical to those already described, respectively, in

5

5.2.2, § 5.2.3,

9

5.2.4, and tj 5.2.5.

The general architecture is first described in

5

5.3.2. Then, and for the sake of clarity, the diffusive stage is described in four successive paragraphs. The first

(5

5.3.3) deals with the the- ory invoked by the diffusion equation. The second 5.3.4) addresses the question of distin- guishing an edge from a surface. The third

( 5

5.3.5) presents some practical aspects of the dif- fusion equation. The fourth

(5

5.3.6) explains how edges, needed for determining the diffusive coefficients, are obtained. Finally, in § 5.3.7 a method to fix some essential parameters of the model is suggested.

5.3.2

Global architecture

The global architecture is shown in Figure 5.14 where it can be seen that the diffusive stage receives an asynchronous data flow. As it will be explained in the next paragraphs, this stage requires coefficients to be controlled. Thus, these coefficients must be reevaluated continuous- ly. This state of affairs implies the existence of a recursive loop linking the diffusive stage with

110 CHAPTER 5. Asynchronous Visual Processing I. Foundations the gradient calculation stage. The resulting architecture, with its dynamic evolution, can be de- scribed as follows: (i) an image with differing luminances yields an asynchronous data flow where latencies are chosen to be linearly related to luminances; (ii) the data flow enters the first isotropic filtering stage; (iii) output of this stage is temporally integrated; (iv) output of this stage is diffused; (v) conductance coefficients are determined dynamically to control diffusion.

The role of the temporal integration has afready been discussed in § 5.2.5. It is situated at the output of the isotropic filtering in order that the diffusive stage benefits from the increase in transinformation, resulting in a better edge estimation.

Figure 5.14 : Recursive architecture with diffusion. Edge estimation comprises the anisotropic and competitive stages (and optionally the cooperation stage). Outputs of the temporal integration stage are Xij defrned in equa- tions (5.34): outputs of the edge estimation are

sfx. S k

defined in equations (5.37). Note the stage which converts luminance values into an asynchronous flow of data. This asynchronous data flow is indicated by hatched mows.

This asynchronous recursive architecture has two dynmic data flows. First, the data flow yielded by luminance differences. Second, the data flow formed by the loop linking the diffu- sive stage with the edge estimation stage. Conceptually, both data flows are indistinguishable.

Practically, and for reasons of optimization, these two data flows are considered independently.

Given a sampling time for the asynchronous data flow, the data flow formed by the loop is sub- divided into a finer sampling time.

53.3

Diffusion equation

The diffusion equation used in the present context describes the propagation of neuronal activity from one neighbor to another along time. Propagation velocity is controlled by time- varying anisotropic coefficients, Membrane potential is supposed to have a decay and nellronal activity is initiated by sustained inputs. The anisotropic diffusion equation corresponding to these assumptions is written as follows (adapted from Cohen and Grossberg 1984):

where b(x, y, t), M, c(x7 y, t), and X(x, y, t ) stand respectively for the membrane potential, the decay, the anisotropic conductance coefficient, and the sustained input. The operators div and

V

indicate, respectively, the divergence and the gradient with respect to the space variables.

Note that if c(x, y, t) is a constant, this equation reduces to the isotropic heat diffusion equation.

Convergence of equation (5.14) is demonstrated in Appendix 12 in accordance with properties already stated in Briffod and Burgi (1991). Its capacity to enhance object contours and smooth