Geometric selectivity - Generalization to other features

6.2 Generalization to other features

6.2.2 Geometric selectivity

Two geometric feature detectors are described and used. They correspond to visual receptive field (W) whose shape and response (weight profile) mimic those found in the LGN (isotropic stage) and in the visual cortex (anisotropic stage). Concentric on-center off-surround LGN cells are modeled by the difference of two Gaussians (DOG). These cells are known to enhance contrasts and to respond weakly to uniform surfaces. For the visual cortex, end- stopped cells were chosen for their ability to be excited either by line ends or by comers (von der Heydt 1987) as well as for their selectivity to curvatures (Dobbins et al. 1989). They are described by the difference of two elliptic Gabor functions (for Gabor functions, see Marcelja 1980; Daugman 1983) at the same position with identical orientations but one with a small receptive field (small component) and the other with a large receptive field (large component).

Parameters are adapted from those found in Dobbins et al. (1989). Equations describing these receptive fields can be found in Appendix 13.

To evaluate latencies corresponding to various stimuli, the weighting functions of the R F are convolved against the image containing the stimulus. Results of the convolution are then converted into rates of f i n g and latencies computed as in $ 4.1.3.

The model of leaky integrator presented in

5

4.1.3 was conceived of with only one excitatory entry. Also, if more than one entry is considered, this model is applicable only if the trains of spikes are in total synchrony, which corresponds to identical signal amplitudes. In such a case, it is possible to use the convolution operator and, then, to convert its response into a rate of firing to calculate the latency. Without this assumption (identical signal amplitudes), nonlinearity pervades and a convolution followed by a latency conversion is consequently no longer correct. In such circumstances an extension of the electrical circuit presented in $4.1.3 would be necessary, with, among other things, an inhibitory channel conductance for negative weights (of the weighting profile). Furthermore, the model would contain as many inputs as there would be image elements in the spatial span of the RF and each channel (either excitatory or inhibitory) would have to integrate its train of pulses. Such a model is described, for instance, in Wilson and Bower (1989).

6.2.3 Results for corner and curvature selectivity

Results of these evaluations c o n f m the existence of latencies specific to geometry. In particular, key points, such as corners and junctions (Figure 6.14), appear first, whereas diffuse light and edges invoke longer latencies and, thus, appear later. This fundamental result is valid for both kinds of RFs.

CHAPTER 6. Asynchronous Visual Processing 11. Applications

lat = 5.28 ms lat = 8.0 ms lat = 10.44 ms

Figure 6.14 : Latencies of geometric features using the receptive field of a LGN-like cell (see equations in Appen- dix 13). Illustration of the temporal precedence of comers over simple lines when LGN-hke cells are used. The whole image has a constant intensity: lines have value 1, whereas the background is set at 0; the convolution is therefore applicable. As expected, key points (junctions and comers) appear first. Circles show the loci of the activated cells. Filter parameters are: WR = 2.5, AR = 1 .O, MR = 18.0 and size = 15. Electrical parameters as in Fig- ure 4.4 and F3gure 4.5.

Furthermore, results obtained from end-stopped cells show that variation in the radius of curvature of lines implies correspondingly varying latencies (Figure 6.15). Varying responses (in terms of rate of firing) to different radii of curvature is extensively discussed in Dobbins et al. (1989). The supplementary step performed herein relates rates of firing to latencies. It in- tends to show that the curvature whose value is adapted to RF sizes has a short latency while those differing with this optimal value have steep increases in latencies. Therefore, the more bent a line is the shorter the latency of the cell, assuming that the smallest radius of curvature is above a certain size related to the size of the components. Sharper curves are thus favored (in term of temporal precedence). Application of this temporal selectivity to curves is to be found in Figure 6.16.

Similarly to end-stopped cells, LGN cells appear to have shorter latencies for chevrons with correspondingly smaller angles between the sides of the V-shape. In this case, however, the function relating latencies to the varying angle seems to be non-monotonic and further stud- ies are necessary.

Generalization to other features

0

20 40

60 80 100 Radius of curvature

Figure 6.15 : Latency of end-stopped cells versus various radii of curvature of line segments. Filter parameters (see equation A2.4) are: WR = 2.5, AR = 3, MR = 4.3; large component = 25 and small component = 11 (see equations in Appendix 4). The stimulus was centered on the peak of the positive lobe of the RF. Electrical param- eters are as in Figure 4.4 and Figure 4.5.

6.2.4 Other feature selectivity

In section 4.3, various visual features were shown to have specific latencies. Apart from luminance, they have not been studied in this thesis. Nevertheless, some predictions can be made for the utilization of other sources of latency, such as:

* color: similarly to luminance, spectral frequency would yield latency differences and, thus, result in an increase in transinformation which would help in segmentation tasks;

spatial frequency: precedence of low frequencies or of the global aspect of an object (psychologically valid, i.e. Navon 1977; Miller 1981; Boer and Keuss 1982; Hughes et al.);

a texture: Gabor filters are known to be texture selective (e.g. Porat and Zeevi 1989). Also, corresponding latencies could be used in the same way for luminance and colors to alleviate the problem of segmentation.

In a more general way, latency of visual features is linked to the receptive field selectivity.

Introducing a conversion of the output responses of the isotropic filtering into latencies in the feedforward and recursive models would thus allow, for instance, taking into account the temporal geometric selectivity. For higher visual features, other levels of these architectures would have to be considered, but in the whole the principle would remain the same. Nevertheless, an extensive study would be necessary to generalize the asynchronous approach to higher visual functions (see

5

7.3.4).

CHAPTER 6. Asynchronous Visual Processing 11. Applications

Figure 6.16 : (A): Profile of a mouse. Various radii of curvature allow testing of latencies of the end-stopped cells.

(B): loci and orientations of the activated cells. The whole image has a constant intensity: lines have value 1 whereas the background is set to 0 and again the convolution is applicable (see explanations in 6.2.2). We have seen before that sharper curves should be favored. Moreover, in order to increase the sensitivity of cells over lines, a directional selectivity was added using different angles in equation (A88). Angles were varied between 0" and 170" by steps of 10". Ellipses indicate the loci of the activated end-stopped cells. Simulation was stopped at time 8.52 ms when key points are effectively detected.Filter parameters are WR = 2.5, AR = 3, MR = 4.8; large component = 15 and small component = 7. Electrical parameters are as in Figure 4.4 and Figure 4.5.

6.3 Conclusions

6.3.1 Matching human perception

Resolution of the figure-ground separation problem for very low SNRs and for iso-average images indeed demonstrated an increase in performance when visual information is treated asyncbronously. Furthermore, for this particular problem, and using the recursive model, latencies confer properties to the model which match those found in human perception. An example is the plateau which characterizes the stable performance in detecting the foreground, even for

Conclusions 141 very low SNRs, until a certain critical value beyond which subjects cannot decide with certain- ty whether a foreground is present or not.

Whether the human visual system really benefits from asynchrony yielded by luminances should now be experimented on subjects. A methodology for testing the effect of this asynchrony is presented in the general conclusions

(9

7.3.2).

6.3.2 Asynchrony: a general concept

Generalization of the concept of asynchrony to other visual features has been shown to be possible if receptive field responses at every level of analysis are considered. Interestingly, specific geometric patterns, that is to say key points, known to be particularly important in perception for coping with occluding contours (von der Heydt and Peterhans 1989b) as well as with shape perception (Attneave 1954), were found to precede (temporally) other visual features such as lines and surfaces.

Supposing latencies in the early visual pathway do exist, they must be present at every level along the hierarchy of the visual system. Consequently, the latency specificity must likewise increase when going higher up in the levels. Optican and Richmond (1987) have studied, in the primate inferior temporal cortex, the temporal modulation of spike trains (see

8

4.3.6) They indeed found latencies specific to different two-dimensional Walsh stimuli corroborating the be- liefs that luminance, spatial, and other visual features may yield asynchrony in the human visual system.

If y a toute une iristiqm rZ llz base di! C'huristiqm, toute une dialkctique du f a q e t du vrai d C'm@ne d;! nos jugements d'el~perience. Un essai d;! synttiPse fond;! to~ours sa re'ussite par opposition d & khecs ant&&nts. La cuue ne peut, par essem, faire fb@et 6'une intuition. Car C'ae d;! C'efet huant &re plus comph~e que l%&e a5 (a cause, Ca kflLrentiellk d;! nouueaute' qui se man- isfeste di! & cause d C'efet doitfaire Ib@et fine pensee discursive, d'une pen- se'e essentiehment d&chqu. L'intuition peut sans doUte, a p b m p , appmter une fudre; ei%- a a s la force d'une hu6ituri;! ratwnneh; mais eUi!

ne sau~ait e'cliztrer I;z rechrck primitive. Avant I 'intuition, ily a I'e'tonnement,

Gaston Bachelard, La dialectique de la duree, 1950.

CHAPTER 7 Overall conclusions

Understanding the human visual system is an ambitious enterprise which necessitates knowledge from various domains. The methodology adopted in this thesis was, on the one hand, to get acquainted with the current and available knowledge in anatomy and physiology of the early visual system of primates and, on the other hand, to apply the reductionist principle to analyze large cortical structures on the basis of elementary units. In particular, two neuron models were studied and their mathematical descriptions were illustrated by equivalent electrical circuits; this allowed a reduction from the understanding of cortical functions to the study of simple electrical components.

In order to summarize what has been accomplished in this work, but also ponder on some deeper questions relative to asynchrony, and propose some future developments, intuitively founded, this last chapter is divided into three sections,

7.1 Accomplishments

7.1.1

Elementary units

As it has been pointed out in the introduction (section 1. I), the choice in the model of elementary unit, especially the accuracy with which biological mechanisms are described, directly acts upon the function of the neuronal structures composed of such units. Whereas for small structures it is possible to utilize a very accurate biological model, this possibility becomes un- realistic for larger structures, typically found in the visual cortex. Also, since the aim of this work was to introduce a new concept, asynchrony, a coarse model of the early visual system was found to be sufficient for that task. Particularly, the temporal precedence of neuronal responses, a direct consequence of asynchrony, was shown to have a potent effect if adequately used.

CHAPTER 7. Overall conclusions

7.1.2 Neuronal latency

Based on a RC circuit, the relationship between frequency of a spike train and latency has been established. Analogous relationships have also been discussed and demonstrated to exist for the photoreceptors and a model of the retina. While nonlinearity pervades in these relationships, for the sake of simplicity they have been reduced to linear functions. It remains to determine what are the effects of such an oversimplification.

9.1.3 Architectures

The feedfonvard and recursive architectures could both c o n f m that asynchrony, stem- ming from differences in signal amplitudes, helps to solve a specific problem, yet easily gen- eralizable. The first architecture (feedfonvard) was shown to be restricted to extracting edges in an image and, thus, was not representative of human perception which can sense the surface of objects. For that reason, a diffusive stage was added to this architecture to remedy to this limitation (forming the second architecture). Given that visual information was considered to arrive asynchronously, this architecture required a feedback to continuously reevaluate the co- efficients needed by the diffusive stage. In consequence, this architecture was referred to as recursive. Performances of this model were found to be satisfying in the tasks considered. More- over, its ability to explain perceptual tasks involving static images that synchronous models fail to analyze effectively, represents a major improvement in the understanding of the early visual system.

7.1.4 Dynamic transinformation

When using an asynchronous model, an extension of the notion of transinformation to asynchronous signals has given a satisfying explanation of the increase in performance of information extraction for images strongly corrupted by Gaussian and uniform noise. The dy- narnic transinformation has been applied to two symbols representing, for instance, a background and a foreground. In CHAPTER 6 the problem of segmentation has also been ad- dressed, representing an extension of the figure-ground separation problem to more that only two symbols. The preliminary results of such a segmentation also indicate an increase in performance. Nevertheless, to obtain more satisfactory results from complex scenes, a better edge estimation will have to be designed.

7.2 Dissertation on asynchrony

7.2.1 A usefull concept

The concept of dynamic transinformation aimed at proving that a synchronous approach can compromise the performances of a task by mixing all information originally temporally structured. Perceptual performance of the human visual system can indeed be better matched by an asynchronous approach. Henceforth, it should not be ignored that a static image creates a dynamic data flow. Furthermore, if the temporal precedence applies to the early part of visual processing, it might also be questioned whether higher visual function could not benefit from it. This point is further discussed in section 7.3.

Apart from the processing of static images, the winner-take-all neuronal structure also

ruture work 145 demonstrated the importance of the temporal precedence. Thanks to this structure, it could be concluded that the dynarnical properties of a neuronal structure are dependent on time differences in the arrival of input signals. In particular, when the strongest signals appears first, the solution was shown to be faster to settle.

The temporal precedence has, nevertheless, the disadvantage of being demanding in terms of computer resources. By opposition, the brain is not penalized by the time dimension as its neuronal structures are wired and continuously process the data flow. The only temporal limitation of neuronal structures is the rate of temporal variations applied to their inputs, a limitation which has its origin in the leaky integrator nature of neurons.

7.2.2 Ontological and philosophical arguments

Whether two events can emerge simultaneously, or in synchrony, may be questioned on the basis that the notion of simultaneousness depends on the scale of analysis, thus on the Sam- pling time. In the notion of synchrony is contained an artificiality intended for conceptualizing a reality which, often, escapes our comprehension. Reintroducing asynchrony in models is thus justified only if this notion introduces new concepts. Those presented in this thesis (i.e. dynamic data flow and temporal precedence), I hope, have been convincingly shown to be well-founded.

If asynchrony is accepted as being natural, by opposition to synchrony, its origin must be sought. Signals with differing amplitudes or strengths were demonstrated to be one cause of asynchrony. In particular, it was found that the higher the amplitude the shorter the latency of neuronal responses. Does physics dictate this relation? In term of energy, this assumption appears reasonable. Indeed, the notion of amplitude exists only through the flux of energy which, in turn, is dependent on time. The higher the amplitude of a signal, the higher the flux of energy and, thus, the shorter the time required to transmit a fixed amount of energy.

Less obvious is the reason why neuronal structures should always benefit from such a relationship and not from the opposite one, which would be the higher the amplitude, the longer the neuronal latency. This latter and artificial situation has been tested on the winner-take-all neuronal structure, resulting in a net increase in the convergence time to reach a stable state.

Consequence of this increase is obviously a decrease in the performances of the function performed by this structure.

Survival of species depends on their aptitudes to react quickly to fast perceptual variations or to strong sensations. Both situations imply sources of consequent energy and thus it seems logical that evolution gained advantage in a faster processing for signals coming first, that is to say, those corresponding to greater amount of energy. It is not the habit of nature to evolve against the laws that physics dictates and the question of temporal precedence, a resultant of asynchrony, seems once more to confirm this fact.

7.3 Future work

7.3.1 Introduction

In the context of this work, I see three major domains where future work is particularly in- dispensable. First, regarding psychological aspects where the concept of asynchrony should be

140 C H M l ' E K 7. Overall conclusions tested on the human visual system. Second, regarding the dynamics of neuronal structures where a solution to the contour completion problem should be sought. Third, in computer vi- sion where solutions to analyze complex images have not yet been found. Some aspects of these three domains are successively discussed in

5

7.3.2, 7.3.3, and

5

7.3.4.

7.3.2 Psychological experiments

The hypothesis of the existence of a visual data flow was judged to be well-founded in view of the satisfactory results obtained from the recursive model. Nevertheless, it remains that there is no certitude in the validity of this model. One way of deciding upon this issue is to test asynchrony on the human visual system. To this end, I propose the following experiment.

The hypothesis of asynchrony and, thus, of the existence of a visual data flow, is taken as valid and its origin to be in luminance differences. Consequently, when a subject looks, for instance, at the iso-average figure/ground image defined in CHAPTER 6, he will be able to dis- tinguish the figure from the background only (it is an hypothesis) because the higher luminance values are perceived first by its early visual system. In this line of thought, if all regions of this image are forced to reach the early visual system in synchrony, the perception of the figure should disappear. For this experiment to be possible, I propose first to determine for every luminance value its intrinsic latency, and second to artificially delay the perception of regions according to their luminance.

To illustrate this procedure, consider the iso-average image described in Figure 6.8. In this image there are 5 regions of different luminance values. Imagine that the measured latencies for these luminances are, for a human subject, those found in the table in Figure 7.1. Then, the column "AT", for absolute time, would specify when a region of a given luminance should appear on the screen. The effect of this procedure would contrive to receive all visual information in synchrony (in this case after 40 ms). Performance comparisons between normal perception (all regions of the image appearing at once on a screen) and artificial perception (regions of lower luminance appearing first, followed by those of higher luminance, according to their latency) would allow a valuable evaluation of the initial hypothesis,

Dans le document Understanding the early human visual system through modeling and temporal analysis of neuronal structures (Page 162-0)