• Aucun résultat trouvé

Bayesian modelling of visual attention in word recognition: simulating optimal viewing position

N/A
N/A
Protected

Academic year: 2021

Partager "Bayesian modelling of visual attention in word recognition: simulating optimal viewing position"

Copied!
2
0
0

Texte intégral

(1)

HAL Id: hal-02004341

https://hal.archives-ouvertes.fr/hal-02004341

Submitted on 1 Feb 2019

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Bayesian modelling of visual attention in word recognition: simulating optimal viewing position

Thierry Phénix, Sylviane Valdois, Julien Diard

To cite this version:

Thierry Phénix, Sylviane Valdois, Julien Diard. Bayesian modelling of visual attention in word recog- nition: simulating optimal viewing position. 19th Conference of the European Society for Cognitive Psychology (ESCOP), 2015, Paphos, Cyprus. �hal-02004341�

(2)

BRAID Bayesian Reading with Attention, Interference and Dynamic

Conclusion

Lexical    Knowledge  Visual  A3en4on  Visual  Percep4on  

P(W0:T L1:T1:N 1:TL1:N A1:T C1:T1:N P0:T1:N 1:TP1:N I1:N1:T I1:N1:T S1:T1:N) =

P(W0) 266666 4

YN

n=1

P(P0n) 377777 5

YT

t=1

266666 66666 66666 66666 66666 66666 64

P(Wt | Wt 1)

YN

n=1

hP(Ltn | Wt)P( tLn | Ltn Ptn)i

P(At)

YN

n=1

hP(Cnt | At)P(Ptn | Ptn 1 Cnt )P( tPn | Ptn Int )i

YN

n=1

hP(Stn)P( Int )P(Int | St1:N Int )i

377777 77777 77777 77777 77777 77777 75 Mathema'cal  formula'on  of  the  joint  distribu'on  of  the  model    

Distribu'on  defini'ons  

Internal  representa4on  for  

the  word  “GARNI”.     Percept  build  on  the  s4mulus    

“GARNI”  aDer  10  itera4ons.     Recogni4on  of  s4mulus  “AIRE”.  Massive  effect  

of  frequency  due  to  compe44on  with  “DIRE”.    

A3en4onal  profile  with  a  mean  =  2  and     standard  devia4on  =  1.    

Variables  

Snt Le3er  at  posi4on  n  in   the  s4mulus  at  4  me  t   Int

Interferences  at  

posi4on  n  and  4me  t   Interference  weights   at  posi4on  n  and  

4me  t   ΔInt

Pnt Dynamic  percept  at   posi4on  n  and  4me  t  

A3en4on  over   percepts  at  4me  t   At

A3en4on  control  of   percept  n  at  4me  t   Cnt

Ltn Le3er  of  internal   word  at  posi4on  n   and  4me  t  

Wt Internal  word  

representa4on  at   4me  t  

λLtn

Coherence  variable   between  P  and  L  at   posi4on  n  and  4me  t  

λPtn

Coherence  variable   between  I  and  P  at   posi4on  n  and  4me  t  

 Univ.  Grenoble  Alpes,  CNRS,  LPNC,  F-­‐38000,  Grenoble,  France                          

[1]  Bosse,  Tainturier  &  Valdois  (2007)  :  Developmental  dyslexia:  the  visual  a3en4on  span  deficit  hypothesis.  Cogni4on  104  (2)  198-­‐230   [2]  Lobier,  Dubois  &  Valdois  (2013)  :  The  role  of  visual  processing  speed  in  reading  speed  development.  PLoS  ONE  8  (4)      

[3]  Bessière,  Mazer,  Ahuactzin  &  Mekhnacha  (2013)  :  Bayesian  Programming.  CRC  Press,  Boca  Raton,  Florida  

Experiment: simulating the OVP Effect

In tr o d u cti o n

[4]  Nazir  &  Poncet  (1998)  :  Pure  Alexia  and  the  Viewing  Posi4on  Effect  in  Printed  Words.  Cogni4ve  Neuropsychology  15  (1/2)  93-­‐140     [5]  Townsend  (1971)  :  Alphabe4c  confusion  :  A  test  of  models  for  individuals.  Percep4on  &  Psychophysics  9  (6)  449-­‐454  

[6]  Pelli,  Tillman,  Freeman,  Su,  Berger  &  Majaj  (2007)  :  Crowding  and  eccentricity  determine  reading  rate.  Journal  of  Vision  7  (2)  20-­‐36  

•    Word  Recogni'on   is  the  cornerstone  of  reading.  It  is  a  dynamic  process  that   emerges   from   the   interac4on   between   low-­‐level   visual   processing   of   input   le3ers  and  the  ac4va4on  of  memorized  orthographic  knowledge.    

 

•    Visual  A<en'on  (VA)  is  cri4cal  to  process  mul4ple  elements  simultaneously.  

VA  capacity  constrains  VA  span  (max  number  of  items  iden4fied  in  parallel),   thus   the   number   of   le3ers   that   can   be   simultaneously   processed   when   reading.  This  capacity  is  highly  limited  (4-­‐5  items)  [1,2].  

Current   word   recogni'on   models   do   not   include   visual   a<en'on   as   a  key  mechanism.    

How   to   model   the   role   of   Visual   A<en'on  in  Word  Recogni'on?  

•    BRAID  (Bayesian  word  Recogni4on  with  A3en4on,  Interference  and  Dynamics).   BRAID   is   a  probabilis4c  word  recogni4on  model  that  incorporates  control  of   a3en4on  resources,   lateral   interference   between   visual   inputs   (crowding   effect),   and   temporal   dynamics   of   informa4on   processing,   in   addi4on   to   bo3om-­‐up   le3er   iden4fica4on   and   top-­‐down   orthographic  knowledge.  

 

•   OVP  Effect  (Op4mal  Viewing  Posi4on):  word  recogni4on  performance  varies  as  a  func4on  

of  fixa4on  loca4on  within  the  word.  Can  BRAID  simulate  the  OVP  func4on?  

propose to resolve the apparent contraction between im paired letter processing and im - plicit read ing in terms of two reading systems.

The first system operates in the dam aged left hemisphere and is responsible for explicit la- borious identification of words. The second system operates in the right hemisphere and supports fast covert read ing.

The Present Study

To describe further the nature of the read ing deficit that characterises pure alexia, in the present study we investigated the read ing abil- ity of a pure alexic patient within an experi- mental paradigm that has been sho wn to elicit an idiosyncratic pattern of read ing perform - ance in normal read ers. This paradigm consists of measuring recognition perform ance for briefly presented words while the eyes are fix- ating different locations in the word (the ex- perim ental techniqu e is illustrated in Fig. 1).

Under such experim ental cond itions, a view- ing position effect is obtained for norm al read- ers: Word recognition performance is best when the word is fixated slightly left of its centre and decreases as fixation position dev i- ates either left ward s or rightwards from this

“optimal viewing position”. Figure 2 gives a characteristic viewing position curve obtained in a word identification task for seven-letter words. The viewing position effect is observ ed for short as well as for long words and gener- alises over different alphabetic languages and reading tasks (e.g. Brysbaert & d’Ydewalle, 1988; Brysbaert, Vitu, & Schroyens, 1996; Farid

& Grainger, 1996; N azir, 1993; N azir, Heller, &

Sussman, 1992; N azir, Jacobs, & O’Regan, in press; Nazir et al., 1991; O’Reg an & Jacobs, 1992; O’Regan, Lévy-Schoen, Pynte, & Bru- gaillère, 1984). A mathem atical mod el, which provides a good description and quantifica- tion of the prototypical shape of the viewing position curve (Nazir et al., 1991), serv ed to interpret the deviating read ing perform ance of the patient. The model is described next.

A Model to Account for the Viewing Position Effect Given the strong acuity drop-off in parafov eal vision, the number of letters that benefit from high resolution differs consid erably as a func-

Fig. 1.The paradigm of the variable viewing position in words. A fixation point appears at the centre of the computer screen. After a short duration, the fixation point is replaced by a word. A brief exposure duration of the word is adopted to prevent participants from making eye movements. The word appears at different positions relative to the fixation point, such that the directly fixated part of the string can systematically be

manipulated from trial to trial. Eye movements are not measured.

PURE ALEXIA AND THE VIEWING POSITION EFFECT

COGNITIVE NEUROPSYCHOLOGY, 1998, 15 (1/2) 95

Behavioral  Experiment  (Montant,  Nazir  &  Poncet  1998)  [4]  

Computa'onal  simula'on    

The present data clearly sho w that the read- ing system of the patient is functioning at least partially . CP exhibited a strong frequency ef- fect and a clear viewing position effect, al- though the shape of the viewing position curve was not of the classic type. Like normal read- ers, CP was able to process a string of letters during one single fixation, provided that he fixated towards the second half of the word.

According to the model, asymmetries in the viewing position curve are caused by differ- ences in the ability to identify letters in the right and left visual field (Nazir et al., 1991). A shift of the optim al viewing position to the right of the centre of the word, as observ ed with CP, indicates that letter processing is im -

paired in the right visual field. Given that perim etric testing did not reveal major visual anom alies in CP’s right visual field, this im - pairm ent cannot stem from a pure visual defi- cit but must be related to difficulties in processing complex visual forms.

EXPERIMENT 2

To discern potential differences in CP’s general capacity to process complex visual stimuli in the right visual field and the left visual field, CP was asked to match the identity of two simultaneou sly presented letters. The letters consisted of either physically identical pairs (BB), nom inally identical pairs (Bb), or nonidentical pairs (BJ). The stimuli were pre- sented either in central vision or in the left or right visual field.

Fig. 6.Mean percentage of correct word identification for CP (left panel) and normal participants (right panel) as a function of word length (five- to nine-letter words) and fixation position in words (Experiment 1). For CP, each data point corresponds to 50 measures.

PURE ALEXIA AND THE VIEWING POSITION EFFECT

COGNITIVE NEUROPSYCHOLOGY, 1998, 15 (1/2) 105

Word  recogni4on  task  with  variable  viewing  posi4on:  

ADer  looking  at  a  fixa4on  point  for  1  s,  a  word  is  presented  to  the  par4cipant  for  a  short  dura4on,   avoiding  any  eye  movement,  and  shiDed  in  func4on  of  desired  viewing  posi4on.      

Word  length  factor:    

Word  length  varies  between  5   to  9  le3ers,  with  a  total  of  250   words  (50  per  length).  

Viewing  Posi'on  factor:    

Word  was  divided  into  5  equally   wide  zones.  The  center  of  each   zone  was  the  ini4al  fixa4on   posi4on.    

•  Viewing  posi4on  effect  was   significant  for  all  lengths  (p<.001).  

 

•  Performance  was  op4mal  when   words  were  fixated  slightly  leD  of   their  center  (p<.001)        

•  Performance  was  highly  sensi4ve   to  word  frequency  [t(248)  =  5.27,  P  <  . 001    

BRAID  is  the  first  word  recogni4on  model  designed  with  structured  probabilis4c  modeling  [3].  Le3er  iden4fica4on  is  a  temporally  dynamic  process,  building  up  a  percept  distribu4on  by  

accumula4on  of  sensory  evidence.  It  relies  on  a  le3er  confusion  matrix  [5]  and  a  temporal  decay  parameter.  The  weighted  fusion  of  le3er  neighbors’  distribu4ons  allows  flexibility  in  le3er  posi4on   coding  and  further  accounts  for  crowding.  A3en4on  modulates  le3er  processing  and  enhances  le3er  iden4fica4on  under  a3en4onal  focus.  Acquired  orthographic  knowledge  (lexical  database  of   36,000  words)  serves  as  a  top-­‐down  influence  during  le3er  iden4fica4on.    

 •  BRAID  integrates  noisy  posi4on  coding,  which  allows  recognizing  a  word  even  if  some  of  its  le3ers  are  transposed,  or  ineligible.  The  same  mechanism  accounts  for  crowding  effects  [6]  (outer   le3ers  are  be3er  recognized  within  strings).  

•  BRAID  integrates  word  frequency  as  a  prior.  In  dynamic  point  of  view,  frequency  can  be  seen  as  a  res4ng  state  of  the  word  distribu4on:  it  is  the  star4ng  point  of  the  dynamic  process  of   recogni4on,  but  also  the  value  to  which  the  distribu4on  will  go  if  we  remove  the  input  s4mulus.  

•  BRAID  integrates  an  A3en4onal  component  computed  as  a  distribu4on  over  the  percepts  that  controls  for  the  amount  of  a3en4on  allocated  to  every  single  percept.  In  the  dynamic  process,   a3en4on  prevents  decay  of  informa4on  on  le3er  iden4ty.  Typically,  the  distribu4on  is  a  Normal.  Its  mean  corresponds  to  the  viewing  posi4on  and  its  standard  devia4on  delimits  the  VA  span.    

Confusion  Matrix  (on  the  leZ):  

Le3er  iden4fica4on  is  based  on  a  confusion  matrix  from   Townsend  [5].  But  this  matrix  is  the  result  of  around  50  ms   of  exposi4on.  We  modified  the  matrix  using  a  Laplace  

succession  law  to  simulate  the  results  at  shorter  dura4on.  

Parameters  evalua'on  (on  the  right):  

Each  cube  represents  the  level  of  iden4fica4on  in  func4on   of  decay  (leak),  crowding  and  Laplace  succession  law  

parameters.  Red  corresponds  to  at  least  95%  iden4fica4on.    

Preliminary  results  

•  On  the  leD,  percentage  of   recogni4on  as  a  func4on  of   viewing  posi4on  (OVP  curve)   for  the  word  “PLANETE”.  

 

•  On  the  right,  dynamics  of   recogni4on  for  each  viewing   posi4on.  The  OVP  curve  

should  vary  depending  on  the   dura4on  we  consider.    

 

•  A  more  systema4c  study  is   underway.  

•  We  demonstrate  the  poten4al  of  BRAID  to  simulate  the  OVP  effect.  It  is  the  first  word  recogni4on  model  able  to   do  so.  

 •  Why  can  BRAID  simulate  OVP  effects?    Because    BRAID  is  the  first    word  recogni4on  model    that  incorporates   a3en4on  as  a  key  component.  A3en4on  allows  focusing  on  different  le3er  posi4ons  within  the  le3er  string.    

 •  BRAID  provides  new  insights  on  OVP  effects,  by  simula4ng  the  temporal  dynamics  of  word  recogni4on.  BRAID   predicts  varia4ons  of  OVP  effects  depending  on  the  dynamics  of  the  system,  with  poten4ally  different  effects  at   different  temporal  loca4on.  

P(W 0) Represents  the  res4ng  state  of  the  internal  word  ac4va4on.  Typically,  it  is   based  on  word  frequency.    

P(WT |Wt−1) Represents  transi4on  of  internal  word  ac4va4on  from  4me  t-­‐1  to  t.  We   implement  a  memory  decay  to  the  res4ng  state.    

P(Int | S1:Nt ΔInt ) Represent  the  crowding  effect.  We  implement  only  direct  neighbor   interferences.  The  result  is  weighted  by                                  .      P(ΔInt )

P(λPtn | Pnt Int ) Guarantees                and                  have  the  same  value  during  the  computa4on.  Pnt Int

P(Snt ) Represents  prior  knowledge  about  sensory  s4mulus.  Typically,  we  use  a   uniform  distribu4on.  

P(Pnt | Pnt−1 Cnt ) Represents  transi4on  of  percept  n  from  4me  t-­‐1  to  t.  Decay  here  is  controlled   by  a3en4on                                        .  Alloca4ng  a3en4onal  resources  blocks  decay.  P(Cnt | At)

P(λLtn | Ltn Pnt) Guarantees                and                  have  the  same  value  during  the  computa4on.  Ltn Pnt

P(At ) Represents  the  distribu4on  of  a3en4on  over  percepts.  We  use  a  Gaussian   distribu4on  centered  on  the  viewing  posi4on.  Its  standard  devia4on  models   Visual  A3en4on  Span.  

P(Ltn |W t) Represents  orthographic  knowledge.    

AIRE   DIRE  

Références

Documents relatifs

The lexical membership submodel implements a mechanism to decide whether or not the input letter-string is a known word, by observing how predicted spellings compare with

Under such experim ental cond itions, a view- ing position effect is obtained for norm al read- ers: Word recognition performance is best when the word is fixated slightly left of

While many computational models simulate reading, word recognition, phonological transcoding or eye movement control in text reading, only one study (Ziegler, Perry, &amp; Zorzi,

Observation: Word-final obstruents are acquired while word-final sonorant undergo deletion. This observation reflects Zec’s generalization (1988) that sonorant are universally

This suggests that under articulatory suppression, even if the self-produced speech may impact pSTM performance (Gupta and MacWhinney, 1995), it cannot be entirely responsible for

Other simulations run on artificial lexica (Mathey &amp; Zagar, 2000) suggested that when the number of neighbors was greater than two, the neighborhood distribution effect varied

The shape of word the frequency effects in the visual (black line) and auditory modality (dashed line) in MEGALEX, as well as in the French Lexicon Project (dotted line) for

Four experiments using a lexical decision task showed systematic effects of the anagram relationship between lexical units as well as between prime and target stimuli, even though