ADVERTIMENT. Lʼaccés als continguts dʼaquesta tesi queda condicionat a lʼacceptació de les condicions dʼús establertes per la següent llicència Creative Commons: http://cat.creativecommons.org/?page_id=184 ADVERTENCIA.El acceso a los contenidos de esta tesis queda condicionado a la aceptación de las condiciones de uso establecidas por la siguiente licencia Creative Commons: http://es.creativecommons.org/blog/licencias/
WARNING.The access to the contents of this doctoral thesis it is limited to the acceptance of the use conditions set by the following Creative Commons license: https://creativecommons.org/licenses/?lang=en
Genís Prat Ortega
Attractor dynamics in perceptual decision making: from theoretical predictions to
psychophysical experiments
Tesis doctoral.
Programa de doctorat en Matemàtiques, Universitat Autònoma de Barcelona
Genís Prat Ortega
Directors Tutor
Jaime de la Rocha Alex Roxin Tomás Alarcón
Setembre 2019
Agraïments
M’agradaria agrair als meus supervisor, el Jaime i l’Alex, pel seu suport durant tots aquests anys i al Klaus que ha estat involucrat en la meva tesis desde el primer dia.
També m'agradaria agrair al Tobias i al Niklas per ajudar-me a realitzar els experiments del capítol 4.
A tots els companys del IDIBAPS, CRM i UKE que m’han fet la vida molt més fàcil.
També els meus pares i la meva germana, que sempre, sempre estan allà i mai fallen!
Table of contents
Summary 5
1. Introduction 7
Canonical models of perceptual decision making 8
Psychophysical kernel 10
Psychophysical Kernel for the canonical models 12
Drift diffusion model with absorbing bounds: the standard cognitive model 12
Biophysical models of perceptual decision making 14
One dimensional neurobiological models of perceptual decision making 20
Classical view of perceptual decision making 22
Thesis goals 23
2. Flexible categorization in perceptual decision making 25
Summary 25
Results 26
Changing the stimulus statistics reveals the dynamics of the decision variable 26 Decision accuracy in models of evidence integration 28
Consistency in models of evidence integration 32
Spiking network 34
Testing model predictions in a perceptual decision task 35
Discussion 38
Methods 42
Model simulations 42
Psychophysical kernel 43
Normalized psychophysical kernel area and primacy-recency index 43
Accuracy for the DWM 44
Asymptotic analysis 46
Psychophysical data and model fitting 49
Network of integrate-and-fire neurons 50
3. The double well model in tasks involving working memory and decision making 54
Summary 54
Results 55
Impact of the winner take all dynamics in the accumulation of evidence 55 Impact of temporal gaps in models with winner take all dynamics 57 Double well model combines stimulus evidence from two pulses 59 Alternative model with urgency and undecided state 60
Discussion 63
Methods 66
Double well model fitting 66
Limitations of the double well model fitting 67 Alternative model and comparison with the double well 68 4. Modifying the magnitude of stimulus fluctuations to identify different neural
mechanisms 69
Summary 69
Results 70
The “ single dot“ task 70
The accuracy of subjects does not show a systematic non-monotonic relation with
the strength of stimulus fluctuations 70
Idiosyncratic behaviour in the dot tasks 71
Testing the model based analysis with synthetic data 76 Extended neurobiological model to explain idiosyncratic behaviour 78 Preliminary results of the extended neurobiological model 81
Discussion 87
Methods 89
Psychophysical Experiment 89
Primacy-recency index 90
Model fitting to the psychophysical data 90
Model fitting to synthetic data 93
5. Conclusions 95
Supplementary figures 97
Bibliography 102
Summary
From time to time humans and animals must respond to a certain stimulus that can be ambiguous. In the old days, before the creation of video assistant referee, football referees had a very hard life. One of the most popular plays in football history is the so-called “La mano de dios” where Maradona used his hand to score a goal in the quarter-final match of the 1986 World Cup. Based on what he saw, the referee incorrectly decided that Maradona had not touched the ball with his hand and Argentina ended up winning the World Cup. The decisions based on external stimuli (in this case visual) are what we call perceptual decision making.
In this thesis, we studied how the brain makes perceptual decisions in experimental settings where subjects have to make a categorical decision about a certain feature of the presented stimulus. We studied the case where the stimulus is presented for a certain time controlled by the experimenter. During the stimulus presentation, the subjects have to accumulate evidence and when the stimulus ends, they need to choose between two possible alternatives. These experiments are typically called 2-alternative forced choice tasks (2AFC).
From a computational point of view the accumulation of stimulus evidence in 2AFC tasks has been studied intensively in the last decades. Canonical approaches to model this cognitive function are based on diffusion processes that assume bounded or unbounded perfect stimulus evidence accumulation. However, the relationship of such models with the underlying neural circuitry is unclear. In this thesis, we study the accumulation process in neurobiological models with attractor dynamics. Such models can actually be reduced to a nonlinear diffusion process, which in the case of binary categorizations can be described by a double well potential (DW). Despite the fact that the canonical and the neurobiological models rely on different mechanisms, they can account for various behavioral aspects such as performance or reaction time.
The first aim of this thesis was to derive behaviourally testable predictions of attractor dynamics during a 2AFC task and compare them with models that assume other kinds of dynamics (e.g. perfect integration). We found two signatures of attractor dynamics that can be tested in behavioural experiments. Specifically, we found that: 1) The DW model had
different integration regimes, from transient (primacy) to leaky (recency) as the magnitude of the stimulus fluctuations (σ s ) or the stimulus duration (T) increased and 2) the DW had a non-monotonic relation between the accuracy and the stimulus fluctuations.
The second final aim of this thesis is to qualitatively and quantitatively test the existence of attractor dynamics. To this aim, we designed an experiment where we systematically modified the magnitude of the stimulus fluctuations. Qualititatively, we could not identify obvious signatures of attractor dynamics that allow us to distinguish between different models. However, we quantitatively assessed the attractor dynamics and other plausible neural mechanisms by developing a model-based analysis. Preliminary results suggest that attractor dynamics can be important to explain the behavioural results in at least a fraction of subjects.
1. Introduction
I would like to start my thesis with an insight: humans make decisions. Some of them are important, whereas others are less important. For example, today while I was writing this paragraph I made a critical decision for my life: I decided that it was time to clean up the blankets that were on a chair in my bedroom for the last three months. It is funny because today (September 6, 2019) was the first day during these three months that it was actually a little bit chilly at night. But do not blame me, humans do not make optimal decisions (Barth, Cordes, and Patalano 2018) , or maybe they do (Brunton, Botvinick, and Brody 2013) , who knows? Well, I should probably stop procrastinating and start writing the introduction.
Spain has not had a government since the last elections in April. Today, two parties that in principle want to implement similar progressive policies are negotiating to form it. It seems, however, that they will not reach an agreement before the deadline to call for new elections.
If this finally happens, many former voters from the two parties are going to vote for the party that they consider is not responsible for this disaster. Thus, they will have to decide based on very little and manipulated information which of the two parties is more responsible for not reaching an agreement. Although studying this type of decision is a very interesting problem, it is not well suited for studying how brain circuits carry out these computations because voters can base their decision on multiple sources of information which are hard to control experimentally. In contrast, in perceptual decision making, subjects must base their decisions on sensory information that is completely controlled by the experimenter. A famous experiment is the motion discrimination task where subjects must decide if a cloud of dots is moving towards the right or the left. Because experimenters can control all the details of the stimuli, the subject’s responses, etc, we can try to infer how the brain is using these stimuli to make a choices.
The canonical approaches to model perceptual decision making describe the dynamics of the decision process with a simple heuristical linear equation. In contrast, biophysical attractor models are complex and high dimensional but they explain the dynamics of the decision process based on realistic neural circuits. Here, we explain that attractor biophysical models can be reduced to a one dimensional nonlinear equation. The main goal of the thesis is to find experimentally testable signatures of non-linear attractor dynamics in perceptual
decision making and test them in a psychophysical experiment. To analyze the psychophysical experiments, we are extending the attractor models to include different plausible neural mechanisms and we are developing a fitting procedure to fit all these mechanisms by maximum likelihood.
In this introduction, we first summarize the canonical and biophysical models for perceptual decision making. Then, we explain what a psychophysical kernel is and why it is important for understanding how the brain uses the stimulus to make decisions. Finally, we discuss why the findings of our work are important.
Canonical models of perceptual decision making
In perceptual decision making, the standard approaches to model the dynamics of sensory evidence accumulation during two alternative forced choice tasks (2AFC) are one-dimensional drift diffusion processes. Specifically, the diffusion process describes the dynamics of the decision variable ( ). This decision variable represents the X moment-by-moment evidence for one alternative over the other. The mean sensory evidence is generally modeled as a constant drift in this integration which also has a noise term, reflecting fluctuations in the sensory input and internal noise in the decision process. We define the canonical models as three drift diffusion models with different boundary mechanisms: 1) The drift diffusion model with absorbing boundaries (DDMA) integrates the stimulus evidence until the decision variable reaches an absorbing boundary. 2) The drift diffusion model with reflecting bounds (DDMR) integrates the stimulus evidence with an upper limit to the accumulated evidence for one of the choices imposed 3) The perfect integrator (PI) perfectly integrates the stimulus evidence without bounds. In the three models, the choice is given by the sign of the decision variable at the end of the trial ( figure 1 a). The decision variable therefore obeys the simple stochastic equation X
μ ξ(t)
τdXdt = + σ (1 )
where μ is the drift and σ is the standard deviation of the noise, modeled as Gaussian stochastic process ξ with zero mean, unit variance and without temporal correlations (white noise). For the purpose of this thesis, we separate the noise in two sources: 1) the internal noise (σi) produced by the intrinsic variability of neurons as and inputs to the decision circuit from other brain regions and 2) the stimulus fluctuations (σS), which are proportional
Figure 1 | Canonical models of perceptual decision making
a) Example of single traces of the decision variable for the canonical models: Top: in the perfect integration model the decision variable is simply the integral of the stimulus. Middle: The drift diffusion model with absorbing (DDMA) barrier perfectly integrates the evidence until it commits to a decision when the decision variable reaches one of the bounds. Bottom: The drift diffusion model with reflecting barrier perfectly integrates the evidence during the entire trial but the reflecting boundaries introduces an upper limit to the accumulated evidence towards one of the choices. b) Psychophysical kernels for the three diffusion models. Without bounds the model weights all the evidence equally.
With bounds the model shows primacy for absorbing and recency for reflecting boundaries. c) Psychophysical kernel diagram for a toy model that perfectly integrates the evidence of the first half of the trial but it omits the second half. Top: In red (blue) stimuli that give rise to a left (right) choice.
Note that in the first half of the trial, the blue stimuli tend to be positive whereas the red stimuli tend to be negative. This structure is lost in the second half. Middle: The distributions of the left and right stimuli during the first half are separable ( area under the ROC curve 0.64) whereas during the second half, both distributions are equal.
to the variance of the stimulus and thus are controlled by the experimenters. We write the total noise as the sum of a stimulus-dependent term and internal noise: σξ(t) = σ ξS S+ σI Iξ .
The dynamics in equation 1 can be recast as the motion of a particle in a potential φ(X):
ξ(t)
τdXdt = − dφdX + σ (2 )
where φ(X) = − μX ( figure 1 a). The conceptual advantage of using a potential relies on the fact that, ignoring the noise term, the dynamics of the decision variable always leads the potential to decrease in time. Therefore, starting with an initial position on the potential, the
decision variable will always “roll downward” with only the noise fluctuations causing possible motion upward along the potential. For the canonical models described in equation 2 , the potential is a linear ramp. In the case of absorbing boundaries, it has two vertical cliffs and for reflecting boundaries two vertical walls ( figure 1 b insets). It is worth noting that the form of the potential depends both on the properties of the stimulus and the intrinsic dynamics of the integration process itself. Specifically, the steepness of the linear ramp μ is just the mean evidence in favor of one of the two possible choices; it is therefore an exogenous parameter.
On the other hand, the potential boundaries are a property of the intrinsic dynamics of the model and hence do not depend, in principle, on the stimulus. To explain the effect of the bounds in the decision process dynamics, we first need to define what a psychophysical kernel is.
Psychophysical kernel
In perceptual decision tasks, subjects make choices based on stimulus evidence. One of the goals in systems neuroscience is to reveal how the brain uses the stimulus evidence to guide our behaviour. A useful analytical tool for this purpose is the psychophysical kernel (PK) (Neri and Heeger 2002) . In tasks where the stimulus duration is controlled by the experimenters, the PK measures the average time-course of the impact of stimulus fluctuations on choice. Classically, the PK has been computed by averaging the stimuli that give rise to a right and left choice separately and subtracting them (Roozbeh Kiani, Hanks, and Shadlen 2008; Okazawa et al. 2018) . We are going to compute the PK as the temporal evolution of the separability between the stimulus distributions that give rise to left and right choices. This method has the advantage that the magnitude of the PK is interpretable and comparable between different experiments (see methods, figure 2 ). To illustrate how the PK works, we build a toy model that perfectly integrates the evidence of the first half of the stimulus but it omits the second half. For this simple example, the PK is able to recover the integration dynamics of the model ( figure 1 c).
In psychophysical experiments where the duration of the stimulus is controlled by the experimenters, three qualitative classes of PK have been mainly reported: primacy (Roozbeh Kiani, Hanks, and Shadlen 2008; Nienborg and Cumming 2009) or recency (Wyart, Myers, and Summerfield 2015; Cheadle et al. 2014) when the subjects tend to give more weight to early or late evidence and flat PK when the subjects equally weight the stimulus (Brunton, Botvinick, and Brody 2013) . The PK is a powerful analytical tool and it can be
Figure 2 | Psychophysical kernel method
First the stimuli are categorized by the model or by a subject, then we compute the distribution for left and right stimuli in each frame. The psychophysical kernel is the temporal evolution of the separability between the right and left stimuli. We compute this separability with the area under the ROC curve.
useful for different analysis. For instance, it has been used to study the temporal evolution of the stimulus fluctuations impact in confidence choices (Zylberberg, Barttfeld, and Sigman 2012) . Other studies have used the PK to postulate their hypothesis. For example in (Wimmer et al. 2015) , the authors propose top-down signals as a possible mechanism to explain primacy PK but constant choice probabilities in sensory areas. In another study, PK from high and low confidence trials have been used to compare different models (Kawaguchi et al. 2018) . Finally, in a more recent study, PKs were used to show that rats can optimally discount information in a dynamic environment where the old information may no longer be relevant of the state of the world (Piet, El Hady, and Brody 2018) .
All these examples show the potential of the PK to reveal the dynamics of the decision making process, but the limitations of the PK have also been pointed out in (Okazawa et al.
2018) . For example, the PK reflects both sensory and decision dynamics processes and they can not be distinguished with the PK. Thus, a PK showing primacy can be explained by adaptation in sensory neurons (Yates et al 2017) or by decision related mechanisms such as
the accumulation to bound (Yates et al. 2017; Wimmer et al. 2015; Roozbeh Kiani, Hanks, and Shadlen 2008) .
Here, we are going to use the PK to understand how the statistical properties of the stimulus and the model parameters can change the dynamics of the decision making process. We start by explaining the impact of the decision boundaries in the canonical models on the PK.
Psychophysical Kernel for the canonical models
In the canonical models described above, the differences in the intrinsic dynamics introduced by imposing absorbing or reflecting bounds, or removing the bounds altogether, strongly influence the way the stimulus impacts the upcoming decision. Using the PK, we can unequivocally characterize the integration dynamics resulting from each of the canonical models: 1) the perfect integrator for which there are no bounds ( figure 1 a, top), 2) the DDM with absorbing boundaries ( figure 1 a, middle) and 3) the DDM with reflecting boundaries ( figure 1 a, bottom). In the case of the perfect integrator, the stimulus fluctuations are weighted equally during the entire duration of the trial, leading to a flat PK ( figure 1 b, top).
When there are absorbing boundaries fluctuations late in the trial are unlikely to affect the decision and therefore the PK shows a “primacy” effect ( figure 1 b, middle). On the other hand, a “recency” effect is seen when boundaries are reflecting because early fluctuations are largely forgotten once a boundary is reached ( figure 1 b, bottom).
Drift diffusion model with absorbing bounds: the standard cognitive model
Canonical models such as the DDM have proven very successful in fitting key features of animal behavior in perceptual decision making tasks. The most common diffusion model is the version with absorbing bounds (DDMA). The bounds provide a mechanism to commit to a decision which it is useful to model reaction time tasks. In this type of task, the stimulus is present until subjects make a choice. The reaction time is the time from the stimulus onset until the decision is made.
The DDMA can be viewed as an implementation of the sequential probability ratio test (Wald and Wolfowitz 1948) . This test minimizes the number of random variables that need to be observed to distinguish the true generative distribution between two possible options with a certain level of accuracy controlled by the bounds. For each observed random variable, the
decision variable is updated with the difference between the log-likelihood of the two alternatives until the decision variable reaches the upper or the lower bound. In the DDMA, the decision variable is updated at each time step with the instantaneous evidence from the stimulus until the decision variable reaches one of the bounds. The two processes are equivalent if the stimulus is the difference between the log-likelihoods probabilities of the left and right choices.
Thus in the DDMA, larger bounds produce more accurate but slower decisions. This property makes the DDMA a suitable model to study the speed-accuracy trade-off. It has been shown that under time pressure, subjects can trade accuracy for speed, something that can be accounted for in the DDM by decreasing the bounds (Duncan Luce 1991; Smith and Ratcliff 2004) .
One of the great advantages of the DDM is that, due to its simplicity (e.g. the small number of parameters) it can easily be fitted to reproduce the responses of a subject (Ratcliff and Rouder 1998; Ratcliff and Tuerlinckx 2002) . In order to correctly fit the accuracy and reaction time distributions (for Reaction Time tasks) for error and correct trials with the DDMA, we need to add two extra parameters. The first one is a non-decision time that models the stimulus encoding time. It is an additive lag parameter that is added to the diffusion reaction time. The second one is a bias in the starting point which is the value of the decision variable before the stimulus onset. In addition, we also need to introduce trial-to-trial variability in three different parameters: 1) the drift rate (μ) which produces longer reaction times for error trials (Ratcliff 1978) 2) the starting point which produces faster reaction time in errors trials (Ratcliff and Rouder 1998) and 3) the non-decision time which helps to fit the distributions of reaction times (Ratcliff and Tuerlinckx 2002) . An urgency signal mechanism (e.g. collapsing bounds or a gain parameter) has also been proposed as a possible mechanism to account for the reaction time distributions (Drugowitsch et al. 2012; Thura et al. 2012) but see also (Hawkins et al. 2015) .
Although the absorbing bounds are a mechanism originally proposed to capture the moment in which subjects commit to a decision in reaction time tasks, the DDMA has also been applied in fixed duration task (Roozbeh Kiani, Hanks, and Shadlen 2008) . Consistent with the DDMA, in a motion discrimination task in which stimulus duration (T )is variable and monkeys have to wait for stimulus offset to respond, the accuracy increases with the square root of T for short trials but it plateaus when the trials are long enough ( figure 4 ). For short
trials, the evidence is perfectly integrated because the decision variable never reaches the bounds. In this case, the accuracy increases with the square root of the T because the mean accumulated evidence increases with Twhereas the standard deviation increases with the square root of T. For long stimuli, the decision variable reaches the bounds easily and the accumulation of evidence finishes before the stimulus. Consequently the accuracy tends to plateau (Roozbeh Kiani, Hanks, and Shadlen 2008) .
The popularity of the DDMA has been growing in the last two decades. Not only because it can fit the accuracy and the mean reaction time but also because the activity of neurons in decision areas resembles a diffusion process with absorbing bounds. Various studies (Roitman and Shadlen 2002; Roozbeh Kiani, Hanks, and Shadlen 2008) , have shown that the firing rate of neurons in several associative brain areas ramps up with the stimulus presentation and it plateaus after 400− 500 ms. Similar to the bound in the DDMA, the saturation firing rate is independent of the stimulus strength. Nevertheless, the DDMA is still a cognitive model and it can not explain from a mechanistic point of view how the ramping and the plateaus of the firing rate are produced.
Despite this limitation, many studies have used the DDMA to perform model based analysis and try to relate the model parameters with different mechanisms. For instance, several experiments have shown that the drift, the non-decision time or the bounds can change with the age (Ratcliff, Thapar, and McKoon 2004; Thapar, Ratcliff, and McKoon 2003; Starns and Ratcliff 2010) . Different mechanisms of bias such as an initial offset or an asymmetric drift rate have also been compared under the DDMA framework (Gold et al. 2008; Urai et al.
2019; Mulder et al. 2012) . Finally, confidence judgements (Roozbeh Kiani, Corthell, and Shadlen 2014) or changes of mind (Resulaj et al. 2009) have been studied with different versions of the DDMA. For all these reasons, the drift diffusion model with absorbing barriers is the standard cognitive model in perceptual decision making because it is simple, can be easily fit to data and can be related to neuronal activity. However the relation between the DDMA and the neural mechanisms remains at best heuristic and biophysical models are needed to uncover the neural mechanisms underlying perceptual decision making.
Biophysical models of perceptual decision making
During the last two decades many studies have addressed the question of how a circuit made of neurons with a time constant of tens of milliseconds can give rise to cognitive
processes such as working memory or decision making which operate on a time scale of seconds (Amit and Brunel 1997; Compte et al. 2000; Brunel and Wang 2001; Wang 2002) . In these studies, it has been proposed that this could be solved by strong recurrent connections mediated by NMDA receptors. For illustration purposes, let us study a simple linear network model with a single neural population described by its averaged firing rate r . Without recurrent connections, the firing rate decays with the synaptic time constant and its dynamics are governed by:
−
τsyn dtdr = r (3 )
When we introduce recurrent connections the firing rate evolves according to the equation
− 1 )r
τsyn dtdr = ( − wrec (4 )
The effective time scale of the network can be estimated by τnet= τsyn/(1− wrec) where wrec is the strength of the recurrent connections. If the strength of the recurrent connections is then all the possible firing rates are stable and the system becomes a line attractor.
wrec= 1
In the case where wrec< 1, the network evolves towards an attractor state of r =0 with the effective time constant τnet. If we consider NMDA receptors with long synaptic time constants (τsyn= 100 ms) and wrec= 0.9 the effective time scale is τnet= 1s (Wang 2008;
Goldman, Compte, and -J. Wang 2009) .
A popular biophysical model for perceptual decision making with slow attractor dynamics was published in the early 2000s (Wang 2002) . Wang’s attractor network consists of two pools of excitatory spiking neurons and one pool of inhibitory spiking neurons. Each of the excitatory pools is selective to one of the possible choices and has strong recurrent connections ( figure 3 a). These two populations compete through the third pool of untuned inhibitory neurons. Before the stimulus presentation, the firing rate of the two excitatory
Figure 3 | Standard biophysical model for perceptual decision making
a) The standard biophysical model for perceptual decision making consists of two populations of excitatory neurons with strong recurrecnt activity and an untuned inhibitory population. Each of the excitatory population represents a possible choice and they compete through the inhibitory population.
b) Example of a single trial in the biophysical network. Bottom: Each of the excitatory populations receives an input proportional to the instantaneous stimulus evidence for the choice that it represents.
Middle: When the stimulus is presented the two excitatory populations start increasing their firing rate until one of them wins the competition and shuts down the other one. Top: Raster plot of each population of c) Stable Firing rates for the excitatory populations when we slowly increase ( squares) or decrease (crosses) the mean input. Note that there are symmetric stable firing rate where the blue population win the competition in the winner-take-all regime (not shown).
populations is similar. With the stimulus presentation, the two excitatory populations receive an input proportional to the evidence for left or right choice according to its selectivity. The firing rate of both populations ramps up until one of the populations wins the competition.
When the stimulus ends, the network is able to maintain the decision in working memory ( figure 3 b). To illustrate this dynamics, it is convenient to plot the rate of the stable states by slowly increasing and decreasing the mean input to the network ( figure 3 c). When we compute the stable states by increasing the inputs (squares in figure 3 ), the undecided stable state where the firing rates of both populations are similar becomes unstable for strong enough inputs. In contrast, when we compute the stable states by decreasing the inputs (crosses in figure 3 c), the decision states disappear. Critically, they do not disappear for the same value of the input. As occurs in a system showing histeresis, there is a multistable region where the decision and the undecided states are both stable. Without the
stimulus Wang’s attractor network is in the multistable region (green cross figure 3 ). With the stimulus the network crosses the bifurcation to the winner-take-all region and it evolves towards one of the decision states. When the stimulus is removed the network returns to the original input but it remains in the decision state. Thanks to recurrent connections mediated by NMDA with a long synaptic time scale, Wang’s attractor network has an effective long time scale that allows for the integration of stimuli during hundreds of milliseconds. The integration of the stimulus ends when the network reaches one of the attractors. This is consistent with the experimental results where the accuracy of two monkeys performing a direction discrimination task does not increase indefinitely with the stimulus duration ( figure 4 a) (Roozbeh Kiani, Hanks, and Shadlen 2008; Roitman and Shadlen 2002) . Wang’s model is also compatible with the empirical result that reaction times are longer for error than correct trials (Roxin and Ledberg 2008; Roitman and Shadlen 2002) and the primacy psychophysical kernels seen in motion discrimination tasks (Roozbeh Kiani, Hanks, and Shadlen 2008; Nienborg and Cumming 2009) . The advantage of the biophysical models is that we can study the neural mechanisms underlying the behaviour results. Similar to neurons in the lateral intraparietal cortex (LIP), the two populations of excitatory selective neurons first ramp up together when the stimulus is presented until one of the population wins the competition and shuts down the other one (Roozbeh Kiani, Hanks, and Shadlen 2008) . In addition, as predicted by Wang’s model, neurons in LIP are also correlated with the decision during a delay between the stimulus offset and the decision (Roozbeh Kiani, Hanks, and Shadlen 2008) . In sum, the standard biophysical model is able to explain different behavioural and electrophysiological results.
Wang’s network model has had a great impact on the study of the underlying neural mechanisms of perceptual decision making. In (Wimmer et al. 2015) , the authors added a sensory level to the standard attractor network to show that constant choice probabilities in sensory areas can be explained by top down inputs from decision areas. It has also been useful to study the role of different brain areas in cognitive functions such as decision making or working memory. For instance in (Jaramillo, Mejias, and Wang 2019) the authors found that the pulvino-cortical pathway can control the effective connectivity within and across cortical regions and modulate attention, decision making and working memory. The coordination between different cortical areas during the computation of working memory and decision making has been studied in (Murray, Jaramillo, and Wang 2017) . The authors found that prefrontal cortex should represent the decision variable more categorically than posterior
Figure 4 | Monkeys can perfectly integrate evidence for short but not for long trials.
a) Discrimination threshold versus stimulus duration of monkeys performing the dots tasks in a variable stimulus duration paradigm. The accuracy of monkeys increase similar to a perfect integration model for short trials and it tends to plateau at longer trials, figure from (Roozbeh Kiani, Hanks, and Shadlen 2008) b) Shape of the potential φ(X) associated with the double well attractor model (black) and the DDM with absorbing bounds (gray). The result in (a) can be explained by both the drift diffusion model with absorbing bounds. Close to the bifurcation where the undecided states become unstable, the potentials of both models are similar.
parietal cortex. In recent years, it has been reported that subjects tend to repeat or alternate choices during a sequence of perceptual decisions (Akaishi et al. 2014; Braun, Urai, and Donner 2018; Fritsche, Mostert, and de Lange 2017) . One possible explanation for the tendency to repeat is a slow decay of the decision state during the inter-trial interval. In other words, when the next trial starts the firing rate of the population associated with the previous choice could be still above the other one. Consistent with the model predictions, it was found that repetition bias increases or decreases by depolarizing or hyperpolarizing respectively with transcranial direct current stimulation over the left dorsolateral prefrontal cortex, (Bonaiuto, de Berker, and Bestmann 2016) . Different versions of the biophysical attractor model have been able to explain the neural mechanisms underlying many different behaviour and neurophysiological findings.
Line attractors ( wrec= 1) where first used in tasks that require storing a parametric values in working memory such as spatial positions in a circle (Compte et al. 2000) or frequencies (Machens, Romo, and Brody 2005) . But can also be applied in tasks where a stimulus needs to be integrated. We illustrate this using the same simple linear network model from equation 3
− (1 ) (t)
τsyn dtdr = r − wrec + I (5 )
With wrec= 1 the mean firing rate of the population is simply the integral of the stimulus (t)dt
r = τ−1syn
∫
I (6 )
In that case the model is a neuronal implementation of the perfect integrator model that we described in the previous section. In the context of decision making, similar biophysical line attractors model but with circular symmetry (ring attractor, (Compte et al. 2000) ) have been used in tasks with multiple choices. In (Furman and Wang 2008) , different bumps of activity compete through common inhibition to model a direction discrimination task with multiple alternatives. In a more recent study (Wei and Wang 2015) , the authors use a ring attractor to model a task where the subjects could choose between the standard left and right choices or a choice that gave a small but certain reward (Roozbeh Kiani and Shadlen 2009) . After the stimulus, the authors add another bump between the two possible choices, this bump represents the certain but small reward choice and competes with the other bumps.
A complication of this type of model is the need for fine tuning of the parameters. Because the landscape must be completely flat, small asymmetries (~1%) in the connections can produce a landscape with a few number of attractors (Brody, Romo, and Kepecs 2003;
Seung et al. 2000; Renart, Song, and Wang 2003) . Possible solutions to this problem have been proposed by adding a fast negative feedback to stabilize the memory (Lim and Goldman 2013) or with bistable neurons (Koulakov et al. 2002) which change the landscape from flat to a series of small attractors. However this mechanism disrupts the perfect integration of evidence because stimuli below a threshold value fail to perturb the small attractor and thus they are not integrated. Another limitation to perfectly integrating the stimulus evidence is the internal noise. In the case of a completely flat potential all the possible states are stable. Consequently any internal noise ( e.g. inputs from other brain regions or intrinsic noise of the neurons) would be also integrated by the model. This would produce a degradation of the stored values during the delay in working memory tasks.
Actually spatial memories do degrade over time (Funahashi, Bruce, and Goldman-Rakic 1989) and this degradation has been understood as a signature of line attractors (Wimmer et al. 2014) . Biophysical line attractors provide a framework to model the perfect accumulation of evidence and to store parametric values in working memory.
In contrast to the canonical models, biophysical attractor network models are high dimensional and rather complicated. Line attractors are basically perfect integrators and can easily be modelled by a one dimensional linear equation. In the next section, we explain how the biophysical attractor network model for decision making can be reduced to a simple one dimensional nonlinear equation.
One dimensional neurobiological models of perceptual decision making
Canonical models for perceptual decision making are simple and can fit several aspects of experimental data, such as performance and mean reaction times. Biophysical models provide a framework to study the neural mechanisms underlying perceptual decisions but they are rather complicated and high dimensional and hence difficult to fit. In what follows, we consider a one dimensional diffusion process for the decision variable (X) which can be formally derived from higher-dimensional neuronal models, and yet which is still described by the diffusion of a particle in a potential (equation 2 ) (Roxin and Ledberg 2008) . In particular, inhibition-mediated winner-take-all attractor dynamics observed in network models of 2AFC contributes to the potential , and one findsφ
(X) X
φ = − μ −c X²22 + c X⁴44 + c X⁶66 (7 )
Where μis proportional to the difference in inputs to the two selective excitatory population in the biophysical network model, thus it represents the strength of the stimulus evidence.
The parameter c2is proportional to the difference between the mean input to both population and the input at the bifurcation, Finally, the parameters c4 and c6 are non-trivial functions of the network. The temporal evolution of the stable states with the stimulus is the same as in the biophysical attractor network model for decision making. The parameter c2 controls the stable states of the network ( figure 5 b). Before the stimulus presentation the decision variable remains in the undecided state because it is stable (c2< 0). With the presentation of the stimulus, the system crosses the bifurcation (c2> 0), the undecided state becomes unstable and the decision variable evolves towards one of the two decision states. Finally, after the stimulus the decision states remain stable representing some sort of memory state which can hold on the information of the decision in the absence of any external input. .
Figure 5 | Bifurcation diagram of the models with winner take all dynamics.
a and b) Before the stimulus the system is in the undecided state (black solid dot). When the stimulus is presented , the undecided state becomes unstable (white dot) and the only two possible stable states are the decision states (red and turquoise dots). After stimulus offset, depending on the network parameters, either there still exist decision states (b) or the only attractor of the network is again the undecided state. (a) Insets along the x-axis, illustrate the landscape of the equivalent potential of the system with (right) and without (left) stimulus.
When focusing on the study the dynamics of perceptual decisions without the need of working memory (chapter 1 and 3), we will use the simpler case where the decision states are unstable after the stimulus:
X
φ = − μ −c X²22 + c X⁴44 (8 )
with c2,c4> 0 ( figure 5 a). In chapter 2 we will investigate the case in which the perceptual categorization needs to be maintained in the absence of any input. For this, we will compare the full model given by equation 7 . Whether the biophysical network dynamics are correctly described by 7 or 8 depends on the specific choice of network parameters (Roxin and Ledberg 2008) . The resulting potential has two local minima corresponding to patterns of neuronal activity which represent the two possible choices. We call this model the double well model (DWM)
The models represented by a potential of the form given in equations 7 and 8 have the advantage that they are simple, can be fitted to data and the relation between the
parameters of the potential φ(X) and the neural mechanisms in the corresponding biophysical model is well characterized (Roxin and Ledberg 2008) .
Classical view of perceptual decision making
The classical view of perceptual decision making is that decision areas integrate the decision evidence provided by the stimulus represented in sensory areas up to a decision threshold.
This view is based on a series of motion discrimination experiments in monkeys (Roitman and Shadlen 2002; Roozbeh Kiani, Hanks, and Shadlen 2008; Nienborg and Cumming 2009; Britten et al. 1992; Shadlen and Newsome 2001) . On the one hand, they found that the firing rate of neurons in sensory areas (medial temporal cortex, MT) were proportional to the strength of the stimulus evidence (i.e. the motion direction). On the other hand, they found that the firing rate in decisions areas (Lateral intraparietal cortex) ramped up with the stimulus presentation as if they were integrating the stimulus evidence from the sensory areas. In addition, they found that the slope of the ramping activity depended on the stimulus strength but then it saturated to a value of the firing rate that was independent of the stimulus strength. This feature was indicative of the presence of absorbing bounds.
New experiments in humans, rats and monkeys have however challenged this view. In a landmark study (Brunton, Botvinick, and Brody 2013) used model based behavioral analysis to show that humans and rats can perfectly integrate evidence without bounds. The experimenters used the so-called clicks task where two streams of poisson auditory or visual clicks are played at different rates. The subjects had to choose the stream with a higher rate.
Another experiment using the same task has however recently challenged this result (Keung, Hagen, and Wilson 2019) . This time the experimenters found different sources of suboptimally including psychophysical kernels across human subjects with different temporal profiles (Keung, Hagen, and Wilson 2019) . Consistent with perfect integration of evidence without bounds, a recent study has shown that humans can perfectly integrate the contrast of a sequence of gratings even when the stimulus is interrupted by long delays (up to 8 s) (Waskom and Kiani 2018) . Although monotonically increasing psychophysical kernels showing a recency effect can not be explained by a model with absorbing bounds, they have also been reported in direction (Cheadle et al. 2014; Wyart, Myers, and Summerfield 2015) or brightness discrimination tasks (Bronfman, Brezis, and Usher 2016) . More recent studies using motion discrimination experiments have also challenged the classical framework by showing that the primacy psychophysical kernels could be more related to
sensory adaption than to a bound mechanism (Yates et al. 2017) . Other experiments have shown that the activity of neurons in decision areas is not only related to the integral of the stimulus but also to other mechanisms such as an urgency signal (Park et al. 2014; Thura et al. 2012) . In summary, it seems that the classical framework of how the brain makes decisions is too simple to account for all the behaviours and electrophysiological findings in the literature. Thus other mechanisms needs to be studied to complete the classical model for decision making.
Thesis goals
During the 2017 summer, I attended the cognitive computational neuroscience summer school in Shanghai. There, we had a series of lectures about perceptual decision making, and Mike Shadlen and Xiao-Jing Wang, among others, presented their work. Surprisingly, both used the result showing the dependence of the monkeys accuracy on the duration of the dots stimulus ( see figure 4 a) to support their preferred models, namely the drift diffusion model with absorbing bounds (DDMA) and the double well model (DWM), respectively. As we already explained, the DDMA and the double well model can perfectly integrate evidence for short but not for long stimulus durations. Both models can also fit the mean reaction time for error and correct trials. This is explained by the fact that in a certain parameter regime their dynamics can be similar ( figure 4 b) (Bogacz et al. 2006) . However, the shape of the potentials associated with these two models are in general different: while the DWM has non-linear terms, the DDMA is linear with the only non-linearity represented by the decision bounds. But theoretical predictions that could be used to test this nonlinearities in the potentials have remained elusive.
As already mentioned, the canonical models are widely used because, in contrast to biophysical models, they are simple and easy to fit to data. In a recent review about sequential sampling models (Forstmann, Ratcliff, and -J. Wagenmakers 2016) , Forstmann and co-authors wrote “ Unfortunately, the Wang model is relatively complex, and at this point it is not possible to use it to fit data ” (Forstmann, Ratcliff, and -J. Wagenmakers 2016) ,. The one-dimensional model described by equation 8 has been qualitatively fitted in reaction time tasks (Roxin and Ledberg 2008) . But it is unclear how to fit these models using a more rigorous maximum-likelihood approach.
The goals of this thesis are:
● Study the non-linear dynamics of the DWM models and find specific signatures of attractors dynamics that are experimentally testable. Ideally, these predictions should be able to qualitatively distinguish between the DWM and the canonical models.
● Design an experiment to qualitatively and quantitatively test the attractor dynamics in perceptual decision making. We aim to test experimentally test the qualitative signatures of attractor dynamics. In addition, we want to develop a maximum likelihood maximization to fit the DWM. The procedure should be general enough to be able to add extra mechanisms that have been shown that could play a critical role in perceptual decision such as different types of biases, adaptation or an urgency signal. Then using a model-based approach, we aim to quantitatively differentiate between these distinct mechanisms.
2. Flexible categorization in perceptual decision making
Summary
Canonical approaches to model perceptual decision making are based on diffusion processes that assume bounded or unbounded perfect integration of the stimulus. Here we study the integration process in neurobiological models with winner-take-all dynamics that can be reduced to a diffusion process. The key difference between these models and the canonical ones is the shape of the potential landscape along which the decision dynamics evolves. Whereas the models with winner-take-all dynamics have a nonlinear potential, the canonical models have a linear potential, which in principle allows for perfect integration the evidence. To study the implications of the winner-take-all dynamics in the integration of evidence, we use stimuli with different magnitudes of fluctuations. This led us to discover a new integration regime not present in canonical diffusion models, which we call flexible categorization. In this regime, fluctuations late in the trial robustly generate decision reversals by overcoming the internal attractor dynamics when the initial choice is incorrect.
One signature of winner-take-all dynamics is a non-monotonic dependence of the accuracy and the response consistency on the stimulus fluctuations. Another is a transition in the temporal weighting of the stimulus impact on the upcoming choice from primacy to recency as a function of the trial duration or the magnitude stimulus fluctuations. We found evidence for such a transition in data from a series of psychophysical experiments where subjects made decisions about the average brightness level of two disks for different stimulus duration T=1,2,3 and 5 s.
Results
Changing the stimulus statistics reveals the dynamics of the decision variable
In typical settings, the stimulus statistics is under the control of the experimenter and can reveal the dynamics of the decision variable. We thus investigated how the psychophysical kernel (PK) for the three DDM models depends on the stimulus fluctuations (σ )S while keeping the magnitude of the internal noise (σ )i constant ( figure 6 ). In order to quantify changes in the PK, we calculated both its normalized area, which is a measure of the overall impact of stimulus fluctuations on the upcoming decision ( figure 6 d; Methods), as well as a primacy-recency index, which ranged from -1 (extreme recency) to 1 (extreme primacy). For very weak stimulus fluctuations all three models are equivalent because the bounds are never reached; in particular the PK is flat, the primacy-recency index is therefore zero ( figure 6 e) and the normalized area is small because the dynamics is driven by internal noise. As stimulus fluctuations increase, the PK of the DDM with absorbing bounds becomes more markedly primacy-laden, leading to an increasing primacy-recency index. In a similar vein, the primacy-recency index for the DDM with reflecting bounds decreases, indicating enhanced recency. The primacy-recency index for the perfect integrator always remains zero because the PK is always flat. The normalized PK area increases monotonically with the ratio σS/σi. For the DDM with bounds, the bounds are reached more often as σS increases, which complicates the integration of the stimulus and consequently the normalized PK area decreases. In sum, the dynamics of evidence accumulation in the DDMs remains qualitatively the same when changing the strength of stimulus fluctuations.
In contrast, the double well model (DWM) has a much richer dynamical repertoire as a function of stimulus fluctuation strength than the canonical models. Specifically, the presence of two potential wells allows for the possibility of sudden transitions between states. That is, the model can account for “changes of mind” ( figure 7 a) (Resulaj et al. 2009; Roozbeh Kiani et al. 2014) by virtue of stochastic transitions between distinct attracting states. For a fixed stimulus duration, such transitions become more likely as the strength of stimulus fluctuations is increased. The same effect can be achieved by extending the trial duration.
For a fixed strength of stimulus fluctuations the rate of transitions remains constant but because the trials are longer the transitions become more likely. The transition dynamics in the DWM play a major role in shaping the PK ( figure 7 b,c,d). For weak stimulus fluctuations