Dorsal Raphe Serotonergic Neurons Control Intertemporal Choice under Trade-off

(1)

Dorsal Raphe Serotonergic Neurons Control

Intertemporal Choice under Trade-off

The MIT Faculty has made this article openly available.

Please share

how this access benefits you. Your story matters.

Citation

Xu, Sangyu et al. “Dorsal Raphe Serotonergic Neurons Control

Intertemporal Choice Under Trade-Off.” Current Biology 27, 20

(October 2017): 3111–3119 © 2017 Elsevier Ltd

As Published

http://dx.doi.org/10.1016/J.CUB.2017.09.008

Publisher

Elsevier

Version

Author's final manuscript

Citable link

http://hdl.handle.net/1721.1/118961

Terms of Use

Creative Commons Attribution-NonCommercial-NoDerivs License

(2)

Dorsal Raphe Serotonergic Neurons Control Intertemporal

Choice under Trade-off

Sangyu Xu1,4,†, Gishnu Das1, Emily Hueske1, and Susumu Tonegawa1,2,3,†,‡

1_{RIKEN-MIT Center for Neural Circuit Genetics at the Picower Institute for Learning and Memory,} Department of Biology and Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA

2_{Howard Hughes Medical Institute, Massachusetts Institute of Technology, Cambridge,} Massachusetts 02139, USA

3_{Brain Science Institute, RIKEN, Wako, Saitama, 351-0198 JAPAN} 4_{Agency for Science, Technology and Research, Singapore 138632}

SUMMARY

Appropriate choice about delayed reward is fundamental to the survival of animals. While animals tend to prefer immediate reward, delaying gratification is often advantageous. The dorsal raphe (DR) serotonergic neurons have long been implicated the processing of delayed reward, but it has been unclear whether or when their activity causally directs choice. Here we transiently

augmented or reduced the activity of DR serotonergic neurons while mice decided between differently delayed rewards as they performed a novel odor-guided intertemporal choice task. We found that these manipulations, precisely targeted at the decision point, were sufficient to bidirectionally influence impulsive choice. The manipulation specifically affected choices with more difficult trade-off. Similar effects were observed when we manipulated the serotonergic projections to the nucleus accumbens (NAc). We propose that DR serotonergic neurons preempt reward delays at decision point and play a critical role in suppressing impulsive choice by regulating decision trade-off.

Keywords

Serotonin; raphe; intertemporal choice; decision trade-off; impulsivity; accumbens; delay discounting

†_{Correspondence to: xusangyu@gmail.com; tonegawa@mit.edu.}

‡Lead Contact

Author Contributions:

S. T. and S. X. conceived the study; S. X., G. D. and E. H. designed the experiments. S. X. conducted the experiments. G. D. provided

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

(3)

INTRODUCTION

Intertemporal choice is decision about future outcomes that are differently delayed.

Impulsive choice, the preference for more immediate but smaller rewards, is associated with a number of maladaptive behaviors [1–4] and psychiatric disorders [5–8].

Serotonin has been well studied in intertemporal choice paradigms with lesion and pharmacological methods, and is thought to promote the choice for larger, more delayed reward [9–11]. Due to the low temporal resolution of such techniques, it has been difficult to pinpoint how serotonin performs this function. We reasoned that since many intertemporal decisions are carried out before action or outcome materializes, and are often prompted by conditioned stimuli (CS), neuronal activity causally driving decision-making should occur at cue. Indeed, recordings of optically tagged DR serotonergic neurons in a Pavlovian task showed that some of them fired phasically to reward-predicting cues [12]. We therefore hypothesized that this phasic activity of serotonergic neurons could be used to direct choice at the decision point with the presentation of a CS.

To test our hypothesis, we needed a behavioral task that isolated a clearly defined decision point[13]. To this end, we devised a novel odor-guided intertemporal choice (OGIC) task that randomizes reward delay contingencies trial-by-trial, in the fashion of existing odor-based rodent decision tasks [14–16]. Randomizing reward contingencies ensures that a decision has to be made on every trial; and the use of odor cues allows the isolation of a decision period for temporally precise neural manipulations. We then systematically tested mouse subjects on decisions between a large reward and a small one that were differently delayed, while optogenetically activating or suppressing DR serotonergic neurons during the decision period in a random subset of the trials. Our findings suggest that serotonergic neuronal activity suppresses impulsive choice at the decision point under trade-off

conditions, possibly via action in the nucleus accumbens (NAc), another structure implicated impulsive choice [17–20]. We propose that serotonergic neurons play a crucial role in resolving reward delay and size trade-off under decision conflict.

RESULTS

Mice Can Use Odor Cues to Choose Sooner or Larger Rewards

We conducted the task in a custom-designed operant chamber with a center odor port and two side reward ports. Subjects initiated each trial by nosepoking into the odor port and sampled a mixture of two odors. After cue sampling, mice could respond by choosing to nosepoke in either reward port (Figure 1A). After a reward delay, a water reward was delivered in the chosen side port. A predetermined trial duration was set to the same length regardless of the chosen reward option so that subjects could not perform more trials over a given period of time by choosing the less delayed reward. All possible combinations of left and right reward delays were offered pseudorandomly in an interleaved fashion within each session. Sert-Cre mice [21] were water-restricted and shaped to associate the intensity of odor 1 with left reward delay and that of odor 2 with right reward delay (Figure 1B). The performance of an example subject after training is shown in Figure 1C. During training, both left reward size (SL) and right reward size (SR) were 1 drop (SL1 sessions), and after

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

(4)

training the left reward size became 2 drops while the right reward size remained at 1 (SL2 sessions). Motor aspects of response to the two odors are shown in Figure S1. Subjects chose the left reward more often during SL2 sessions (Figure 1F, paired t-test, t9= 6.090,

p<0.001). Psychometric curves of a representative group of subjects show the proportion of left choice as a function of left and right reward delays (10 mice, 64 SL1 sessions, 10882 trials; 79 SL2 sessions, 10269 trials). We fitted the proportion of left choice with a multiple regression model to express left choice as a function of left and right reward delays offered. We reasoned that since the goal of the task was to compare reward delays to make a choice, it was very likely that the effects of the two reward delays depended on each other, and therefore, we included an interaction term between the left and right reward delays. This addition also improved fitting of the model (see Methods and Table S1 for model selection). Our model indeed had a significant interaction term on the population level (Wilcoxon signed-rank test, p<0.01 for SL1 sessions, and p<0.01 for SL2 sessions). Between SL1 and SL2 sessions, the only model coefficient that changed significantly was that of the

interaction between left and right delay, βLeft Delay*Right Delay−1 (Figure 1J, Wilcoxon signed-rank test, p<0.01), indicating reduced dependency between the effects of the two reward delays.

Optogenetic Manipulations of DR Serotonergic Neurons Bidirectionally Regulated Impulsive Choice

We manipulated DR serotonergic neurons at the decision point in the OGIC task to investigate the role of transient serotonergic activity in intertemporal choice. In all manipulation experiments, we tested a smaller range of right reward delays (0, 2 and 8 seconds). The left reward was always twice as large as the right reward (SL=2×SR).

First, we hypothesized that transiently silencing serotonergic neurons would encourage impulsive choice. SERT-Cre mice trained on the OGIC task as described were injected in the DR nucleus with AAV9-ef1α-DIO-eArch3.0-eYFP (DR-Arch, Figure 2A and C), or a control virus containing only eYFP (DR-Ctrl), and implanted with a single optical fiber over DR. Light delivery was administered interleaved on a subset of trials. A constant pulse of green light (532 nm, 5 mW) was triggered by odor port entry and terminated at reward port entry (Figure 2B) to ensure light delivery spanned the entire epoch where a decision could be made or changed. Similar light delivery was effective in suppressing spontaneous firing in DR in an in vivo anesthetized preparation (Figure S2A and B). Green light reduced choice of the larger reward in DR-Arch but not DR-Ctrl group (Figure 2K, two-way ANOVA with repeated measures, significant interaction between group and light, F1, 19 = 11.72, p<0.01;

n=10 DR-Arch, 10 DR-Ctrl). Proportion of left choice for DR-Arch and DR-Ctrl groups under green light manipulation were fitted with the choice model described in the previous section, with an additional term for light treatment, shown in Figure 2D, E, I and J. We found that there was a significant positive interaction between green light and the delay interaction term (Figure 2M, Wilcoxon signed-rank test, p<0.05). Green light did not have significant interaction with any other model parameters (Table S2, Wilcoxon signed-rank test, p>0.05). We conclude that green light specifically increased the interaction between left and right reward delays in favor of choosing the smaller reward on the right.

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

(5)

Next we hypothesized that augmenting serotonergic activity during the same time point might have the opposite effect. We injected similarly trained and implanted SERT-Cre mice with AAVrh8-CBA-DIO-ChR2-eYFP virus (DR-ChR2, Figure 2F and H). A transient bout of blue light was able to reliably trigger spiking in the ChR2-expressing serotonergic neurons (Figure S2C–E). In the task, blue light increased preference for the larger reward in DR-ChR2 group but not DR-Ctrl group (Figure 2L, two-way ANOVA with repeated measures, significant interaction between group and light, F1, 19 = 29.95, p <0.01; n=10

DR-ChR2, 10 DR-Ctrl). There was a significantly negative interaction between blue light and specifically the delay interaction term (Figure 2M, Wilcoxon signed-rank test, p<0.05) in favor of choosing the larger reward. Blue light did not have significant interaction with any other model parameters (Table S2, Wilcoxon signed-rank test, p>0.05).

Taken together, these results indicate that manipulating the predictive activity of DR serotonergic neurons bidirectionally regulated impulsive choice by altering how the effects of both reward delays on offer affected each other.

Nucleus Accumbens is a Site of Action for Serotonergic Neurons in Regulating Impulsive Choice

NAc is a region with extensive neuromodulatory innervation and interaction [20,22,23]. Serotonergic axons have been shown to innervate both the core and the shell regions of NAc [24]. As have been previously reported [25], we observed presence of eYFP containing projections in NAc, more prominently in the shell region, in Sert-Cre mice expressing ChR2-eYFP construct in DR (Figure 3A and B).

To investigate the downstream target of DR neurons responsible for the effects we observed, we manipulated opsin-expressing serotonergic axons in NAc. Sert-Cre mice trained on the OGIC task were injected with AAVs carrying either a ChR2-eYFP construct NAC-ChR2), an eArch3.0-eYFP construct NAC-Arch), or a control eYFP construct (DR-NAC-Ctrl), and implanted bilaterally with optical fibers over NAc.

Green light reduced the proportion of left choice in DR-NAC-Arch group but not in the control group (Figure 3E, two-way ANOVA with repeated measures, significant interaction between group and light, F1, 19 = 13.19, p<0.01; n=10 DR-NAC-Arch, 10 DR-NAC-Ctrl).

Blue light increased this preference in DR-NAC-ChR2 group but not in the control group (Figure 3F, two-way ANOVA with repeated measures, significant interaction between group and light, F1, 19 = 15.17, p<0.01; n=10 DR-NAC-ChR2, 10 DR-NAC-Ctrl). Similar to what

we observed in DR cell body experiments, green light in DR-NAC-Arch group had a positive interaction with the delay interaction term (Figure 3G, Wilcoxon signed-rank test, p<0.05), whereas blue light in DR-NAC-ChR2 group had a negative interaction with the delay interaction term (Figure 3G, Wilcoxon signed-rank test, p<0.01). Light did not have significant interaction with any other model parameters (Table S2, Wilcoxon signed-rank test, p>0.05). These results suggest that NAc likely acts as a target site to mediate the action of serotonergic neurons in suppressing impulsive choice by altering how much effects of both reward delays on offer affected each other.

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

(6)

DR Serotonergic Manipulation Effects Were Modulated by the Degree of Trade-off

We consistently found optogenetic manipulations of serotonergic activity either at the cell bodies or their NAc-projecting terminals to be effective in altering the interaction between the effects of left and right reward delays bidirectionally. In the OGIC task, difficulty arising from having to discriminate between odor pairs (perceptual difficulty, see Methods) covaried with the dilemma in choosing between two reward delays (choice difficulty, see Methods), therefore, we could not immediately exclude the possibility that light manipulations were affecting the subjects’ discrimination between similarly concentrated odors. We found, though, that light did not affect the effects of the left delay or right delay alone, indicating that the subjects could respond normally to each odor by itself. Since interaction terms are complex concepts to interpret, we visualized the manipulation effects using our choice model with heatmaps (Figure 4A–D). We found that the valleys of the Arch manipulation effects and the peaks of the ChR2 manipulation effects did not coincide with the perceptual boundary, where left and right reward delays are equal, but were skewed in the direction of longer left reward delay. Since the left reward was larger, the subjects were willing to tolerate higher delays for it. Therefore, the region we saw the greatest manipulation effects was where there was more difficult decision trade-off. Given the model prediction, we looked in our raw choice data for a relationship between choice difficulty and the light effect (Figure 4E–H). To quantify the relationship, we expressed light effect as a function of both choice difficulty and perceptual difficulty (see Methods). The coefficient for choice difficulty after any effect of perceptual difficulty was accounted for was still significantly lower than zero for DR-Arch and NAC-Arch groups (Figure 4J and L, Wilcoxon singed-rank test, p<0.05) and was significantly higher than zero for DR-ChR2 and NAC-ChR2 groups (Figure 4J and L, Wilcoxon signed-rank test, p<0.05 and p<0.01 respectively). Choice difficulty did not affect light effects in control groups (Figure 4J and L, Wilcoxon signed-rank test, p >0.05).

Taken together, we show that optogenetic manipulations of serotonergic neurons and their NAc-projecting axons are most effective in altering the subjects’ choice when the trade-off was most difficult.

DISCUSSION

Serotonergic Activity at the Decision Point Modulates Impulsive Choice

We report here that transient inhibition of serotonergic activity at the decision point promotes impulsive choice, whereas transient activation has the opposite effect. We demonstrated a role for serotonergic activity at the decision point to affect choice. Previous work has shown that serotonergic activation during waiting for reward increased waiting [26,27], suggesting that serotonin signals the availability of delayed reward and increases persistence in waiting for the said delayed reward. In contrast to the subjects in a waiting paradigm, the subjects in the present experiments had not already been waiting at the activation or inhibition onset, and therefore the effects we observed were not simply due to extending an existing behavioral state. We showed that manipulations of serotonergic neurons are capable of recruiting patient behavior programs, such as a patient choice in our

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

(7)

experiments, before waiting even begins. Our results therefore demonstrate an additional and distinct mechanism for serotonin to suppress impulsive choice and buttress the theory of serotonin support for patience. We did not observe altered behavior inhibition since the subjects did not spend more time sampling in the odor port or traveling to the reward ports during light stimulation (Figure S2F and G). Our results provide strong evidence that serotoninergic neurons support patience primarily through suppressing impulsive choice, as opposed to impulsive action [28].

Serotonergic neurons have also been suggested to encode unconditioned rewarding stimuli in naïve mice [25,29]. After training, serotonergic neurons have been shown to fire phasically at CS in proportion to the value predicted by the CS [12,30–33] in Pavlovian tasks. Our results provided causal evidence that this reward value-predictive signal could be used to direct behavior. The manipulation effects in the present experiments are specific to the reward delay interaction term (Figure 2M and 3G), which was also the term that was different between SL1 and SL2 sessions in baseline behavior (Figure 1J). It is possible that serotonergic neuronal suppression caused the subjects to undervalue the left reward size and activation caused them to overvalue it. This bidirectional shift in preferences was only captured because our task constituted only free choices with equally valid responses on both sides. From this perspective, serotonergic neurons could act as a decision variable for controlling intertemporal preferences.

Serotonergic Neurons Are Engaged in Difficult Delay-size Trade-off

We observed the greatest effect in serotonergic manipulations when the subjects had to wait longer for the larger reward, but not vastly longer (Figure 4). We propose that serotonergic neuronal control of choice is specific to difficult trade-off conditions. This idea may explain the seemingly inconsistent results in serotonergic manipulations in rodent delay discounting literature. Chemical lesions of the serotonergic system have been shown to increase

impulsivity in some delay discounting experiments [10, 34], but not in others [35]. Mobini et al. and Wogar et al. used an adjusting-delay procedure [36], whereas Winstanley et al. varied delay contingencies systematically. Since adjusting-delay procedures for the indifference delay for the larger reward, the procedure samples preference near the indifference delay, i.e. conditions with greater delay-size trade-off. On the other hand, when delays are

systematically varied, the region of greatest trade-off may contribute narrowly to sampled data sets. In our study, we may have identified conditions for which serotonergic neurons are important, and those for which they are not, because we exhaustively sampled large sets of delay combinations in fine grain, delineating between high trade-off and low trade-off conditions.

Our proposal for serotonin function in trade-off emphasizes the integrative nature of serotonin encoding. To process trade-off, a neural structure must have information about both reward and cost. Others have proposed similar theories for serotonin’s role in combining positive and negative valences. For example, serotonin was proposed to encode “beneficialness” [37], i.e. as a function of reward magnitude multiplied by reward

probability subtracting cost. Given the diverse observations of serotonin effects in both reward and punishment processing, the more integrative theories may have greater

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

(8)

explanatory power. Of course, positive and negative valences are not necessarily integrated in parallel as in the case of positive and negative outcomes of a choice. A possible

alternative interpretation of our data is that the need for trade-off constitutes a precondition for serotoninergic neurons to be engaged. Decision conflict between wanting to choose a larger reward and a sooner one could induce anxiety and lead to impulsive behavior. Interestingly, it has been shown that optogenetic activation was most effective in averting wait errors during the late phase of waiting, when mice were more likely to give up [26]. It is possible that serotonergic neurons are activated in response to this negative mood to maintain a representation of positive predictions of future reward, and as a result, resolving decision conflicts appropriately. Testing these theories will certainly require a combination of manipulations and monitoring of neuronal activities during behavior paradigms explicitly designed to probe the interaction between positive and negative valences.

DR-NAc Circuits Can Mediate Serotonergic Control of Intertemporal Choice

We found that augmenting and inhibiting opsin-expressing serotonergic axonal projections in NAc mimicked the effects of manipulating DR serotonergic cell bodies. NAc has been found to be involved in choice impulsivity[17], and the dense and diverse neuromodulatory input to the region is thought to contribute[20]. More specifically, ablation of the core region of NAc (NAcC) has been shown to reduce responding to the larger reward in delay

discounting tasks[18,19], but not in the shell region (NAcSh)[19]. We observed more prominent presence of opsin-expressing serotonergic projections in NAcSh, but also in NAcC to a smaller extent. Due to difficulty in restricting the spread of light in optogenetic manipulations, we cannot definitively attribute the effects of DR-NAc manipulations to either sub-region, although it is more likely that DR-NAcSh projections are mediating the observed effects. The latter possibility appears inconsistent with the prevailing belief that NAcC is the major component responsible for processing delayed reward. The difference between chronic and acute neuronal manipulations are often discussed, especially in the era of optogenetics[38]. It is possible that NAcC is the main component processing delayed reward, but NAcSh is a site of dynamic neuromodulatory control upstream of NAcC. A future goal would be to elucidate the functional relationship between the structures. We propose that NAc is a projection target that mediates the action of serotonergic neurons in guiding intertemporal choice. Since we cannot rule out the possibility that terminal stimulation caused antidromic activation of DR cell bodies, and since serotonergic neurons project widely, it is possible that other areas could be affected by stimulation of DR projections in NAc. Further work is required to clarify the relationship between DR serotonergic neurons and their target areas.

Neurochemical Considerations

Optogenetic activation of serotoninergic neurons have been well-documented to cause serotonin release in projection targets [26,39]. Serotonergic neurons have also been reported to signal reward via the co-release of both serotonin and glutamate [25,29,40]. It is possible that some of the effects we observed are mediated by glutamate transmission. Although we do not observe reward effects of optogenetic stimulation of DR, we cannot exclude this

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

(9)

possibility. We therefore draw our conclusions about serotonergic neurons as labeled by the genetic marker Sert.

In conclusion, we developed a novel odor-guided choice task that combines genetic tools of great temporal and cellular specificity with well-controlled psychophysical characterization of intertemporal decision-making. We show that transient manipulations of DR serotonergic activity at the decision point causally regulated impulsive choice during difficult delay-size trade-off. We propose that predictive reward-value encoding signals in serotonergic neurons underlie appropriate choice between differently delayed reward options under decision trade-off.

Star Methods

Experimental Model and Subject Details

Subjects were male adult mice heterozygous for the transgene SERT-Cre (Tg(Slc6a4-cre)ET33Gsat; Gensat) in the C57BL/6J background. They were singly housed and water-restricted in accordance with guidelines from the Committee on Animal Care (CAC) at the Massachusetts Institute of Technology. Behavior experiments were carried out during the dark phase of the 12 hour dark/12 hour light cycle.

Methods Details

Viral Construction—The AAV-CBA-DIO-ChR2-eYFP plasmid was constructed by inserting the ChR2-eYFP gene fragment, which was obtained from a template, pLenti-CaMKIIa-ChR2-eYFP (courtesy of Dr. Karl Deisseroth at Stanford University [40]) into a linearized and modified AAV vector containing the chicken β-actin promoter, using the double-floxed inverted construct strategy. Restriction digests were made according to standard protocol, and ligations were made using Takara DNA ligation kit version 2.1. The construct was amplified using EndoFree Plasmid QIAGEN maxi prep kit. Recombinant AAV vectors were serotyped with AAVrh8 coat proteins and were packaged by the viral vector core at the Gene Therapy center and Vector Core at the University of Massachusetts Medical School. The final viral concentration was 1 × 1013 genome copies mL−1.

Stereotactic Injections and Optical Fiber Implantations—Stereotactic viral injections and optical fiber implantations were performed in accordance with CAC guidelines at MIT. Mice were anaesthetized using 250 mg/kg avertin and mounted onto a stereotactic setup. A small craniotomy was made over DR. 1.5 µL of virus was injected by using a glass micropipette attached to a 10 µL Hamilton microsyringe through a

microelectrode holder filled with mineral oil. A microsyringe pump was used to control the speed of the injection. For Arch experiments, mice were injected in the DR nucleus (AP: −4.65 mm, DV: −3 mm, ML: 0 mm) with AAV9-ef1α-DIO-eArch3.0-eYFP (UNC Vector Core), or a control virus containing AAV9-ef1α-DIO-eYFP (UNC Vector Core), diluted to a titer of 1×1011 particles/ml, in a pulled micropipette needle attached to a microinjector. For ChR2 experiments, mice were injected in the DR nucleus with AAVrh8-CBA-DIO-ChR2-eYFP diluted to a titer of 1×1012 particles/ml.

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

(10)

For DR cell body experiments, a single optical fiber (diameter 200 µm, numerical aperture 0.22) was implanted over the DR nucleus in the same session as viral injections, targeted for DR (AP: −4.65 mm, DV: −2.7 mm, ML: 0 mm), and secured with dental cement fitted with the top segment of a black microcentrifuge tube. For DR-NAc projection manipulation experiments, bilateral optical fibers were implanted above the nucleus accumbens (AP: +1.2 mm, DV: −4mm, ML: ±0.7 mm). For further light shielding, any dental cement not covered by the microcentrifuge tube was painted over with black nail polish. Mice were allowed to recover over a period of at least two weeks before being water-restricted again.

For electrophysiological validation, viruses were injected and allowed to express for 3–4 weeks before recordings were performed.

In vivo Electrophysiological Recordings—DR-ChR2 mice were anaesthetized via an i.p. injection (100 mL/kg) of a mixture of ketamine (100 mg/mL) / xylazine (20 mg/mL) and mounted on a stereotactic instrument. DR-Arch mice were anaesthetized with an i.p. injection of 1g/kg of urethane dissolved in saline. A small craniotomy was made above DR, and an optrode consisting a 0.5 MΩ tungsten electrode and an optical fiber with a diameter of 200 µm was lowered slowly into DR nucleus. The tip of the electrode extended beyond the end of the fiber tip by 200 µm. The optical fiber was connected either to a blue laser (473 nm, at 10 mW at fiber tip) or a green laser (532 nm, at 5 mW at fiber tip) during the

recording. The lasers were controlled by a signal generator. For ChR2 experiments, the blue laser was pulsed at 5 ms width and 10 Hz frequency. For Arch experiments, a constant pulse of green light was used. Multiunit or single unit activity was recorded via an Axon Digidata 1440A acquisition system running Clampex 10.2 software. Data were analyzed using MATLAB.

Histology and Immunohistochemistry—Mice were perfused with 4%

paraformaldehyde (PFA) in phosphate buffered saline (PBS). Brains were then post-fixed with the same solution overnight and sliced with a vibrotome to 50 µm sections. Sections were then incubated in primary antibodies in blocking solution (PBS containing 0.2% triton and 1% BSA) overnight at room temperature, washed in PBS 3 times for 15 min each, and then incubated in secondary antibodies in blocking solution for 2 hours at room temperature. Primary antibodies used were rabbit TPH (Thermo Fisher, 1:1000) and chicken anti-GFP (Abcam, 1:1000). Secondary antibodies used were AlexaFluor 488- or 568-conjugated secondary antibodies (Invitrogen, 1:1000). Sections were then washed again in PBS 3 times, mounted onto glass slides and allowed to dry completely. Vectashield mounting medium containing DAPI was then applied to the brain sections and cover slides were added.

Behavior Chamber—The behavior task took place in an operant chamber. The chamber was equipped with 3 ports into which the subjects could nosepoke. The center port delivered the odor and the two side ports delivered water reward. The center port was connected to an odor manifold fitted with filters containing odors diluted with mineral oil. Air flow to the manifold was controlled by an olfactometer (Island Motion, Tappan, NY, USA). A stream of air flow through the odor filters joined the main carrier flow and was delivered to the subject. The side ports were fitted with water tubing connected to solenoid valves. When the

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

(11)

subjects poked into the ports, infrared beams broke and the poke events were logged by Matlab (The MathWorks Inc., Natick, MA, USA).

Behavior Task—Subjects were water restricted for a week (administered 1.2 ml of water in a single session per day), and then gradually shaped to associate odor intensity with reward delays. On day one, subjects were placed into the chamber with the shaping protocol in place. Trials were self-initiated. Subjects were required to poke in the center port and collect a reward from either side port. Total air flow was held constant at 1000mL/min throughout the session. Odor delivery started as soon as the subjects made a center poke in. The intensity of odor 1 (caproic acid, 1:10 dilution in mineral oil) was linearly correlated with delay to the left reward. The intensity of odor 2 (hexanol, 1:10 dilution in mineral oil) was linearly correlated with delay to the right reward. The subjects were required to sample for a minimum of 400 milliseconds at the center port before the trial was able to proceed. The subjects were then required to commit a choice in one of the side ports to initiate the reward delay. The LEDs on the ports turned on when a center poke was made and were turned off when a valid side poke was made. If a center poke was not long enough to qualify the trial, the trial was aborted and the LEDs turned off. During the reward delay, the subjects were free to leave the port and move about the chamber. Once the reward delay lapsed, the water valve released a drop of water (~3 µL) with a click, and the subject could collect the water reward. The trials lasted 6 seconds longer than the longer reward delay so that the length of the trial was not contingent upon the choice. During training, both reward ports released 1 drop of water each. Subjects were typically trained on the task for 3 to 4 months before they consistently chose the less delayed reward. Once subjects had associated the odor intensities with reward delays, the left reward port released 2 separate drops of water (a large reward), while the right reward remained at 1 drop (a small reward).

Implanted mice performed the OGIC task with laser delivery through patch cords attached to the optical implants by ceramic sleeves. The patch cords were attached to a rotary joint to allow rotations and to free the mice for movement within the operant chamber. Between 10% and 20% of the trials were pseudorandomly selected to be light-on trials, in which the laser was turned on when a valid center poke was made, concurrent with the odor onset, and turned off when a valid side poke was made, committing the choice. The lasers (CNI, Jilin, China; Optoengine, Utah, USA) were triggered via a TTL pulse issued from the state machine that also controls the behavior apparatus. For Arch and green light control experiments, a constant pulse of 5 mW 532 nm light was used. For ChR2 and blue light control experiment, a train of 473 nm light was used at 10 Hz and 5 ms pulse width, power measured to be 10mW at constant output.

Quantification and Statistical Analyses

Preference Data Extraction—Port entry timestamps were logged using computers. For each subject, trial-by-trial data arrays were constructed using Matlab. Data from all sessions performed by the subject were pooled. The preferences were tabulated for each combination of left and right delays. Sampling time was calculated by subtracting the center poke entry time from center poke exit time. Transit time was calculated by subtracting the side port entry time from the center port exit time. The preference data was first sorted by relative

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

(12)

reward delays: whether left reward delay was longer than, equal to or shorter than the right reward delay. Average preference for each case was calculated and compared between stimulation conditions.

Choice Model for Psychometric Curves—We used a multiple regression model to describe the choice. We took the inverse of the right reward delay because it fitted the data better. In order to take the inverse of right reward delay, which contained 0 second values, we added a small delay of 0.5 seconds to both left and right reward delays, which

corresponds roughly to average transit time. Different link functions were compared as well as models including and not including an interaction between left delay and right delay−1. Model selection was based on goodness of fit assessed using Akaike Information Criterion normalized with number of trials. We use the function glmfit in Matlab to fit a generalized linear model for proportion left choice with a combination of a constant, βLeft Delay × Left

Delay, βRight Delay−1 × Right Delay−1 and βLeft Delay*Right Delay−1 × Left Delay × Right

Delay−1. To assess the influence of light activation and inhibition, we added a dummy variable for the presence of light, which could be 0 for light off and 1 for light on and could interact with any of the existing independent variables. Population means and SEMs for light interaction coefficients were compared to zero.

Trade-off Analysis—Heatmaps were generated by representing light effect predicted by the model with color intensity (predicted proportion of left choice with light on – predicted proportion of left choice with light off). Choice difficulty was calculated as 0.5 - |preference during light off trials – 0.5|, i.e. the distance between the subject’s mean choice to the closer 0 or 1 bounds. Perceptual difficulty was calculated as 1 - |intensity of odor 1 – intensity of odor 2 |/140. 140mL/min is the maximum total odor intensity possible. To compare the influence of choice difficulty against that of perceptual difficulty on the manipulation effect, a multiple linear regression was performed for individual subjects, where effect = β0 +

βchoice difficulty * choice difficulty + βperceptual difficulty * perceptual difficulty. Population means and SEM for regression coefficients were compared to zero.

Statistics—Data between control and test groups with a light-on and light-off design were analyzed in R with two-way ANOVA with a mixed model, accounting for the within design of light and between design of group. Model coefficients sometimes violate the assumption of normality of residuals as assessed by Shapiro-Wilk test, and therefore were compared with Wilcoxon signed-rank tests between conditions or with zero, wherever appropriate. Null hypothesis was rejected at p-value<0.05, unless Bonferroni-corrected for multiple comparisons, in which case the p-value was multiplied by m, where m is the number of hypotheses. All data were visualized as population means with SEM, and the fit line represents the population mean of linear fit lines of individual subjects.

Supplementary Material

Refer to Web version on PubMed Central for supplementary material.

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

(13)

Acknowledgments

We thank C.W. Lovett for assistance with histology and colony management; C. A. Twiss for assistance with behavior training; W. Yu for assistance with histology and T. O’Connor for help with the behavior setup. We are grateful to N. Uchida for advice on the experiments, K. Deisseroth for viral constructs and M. S. Fee and E. K. Miller for comments on the study. We thank A. Bari for reading the manuscript and K. J. Bustamante for helpful input for the initial design of the task. We thank members of the Tonegawa lab for helpful discussions. This work was supported by a National Science Scholarship from the Agency for Science, Technology and Research (to S.X.), the RIKEN Brain Science Institute, the Howard Hughes Medical Institute, and the JPB Foundation (to S.T.).

References

1. Shoda Y, Mischel W. Predicting adolescent cognitive and self-regulatory competencies from preschool delay of gratification: Identifying diagnostic conditions. Dev. Psychol. 1990; 26:978–986. 2. Bickel WK, Koffarnus MN, Moody L, Wilson AG. The behavioral- and neuro-economic process of

temporal discounting: A candidate behavioral marker of addiction. Neuropharmacology. 2014; 76(Pt B):518–527. [PubMed: 23806805]

3. Reynolds B. A review of delay-discounting research with humans: relations to drug use and gambling. Behav. Pharmacol. 2006; 17:651–667. [PubMed: 17110792]

4. Story GW, Vlaev I, Seymour B, Darzi A, Dolan RJ. Does temporal discounting explain unhealthy behavior? A systematic review and reinforcement learning perspective. Front. Behav. Neurosci. 2014; 8:76. [PubMed: 24659960]

5. Barkley RA, Edwards G, Laneri M, Fletcher K, Metevia L. Executive functioning, temporal discounting, and sense of time in adolescents with attention deficit hyperactivity disorder (ADHD) and oppositional defiant disorder (ODD). J. Abnorm. Child Psychol. 2001; 29:541–556. [PubMed: 11761287]

6. Heerey EA, Robinson BM, McMahon RP, Gold JM. Delay discounting in schizophrenia. Cognit. Neuropsychiatry. 2007; 12:213–221. [PubMed: 17453902]

7. Takahashi T, Oono H, Inoue T, Boku S, Kako Y, Kitaichi Y, Kusumi I, Masui T, Nakagawa S, Suzuki K, et al. Depressive patients are more impulsive and inconsistent in intertemporal choice behavior for monetary gain and loss than healthy subjects--an analysis based on Tsallis’ statistics. Neuro Endocrinol. Lett. 2008; 29:351–358. [PubMed: 18580849]

8. Ahn W-Y, Rass O, Fridberg DJ, Bishara AJ, Forsyth JK, Breier A, Busemeyer JR, Hetrick WP, Bolbecker AR, O’Donnell BF. Temporal discounting of rewards in patients with bipolar disorder and schizophrenia. J. Abnorm. Psychol. 2011; 120:911–921. [PubMed: 21875166]

9. Wogar AM, Bradshaw CM, Szabadi E. Effect of lesions of the ascending 5-hydroxytryptaminergic pathways on choice between delayed reinforcers. Psychopharmacology (Berl.). 1993:1–5. 10. Mobini S, Chiang T, Ho M, Bradshaw C, Szabadi E. Effects of central 5-hydroxytryptamine

depletion on sensitivity to delayed and probabilistic reinforcement. Psychopharmacology (Berl.). 2000; 152:390–397. [PubMed: 11140331]

11. Schweighofer N, Bertin M, Shishida K, Okamoto Y, Tanaka SC, Yamawaki S, Doya K. Low-serotonin levels increase delayed reward discounting in humans. J Neurosci. 2008; 28:4528–4532. [PubMed: 18434531]

12. Cohen JY, Amoroso MW, Uchida N. Serotonergic neurons signal reward and punishment on multiple timescales. eLife. 2015; 4:e06346.

13. Kim S, Hwang J, Lee D. Prefrontal Coding of Temporally Discounted Values during Intertemporal Choice. Neuron. 2008; 59:522.

14. Kepecs A, Uchida N, Zariwala H, Mainen ZF. Neural correlates, computation and behavioural impact of decision confidence. Nature. 2008; 455:227–231. [PubMed: 18690210]

15. Roesch M, Calu D, Schoenbaum G. Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards. Nat. Neurosci. 2007; 10:1615–1624. [PubMed: 18026098]

16. Uchida N, Mainen ZF. Speed and accuracy of olfactory discrimination in the rat. Nat Neurosci. 2003; 6:1224–1229. [PubMed: 14566341]

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

(14)

17. Acheson A, Farrar AM, Patak M, Hausknecht KA, Kieres AK, Choi S, de Wit H, Richards JB. Nucleus accumbens lesions decrease sensitivity to rapid changes in the delay to reinforcement. Behav Brain Res. 2006; 173:217–228. [PubMed: 16884790]

18. Cardinal R, Pennicott D, Lakmali C. Impulsive choice induced in rats by lesions of the nucleus accumbens core. Science. 2001; 292:2499–2501. [PubMed: 11375482]

19. Pothuizen HJ, Jongen-Rêlo J, Feldon AL, Yee BK. Double dissociation of the effects of selective nucleus accumbens core and shell lesions on impulsive-choice behaviour and salience learning in rats. Eur. J. Neurosci. 2005; 22:2605–2616. [PubMed: 16307603]

20. Winstanley CA, Theobald DEH, Dalley JW, Robbins TW. Interactions between serotonin and dopamine in the control of impulsive choice in rats: therapeutic implications for impulse control disorders. Neuropsychopharmacology. 2005; 30:669–682. [PubMed: 15688093]

21. Gerfen CR, Paletzki R, Heintz N. Cre-Recombinase Driver Lines to Study the Functional Organization of Cerebral Cortical and Basal Ganglia Circuits. Neuron. 2013; 80:1368–1383. [PubMed: 24360541]

22. Dalley JW, Fryer TD, Brichard L, Robinson ESJ, Theobald DEH, Laane K, Pena Y, Murphy ER, Shah Y, Probst K, et al. Nucleus accumbens D2/3 receptors predict trait impulsivity and cocaine reinforcement. Science. 2007; 315:1267–1270. [PubMed: 17332411]

23. Dalley JW, Everitt BJ, Robbins TW. Impulsivity, compulsivity, and top-down cognitive control. Neuron. 2011; 69:680–694. [PubMed: 21338879]

24. Brown P, Molliver ME. Dual Serotonin (5-HT) Projections to the Nucleus Accumbens Core and Shell: Relation of the 5-HT Transporter to Amphetamine-Induced Neurotoxicity. J. Neurosci. 2000; 20:1952–1963. [PubMed: 10684896]

25. Liu Z, Zhou J, Li Y, Hu F, Lu Y, Ma M, Feng Q, Zhang J, Wang D, Zeng J, et al. Dorsal raphe neurons signal reward through 5-HT and glutamate. Neuron. 2014; 81:1360–1374. [PubMed: 24656254]

26. Miyazaki KW, Miyazaki K, Tanaka KF, Yamanaka A, Takahashi A, Tabuchi S, Doya K. Optogenetic Activation of Dorsal Raphe Serotonin Neurons Enhances Patience for Future Rewards. Curr. Biol. 2014; 24:2033–2040. [PubMed: 25155504]

27. Fonseca MS, Murakami M, Mainen ZF. Activation of Dorsal Raphe Serotonergic Neurons Promotes Waiting but Is Not Reinforcing. Curr. Biol. 2015; 25:306–315. [PubMed: 25601545] 28. Bari A, Robbins TW. Inhibition and impulsivity: Behavioral and neural basis of response control.

Prog. Neurobiol. 2013; 108:44–79. [PubMed: 23856628]

29. Qi J, Zhang S, Wang HL, Wang H, de Jesus Aceves Buendia J, Hoffman AF, Lupica CR, Seal RP, Morales M. A glutamatergic reward input from the dorsal raphe to ventral tegmental area dopamine neurons. Nat Commun. 2014; 5:5390. [PubMed: 25388237]

30. Matias S, Lottem E, Dugué GP, Mainen ZF. Activity patterns of serotonin neurons underlying cognitive flexibility. eLife. 2017; 6:e20552. [PubMed: 28322190]

31. Bromberg-Martin ES, Hikosaka O, Nakamura K. Coding of Task Reward Value in the Dorsal Raphe Nucleus. J. Neurosci. 2010; 30:6262–6272. [PubMed: 20445052]

32. Nakamura K, Matsumoto M, Hikosaka O. Reward-dependent modulation of neuronal activity in the primate dorsal raphe nucleus. J. Neurosci. 2008; 28:5331–5343. [PubMed: 18480289] 33. Ranade SP, Mainen ZF. Transient Firing of Dorsal Raphe Neurons Encodes Diverse and Specific

Sensory, Motor, and Reward Events. J. Neurophysiol. 2009; 102:3026–3037. [PubMed: 19710375] 34. Wogar MA, Bradshaw CM, Szabadi E. Effect of lesions of the ascending 5-hydroxytryptaminergic

pathways on choice between delayed reinforcers. Psychopharmacology (Berl.). 1993; 111:239– 243. [PubMed: 7870958]

35. Winstanley CA, Dalley JW, Theobald DEH, Robbins TW. Fractionating impulsivity: contrasting effects of central 5-HT depletion on different measures of impulsive behavior.

Neuropsychopharmacol. Off. Publ. Am. Coll. Neuropsychopharmacol. 2004; 29:1331–1343. 36. Mazur, JE. An adjusting procedure for studying delayed reinforcement. Hillsdale, NJ: Earlbaum;

1987.

37. Luo M, Li Y, Zhong W. Do dorsal raphe 5-HT neurons encode “beneficialness”? Neurobiol. Learn. Mem. 2016; 135:40–49. [PubMed: 27544850]

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

(15)

38. Sudhof TC. Reproducibility: Experimental mismatch in neural circuits. Nature. 2015; 528:338– 339. [PubMed: 26649825]

39. Marcinkiewcz CA, Mazzone CM, D’Agostino G, Halladay LR, Hardaway JA, DiBerto JF, Navarro M, Burnham N, Cristiano C, Dorrier CE, et al. Serotonin engages an anxiety and fear-promoting circuit in the extended amygdala. Nature. 2016; 537:97–101. [PubMed: 27556938]

40. McDevitt RA, Tiran-Cappello A, Shen H, Balderas I, Britt JP, M MRA, Chung SL, Richie CT, Harvey BK, Bonci A. Serotonergic versus Nonserotonergic Dorsal Raphe Projection Neurons: Differential Participation in Reward Circuitry. Cell Rep. 2014; 8:1857–1869. [PubMed: 25242321] 41. Mattis J, Tye KM, Ferenczi EA, Ramakrishnan C, O’Shea DJ, Prakash R, Gunaydin LA, Hyun M,

Fenno LE, Gradinaru V, et al. Principles for applying optogenetic tools derived from direct comparative analysis of microbial opsins. Nat Meth. 2012; 9:159–172.

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

(16)

Figure 1. Odor-Guided Intertemporal Choice Task

(A) Task structure. The subject initiates the tasks by sampling odors in the center odor port, and then makes a choice by poking in the left or right reward port, waits the prescribed amount of time anywhere in the chamber and receives water reward on the chosen side. (B) Each odor cue was used in six different intensities, corresponding to six different reward delay lengths. O1: odor 1 intensity; O2: odor 2 intensity; DL: delay of left reward; DR: delay

of right reward. In this example trial, 70mL/min of odor 1 is mixed with 10 mL/min of odor 2, indicating a delay of 14 seconds to the left reward and a delay of 2 seconds to the right reward. (C) Raw data from a single example subject performing the OGIC task with equal

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

(17)

left and right reward sizes (925 trials, 5 sessions). RD0-20: trials in which right reward delay was 0–20 second. (D) Psychometric curves fitted to choice data in SL1 sessions (10 mice, 64 sessions, 10882 trials). (E) Psychometric curves fitted to choice data in SL2 sessions (10 mice, 79 sessions, 10269 trials). (F) The subjects chose the left option more often during SL2 sessions (paired t-test, t9= 6.090, ***p<0.001). (G) The constant in the choice model

was not changed during SL2 sessions (Wilcoxon signed-rank test, p>0.05). (H) The coefficient for left reward delay was not changed during SL2 sessions (Wilcoxon signed-rank test, p>0.05 (I) The coefficient for right reward delay was not changed during SL2 sessions (Wilcoxon signed-rank test, p>0.05). (J) The coefficient for the interaction term was significantly reduced during SL2 sessions (Wilcoxon signed-rank test, **p <0.01). Data are represented as mean ± SEM. See also Figure S1 and Table S1.

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

(18)

Figure 2. Optogenetic manipulations of DR serotonergic neurons

(A, F) Schematics for virus injection and optical implant. (B, G) Time points for light delivery in behavior. Center: center poke TTL; Side: side poke TTL. (C) Expression of AAV9-ef1α-DIO-eArch3.0-eYFP in the DR nucleus of a Sert-Cre mouse. Red: tryptophan hydroxylase; Green: Arch. Scale bar: 100 µm. (D) Psychometric curves for proportion of left choice in DR-Ctrl group subjected to green light (10 mice, 20473 trials, 132 sessions). Empty symbols and dotted line: light-off trials. Filled symbols and solid line: light-on trials. RD0-8: trials in which right reward delay was 0–8 seconds. (E) Psychometric curves for proportion of left choice in DR-Arch group (10 mice, 20785 trials, 137 sessions). (H) Expression of AAVrh8-CBA-DIO-ChR2-eYFP in the DR nucleus of a Sert-Cre mouse. (Red: tryptophan hydroxylase; Green: ChR2). (I) Psychometric curves for proportion left choice in DR-Ctrl group subjected to blue light (10 mice, 20712 trials, 140 sessions). (J)

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

(19)

Psychometric curves for proportion left choice in DR-ChR2 group (10 mice, 20455 trials, 147 sessions). (K) Green light reduced the proportion of left choice in DR-Arch group (post-hoc paired t-test, **p<0.01), but not in DR-Ctrl group (post-(post-hoc paired t-test, p>0.05). (L) Blue light increased the proportion of left choice in DR-ChR2 group (post-hoc paired t-test, **p<0.01) but not in DR-Ctrl group (post-hoc paired t-test, p>0.05). (M) There was positive interaction between green light and the delay interaction term in DR-Arch group (Wilcoxon signed-rank test, *p<0.05) whereas there was a negative interaction between blue light and the delay interaction term in DR-ChR2 group (Wilcoxon signed-rank test, *p<0.05). Data are represented as mean ± SEM. See also Figure S2, Table S2.

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

(20)

Figure 3. Optogenetic Manipulation of DR Serotonergic Projections in the NAc

(A) Schematic for viral injections. (B) ChR2-expressing serotonergic projections in a coronal section of nucleus accumbens. Scale bar: 200 µm (C) Schematics for Arch and ChR2 virus injection and optical implant, respectively. (D) Psychometric curves fitted to raw data for DR-NAC-Ctrl group subjected to green light (10 mice, 20534 trials, 115 sessions), DR-NAC-Arch group subjected (10 mice, 20329 trials, 142 sessions), DR-NAC-Ctrl group subjected to blue light (10 mice, 20606 trials, 130 sessions) and DR-NAC-ChR2 group (10 mice, 20506 trials, 150 sessions). Empty symbols and dotted line: light-off trials. Filled symbols and solid line: light-on trials. (E) Green light reduced the proportion of left choice in DR-NAC-Arch group (post hoc paired t-test, **p<0.01) but not in DR-NAC-Ctrl group (post hoc paired t-test, p>0.05). (F) Blue light increased the proportion of left choice in DR-NAC-ChR2 group (post hoc paired t-test, **p<0.01) but not in DR-NAC-Ctrl group (post hoc paired t-test, p>0.05). (G) There was positive interaction between green light and the delay interaction term in DR-NAC-Arch group (Wilcoxon signed-rank test, *p<0.05) whereas there was a negative interaction between blue light and the delay interaction term in DR-NAC-ChR2 group (Wilcoxon signed-rank test, *p<0.05). Data are represented as mean ± SEM. See also Table S2.

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

HHMI A

uthor Man

uscr

ipt

(21)

Figure 4. Trade-off analysis of manipulation effects

(A – D) Heatmaps of predicted light manipulation effects (predicted differences between proportion left choice during light-on conditions and proportion left choice during light-off conditions). White solid line: perceptual boundary where odor 1 is at the same intensity as odor 2. Red dotted line: the valleys and peaks of the predicted manipulation effects. Colorbar: color-coding for levels of effect (unit: proportion of choice). (E – H) Light manipulation effects plotted against choice difficulty (the distance between average proportion of left choice and either 0 or 1, whichever is closest) and perceptual difficulty (difference between the intensity of the two odor cues normalized by total odor intensity possible). (I – L) Multiple regression coefficients for choice difficulty and perceptual difficulty accounting for the light effect. Choice difficulty negatively modulates green light effects in DR-Arch and DR-NAC-Arch groups. Choice difficulty positively modulates blue