• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

CCCP19: Russian-French Symposium on Neuroeconomics

The HSE will host the first seminar on neuroeconomics, organized jointly with École normale supérieure (Paris). Neuroeconomics studies the decision-making process and uses economic approaches to study the brain and achieve neuroscience to adjust economic decision-making models.

23 - 24 September 2019Event
Language: English
Location: Myasnitskaya ul. 20
09/23/2019: room 102
09/24/2019: room 124

Organizing committee: Vasilisa Skvortsova (ENS); Boris Gutkin (HSE / ENS), Anna Shestakova (HSE), Vasily Klucharev (HSE).

We are pleased to announce the 2019 edition of the CCCP conferences / symposia focusing this year on neuroeconomics. The symposium is organized in a collaboration between the Center for Cognition and Decision Making (CCDM), Higher School of Economics and the Laboratory for Cognitive and Computational Neuroscience (LNC2) at the Institute for Cognitive Neuroscience, central campus of HSE.

Those interested to attend have to register: REGISTRATION

Final program: 
pdf;      Abstracts (see below)

Preliminary program:

23/09

МС20/102

Plenary talks

PhD Students presentations

10:00 -10:45

Valentin Wyart, ENS Paris

Human learning and decision-making under uncertainty: computational models, neural mechanisms, psychiatric dysfunctions

 

 

10:45 -11:30

Vasily Klucharev, HSE Moscow

Neurodynamics  and Cognitive Dissonance

 

11:30 -12:00

Coffee break

12:00 -12:45

Ksenia Panidi, HSE Moscow

Distinct roles of DLPFC and PPC in reward value and reward probability

 

 

12:45 -13:05

 

Oxana Zinchenko, HSE

Neural correlates of prosocial motivation and self-maximization

 

13:10 – 13:30

Mario Martinez Saito, HSE

Neural Underpinnings of Exploitation of Common Goods

 

13:30 – 13:50

Anush Ghambaryan, HSE

Neurocomputational model of human choices in uncertain and changing environments


 

14:00 – 15:30

Lunch

15:30 – 16:30

Visiting CCDM/IoCN HSE

24/09

МС20/124

 

 

10:00 -10:45

Stefano Palminteri, ENS Paris

The construction and deconstruction of irrational preferences through range-adapting reinforcement learning

 

 

10:45 -11:30

Anna Shestakova, Alexei Gorin HSE Moscow

Neural plasticity in the auditory sensory cortex elicited by monetary outcomes


 

11:30 -12:00

Coffee break

12:00 -12:45

Vasilisa Skvortsova, ENS Paris

Computational noise in reward-guided learning drives behavioral variability in volatile environments

 

12:45 – 13:30

Charles Findling, ENS Paris

Soft introduction to MCMC/SMC/particle filtering methods in modelling human behavior: from exact to noisy inference computations.    

13:30 – 14:30

Lunch

 
Как найти:

 

Abstracts

 Valentin Wyart

 Inserm, Ecole Normale Supérieure, PSL University, Paris, France

Human learning and decision-making under uncertainty: computational models, 
neural mechanisms, psychiatric dysfunctions
  
Efficient learning and decision-making in uncertain environments constitutes a difficult challenge for human and machine intelligence. Even for simplest, binary decisions, it requires to infer properties of one’s environment on the basis of imperfect evidence. Such inference process has been studied across a wide range of vastly different paradigms, from categorizing ambiguous stimuli (perceptual decisions) to choosing among stochastic reward sources (reward-guided decisions). In this talk, I will present the research framework developed in my group for characterizing the similarities and differences between perceptual and reward-guided decisions – something which had previously proven difficult for both theoretical and experimental reasons. First, I will show how the accuracy of perceptual and reward-guided decisions is bounded not by sensory variability nor by choice stochasticity, but by the limited precision of inference. Neuroimaging, pupillometric and causal pharmacological evidence suggest an important, yet previously unsuspected role for the noradrenergic system in inference precision. Then, I will explain how perceptual and reward-guided decisions diverge not only in their input, but also in the degree of control conferred to decision-makers in the sampling of uncertain environments. To illustrate the neurobiological relevance of this difference, I will show that interacting with uncertain environments (a distinctive feature of reward-guided decisions) stabilizes inference by shaping relational codes in the medial temporal lobe. I will end my talk by stressing the fragility of inference in psychiatric diseases: the emergence of circular reasoning in ketamine models of psychosis, and the impaired learning from conflicting action outcomes in obsessive-compulsive disorder.

Stefano Palminteri

 Inserm, Ecole Normale Supérieure, PSL University, Paris, France

 The construction and deconstruction of irrational preferences through range-adapting reinforcement learning

 Wealth of evidence in behavioral economics and affective neuroscience suggests that option values are highly dependent of the context in which the options are presented. Building on an analogy with perceptual psychophysics and neuroscience, option valuation seems to be affected by both the spatial (i.e., what is the value of the simultaneously presented options?) and temporal (i.e., which options were presented in the recent past?) contexts. In a series of recent papers, we demonstrated that contextual adjustments also occur in reinforcement learning. However, the exact algorithmic implementation of context-dependence and how this process is affected by modulating feedback information, still remains unclear. To fill these gaps, we implemented 4 new variants of an instrumental learning task where we orthogonally manipulated outcome magnitude and feedback information, resulting in systematic variations in reward ranges. In a first phase of the task (learning test), participants had to determine by trial-and-error the most favorable option (in terms of received points) in 4 fixed pairs of options. In a second phase (transfer test), the original pairs were remodeled to investigate the choice preference between options extrapolated from their original context. We ran 5 experiments (one in the laboratory: N=40; and four online N=400). In all experiments, subjects learn above chance level. Of note, lab results were qualitatively well replicated in the corresponding online experiment. We replicate results found in previous studies indicating partial range adaptation in the learning test and context-induced suboptimal preferences in the transfer test. We found that increasing feedback information (by showing both the obtained and the forgone outcome: complete feedback) in the learning test increases the context-induced suboptimal preferences, as measured at the first trial of the transfer test, compared to the partial feedback case (showing the obtained outcome only). Further analysis of trial-by-trial dynamics during the transfer test showed that, while complete feedback redresses suboptimal preferences, partial feedback does the opposite. In complement to choice rate analysis, we developed a computational model that implements normalization by tracking the range of each decision context and adapting the perceived reward accordingly. Model simulations show that this model best explains subjects’ behavior, capturing both the partial adaptation during the learning test and the context-induced suboptimal preferences in the transfer test. Model comparison indicates that the new RANGE model performs better compared to a simple Q-learning model and a previously proposed descriptive model, featuring normalization as a weighted average of absolute and relative outcomes. Finally, we analyzed the payoff of the two models and found that subjects would have been better off (financially speaking) if they behaved as Q-learners (instead of Range model).  To conclude, we provide definitive evidence of context-dependent reinforcement learning in humans and concomitantly propose a more satisfactory computational model to explain these processes. Between-task comparison indicates that increasing feedback information has somehow counter-intuitive results, since it decreases optimization in the transfer test. Strikingly, we demonstrate a clear instance of irrational economic behavioral arising from too much (instead of not enough) computations.


 Vasily Klucharev,

 Professor, Director, Institute for Cognitive Neuroscience, National Research University Higher School of Economics, Russian Federation

  Neurodynamics and Cognitive Dissonance

 Although cognitive dissonance was first suggested more than half a century ago, this phenomenon has only recently moved into the field of neuroscience.

 Here, we report the results of series of studies that further explored the neural mechanisms of cognitive dissonance. We extended this area of research by introducing a number of novel approaches to it: firstly, we explored the neural foundation of cognitive dissonance during the decisional stage of the free-choice paradigm, whereas other studies focused on the post-decisional stages (Izuma et al 2010, Mengarelli 2015).We integrated evidences from resting states cortical activity and evoked potentials, thus framing the cognitive dissonance phenomenon within the performance monitoring mechanisms, and finally, we employed non-invasive brain modulatory techniques such as the transcranial Direct Current Stimulation (tDCS) to modulate the pMFC and thus demonstrate the early role of this area in generation and reduction of cognitive dissonance. Taken together, our results support out initial hypothesis that cognitive dissonance shares similar neural structures and dynamics underpinning performance monitoring mechanisms. This activity is reflected in both resting-state and choice-related neural activation of the prefrontal cortex as a part of the general performance-monitoring circuitry, as well as we demonstrated the causal role of pMFC in CD generation and following behavioral adjustment. 

 

Ksenia Panidi

 Assistant Professor, Department of Economics – National Research University Higher School of Economics

 Distinct roles of DLPFC and PPC in reward value and reward probability

 Recent economic theories of choice under risk postulate that the observed risk-taking behavior in monetary domain may be a result of the value of money (i.e. how much a person values one additional dollar, or decreasing marginal utility of money) as well as specific perception of probabilities (i.e. probability weighting). However, existing neuroeconomic studies of risk taking usually focus on the analysis of the degree of the observed risk taking per se without disentangling its individual components. Recent neuroeconomic research suggests that dorso-lateral prefrontal cortex (DLPFC) and posterior parietal cortex (PPC) may both play a role in risky decision-making. In two separate experiments, we employ transcranial magnetic stimulation to explore the effects of decreased DLPFC and PPC excitability on distinct components of risky choice. We employed a within-subject design with cTBS stimulation and Random lottery pair design to parametrically estimate components of risk preferences. The results suggest that DLPFC and PPC may play distinct roles in reward value and reward probability perception. In particular, we observe that decrease in the DLPFC excitability leads to a decrease in the coefficient of risk aversion (increased marginal utility of money), while no change in probability perception is observed. At the same time, decrease in PPC excitability leads to more linear (less distorted) probability weighting, while it has no effect on reward value.

 Anna Shestakova

  PhD, researcher, Сentre for Cognition and Decision Making, Institute for Cognitive Neuroscience, National Research University Higher School of Economics, Russian Federation

  Neural plasticity in the auditory sensory cortex elicited by monetary outcomes

 Traditional decision-making theory assumes that individuals’ choices are driven by values associated with prospective outcomes. Interestingly, popular neurobiological models of decision-making acknowledge the key role of learning in reward-based decisions but indirectly assume that the primary sensory inputs to dopaminergic (decision-making) networks are stationary and independent from previous decisions. However, many cognitive studies have demonstrated experience-induced plasticity in the primary sensory cortices. This suggests that repeated decisions could modulate sensory processing, which, in return, can modulate follow-up decisions. In series of studies, we tested the hypothesis that repeated associations of a stimulus with a monetary outcome may evoke plasticity in the sensory processing. Furthermore, we tested the link between the neural activity underlying value-based learning and plastic changes in the sensory cortices. The plasticity of auditory processing is often reflected in the mismatch negativity (MMN) component of auditory ERPs. The MMN is an electrophysiological signature of a pre-attentive process that detects alterations in a regular sound sequence. Recent evidence suggests that in addition to deviance detection, MMN can also be implicated to predictive coding mechanism. To study sensory plasticity, we presented incentive cues as deviants during oddball sessions recorded before and after training in the two MID task sessions. For gains we found that after a two-day training in the MID task, regardless of their magnitude and probability, incentive cues evoked a larger P3a, indicating the enhancement of involuntary attention to stimuli that predict rewards. At the individual level, the training-induced change of mismatch-related negativity was correlated with the amplitude of the feedback-related negativity (FRN) recorded during the first MID task session. No expected value (EV)-related changes of MMN was revealed. For losses, we found that two sessions of the MID task evoked the significant enhancement of the MMN for incentive cues predicting larger monetary losses, specifically, when monetary cue discrimination was essential for maximizing monetary outcomes.

 Our results suggest that the MID task evokes plastic changes in the auditory system associated with better passive discrimination of incentive cues and with enhanced involuntary attention switching towards the cues. Moreover, the task-induced sensory plasticity correlated with the learning-related neural activity recorded during the MID task. Our results confirm that the sensory processing of incentive cues is dynamically modulated by the previous monetary outcomes and this modulation is reward-specific. Next logical step would be to use active neuroimaging such as brain stimulation in order to prove causal relationship between the medial prefrontal function mediating performance monitoring function as signaled by the FRN and sensory memory indexed by MMN.

 
Charles Findling
 ,

  post-doctoral fellow, Ecole Normale SupérieurePSL University, Paris, France

  Soft introduction to MCMC/SMC/particle filtering methods in modelling human behavior: from exact to noisy inference computations.    

 In this tutorial, I will give you the intuitions behind the statistical methods such as Markov Chain Monte Carlo (MCMC) and Sequential Monte Carlo (SMC) methods which are now gaining attention in neuroscientific and psychological community. First, these techniques give access to the estimates of marginal likelihoods of your models which are valuable when dealing with model comparisons. Second, they allow to model behavioral quantities such as precision/learning noise in the inference process of subjects and to explore its properties.   

 

Vasilisa Skvortsova ,

 post-doctoral fellow, Ecole Normale SupérieurePSL University, Paris, France

 Computational noise in reward-guided learning drives behavioral variability in volatile environments

When learning the value of actions in volatile environments, humans often make seemingly irrational decisions which fail to maximize expected value. Prominent theories describe these ‘non-greedy’ decisions as the result of a compromise between choosing a currently well-valued action vs. exploring more uncertain, possibly better-valued actions – known as the ‘exploration-exploitation’ trade-off. However, we have recently shown that the accuracy of human decisions based on multiple ambiguous cues is bounded not by variability in the choice process, but rather by computational noise arising from the underlying inference process. We thus reasoned that non-greedy decisions may be caused to some extent by the same kind of noise, in this case variability in the learning of action values. We derived a theoretical formulation of reinforcement learning (RL) which allows for random noise in its core computations. In a series of behavioral, functional neuroimaging and pupillometric experiments, tested over a total of 90 human participants in a canonical restless bandit task, we quantified the fraction of non-greedy decisions driven by learning noise and identified its neurophysiological substrates. At the behavioral level, we show that more than half of non-greedy decisions are triggered by learning noise. By describing the consistency of human decisions across repetitions of the same sequence of rewards in terms of a ‘bias-variance’ trade-off, we rule out the possibility that learning noise is due to a misspecification of our RL framework. At the neurophysiological level, the trial-to-trial variability of sequential learning steps and their impact on behavior could be predicted both by BOLD responses in the dorsal anterior cingulate cortex (dACC) and by phasic pupillary dilation – suggestive of neuromodulatory fluctuations driven by the locus coeruleus-norepinephrine (LC-NE) system. Together, these findings indicate that most of the behavioral variability observed in volatile environments is due to the limited computational precision of reward guided learning.

 

Mario Martinez Saito,

  PhD student, Сentre for Cognition and Decision Making, Institute for Cognitive Neuroscience, National Research University Higher School of Economics, Russian Federation

  Neural Underpinnings of Exploitation of Common Goods

  Why do people often exhaust unregulated common natural resources but successfully sustain similar private resources? To answer this question the present work combines a neurobiological, economic, and cognitive modeling approach. Using functional magnetic resonance imaging we showed that sharp depletion of a common (shared) and a private resource deactivated the ventral striatum, that is involved in the valuation of outcomes. Across individuals the observed inhibition of the ventral striatum negatively correlated with attempts to preserve the common resource, but the opposite pattern was observed when individuals dealt with their private resource. The results indicate that the basic neural value signals differentially modulate people's behavior in response to the depletion of common versus private resources. The computational modeling of the results suggests that the overharvesting of common resources is facilitated by social comparison. Overall, the results could explain some aspects of people’s tendency to overexploit unregulated common natural resources.

 

Anush Ghambaryan

  PhD student, Сentre for Cognition and Decision Making, Institute for Cognitive Neuroscience, National Research University Higher School of Economics, Russian Federation

  Neurocomputational model of human choices in uncertain and changing environments

 Behavioral economics approach suggests that for deriving choices in value-based decision-making humans integrate computational components (utilities and reward probabilities) according to expected utility theory (Von Neumann & Morgenstern, 1947), but make computations based on distorted representation of computational components. This explains systematic deviation of human choices from optimality (Kahneman & Tversky, 1979) but may conclude models that are not indicative of underlying neurocomputational mechanisms. An alternative neuroeconomics view (model MIX) was recently proposed by Rouault, Drugowitsch, and Koechlin (2019). The developed model MIX suggests that the decision variable (log-odds of two options) varies as the addition of the difference between normalized utilities (derived through reinforcement learning and value updates) and the difference between state beliefs (derived through multi-stage reward probability updates) of two options controlled by a weighting parameter responsible for variations in the level of reliance on state beliefs rather than on normalized utilities. The experimental task used in the study is a one-armed bandit task, where each bandit proposes varying rewards with uncertain probability. The experiment was designed such that participants could learn rewarding probabilities over trials but had to adjust as rewarding probabilities were changing throughout the experiment.

 Normalization of utilities in the model is not tested experimentally. We have replicated the original study in low and high reward parity conditions and estimated MIX model parameters for those conditions separately. The parameter weighing normalized utilities and state beliefs was not different in two conditions. But risky choice frequencies differed in two conditions significantly. Here risky choice frequency is the percentage of choosing options with relatively high normalized reinforcement learning value in trials, where such options are characterized by relatively low state belief. The obtained result suggests that the newly developed model MIX fits human choices equally well in both experimental conditions, but none of its parameters, on average, indicate differences in risky choices arising from the difference in the absolute level of proposed rewards.

 

Oksana Zinchenko,

  PhD student, Сentre for Cognition and Decision Making, Institute for Cognitive Neuroscience, National Research University Higher School of Economics, Russian Federation

 Neural correlates of prosocial motivation and self-maximization

  Many neuroeconomics studies have suggested that social norms play a crucial role in economic decision-making (Fehr and Fischbacher, 2004; Melis and Semmann, 2010). Previous neuroimaging studies have suggested that activity in the dorsolateral prefrontal cortex (DLPFC) during the dictator game, and other games, is associated with the control of selfish impulses (Wout et al., 2005; Knoch et al., 2006; Knoch and Fehr, 2007; Knoch et al., 2009) and the ability to strategically adapt behavior (Spitzer et al., 2007; Ruff et al., 2013). Knoch et al. (2006) showed that inhibitory repetitive transcranial magnetic stimulation (rTMS) of the rDLPFC led to a stronger maximization of the budget in the ultimatum game. On the other hand, inhibitory rTMS of the rDLPFC led to more generous behavior in the dictator game (Christov-Moore et al., 2017). The present study aims to address inconsistencies in the findings regarding rDLPFC involvement in control of particular motivations during decision-making.

We hypothesized (Hypothesis I) that inhibitory cTBS of the rDLPFC could increase voluntary transfers in both the (a) dictator and (b) generosity games. Alternatively (alternative Hypothesis II) inhibitory cTBS of the rDLPFC will liberate selfish motives and, consequently, (a) decrease voluntary transfers in the dictator game, while (b) not affecting voluntary transfers in the generosity game, in which the dictator’s budget is fixed. Our preliminary data partially support Hypothesis II, suggesting that the DLPFC plays a crucial role in controlling selfish motives and strategic social behavior.