Separate Brain Systems Process the Consequences of Our Decisions

To avoid repeating the same mistakes and learn to make good choices, our brain needs to correctly evaluate the consequences of our decisions.

Exactly how the brain implements this learning process during value-based decision making, however, remains unclear.

A study from a team of neuroscientists at the Institute of Neuroscience and Psychology at the University of Glasgow has shed some light.

Imagine picking wild berries in a forest when suddenly a swarm of bees flies out from behind a bush. In a split second, your motor system has already reacted to flee the swarm. This automatic response – acting before thinking – constitutes a powerful survival mechanism to avoid imminent danger.

In turn, a separate, more deliberate process of learning to avoid similar situations in the future will also occur, rendering future berry-picking attempts unappealing. This more deliberate, “thinking” process will assist in re-evaluating an outcome and adjusting how rewarding similar choices will be in the future.

“To date the biological validity and neural underpinnings of these separate value systems remain unclear,” said Dr Marios Philiastides, who led the work published in the journal Nature Communications.

In order to understand the neuronal basis of these systems, Dr. Philiastides’ team devised a novel state-of-the-art brain imaging procedure.

Specifically, they hooked up volunteers to an EEG machine (to measure brain electrical activity) while they were concurrently being scanned in an MRI machine.

An EEG machine records brain activity with high temporal precision (“when” things are happening in the brain) while functional MRI provides information on the location of this activity (“where” things are happening in the brain). To date, “when” and “where” questions have largely been studied separately, using each technique in isolation.

Dr. Philiastides’ lab is among the pioneering groups that have successfully combined the two techniques to simultaneously provide answers to both questions.

The ability to use EEG, which detects tiny electrical signals on the scalp, in an MRI machine, which generates large electromagnetic interference, hinges largely on the team’s ability to remove the ‘noise’ produced by the scanner.

During these measurements participants were shown a series of pairs of symbols and asked to choose the one they believed was more profitable (the one which earned them more points).

They performed this task through trial and error by using the outcome of each choice as a learning signal to guide later decisions. Picking the correct symbol rewarded them with points and increased the sum of money paid to them for taking part in the study while the other symbol did not.

To make the learning process more challenging and to keep participants engaged with the task, there was a probability that on 30% of occasions even the correct symbol would incur a penalty.

The results showed two separate (in time and space) but interacting value systems associated with reward-guided learning in the human brain.

This shows a man in an EEG cap.

The ability to use EEG, which detects tiny electrical signals on the scalp, in an MRI machine, which generates large electromagnetic interference, hinges largely on the team’s ability to remove the ‘noise’ produced by the scanner. Image is for illustrative purposes only. Credit: Chris Hope.

The data suggests that an early system responds preferentially to negative outcomes only in order to initiate a fast automatic alertness response. Only after this initial response, a slower system takes over to either promote avoidance or approach learning, following negative and positive outcomes, respectively.

Critically, when negative outcomes occur, the early system down-regulates the late system so that the brain can learn to avoid repeating the same mistake and to readjust how rewarding similar choices would “feel” in the future.

The presence of these separate value systems suggests that different neurotransmitter pathways might modulate each system and facilitate their interaction, said Elsa Fouragnan, the first author of the paper.

Dr Philiastides added: “Our research opens up new avenues for the investigation of the neural system underlying normal as well as maladaptive decision making in humans. Crucially, their findings have the potential to offer an improved understanding of how everyday responses to rewarding or stressful events can affect our capacity to make optimal decisions. In addition, the work can facilitate the study of how mental disorders associated with impairments in engaging with aversive outcomes (such as chronic stress, obsessive-compulsive disorder, post-traumatic disorder and depression), affect learning and strategic planning.

About this neuroscience research

Source: Stuart Forsyth – University of Glasgow
Image Credit: The image is credited to Chris Hope and is licensed CC BY 2.0
Original Research: Full open access research for “Two spatiotemporally distinct value systems shape reward-based learning in the human brain” by Elsa Fouragnan, Chris Retzler, Karen Mullinger and Marios G. Philiastides in Nature Communications. Published online September 8 2015 doi:10.1038/ncomms9107


Abstract

Two spatiotemporally distinct value systems shape reward-based learning in the human brain

Avoiding repeated mistakes and learning to reinforce rewarding decisions is critical for human survival and adaptive actions. Yet, the neural underpinnings of the value systems that encode different decision-outcomes remain elusive. Here coupling single-trial electroencephalography with simultaneously acquired functional magnetic resonance imaging, we uncover the spatiotemporal dynamics of two separate but interacting value systems encoding decision-outcomes. Consistent with a role in regulating alertness and switching behaviours, an early system is activated only by negative outcomes and engages arousal-related and motor-preparatory brain structures. Consistent with a role in reward-based learning, a later system differentially suppresses or activates regions of the human reward network in response to negative and positive outcomes, respectively. Following negative outcomes, the early system interacts and downregulates the late system, through a thalamic interaction with the ventral striatum. Critically, the strength of this coupling predicts participants’ switching behaviour and avoidance learning, directly implicating the thalamostriatal pathway in reward-based learning.

“Two spatiotemporally distinct value systems shape reward-based learning in the human brain” by Elsa Fouragnan, Chris Retzler, Karen Mullinger and Marios G. Philiastides in Nature Communications. Published online September 8 2015 doi:10.1038/ncomms9107

Feel free to share this neuroscience news.
Join our Newsletter
I agree to have my personal information transferred to AWeber for Neuroscience Newsletter ( more information )
Sign up to receive the latest neuroscience headlines and summaries sent to your email daily from NeuroscienceNews.com
We hate spam and only use your email to contact you about newsletters. We do not sell email addresses. You can cancel your subscription any time.
No more articles