Researchers Solve Longtime Puzzle About How We Learn

More than a century ago, Pavlov figured out that dogs fed after hearing a bell eventually began to salivate when they heard the ring. A Johns Hopkins University-led research team has now figured out a key aspect of why.

In the current issue of the journal Neuron, neuroscientist Alfredo Kirkwood settles a long-running debate in neurology: Precisely what happens in the brain when we learn? In other words, neurologically speaking, how did Pavlov’s dogs learn to associate a ringing bell with the delayed reward that followed? For decades, scientists have had a working theory, but Kirkwood’s team is now the first to prove it.

“If you’re trying to train a dog to sit, the initial neural stimuli, the command, is gone almost instantly — it lasts as long as the word ‘Sit,'” said Kirkwood, a professor with the university’s Zanvyl Krieger Mind/Brain Institute. “Before the reward comes, the dog’s brain has already turned to other things. The mystery was, ‘How does the brain link an action that’s over in a fraction of a second with a reward that doesn’t come until much later?'”

The working theory — which Kirkwood’s team has validated — is that invisible “eligibility traces” effectively tag the neural synapses activated by the stimuli so that it can be cemented as true learning with the eventual arrival of a reward.

In the case of a dog learning to sit, when the dog gets a treat or a reward, neuromodulators like dopamine flood the dog’s brain with “good feelings.” Though the brain has long since processed the “Sit” command, eligibility traces respond to the neuromodulators, prompting a lasting synaptic change: learning.

Image of cells in the visual cortex. — These are isolated cells in the visual cortex of a mouse. Credit: Alfredo/Kirkwood (JHU).

The team was able to prove the theory by isolating cells in the visual cortex of a mouse. When they stimulated the axon of one cell with an electrical impulse, they sparked a response in another cell. By doing this repeatedly, they mimicked the synaptic response between two cells as they process a stimulus and create an eligibility trace. When the researchers later flooded the cells with neuromodulators, simulating the arrival of a delayed reward, the response between the cells strengthened or weakened, showing the cells had “learned” and were able to do so because of the eligibility trace.

“This is the basis of how we learn things through reward,” Kirkwood said, “a fundamental aspect of learning.”

In addition to a greater understanding of the mechanics of learning, these findings could enhance teaching methods and lead to treatments for cognitive problems.

About this learning research

Researchers included Johns Hopkins postdoctoral fellow Su Hong, Johns Hopkins graduate student Xiaoxiu Tie, former Johns Hopkins research associate Kaiwen He, along with Marco Huertas and Harel Shouval, neurobiology researchers at the University of Texas at Houston, and Johannes W. Hell, a professor of pharmacology at University of California, Davis.

Funding: The research was supported by the Science of Learning Institute and by National Institutes of Health grants R01MH093665 and R01EY012124.

Source: Jill Rosen – Johns Hopkins University
Image Credit: The image is credited to Alfredo/Kirkwood (JHU)
Original Research: Abstract for “Distinct Eligibility Traces for LTP and LTD in Cortical Synapses” by Kaiwen He, Marco Huertas, Su Z. Hong, XiaoXiu Tie, Johannes W. Hell, Harel Shouval, and Alfredo Kirkwood in Neuron. Published online September 18 2015 doi:10.1016/j.neuron.2015.09.037

Abstract

Distinct Eligibility Traces for LTP and LTD in Cortical Synapses

Highlights
•Hebbian conditioning induces eligibility traces for LTP and LTD in cortical synapses
•β2ARs and 5-HT2CRs convert the traces into LTP and LTD, respectively
•Anchoring of β2ARs and 5-HT2C is key for trace conversion
•Temporal properties of the LTP/D traces allow reward-timing prediction

Summary
In reward-based learning, synaptic modifications depend on a brief stimulus and a temporally delayed reward, which poses the question of how synaptic activity patterns associate with a delayed reward. A theoretical solution to this so-called distal reward problem has been the notion of activity-generated “synaptic eligibility traces,” silent and transient synaptic tags that can be converted into long-term changes in synaptic strength by reward-linked neuromodulators. Here we report the first experimental demonstration of eligibility traces in cortical synapses. We demonstrate the Hebbian induction of distinct traces for LTP and LTD and their subsequent timing-dependent transformation into lasting changes by specific monoaminergic receptors anchored to postsynaptic proteins. Notably, the temporal properties of these transient traces allow stable learning in a recurrent neural network that accurately predicts the timing of the reward, further validating the induction and transformation of eligibility traces for LTP and LTD as a plausible synaptic substrate for reward-based learning.

“Distinct Eligibility Traces for LTP and LTD in Cortical Synapses” by Kaiwen He, Marco Huertas, Su Z. Hong, XiaoXiu Tie, Johannes W. Hell, Harel Shouval, and Alfredo Kirkwood in Neuron. Published online September 18 2015 doi:10.1016/j.neuron.2015.09.037

Feel free to share this neuroscience news.