Summary: Positive reinforcement by reward is important for learning based behaviors, but masks the expression of underlying knowledge and true animal intelligence.
Source: Johns Hopkins University
Rewards are necessary for learning, but may actually mask true knowledge, finds a new Johns Hopkins University study with rodents and ferrets.
The findings, published May 14 in Nature Communications, show a distinction between knowledge and performance, and provide insight into how environment can affect the two.
“Most learning research focuses on how humans and other animals learn ‘content’ or knowledge. Here, we suggest that there are two parallel learning processes: one for content and one for context, or environment. If we can separate how these two pathways work, perhaps we can find ways to improve performance,” says Kishore Kuchibhotla, an assistant professor in The Johns Hopkins University’s department of psychological and brain sciences and the study’s lead author.
After only a few days of training, however, mice completed this task at chance levels in the presence of the lick tube (~50%). They licked to both the target and foil tones without discriminating. Credit JHU.
While researchers have known that the presence of reinforcement, or reward, can change how animals behave, it’s been unclear exactly how rewards affect learning versus performance.
An example of the difference between learning and performance, Kuchibhotla explains, is the difference between a student studying and knowing the answers at home, and a student demonstrating that knowledge on a test at school.
On this early day in learning, researchers then played the same ‘target’ and ‘foil’ tones but without a lick tube present. Surprisingly, mice licked to the target tone and not to the foil tone with greater than 90% accuracy. Credit JHU.
“What we know at any given time can be different than what we show; the ability to access that knowledge in the right environment is what we’re interested in,” he says.
To investigate what animals know in hopes of better understanding learning, Kuchibhotla and the research team trained mice, rats and ferrets on a series of tasks, and measured how accurately they performed the tasks with and without rewards.
Researchers trained mice to lick a tube for a reward during a ‘target’ tone and not lick during a ‘foil’ tone. After two weeks of training, expert mice perform at high accuracy in the presence of the lick tube. Credit JHU.
For the first experiment, the team trained mice to lick for water through a lick tube after hearing one tone, and to not lick after hearing a different, unrewarded tone. It takes mice two weeks to learn this in the presence of the water reward. At a time point early in learning, around days 3-5, the mice performed the task at chance levels (about 50%) when the lick tube/reward was present. When the team removed the lick tube entirely on these early days, however, the mice performed the task at more than 90% accuracy. The mice, therefore, seemed to understand the task many days before they expressed knowledge in the presence of a reward.
To confirm this finding with other tasks and animals, the team also:
had mice press a lever for water when they heard a certain tone;
prompted rats to look for food in a cup if they heard a tone, but not if a light appeared before the tone;
had rats press a lever for sugar water when a light was presented before a tone;
had rats push lever for sugar water when they heard a certain tone, and prompted ferrets to differentiate between two different sounds for water.
In all experiments, the animals performed better when rewards weren’t available.
“Rewards, it seems, help improve learning incrementally, but can mask the knowledge animals have actually attained, particularly early in learning,” says Kuchibhotla. Furthermore, the finding that all animals’ performance improved across the board without rewards, suggesting that variability in learning rates may be due to differences in the animals’ sensitivity to reward context rather than differences in intelligence.
The dissociation between learning and performance, the researchers suggest, may someday help us isolate the root causes of poor performance. While the study involved only rodents and ferrets, Kuchibhotla says it may be possible to someday help animals and humans alike better access content when they need it if the right mechanisms within the brain can be identified and manipulated.
For humans, this could help those with Alzheimer’s Disease maintain lucidity for longer periods of time and improve testing environments for schoolchildren.
Funding: Funding for this study was provided by National Institute on Deafness and Other Communication Disorders (DC009635, DC012557, DC05014), a Hirsch/Weill-Caulier Career Award, a Howard Hughes Medical Institute Faculty Scholarship, the Programme Emergences of City of Paris, Agence Nationale de la Recherche (ANR-17-ERC2-0005, ANR-16-CE37-0016), program “Investissements d’Avenir” (ANR-10-LABX-0087 IEC, ANR-11-IDEX-0001-02), PSL Research University, and the National Institutes of Health training program in computation neuroscience (R90DA043849).
Other authors on this study include Sarah Elnozahy, Kelly Fogelson and Peter C. Holland of The Johns Hopkins University; Tom Hindmarsh Sten, Eleni S. Papadoyannis and Robert C. Froemke of New York University School of Medicine; Rupesh Kumar, Yves Boubenec and Srdjan Ostojic of École Normale Supérieure-PSL Research University.
About this neuroscience research article
Source: Johns Hopkins University Media Contacts: Chanapa Tantibanchachai – Johns Hopkins University Image Source: The image is in the public domain.
Dissociating task acquisition from expression during learning reveals latent knowledge
Performance on cognitive tasks during learning is used to measure knowledge, yet it remains controversial since such testing is susceptible to contextual factors. To what extent does performance during learning depend on the testing context, rather than underlying knowledge? We trained mice, rats and ferrets on a range of tasks to examine how testing context impacts the acquisition of knowledge versus its expression. We interleaved reinforced trials with probe trials in which we omitted reinforcement. Across tasks, each animal species performed remarkably better in probe trials during learning and inter-animal variability was strikingly reduced. Reinforcement feedback is thus critical for learning-related behavioral improvements but, paradoxically masks the expression of underlying knowledge. We capture these results with a network model in which learning occurs during reinforced trials while context modulates only the read-out parameters. Probing learning by omitting reinforcement thus uncovers latent knowledge and identifies context- not “smartness”- as the major source of individual variability.