Summary: Pure free exploration may not be the most efficient route to solving a complex problem or learning, a new study reports.
Source: Champalimaud Centre for the Unknown
Who hasn’t felt the temptation to fling a lengthy manual into the trash, or just drive on instead of asking for directions? After all, following instructions is often tiresome, and we can just figure it out on our own… Or can we?
A study published today in the scientific journal Nature Human Behaviour challenges prevalent theories about our capacity to solve complex problems and how certain mental disorders influence it.
“Patients that suffer from obsessive-compulsive disorder (OCD) are thought to have a problem with developing sophisticated problem-solving strategies,” said the study’s senior author, Albino Oliveira-Maia, head of the Neuropsychiatry Unit at the Champalimaud Foundation in Portugal. “However, our novel experimental approach provides strong evidence against this theory.”
Two ways to solve one problem
Oliveira-Maia’s team made this finding when investigating how healthy subjects and patients with OCD differ in the way they solve problems. “In general, people use a combination of two complementary strategies, known as the model-free and the model-based approaches,” Oliveira-Maia explained. “While healthy individuals use both strategies flexibly, patients with OCD tend to exercise the model-free approach.”
The model-free strategy is relatively simple and works well in stable environments. For example, imagine the following scenario: You have breakfast outside every morning on your way to work. There are two coffee shops on your route: The Bean and Aroma. Since you have to be at work early, with time, you learn that Aroma usually gets your breakfast staple—fresh croissants—delivered before the other shop does. So, following the model-free approach, you would typically go to Aroma first, and only when it doesn’t have croissants, you would head over to The Bean.
However, the model-free approach won’t work very well if the croissant supplier employed two delivery people who followed opposite routes. On weeks when the first delivery person is on duty, The Bean would get the croissants earlier. But if the second delivery person is working that week, then Aroma would receive them first.
If you were able to discover the model—that the availability of croissants depends on which delivery person is working that week—you would save yourself unnecessary trips. So even if The Bean has had croissants bright and early for weeks, on the first Monday that it doesn’t, you would immediately know that this week Aroma is the safer choice.
“Even though the model-based strategy is more computationally heavy, especially while you are still working out what’s going on, it’s more effective for optimizing your actions in complex circumstances such as the one in this example,” said Oliveira-Maia.
Switching up the rules of the game
According to Oliveira-Maia, scientific studies that assess these strategies routinely apply a puzzle called the two-step task, which is similar to the second, more complex, scenario.
“These studies have shown that healthy subjects use a mixture of the simpler model-free strategy with the more complex model-based strategy when solving these types of tasks. In contrast, patients with OCD tend to stick with the less efficient strategy. The proposed reason is that patients with OCD are extremely habitual, and so they tend to repeat actions even if they don’t serve a useful purpose,” Oliveira-Maia explained.
Though this conclusion seems straightforward and consistent, there’s a catch. Since the tasks used in these studies are usually very complex, test subjects always receive a full explanation of the model before they begin. However, no one had ever rigorously tested the effect of these preemptive instructions—particularly their absence—on the subjects’ problem-solving strategy.
To find out how people would do with just free experimentation, Oliveira-Maia’s team partnered up with Thomas Akam, a neuroscientist currently at Oxford University who had recently developed a two-step task for mice.
“Since you cannot verbally instruct mice, Thomas created a task that was simple enough so that the animals will be able to decipher the model through trial and error. In his research article, published in the journal Neuron a little over a year ago, Thomas showed that mice were indeed able to crack the puzzle. So we decided to adjust this task for humans and test whether subjects would naturally adopt a model-based strategy as is generally assumed,” recounted former doctoral student Pedro Castro-Rodrigues, the first co-author of the study.
The results of the experiment caught the researchers by surprise. “Even with extensive experience with the task, only a small minority of the 200-subjects group developed a model-based strategy. This is striking given the relative simplicity of the task and suggests that humans are surprisingly poor at learning causal models from experience alone,” Castro-Rodrigues remarked.
OCD patients match-up to healthy subjects
At the end of the third session, the researchers split subjects into two groups. One group received the full description of how the puzzle works, while the other did not. Then, the researchers ran a fourth and final session to test the effect of receiving instructions on the subjects’ problem-solving approach.
The difference between the two groups was clear: almost all the subjects from the “explanation” group—both healthy volunteers and OCD patients—adopted a model-based strategy. On the other hand, most test subjects of the other group carried on with the model-free approach.
“These results were fascinating,” said Ana Maia, a doctoral student that participated in the study. “Not only did they reveal that explanation plays more of a role than previously thought, but also that, given the right set of conditions, patients with OCD are in fact as capable of optimally solving a two-step task as healthy individuals.”
What is the reason for the discrepancy in results between this study and previous ones? According to the authors, there are several possible explanations. The first is that the task was relatively simple, and so were the instructions.
“Since classic two-step tasks tend to be very intricate, the explanations are also very complex. So you can imagine that a person who is acutely ill and distressed will have a harder time processing this type of information,” explained Oliveira-Maia.
Another intriguing hypothesis is that beginning with free experimentation makes a difference. Is it possible that the three unguided sessions effectively prepared the patients for the explanation?
“We didn’t directly test this question in this study, but there are some hints that it may have been the case. If future studies support this hypothesis, they might even lead to the development of novel psychotherapeutic and behavioral treatments for patients with OCD and perhaps other mental health disorders as well,” Castro-Rodrigues suggested.
The team is continuing its exploration into this topic via several avenues. “In this project, we’ve also collected imaging data of subjects performing the task inside an MRI scanner. So our most immediate follow-up would be to search for the neural correlates associated with the transition from one strategy to the other after receiving an explanation,” said Castro-Rodrigues.
“Pedro’s work is partly inscribed in a larger endeavor of the lab—the Neurocomp project,” added co-author Bernardo Barahona-Corrêa, a psychiatrist at the Champalimaud Foundation.
“This project, which I am leading with Albino, will investigate many aspects of OCD, focusing particularly on a brain region called the orbitofrontal cortex. We believe this region is critical for both the core manifestations of this disorder and for the acquisition of model-based action-control in tasks such as the one we used in this experiment.”
“Ultimately, these results highlight the importance of explicit explanations in learning,” Oliveira-Maia pointed out. “It seems that pure free exploration may not be the most efficient route to gaining new knowledge. In fact, I’ve started talking with my kids about this,” he added playfully” “telling them to be sure to pay attention to their teachers.”
Explicit knowledge of task structure is a primary determinant of human model-based action
Explicit information obtained through instruction profoundly shapes human choice behaviour. However, this has been studied in computationally simple tasks, and it is unknown how model-based and model-free systems, respectively generating goal-directed and habitual actions, are affected by the absence or presence of instructions.
We assessed behaviour in a variant of a computationally more complex decision-making task, before and after providing information about task structure, both in healthy volunteers and in individuals suffering from obsessive-compulsive or other disorders. Initial behaviour was model-free, with rewards directly reinforcing preceding actions.
Model-based control, employing predictions of states resulting from each action, emerged with experience in a minority of participants, and less in those with obsessive-compulsive disorder. Providing task structure information strongly increased model-based control, similarly across all groups.
Thus, in humans, explicit task structural knowledge is a primary determinant of model-based reinforcement learning and is most readily acquired from instruction rather than experience.