Kindergarten for AI: Basic Skills Boost Complex Learning in RNNs

Summary: AI systems, like humans and animals, learn better when they first master simple tasks. Researchers showed that recurrent neural networks (RNNs) trained with a “kindergarten curriculum” approach, starting with easy tasks, perform more efficiently on complex ones.

Inspired by experiments with rats, who learned to combine basic sensory cues to retrieve water, scientists applied the same logic to train RNNs on decision-making tasks. Compared to traditional training, this method led to faster and more robust learning, suggesting a new framework for AI development.

Key Facts:

Curriculum Boost: RNNs trained on simple tasks first learn complex tasks faster.
Animal-Inspired Design: Rats demonstrated the ability to combine basic learned behaviors to solve advanced challenges.
Human-Like Learning: The findings support a developmentally inspired approach to AI training, mirroring human cognitive growth.

Source: NYU

We need to learn our letters before we can learn to read and our numbers before we can learn how to add and subtract.

The same principles are true with AI, a team of New York University scientists has shown through laboratory experiments and computational modeling.

In their work, published in the journal Nature Machine Intelligence, researchers found that when recurrent neural networks (RNNs) are first trained on simple cognitive tasks, they are better equipped to handle more difficult and complex ones later on.

The paper’s authors labeled this form of training kindergarten curriculum learning as it centers on first instilling an understanding of basic tasks and then combining knowledge of these tasks in carrying out more challenging ones.

“From very early on in life, we develop a set of basic skills like maintaining balance or playing with a ball,” explains Cristina Savin, an associate professor in NYU’s Center for Neural Science and Center for Data Science.

“With experience, these basic skills can be combined to support complex behavior—for instance, juggling several balls while riding a bicycle.

“Our work adopts these same principles in enhancing the capabilities of RNNs, which first learn a series of easy tasks, store this knowledge, and then apply a combination of these learned tasks to successfully complete more sophisticated ones.”

RNNs—neural networks that are designed to process sequential information based on stored knowledge—are particularly useful in speech recognition and language translation.

However, when it comes to complex cognitive tasks, training RNNs with existing methods can prove difficult and fall short of capturing crucial aspects of animal and human behavior that AI systems aim to replicate.

To address this, the study’s authors—who also included David Hocker, a postdoctoral researcher in NYU’s Center for Data Science, and Christine Constantinople, a professor in NYU’s Center for Data Science—first conducted a series of experiments with laboratory rats.

The animals were trained to seek out a water source in a box with several compartmentalized ports.

However, in order to know when and where the water would be available, the rats needed to learn that delivery of the water was associated with certain sounds and the illumination of the port’s lights—and that the water was not delivered immediately after these cues.

In order to reach the water, then, the animals needed to develop basic knowledge of multiple phenomena (e.g., sounds precede water delivery, waiting after the visual and audio cues before trying to access the water) and then learn to combine these simple tasks in order to complete a goal (water retrieval).

These results pointed to principles of how the animals applied knowledge of simple tasks in undertaking more complex ones.

The scientists took these findings to train RNNs in a similar fashion—but, instead of water retrieval, the RNNs managed a wagering task that required these networks to build upon basic decision making in order to maximize the payoff over time.

They then compared this kindergarten curriculum learning approach to existing RNN-training methods.

Overall, the team’s results showed that the RNNs trained on the kindergarten model learned faster than those trained on current methods.

“AI agents first need to go through kindergarten to later be able to better learn complex tasks,” observes Savin.

“Overall, these results point to ways to improve learning in AI systems and call for developing a more holistic understanding of how past experiences influence learning of new skills.”

Funding: This research was funded by grants from the National Institute of Mental Health (1R01MH125571-01, 1K01MH132043-01A1) and conducted using research computing resources of the Empire AI consortium, with support from the State of New York, the Simons Foundation, and the Secunda Family Foundation.

About this AI and learning research news

Author: James Devitt
Source: NYU
Contact: James Devitt – NYU
Image: The image is credited to Neuroscience News

Original Research: Closed access.
“Compositional pretraining improves computational efficiency and matches animal behaviour on complex tasks” by Cristina Savin et al. Nature Machine Intelligence

Abstract

Compositional pretraining improves computational efficiency and matches animal behaviour on complex tasks

Recurrent neural networks (RNNs) are ubiquitously used in neuroscience to capture both neural dynamics and behaviours of living systems.

However, when it comes to complex cognitive tasks, training RNNs with traditional methods can prove difficult and fall short of capturing crucial aspects of animal behaviour.

Here we propose a principled approach for identifying and incorporating compositional tasks as part of RNN training.

Taking as the target a temporal wagering task previously studied in rats, we design a pretraining curriculum of simpler cognitive tasks that reflect relevant subcomputations, which we term ‘kindergarten curriculum learning’.

We show that this pretraining substantially improves learning efficacy and is critical for RNNs to adopt similar strategies as rats, including long-timescale inference of latent states, which conventional pretraining approaches fail to capture.

Mechanistically, our pretraining supports the development of slow dynamical systems features needed for implementing both inference and value-based decision making.

Overall, our approach helps endow RNNs with relevant inductive biases, which is important when modelling complex behaviours that rely on multiple cognitive functions.