Summary: You don’t need years of piano lessons to “get” music. A new study reveals that the human brain is naturally wired to pick up on complex musical context just by being exposed to it in daily life.
By testing how both musicians and non-musicians responded to scrambled segments of Tchaikovsky’s Album for the Young, researchers found that everyone uses relatively long “chunks” of musical information—up to 16 seconds—to predict what happens next. This innate ability to track tonal structure is what allows us to feel suspense in a thriller or joy in an upbeat melody, regardless of our formal training.
Key Facts
- Innate Expertise: Non-musicians and trained musicians perform almost identically when it comes to predicting melodies and remembering musical sequences.
- 16-Second Window: The brain doesn’t just listen to the last note; it integrates a full 16-second “tonal context” to make sense of the music.
- Scrambling Science: When music is scrambled at short intervals (e.g., every bar), it loses its “meaning” to the brain, even if the individual notes sound fine.
- Emotional Priming: This deep contextual processing is what allows music to prime our emotions, helping us anticipate the “mood” of a film or a social setting.
- Exposure is Enough: The study suggests that “everyday walking around” and passive exposure to music are enough for our brains to learn sophisticated tonal hierarchies.
Source: University of Rochester
Context is everything. It helps us predict what may happen next and what the people around us may do or say. Music can add to this context. In films, music allows us to anticipate what’s coming and drives emotional experiences, such as building suspense or creating a heartwarming moment.
“Music in general, especially when we’re listening attentively, seems to help modulate our emotions,” said Riesa Cassano-Coleman, a graduate student in Elise Piazza’s lab at the University of Rochester, in an interview with the Observer. “During a film, if you hear scary music, it makes it scary. If you take away the music, you’re like, ‘Oh, that’s not that bad.’”
A concert can give you chills or bring you to tears. Heavy metal playing in the background might lead you to talk to someone differently than classical music. An upbeat song might get you excited to pedal faster, lift more, run longer.
Within the music itself, the tonal context—a framework that ranks a certain pitch and chord as central—helps guide predictions about which pitch or chord is likely to follow. This process of forming and evaluating predictions is thought to be the basis for musical emotion.
“Tonal context is just kind of how the music makes sense,” said Cassano-Coleman. “If you’re in the key of C, then the home note C is the one that makes the most sense within the environment. The different patterns of the different notes and the different rhythms all kind of fit in that environment.”
Previous research has not uncovered whether formal training is needed to pick up on this tonal context or how much context people use to make sense of the music and feel the emotion of it.
For example, if someone is using short context, they might listen to 16 seconds of music but only use the last note to predict the next one. But Cassano-Coleman suspected people use more than that and mentally form a sequence. In other words, if given 16 seconds of music, they would rely on the whole 16-second segment to predict what would come next.
To test this, Cassano-Coleman and colleagues scrambled music from Tchaikovsky’s Album for the Young at different timescales (eight bars of intact music, scrambled every two bars, or scrambled every bar).
When diced up and scrambled, “It kind of sounds weird,” said Cassano-Coleman. “Acoustically, it sounds smooth, it sounds fine, but it just sounds like it doesn’t make sense.”
Cassano-Coleman’s intuition about the sound is informed by her background as a musician. To see whether formal training is needed to pick up on this tonal context, she tested both musicians’ and nonmusicians’ abilities across four different tasks with the scrambled music: remembering chunks of music, predicting the next notes, segmenting music into meaningful parts, and identifying the timescale in which the segment was scrambled. If people used more context while doing the tasks, the shorter segments would be harder.
Indeed, the results, published in a 2025 Psychological Science paper, showed that people were more accurate when given more context.
“They do use that whole segment, the whole 16 seconds,” said Cassano-Coleman. “When we disrupt that, that disrupts the processing.”
Surprisingly, musicians and nonmusicians performed similarly overall. Musicians were better able to identify the degree of scrambling in the fourth task, indicating their explicit knowledge of tonal structures probably helped performance in that task.
“We were thinking that formal training would give musicians an advantage, but really it seems like our everyday walking around, just listening and being exposed to music, is enough for our brains to learn,” said Cassano-Coleman.
“In the case of memory, you have that context, that kind of structure that you can go back and say, ‘Oh, I heard this little snippet in there somewhere.’ Or with prediction, it gives you enough context to then predict what’s coming next.”
That prediction helps us know not only what’s coming next in the music, but what emotions to prime for. Major chords are brighter and happier than minor chords, which are darker and more serious. Fast music, with many notes jam-packed into each measure, can create anxiety, while slower music that changes less often can be calming.
“Our brains can use the information in the music that’s in front of us in really cool ways,” Cassano-Coleman continued. “Even when we aren’t specifically trained to play music, we still pick up enough of it just walking around, listening.”
Key Questions Answered:
A: Not at all. Your brain has been “training” every time you hear music in a store, a movie, or on the radio. This study shows your brain understands the “grammar” of music just as well as a professional’s.
A: Because your brain uses the musical context to predict what’s coming next. The music sets a “tonal framework” that primes your emotions for anxiety or relief before the visual action even happens.
A: It sounded “weird” to everyone. Even though the sounds were smooth, the brain couldn’t form a sequence or predict the next note, which essentially “broke” the emotional connection to the music.
Editorial Notes:
- This article was edited by a Neuroscience News editor.
- Journal paper reviewed in full.
- Additional context added by our staff.
About this music and neuroscience research news
Author: Hannah Brown
Source: APS
Contact: Hannah Brown – APS
Image: The image is credited to Neuroscience News
Original Research: Open access.
“Listeners Systematically Integrate Hierarchical Tonal Context, Regardless of Musical Training” by Cassano-Coleman, R. Y., Izen, S. C., & Piazza, E. A. Psychological Science
DOI:10.1177/09567976251400331
Abstract
Listeners Systematically Integrate Hierarchical Tonal Context, Regardless of Musical Training
Context drives our interpretations of music as surprising, frightening, or awe-inspiring. However, it remains unclear how formal musical training affects our ability to hierarchically integrate complex tonal information to efficiently predict, remember, and segment music.
We scrambled naturalistic music at multiple timescales to manipulate coherent tonal context while controlling for multiple acoustic cues. Memory (Experiment 1; n = 108, age range = 19–41 years) and prediction (Experiment 2; n = 108, age range = 20–41 years) improved with more intact context for both musicians and nonmusicians.
Listeners’ event boundaries were influenced by the amount of tonal context but also reflected nested phrase structure, and musicians were more sensitive to longer-timescale “hyperphrase” structure (Experiment 3; n = 95, age range = 20–42 years) and could better identify the amount of scrambling (Experiment 4; n = 108, age range = 19–41 years).
These results indicate that listeners integrate tonal context across complex phrases to efficiently encode, predict, and segment naturalistic music and that in general, training has surprisingly little impact on this integration.

