How to Figure Out What You Don't Know

Summary: Testing multiple computational models of the nervous system, researchers discover that just because a model can make accurate predictions about data, this doesn’t always translate into the underlying logic of the biological system it represents.

Source: CSHL

Increasingly, biologists are turning to computational modeling to make sense of complex systems. In neuroscience, researchers are adapting the kinds of algorithms used to forecast the weather or filter spam from your email to seek insight into how the brain’s neural networks process information.

New research from Cold Spring Harbor Laboratory Assistant Professor Tatiana Engel offers crucial guidance to biologists using such models. Testing various computational models of the nervous system, she and postdoctoral researcher Mikhail Genkin have found that just because a model can make good predictions about data does not mean it reflects the underlying logic of the biological system it represents. Relying on such models without carefully evaluating their validity could lead to wrong conclusions about how the actual system works, they say.

The work, published October 26, 2020 in Nature Machine Intelligence, concerns a type of machine learning known as flexible modeling, which gives users the freedom to explore a wide range of possibilities without formulating specific hypotheses beforehand. Engel’s lab has turned to such models to investigate how signaling in the brain gives rise to decision-making.

When it comes to forecasting the weather or predicting trends in the stock market, any model that makes good predictions is valuable. But Engel says that for biologists, the goals are different:

“Because we are interested in scientific interpretation and actually discover hypotheses from the data, we not only need to fit the model to the data, but we need to analyze or understand the model which we get, right? So we want to look, as I said, we want to look into model structure and the model mechanism to make inference that this is maybe how the brain works.”

This shows colored wavy patterns — These colored maps each have different shapes. Each shape represents a different hypothetical way to answer a complicated question that lacks a simple yes or no answer. Using machine learning, researchers can test a hypothesis many times to find the best answers, rather than stopping at an incomplete answer that might have limited value in only a few special circumstances. Image is credited to Mikhail Genkin/Engel lab

It’s possible to make good predictions using wrong assumptions, Engel said, pointing to the ancient model of the solar system that accurately predicted the movements of celestial bodies while positing that those bodies revolved around the Earth, not the Sun. So it was important to consider how well particular models of neural networks could be trusted.

By building and comparing several models of neural signaling, Engel and Genkin found that good predictive power does not necessarily indicate that a model is a good representation of real neural networks. They found that the best models were instead those that were most consistent across multiple datasets.

This approach won’t necessarily work for all situations, however, and biologists may need alternative methods of evaluating their models. Most importantly, Genkin said, “We shouldn’t take anything for granted. We should check every assumption we have.”

About this artificial intelligence research news

Source: CSHL
Contact: Sara Roncero-Menendez – CSHL
Image: The image is credited to Mikhail Genkin/Engel lab.

Original Research: Closed access.
“Moving beyond generalization to accurate interpretation of flexible models” by Mikhail Genkin & Tatiana A. Engel. Nature Machine Intelligence

Abstract

Moving beyond generalization to accurate interpretation of flexible models

Machine learning optimizes flexible models to predict data. In scientific applications, there is a rising interest in interpreting these flexible models to derive hypotheses from data. However, it is unknown whether good data prediction guarantees the accurate interpretation of flexible models. Here, we test this connection using a flexible, yet intrinsically interpretable framework for modelling neural dynamics. We find that many models discovered during optimization predict data equally well, yet they fail to match the correct hypothesis. We develop an alternative approach that identifies models with correct interpretation by comparing model features across data samples to separate true features from noise. We illustrate our findings using recordings of spiking activity from the visual cortex of monkeys performing a fixation task. Our results reveal that good predictions cannot substitute for accurate interpretation of flexible models and offer a principled approach to identify models with correct interpretation.