In Plain Sight: Study Compares Performance of Humans Versus Neural Networks in Visual Searches

Summary: Researchers report humans often miss objects in plain sight, especially if the size is inconsistent with the rest of the scene.

Source: UC Santa Barbara.

Before you read on, look for toothbrushes in the photo below. Find them? Both of them? If you’re like the vast majority of people, you honed in on the one near the sink, but probably took a moment or two before seeing the other, much larger one hanging on the wall. Although it is technically much more visible and not out of context, for a while at least, your brain excluded that enormous blue toothbrush in your visual search.

As it turns out, size matters. When we search through scenes for a particular object, we often miss even giant targets when their size is inconsistent with the rest of the scene. That’s according to scientists at UC Santa Barbara, where this curious phenomenon is being investigated in an effort to better understand how humans conduct visual searches.

These new findings, by researchers in the Department of Psychological & Brain Sciences, are published in the journal Current Biology.

“When something appears at the wrong scale, you will miss it more often because your brain automatically ignores it,” said UCSB professor Miguel Eckstein, who specializes in computational human vision, visual attention and search. Using scenes of ordinary objects where 14 targets were featured in computer-generated images that varied in color, viewing angle and size, mixed with “target-absent” scenes, the researchers asked 60 viewers to search for these objects (e.g. toothbrush, parking meter, computer mouse) while eye-tracking software monitored the paths of their gaze. They found that people tended to miss the target more often when it was mis-scaled, even when their gaze was directed to the incorrectly sized object.

Computer vision, by contrast, does not have this issue, the scientists reported.

“The idea is when you first see a scene, your brain rapidly processes it within a few hundred milliseconds or less, and then you use that information to guide your search towards likely locations where the object typically appears,” Eckstein said. “Also, you focus your attention on objects that are actually at the size that is consistent with the object that you’re looking for.” The most advanced computer vision — deep neural networks — search across entire scenes and use the visual properties of the object itself, while humans also use the relationships between objects and their context within the scene to guide their eyes.

This trend may seem like a deficiency on the part of humans, but the tables were turned when human subjects and a deep neural network were asked to verify the presence of different target objects in real-world scenes that may or may not have them. In that round, the deep neural network reported a much higher percentage of false positives. That is, they confirmed the presence of, say, a cellphone in a scene where there were computer keyboards because of their similarity in shape — despite the fact that keyboards are several times larger than a cellphone and, in the photo, are much larger than the nearby hands that would be holding them.

“No human would do that,” added former graduate Katie Koehler, now working at Riot Games. “Just based on the size your brain would automatically discard it.” This mechanism, according to the researchers, is in fact a useful strategy implemented by human brains to process scenes rapidly, eliminate distractors and reduce false positives. While this blindness due to inconsistent size may be an unwanted byproduct of the human brain’s search strategy, such scenarios are rare in the real world. With repeated exposure to the unusual scenario, human observers will eventually adapt their visual searches to accommodate it.

A bathroom is shown. — Find both toothbrushes in the scene. NeuroscienceNews.com image credited to the researchers.

“The findings might suggest ways to improve computer vision by implementing some of the tricks the brain utilizes to reduce false positives,” said former postdoctoral researcher Emre Akbas, now an assistant professor of computer engineering at Middle East Technical University in Turkey, who was responsible for the computer vision components of the project.

According to Eckstein, some people on the autism spectrum might not miss the large targets in a scene. He is contemplating a study on that topic in the future.

“There are some theories that suggest that people with autism spectrum disorder focus more on local scene information and less on global structure,” he said. “So there is a possibility that people with autism spectrum disorder might miss the mis-scaled objects less often, but we won’t know that until we do the study.”

In the more immediate future, research will look into the brain activity that occurs when we view mis-scaled objects.

“Many studies have identified brain regions that process scenes and objects, and now researchers are trying to understand which particular properties of scenes and objects are represented in these regions,” said postdoctoral researcher Lauren Welbourne, whose current research concentrates on how objects are represented in the cortex, and how scene context influences the perception of objects. “And so what we’re trying to do is find out how these brain areas respond to objects that are either correctly or incorrectly scaled within a scene. This may help us determine which regions are responsible for making it more difficult for us to find objects if they are mis-scaled.”

About this neuroscience research article

Source: Sonia Fernandez – UC Santa Barbara
Image Source: NeuroscienceNews.com image is credited to the researchers.
Original Research: Abstract for “Humans, but Not Deep Neural Networks, Often Miss Giant Targets in Scenes” by Miguel P. Eckstein, Kathryn Koehler, Lauren E. Welbourne, and Emre Akbas in Current Biology. Published online September 7 2017 doi:10.1016/j.cub.2017.07.068

Cite This NeuroscienceNews.com Article

[cbtabs][cbtab title=”MLA”]UC Santa Barbara “In Plain Sight: Study Compares Performance of Humans Versus Neural Networks in Visual Searches.” NeuroscienceNews. NeuroscienceNews, 26 September 2017.
<https://neurosciencenews.com/human-ann-vision-7590/>.[/cbtab][cbtab title=”APA”]UC Santa Barbara (2017, September 26). In Plain Sight: Study Compares Performance of Humans Versus Neural Networks in Visual Searches. NeuroscienceNews. Retrieved September 26, 2017 from https://neurosciencenews.com/human-ann-vision-7590/[/cbtab][cbtab title=”Chicago”]UC Santa Barbara “In Plain Sight: Study Compares Performance of Humans Versus Neural Networks in Visual Searches.” https://neurosciencenews.com/human-ann-vision-7590/ (accessed September 26, 2017).[/cbtab][/cbtabs]

Abstract

Humans, but Not Deep Neural Networks, Often Miss Giant Targets in Scenes

Highlights
•Humans often miss giant targets in scenes during visual search
•Giant targets are more often missed even when humans foveate them
•Deep neural networks do not exhibit such a deficit with giant targets
•Missing giant targets is a functional brain strategy to discount distractors

Summary
Even with great advances in machine vision, animals are still unmatched in their ability to visually search complex scenes. Animals from bees to birds to humans learn about the statistical relations in visual environments to guide and aid their search for targets. Here, we investigate a novel manner in which humans utilize rapidly acquired information about scenes by guiding search toward likely target sizes. We show that humans often miss targets when their size is inconsistent with the rest of the scene, even when the targets were made larger and more salient and observers fixated the target. In contrast, we show that state-of-the-art deep neural networks do not exhibit such deficits in finding mis-scaled targets but, unlike humans, can be fooled by target-shaped distractors that are inconsistent with the expected target’s size within the scene. Thus, it is not a human deficiency to miss targets when they are inconsistent in size with the scene; instead, it is a byproduct of a useful strategy that the brain has implemented to rapidly discount potential distractors.

“Humans, but Not Deep Neural Networks, Often Miss Giant Targets in Scenes” by Miguel P. Eckstein, Kathryn Koehler, Lauren E. Welbourne, and Emre Akbas in Current Biology. Published online September 7 2017 doi:10.1016/j.cub.2017.07.068

Feel free to share this Neuroscience News.