Human Eye Beats Machine in Archaeological Color Identification Test

Summary: X-Rite Capsure, an archaeological tool to help researchers match colors, is not as consistent or accurate as the human eye in color determination.

Source: Florida Museum of Natural History

A ruler and scale can tell archaeologists the size and weight of a fragment of pottery – but identifying its precise color can depend on individual perception. So, when a handheld color-matching gadget came on the market, scientists hoped it offered a consistent way of determining color, free of human bias.

But a new study by archaeologists at the Florida Museum of Natural History found that the tool, known as the X-Rite Capsure, often misread colors readily distinguished by the human eye.

When tested against a book of color chips, the machine failed to produce correct color scores in 37.5% of cases, even though its software system included the same set of chips. In an analysis of fired clay bricks, the Capsure matched archaeologists’ color scores only 35% of the time, dropping to about 5% matching scores when reading sediment colors in the field. Researchers also found the machine was prone to reading color chips as more yellow than they were and sediment and clay as too red.

“I think that we were surprised by how much we disagreed with the instrument. We had the expectation that it would kind of act as the moderator and resolve conflicts,” said Lindsay Bloch, collection manager of the Florida Museum’s Ceramic Technology Lab and lead study author. “Instead, the device would often have an entirely different answer that we all agreed was wrong.”

Identifying subtle differences in color can help archaeologists compare the composition of soil and the origins of artifacts, such as pottery and beads, to understand how people lived and interacted in the past. Color can also reveal whether materials have been exposed to fire, indicating how communities used surrounding natural resources.

Today, the Munsell color system, created by Albert Munsell in 1905 and later adopted by the U.S. Department of Agriculture for soil research, is the archaeological standard for identifying colors. Researchers use a binder of 436 unique color chips to determine a Munsell color score for artifacts, sediment and objects such as bones, shell and rocks. These scores enable archaeologists around the world to compare colors across sites and time periods. But the process of assigning scores can vary based on lighting conditions, the quality of a sample and the perspective of the researcher.

This study is the first to test and record the accuracy of the X-Rite Capsure, a device made by the same company that owns the color authority Pantone. Although marketed to archaeologists, the device was originally designed for interior designers and cosmetologists, not research, Bloch said.

“I think the main takeaway was just sort of surprise that it’s something that is marketed for our field, specifically for archaeologists, but hasn’t been made for us and the kind of data we need to collect,” she added. “When you read the manual, it says you should always verify that the color the machine tells you looks right with your eyes, which seems to negate the use of the instrument.”

In an experiment designed with the help of University of Florida undergraduate researchers Claudette Lopez and Emily Kracht, the team tested the Capsure’s readings of the three elements of Munsell’s system: a color’s general family, or hue; intensity, also known as chroma; and lightness, also called value.

The team first tested the Capsure on all 436 Munsell soil color chips, rating its reading as correct if it matched the exact score on a chip three out of five times. It correctly scored 274 chips. Of its errant readings, about 75% were misidentifications of hue. The Capsure was consistent, though often wrong, producing the same reading five times for 89% of the chips.

This shws different colored blocks of clay and the machine mentioned in the article — Florida Museum of History archaeologists tested how accurately and consistently the X-Rite Capsure, right, could score the color of chips pieces of fired clay and sediment in the field. Credit: Lindsay Bloch/Florida Museum

To determine how well the machine performed in a typical laboratory setting, the team tested its color readings of 140 pottery briquettes that had been assigned Munsell scores by Lopez. The Capsure matched the archaeologist’s scores in 35% of cases, again tending to misread hue. It proved consistent in this second test as well, yielding the same score across all trials of more than 70% of the briquettes.

In the most challenging of color-identification conditions – outdoors, where lighting and texture can vary – the machine only matched archaeologists’ scores of sediment samples about 5% of the time, often rating a shade darker or lighter. For one sample, the Capsure reported colors from five different families, even though archaeologists agreed the sediment was a single hue. Bloch said the discrepancy was likely due to moisture, sand and shells, which don’t usually interfere with human observations.

Unlike some other methods of identifying color, the Capsure is a remote control-sized device that can provide a reading in seconds. Bloch said the tool’s simple design and accessibility lend it to other scientific applications, but that the team’s results point to a need for further scrutiny of how archaeologists record color.

“This new tool has really forced us to see that color is subjective and that, even with a supposedly objective instrument, it may be much more complicated than we’ve been led to believe,” she said. “We need to pay really close attention and record how we’re describing color in order to make good data. Ultimately, if we’re putting bad color data in, we’re going to get bad data out.”

Bloch said she would give the Capsure three out of five stars for being easy to use and offering helpful ways to store data.

“The ding is for the quality of data because it’s still kind of unknown. At this point, I think that our team would say the subjective eye is better.”

Co-authors of the study are Lopez and Kracht, Jacob Hosen of Purdue University and the Florida Museum’s Michelle LeFebvre, Rachel Woodcock and William Keegan.

Funding: Funding for the research came from the University of Florida Caribbean Archaeology Fund.

About this visual neuroscience research news

Source: Florida Museum of Natural History
Contact: Natalie van Hoose – Florida Museum of Natural History
Image: The image is credited to Lindsay Bloch/Florida Museum

Original Research: Open access.
“Is It Better to Be Objectively Wrong or Subjectively Right?” by Lindsay Bloch et al. Advances in Archaeological Practice

Abstract

Is It Better to Be Objectively Wrong or Subjectively Right?

For many years, archaeologists have relied on Munsell Soil Color Charts (MSCC) as tools for standardizing the recording of soil and sediment colors in the field and artifacts such as pottery in the lab. Users have identified multiple potential sources of discrepancy in results, such as differences in inter-operator perception, light source, or moisture content of samples. In recent years, researchers have developed inexpensive digital methods for color identification, but these typically cannot be done in real time. Now, a field-ready digital color-matching instrument is marketed to archaeologists as a replacement for MSCC, but the accuracy and overall suitability of this device for archaeological research has not been demonstrated. Through three separate field and laboratory trials, we found systematic mismatches in the results obtained via device, including variable accuracy against standardized MSCC chips, which should represent ideal samples. At the same time, the instrument was consistent in its readings. This leads us to question whether using the “subjective” human eye or the “objective” digital eye is preferable for data recording of color. We discuss how project goals and limitations should be considered when deciding which color-recording method to employ in field and laboratory settings, and we identify optimal procedures.