Summary: A new deep learning algorithm, based on automatic speech recognition, is able to accurately identify features in an infant’s cry and differentiate between normal versus abnormal cry signals.
Source: Chinese Association of Automation
Every parent knows the frustration of responding to a baby’s cries, wondering if it is hungry, wet, tired, in need of a hug, or perhaps even in pain. A group of researchers in the USA has devised a new artificial intelligence method that can identify and distinguish between normal cry signals and abnormal ones, such as those resulting from an underlying illness. The method, based on a cry language recognition algorithm, promises to be useful to parents at home as well as in healthcare settings, as doctors may use it to discern cries among sick children.
The research was published in the May issue of IEEE/CAA Journal of Automatica Sinica (JAS), a joint publication of the IEEE and the Chinese Association of Automation.
Experienced health care workers and seasoned parents are able to pretty accurately distinguish among a baby’s many needs based on the crying sounds it makes. While each baby’s cry is unique, they share some common features when they result from the same reasons. Identifying the hidden patterns in the cry signal has been a major challenge, and artificial intelligence applications have now been shown to be an appropriate solution within this context.
The new research uses a specific algorithm based on automatic speech recognition to detect and recognize the features of infant cries. In order to analyze and classify those signals, the team used compressed sensing as a way to process big data more efficiently. Compressed sensing is a process that reconstructs a signal based on sparse data and is especially useful when sounds are recorded in noisy environments, which is where baby cries typically take place. In this study, the researchers designed a new cry language recognition algorithm which can distinguish the meanings of both normal and abnormal cry signals in a noisy environment. The algorithm is independent of the individual crier, meaning that it can be used in a broader sense in practical scenarios as a way to recognize and classify various cry features and better understand why babies are crying and how urgent the cries are.
“Like a special language, there are lots of health-related information in various cry sounds. The differences between sound signals actually carry the information. These differences are represented by different features of the cry signals. To recognize and leverage the information, we have to extract the features and then obtain the information in it,” says Lichuan Liu, corresponding author and Associate Professor of Electrical Engineering and the Director of Digital Signal Processing Laboratory whose group conducted the research.
The researchers hope that the findings of their study could be applicable to several other medical care circumstances in which decision making relies heavily on experience. “The ultimate goals are healthier babies and less pressure on parents and care givers,” says Liu. “We are looking into collaborations with hospitals and medical research centers, to obtain more data and requirement scenario input, and hopefully we could have some products for clinical practice,” she adds.
Infant Cry Language Analysis and Recognition: An Experimental Approach
Recently, lots of research has been directed towards natural language processing. However, the baby’s cry, which serves as the primary means of communication for infants, has not yet been extensively explored, because it is not a language that can be easily understood. Since cry signals carry information about a babies’ wellbeing and can be understood by experienced parents and experts to an extent, recognition and analysis of an infant’s cry is not only possible, but also has profound medical and societal applications. In this paper, we obtain and analyze audio features of infant cry signals in time and frequency domains. Based on the related features, we can classify given cry signals to specific cry meanings for cry language recognition. Features extracted from audio feature space include linear predictive coding (LPC), linear predictive cepstral coefficients (LPCC), Bark frequency cepstral coefficients (BFCC), and Mel frequency cepstral coefficients (MFCC). Compressed sensing technique was used for classification and practical data were used to design and verify the proposed approaches. Experiments show that the proposed infant cry recognition approaches offer accurate and promising results.