New software developed by researchers detects a person’s ability to understand or share feelings in therapy sessions.
“And how does that make you feel?”
Empathy is the foundation of therapeutic intervention. But how can you know if your therapist is or will be empathetic? Technology developed by researchers from USC, the University of Washington and the University of Utah can tell you.
Leveraging developments in automatic speech recognition, natural language processing and machine learning, researchers developed software to detect “high-empathy” or “low-empathy” speech by analyzing more than 1,000 therapist-patient sessions. The researchers designed a machine-learning algorithm that takes speech as its input to automatically generate an empathy score for each session.
The methodology is documented in a forthcoming article titled “Rate My Therapist: Automated Detection of Empathy in Drug and Alcohol Counseling via Speech and Language Processing.” According to the authors, it’s the first study of its kind to record therapy sessions and automatically determine the quality of a therapy session based on a single characteristic. The study appears in the December issue of PLoS ONE.
Some things don’t change
Currently, there are very few ways to assess the quality of a therapy session. In fact, according to the researchers, the methods for evaluating therapy have remained unchanged for 70 years. Methods requiring third-party human evaluators are time-consuming and affect the privacy of each session.
Instead, imagine a natural language processing app like SIRI listening in for the right phrases and vocal qualities. The researchers are building on an emerging new field in engineering and computer science called behavioral signal processing, which “utilizes computational methods to assist in human decision-making about behavioral phenomena.
The authors taught their algorithm to recognize empathy via data from training sessions for therapists, specifically looking at therapeutic interactions with individuals coping with addiction and alcoholism. Using automatic speech recognition and machine learning-based models, the algorithm then automatically identified select phrases that would indicate whether a therapist demonstrated high or low empathy.
Key phrases such as: “it sounds like,” “do you think,” and “what I’m hearing,” indicated high empathy, while phrases such as “next question,” “you need to” and “during the past” were perceived as low empathy by the computational model.
“Technological advances in human behavioral signal processing and informatics promise to not only scale up and provide cost savings through automation of processes that are typically manual, but enable new insights by offering tools for discovery,” said Shrikanth Narayanan, the study’s senior author and professor at the USC Viterbi School of Engineering. “This particular study gets at a hidden mental state and what this shows is that computers can be trained to detect constructs like empathy using observational data.”
Narayanan’s team in the Signal Analysis and Interpretation Lab at USC Viterbi continues to develop more advanced models — giving the algorithm the capacity to analyze diction, tone of voice, the musicality of one’s speech (prosody), as well as how the cadence of one speaker in conversation is echoed with another (for example, when a person talks fast and the listener’s oral response mirrors the rhythm with quick speech).
In the near term, the researchers are hoping to use this tool to train aspiring therapists.
“Being able to assess the quality of psychotherapy is critical to ensuring that patients receive quality treatment, said David Atkins, a University of Washington research professor of psychiatry.
“The sort of technology our team of engineers and psychologists is developing may offer one way to help providers get immediate feedback on what they are doing — and ultimately improve the effectiveness of mental health care,” said Zac Imel, a University of Utah professor of educational psychology and the study’s corresponding author.
In the long run, the team hopes to create software that provides real-time feedback or rates a therapy session on the spot. In addition, the researchers want to incorporate additional elements into their algorithm, including acoustic channels and the frequency with which a therapist or patient speaks.
Bo Xiao and Panayiotis Georgiou of USC Viterbi also contributed to the research.
Source: Amy Blumenthal – USC
Image Source: The image is adapted from the USC press release.
Original Research: Full open access research for “”Rate My Therapist”: Automated Detection of Empathy in Drug and Alcohol Counseling via Speech and Language Processing” by Bo Xiao, Zac E. Imel, Panayiotis G. Georgiou, David C. Atkins, and Shrikanth S. Narayanan in PLOS ONE. Published online December 2 2015 doi:10.1371/journal.pone.0143055
“Rate My Therapist”: Automated Detection of Empathy in Drug and Alcohol Counseling via Speech and Language Processing
The technology for evaluating patient-provider interactions in psychotherapy–observational coding–has not changed in 70 years. It is labor-intensive, error prone, and expensive, limiting its use in evaluating psychotherapy in the real world. Engineering solutions from speech and language processing provide new methods for the automatic evaluation of provider ratings from session recordings. The primary data are 200 Motivational Interviewing (MI) sessions from a study on MI training methods with observer ratings of counselor empathy. Automatic Speech Recognition (ASR) was used to transcribe sessions, and the resulting words were used in a text-based predictive model of empathy. Two supporting datasets trained the speech processing tasks including ASR (1200 transcripts from heterogeneous psychotherapy sessions and 153 transcripts and session recordings from 5 MI clinical trials). The accuracy of computationally-derived empathy ratings were evaluated against human ratings for each provider. Computationally-derived empathy scores and classifications (high vs. low) were highly accurate against human-based codes and classifications, with a correlation of 0.65 and F-score (a weighted average of sensitivity and specificity) of 0.86, respectively. Empathy prediction using human transcription as input (as opposed to ASR) resulted in a slight increase in prediction accuracies, suggesting that the fully automatic system with ASR is relatively robust. Using speech and language processing methods, it is possible to generate accurate predictions of provider performance in psychotherapy from audio recordings alone. This technology can support large-scale evaluation of psychotherapy for dissemination and process studies.
“”Rate My Therapist”: Automated Detection of Empathy in Drug and Alcohol Counseling via Speech and Language Processing” by Bo Xiao, Zac E. Imel, Panayiotis G. Georgiou, David C. Atkins, and Shrikanth S. Narayanan in PLOS ONE. Published online December 2 2015 doi:10.1371/journal.pone.0143055