Author: Ajith Raam D

World Mental Health Day on 10th October cast a long-overdue spotlight on one of the most neglected areas of public health. Nearly a billion people have a mental disorder, and a suicide occurs every 40 seconds. In developing countries, under 25% of people with mental, substance use, or neurological disorders receive treatment1. COVID-19 has worsened the crisis; with healthcare services disrupted, the hidden pandemic of mental ill-health remains largely unaddressed.

In this article, we share some perspectives on the role ML can play and an example of a real-life AI solution we built at Tiger Analytics to address a specific mental-health-related problem.

ML is Already a Part of Physical Healthcare

Algorithms process Magnetic Resonance Imaging (MRI) scans. Clinical notes are parsed to pinpoint the onset of illnesses earlier than physicians can discern them. Cardiovascular disease and diabetes —two of the leading causes of death worldwide— are diagnosed using neural networks, decision trees, and support vector machines. Clinical trials are monitored and assessed remotely to maintain physical distancing protocols.

These are ‘invasive’ approaches with the objective of automating what can —and usually is— be done by humans, but at speed and scale. In the field of mental health, ML can be applied in non-invasive, more humanistic ways that nudge physicians towards better treatment strategies.

Clinical Trials of Mental Health Drugs

In clinical trials of mental health drugs, physicians and patients engage in detailed discussions of the patients’ mental state at each treatment stage. The efficacy of these drugs is determined using a combination of certain biomarkers, body vitals, and mental state as determined by the patient’s interaction with the physician.

The problem with the above approach is that an important input to determining drug efficacy is the responses of a person who has been going through mental health issues. To avoid errors, these interviews/interactions are recorded, and multiple experts listen to the long recordings to evaluate the quality of the interview and the conclusions made.

Two concerns arise: first, time and budget allow only a sample of interviews to be evaluated, which means there is an increased risk of fallacious conclusions regarding drug efficacy; and second, patients may not express all they are feeling in words. A multitude of emotions may be missed or misinterpreted, generating incorrect evaluation scores.

The Problem that Tiger Team Tackled

Working with a pharmaceutical company, Tiger Analytics used speech analytics to identify ‘good’ interviews, i.e., ones that meet quality standards for inclusion in clinical trials, minimizing the number of interviews that were excluded after evaluation, and saving time and expense.

As a data scientist, the typical challenges you face when working on a problem such as this are – What types of signal processing you can use to extract audio features? What non-audio features would be useful? How do you remove background noise in the interviews? How do you look for patterns in language? How do you solve for reviewers’ biases, inevitable in subjective events like interviews?

Below we walk you through the process the Tiger Analytics team used to develop the solution.

mental health and machine learning


Step 1: Pre-processing

We removed background noise from the digital audio files and split them into alternating sections of speech and silence. We grouped the speech sections into clusters, each cluster representing one speaker. We created a full transcript of the interview to enable language processing.

Step 2: Feature extraction

We extracted several hundred features of the audio, from direct aspects like interview duration and voice amplitude to the more abstract speech rates, frequency-wise energy content, and Mel-frequency cepstral coefficients (MFCCs). We used NLP to extract several features from the interview transcript. These captured the unique personal characteristics of individual speakers.

Beyond this, we captured features such as interview length, tone of the interviewer, any gender-related patterns, interview load on the physician, time of the day, and many more features.

Step 3: Prediction

We constructed an Interview Quality Score (IQS) representing the combination of several qualitative and quantitative aspects of each interview. We ensembled boosted trees, support vector machines, and random forests to segregate high-quality interviews from those with issues.

mental health and machine learning 2

This model was able to effectively pre-screen about 75% of the interviews as good or bad and was unsure about the remainder. Reviewers could now work faster and more productively, focusing on only the interviews where the model was not too confident. Overall prediction accuracy improved 2.5x, with some segments returning over 90% accuracy.

ML Models ‘Hear’ What’s Left Unsaid

The analyses provided independent insights regarding pauses, paralinguistics (tone of voice, loudness, inflection, pitch), speech disfluency (fillers like ‘er’, ‘um’), and physician performance during such interviews.

These models have wider applicability beyond clinical trials. Physicians can use model insights to guide treatment and therapy, leading to better mental health outcomes for their patients, whether in clinical trials or practice, addressing one of the critical public health challenges of our time.

World Health Organization, United for Global Mental Health and the World Federation for Mental Health joint news release, 27 August 2020

This article was first published in Analytics India Magazine-


Leave a reply

Your email address will not be published. Required fields are marked *


©2023 Tiger Analytics. All rights reserved.

Log in with your credentials

Forgot your details?