Back to All Events

AI-Driven Detection of Loose Associations in Clinical High-Risk for Psychosis: A Machine Learning Algorithm Using Language Model Metrics

AI-Driven Detection of Loose Associations in Clinical High-Risk for Psychosis: A Machine Learning Algorithm Using Language Model Metrics

Enrique Gutiérrez, PhD (1,2,*), Carlos Quesada, PhD (1) ,Emily DeFraites, MD, MPH (2,3,4), Danielle J. Harper, PhD (2,5,6) and Amar D. Mandavia, MA, PhD (2,7)
1 Polytechnic University of Madrid
2 MIT linQ
3 Greater Los Angeles VA Healthcare System
4 School of Medicine, University of California Los Angeles
5 Massachusetts General Hospital
6 Harvard Medical School
7 Boston VA Healthcare System

Background: This work introduces a novel approach for the detection of Loose Associations (LA) in psychosis-risk individuals, leveraging the predictive power of surprisal metrics derived from the probability distributions returned by pretrained Large Language Models (LLMs). An instance of LA is characterized by an abrupt change of the topic of conversation to a distant topic without preparing the listener. LA are known to occur consistently in populations at risk for psychosis.

Methods: By utilizing surprisal - a measure of word predictability in context calculated from LLMs such as Llama, GPT or Gemma - we develop an algorithm capable of identifying instances of LA with promising accuracy. Our method involves generating surprisal fingerprints for utterances, which are then used as features in an extreme gradient boosting classifier (XGBoost).

Results: Notably, our approach achieves an accuracy of around 80% in distinguishing between utterances exhibiting LA and not, underscoring its potential as a rapid and scalable diagnostic aid. Moreover, to address the lack of available LA data, we employ LLMs to generate a synthetic database of utterances that simulate speech patterns characteristic of both individuals without LA and those affected by LA. This innovative use of synthetic data not only enriches the training dataset but also highlights the versatility of LLMs in psychiatric research.

Conclusions: Our findings indicate that surprisal metrics, when combined with machine learning techniques, offer a promising avenue for enhancing the specificity of psychosis risk assessment tools, ultimately facilitating timely and personalized intervention strategies.