Detection of Suicidal Ideation in Clinical Interviews for Depression Using Natural Language Processing and Machine Learning: Cross-Sectional Study

Author:

Li Tim M HORCID,Chen JieORCID,Law Framenia O CORCID,Li Chun-TungORCID,Chan Ngan YinORCID,Chan Joey W YORCID,Chau Steven W HORCID,Liu YapingORCID,Li Shirley XinORCID,Zhang JihuiORCID,Leung Kwong-SakORCID,Wing Yun-KwokORCID

Abstract

Abstract Background Assessing patients’ suicide risk is challenging, especially among those who deny suicidal ideation. Primary care providers have poor agreement in screening suicide risk. Patients’ speech may provide more objective, language-based clues about their underlying suicidal ideation. Text analysis to detect suicide risk in depression is lacking in the literature. Objective This study aimed to determine whether suicidal ideation can be detected via language features in clinical interviews for depression using natural language processing (NLP) and machine learning (ML). Methods This cross-sectional study recruited 305 participants between October 2020 and May 2022 (mean age 53.0, SD 11.77 years; female: n=176, 57%), of which 197 had lifetime depression and 108 were healthy. This study was part of ongoing research on characterizing depression with a case-control design. In this study, 236 participants were nonsuicidal, while 56 and 13 had low and high suicide risks, respectively. The structured interview guide for the Hamilton Depression Rating Scale (HAMD) was adopted to assess suicide risk and depression severity. Suicide risk was clinician rated based on a suicide-related question (H11). The interviews were transcribed and the words in participants’ verbal responses were translated into psychologically meaningful categories using Linguistic Inquiry and Word Count (LIWC). Results Ordinal logistic regression revealed significant suicide-related language features in participants’ responses to the HAMD questions. Increased use of anger words when talking about work and activities posed the highest suicide risk (odds ratio [OR] 2.91, 95% CI 1.22-8.55; P=.02). Random forest models demonstrated that text analysis of the direct responses to H11 was effective in identifying individuals with high suicide risk (AUC 0.76-0.89; P<.001) and detecting suicide risk in general, including both low and high suicide risk (AUC 0.83-0.92; P<.001). More importantly, suicide risk can be detected with satisfactory performance even without patients’ disclosure of suicidal ideation. Based on the response to the question on hypochondriasis, ML models were trained to identify individuals with high suicide risk (AUC 0.76; P<.001). Conclusions This study examined the perspective of using NLP and ML to analyze the texts from clinical interviews for suicidality detection, which has the potential to provide more accurate and specific markers for suicidal ideation detection. The findings may pave the way for developing high-performance assessment of suicide risk for automated detection, including online chatbot-based interviews for universal screening.

Publisher

JMIR Publications Inc.

Subject

Health Information Management,Health Informatics

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3