北京生物醫(yī)學(xué)工程

一種基于語音信號(hào)的抑郁癥智能診斷方法

A new intelligent diagnosis method of depressionbased on audio signal

作者：辛逸男張莉吳鵬飛劉欣陽劉志寬

單位：中南民族大學(xué)生物醫(yī)學(xué)工程學(xué)院（武漢430074）<br />通信作者：張莉。E-mail：[email protected]

關(guān)鍵詞：抑郁癥智能診斷；短期特征；特征組合；長(zhǎng)期特征；隨機(jī)森林算法

分類號(hào)：R318.04

出版年·卷·期（頁碼）：2023·42·1（38-44）

摘要：

目的提出一種基于語音特征的機(jī)器學(xué)習(xí)診斷新方法，以實(shí)現(xiàn)抑郁癥的臨床智能診斷。方法選擇抑郁癥患者與正常人群的語音信號(hào)作為信號(hào)源，語音信號(hào)特征采取短期特征與長(zhǎng)期特征相結(jié)合的方法，將短期特征離散化后，分別通過獨(dú)立組合和共同出現(xiàn)的方法生成組合特征，并結(jié)合隨機(jī)森林算法和極度梯度提升算法進(jìn)行分類與評(píng)估。結(jié)果組合特征作為分類特征相較于短期特征、長(zhǎng)期特征以及深度學(xué)習(xí)的方法在F1分?jǐn)?shù)上絕對(duì)提高21%、14%、14%，非抑郁類的敏感度上絕對(duì)提高36%、29%、7%。結(jié)論特征組合方法能夠根據(jù)語音片段對(duì)抑郁程度進(jìn)行很好的分類。

Objective To propose a new method of machine learning diagnosis based on audio signal and to realize clinical intelligent diagnosis of depression. Methods We selects the audio signals of depressed patients and normal people as signal source, the audio signal feature adopts the method of combining short-term features and long-term features. After discretizing the short-term features, new long-term features are generated through independent combination and co-occurrence methods, and the random forest algorithm and extreme gradient boosting algorithm are combined for classification and evaluation. Results Com-pared with short-term features, long-term features and deep learning approaches, the combined features as classification features have absolute increases in F1 scores of 21%, 14%, and 14%, and absolute in-creases in non-depression sensitivity of 36%, 29%, and 7%.Conclusions The combined features as clas-sification features was able to classify depression levels based on audio signal.

參考文獻(xiàn)：

[1] Cohn JF, Kruez TS, Matthews I, et al. Detecting depression from facial actions and vocal prosody[C]// 2009 3rd International Con-ference on Affective Computing and Intelligent Interaction and Workshops.Amsterdam,Netherlands：IEEE, 2009:1-7.
[2] Yang L, Jiang DM, Sahli H. Integrating deep and shallow models for multi-modal depression analysis — hybrid architectures[J]. IEEE Transactions on Affective Computing, 2018,1(12):239-253.
[3] Wang ZY, Chen LX, Wang LF, et al. Recognition of audio depres-sion based on convolutional neural network and generative antago-nism network model[J]. IEEE Access, 2020,8:101181.
[4] France DJ, Shiavi RG, Silverman S, et al. Acoustical properties of speech as indicators of depression and suicidal risk[J]. IEEE Trans-actions on Biomedical Engineering, 2000, 47(7): 829-837.
[5] Schumanna I, Schneidera A, Kanterta C, et al. Physicians' attitudes, diagnostic process and barriers regarding depression diagnosis in primary care: a systematic review of qualitative studies[J]. Family Practice, 2012, 29(3): 255-263.
[6] Quatieri TF, Malyska N. Vocal-source biomarkers for depression: A link to psychomotor activity[C]// Interspeech. Portland, Oregon：ISCA，2012：1059–1062.
[7] Hnig F, Wagner J, Batliner A, et al. Classification of user states with physiological signals: On-line generic features vs. specialized feature sets[C].2009 17th European Signal Processing Conference. Glasgow, Scotland：EUSIPCO，2014:2357-2361.
[8] Horwitz R, Quatieri TF, Helfer BS, et al. On the relative importance of vocal source, system, and prosody in human depression[C]//2013 IEEE International Conference on Body Sensor Networks. Cam-bridge, MA：IEEE, 2013:1-6.
[9] Valstar M, Gratch J, Ringeval F, et al. Depression, mood, and emo-tion recognition workshop and challenge[C]// Proceedings of the 6th international workshop on audio/visual emotion challenge. Am-sterdam, Netherlands：AVEC, 2016:3-10.
[10] Ma XC, Yang HY, Chen Q, et al. DepAudioNet: an efficient deep model for audio based depression classification[C]// Proceedings of the 6th international workshop on audio/visual emotion chal-lenge. New York，NY：Association for Computing Machinery, 2016:35-42.
[11] 李金鳴, 付小雁. 基于深度學(xué)習(xí)的音頻抑郁癥識(shí)別[J]. 計(jì)算機(jī)應(yīng)用與軟件, 2019, 36(9): 161-167.
Li J M, Fu X Y. Audio depression recognition based on deep learn-ing[J]. Computer Applications and Software, 2019, 36(9): 161-167.
[12] 曹欣怡, 李鶴, 王蔚. 基于語料庫的語音情感識(shí)別的性別差異研究[J]. 南京大學(xué)學(xué)報(bào)（自然科學(xué)版）, 2019, 55(5):758-764.
Cao X Y, Li H, Wang W. A study on gender differences in speech emotion recognition based on corpus[J]. Journal of Nanjing Univer-sity(Natural Sciences), 2019, 55(5):758-764.
[13] Dhingra SS, Kroenke K, Zack MM, et al. PHQ-8 Days: a meas-urement option for DSM-5 major depressive disorder (MDD) sever-ity[J]. Population Health Metrics, 2011, 9(1): 11.
[14] Theodoros G, Gianni P. pyAudioAnalysis: an open-source python library for audio signal analysis[J]. PLoS ONE, 2015, 10(12): 1-17.
[15] Gu XQ, Ni TG, Wang HY. New fuzzy support vector machine for the class imbalance problem in medical datasets classification[J]. The Scientific World Journal, 2014, 2014: 536434.
[16] Cawley G C, Talbot NLC. On over-fitting in model selection and subsequent selection bias in performance evaluation[J]. Journal of Machine Learning Research, 2010, 11(1):2079-2107.
[17] Nilsonne A, Sundberg J. Differences in ability of musicians and nonmusicians to judge emotional state from the fundamental fre-quency of voice samples[J]. Music Perception: An Interdisciplinary Journal, 1985, 2(4): 507-516.
[18] Breiman L , Friedman JH , Olshen RA , et al. Classification and Regression Trees (CART)[J]. Biometrics, 1984, 40(3):358.

服務(wù)與反饋：

【文章下載】【加入收藏】

提示：您還未登錄，請(qǐng)登錄！點(diǎn)此登錄

51黑料吃瓜在线观看,51黑料官网|51黑料捷克街头搭讪_51黑料入口最新视频