北京生物醫(yī)學(xué)工程

計(jì)算機(jī)輔助診斷模型內(nèi)部驗(yàn)證方法的定量評(píng)價(jià)

Quantitative evaluation of internal validation methods for computer-aided diagnosis scheme

作者：陳婕卿楊秋英陳卉

單位：首都醫(yī)科大學(xué)生物醫(yī)學(xué)工程學(xué)院(北京100069)

關(guān)鍵詞：計(jì)算機(jī)輔助診斷；分類器；Logistic回歸；驗(yàn)證；胰腺癌

分類號(hào)：R318;R576;TP311

出版年·卷·期（頁(yè)碼）：0·0·0（0-0）

摘要：

目的定量比較4種常用的內(nèi)部驗(yàn)證方法，為評(píng)價(jià)計(jì)算機(jī)輔助診斷模型性能時(shí)選擇驗(yàn)證方法提供參考依據(jù)。方法利用Logistic回歸模型完成大樣本集（n=415）和小樣本集（n=76）下的胰腺癌診斷任務(wù)，分別采用保持法、k折交叉驗(yàn)證法、留一法和0.632 Bootstrap法共4種內(nèi)部驗(yàn)證方法，并用診斷的正確率、敏感度、特異度和ROC曲線下面積評(píng)價(jià)診斷的穩(wěn)定性、偏倚和運(yùn)算效率。結(jié)果對(duì)大、小樣本集，0.632 Bootstrap驗(yàn)證方法得到的正確率、敏感度、特異度和ROC曲線下面積的標(biāo)準(zhǔn)誤分別為0.012、0.014、0.010、0.010以及0.013、0.014、0.010、0.011，均小于其他驗(yàn)證方法，其他方法均不同程度地高估或低估模型性能。結(jié)論考慮驗(yàn)證的簡(jiǎn)潔有效性，k折交叉驗(yàn)證法在大樣本量的情況下即可達(dá)到內(nèi)部驗(yàn)證的最佳效果，在小樣本量情況下推薦使用0.632 Bootstrap進(jìn)行驗(yàn)證。

Objective To quantitatively compare four commonly used methods in order to provide reference on the selection of internal validation methods for evaluating a computer-aided diagnosis model. Methods Logistic regression model was used for a diagnostic task on pancreatic cancer datasets with small and large sample sizes (76 and 415, respectively). Four internal validation methods, hold-out, leave-one-out, k-fold cross validation and 0.632 Bootstrap, were used and compared. Diagnosis model stability, bias and efficiency were measured by accuracy, sensitivity, specificity and area under the ROC curve. Results 0.632 Bootstrap validation method was with the minimum standard errors of accuracy, sensitivity, specificity and area under the ROC curve on both large-and small-size datasets, i.e. 0.012, 0.014, 0.010, 0.010, and 0.013, 0.014, 0.010, 0.011, respectively. Other methods underestimated or overestimated the model performance to certain degree. Conclusions Considering the simplicity and effectiveness of these validation methods, it is recommended that k-fold cross validation is preferable on the relative large-size dataset and 0.632 Bootstrap method on the small one.

參考文獻(xiàn)：

［1］Gallego C, Martel AL. Improving the accuracy of computer-aided diagnosis for breast MR imaging by differentiating between mass and monmas lesions［J］. Radiology, 2016, 278(3):1266-1271.

［2］楊柳，王鈺．泛化誤差的各種交叉驗(yàn)證估計(jì)方法綜述［J］．計(jì)算機(jī)應(yīng)用研究，2015，32(5): 1287-1297.

Yang Liu, Wang Yu. Survey for various cross-validation estimators of generalization error［J］. Application Research of Computers, 2015, 32(5):1287-1297.

［3］覃禮堂，柳樹深，肖乾芬，等．QSAR模型內(nèi)部和外部驗(yàn)證方法綜述［J］．環(huán)境化學(xué)，2013，32(7): 1205-1211.

Qin Litang, Liu Shushen, Xiao Qianfen, et al. Internal and external validations of QSAR model: review［J］. Application Research of Computers, 2013, 32(7):1205-1211.

［4］Chirico N, Gramatica P. Real external prodictivity of QSAR Models. Part2. New intercomparable thresholds for different validation criteria and the need for scatter plot inspection［J］. Journal of Chemical Information and Modeling, 2012, 52(8):2044-2058.

［5］Li Jiewang, Bai Li, Su Dan, et al. Retrospective analysis of 102 cases of solid pseudo papillary neoplasm of the pancreas in China ［J］. Journal of International Medical Research, 2013, 41(4):1266-1271.

［6］Uesawa Y, Motege E, Dai Y, et al. Prediction models for feverishness developed during interferon therapy of chronic hepatitis C patients ［J］. Pharmazie, 2010, 65(2):114-126.

［7］Sahiner B, Chan HP, Hadjiiski L. Classifier performance prediction for computer-aided diagnosis using a limited dataset［J］. Medical Physics, 2008, 35(4):1559-1569.

［8］Hussain A, George E, Robert H, et al. Improved accuracy of anticoagulant dose prediction using a pharmacogenetics and artificial neural network-based method［J］. European Journal of Clinical Pharmacology, 2014, 70(3):265-273.

［9］Simon G, Cherry E, Ellie M, et al. Gene expression profiling allows distinction between primary and metastatic squamous cell carcinomas in the lung［J］. American Association for Cancer Research, 2005, 65(8):3063-3071.

［10］Wang TR, Mousseau V, Pedroni N, et al. Assessing the performance of classification-based vulnerability analysis model ［J］. Risk Analysis, 2015, 35(9):1674-1689.

［11］Nagesh A, Bret M, Antoine L, et al. Penalized likelihood phenotyping: unifying voxel wise analyses and multi-voxel pattern analyses in neuroimaging ［J］. Neuroinformatics, 2013, 11(2):227-247.

服務(wù)與反饋：

【文章下載】【加入收藏】

提示：您還未登錄，請(qǐng)登錄！點(diǎn)此登錄

51黑料吃瓜在线观看,51黑料官网|51黑料捷克街头搭讪_51黑料入口最新视频