51黑料吃瓜在线观看,51黑料官网|51黑料捷克街头搭讪_51黑料入口最新视频

設(shè)為首頁 |  加入收藏
首頁首頁 期刊簡介 消息通知 編委會 電子期刊 投稿須知 廣告合作 聯(lián)系我們
SMOTE算法在不平衡數(shù)據(jù)中的應(yīng)用

Application of SMOTE arithmetic for unbalanced data

作者: 孫濤  吳海豐  梁志剛  賀文  張鐳  呂平欣  郭秀花 
單位:首都醫(yī)科大學(xué)公共衛(wèi)生與家庭醫(yī)學(xué)學(xué)院(北京100069)
關(guān)鍵詞: SMOTE;不平衡數(shù)據(jù);臨床數(shù)據(jù) 
分類號:
出版年·卷·期(頁碼):2012·31·5(528-530)
摘要:

目的 臨床數(shù)據(jù)在分析時多存在不平衡性,即陽性數(shù)據(jù)和陰性數(shù)據(jù)不相等,如果不加以預(yù)處理會使分析結(jié)果產(chǎn)生偏倚。處理有偏性數(shù)據(jù)的方法多,但多數(shù)方法存在過擬合或丟失數(shù)據(jù)等缺點(diǎn)。方法 本文介紹了SMOTE算法的原理和R語言具體實(shí)現(xiàn)方式,并用SMOTE算法處理真實(shí)臨床數(shù)據(jù)作為應(yīng)用實(shí)例。結(jié)果 原始數(shù)據(jù)良惡性比率為1/3,經(jīng)過SMOTE算法處理后,良惡性比率為1。結(jié)論 SMOTE算法可對不平衡數(shù)據(jù)進(jìn)行有效糾偏。

Objective Unbalanced data which means inequality between positive and negative data, is a common problem in clinical data analysis, and this problem may result in bias. Methods for balancing data are various, yet some may over fit or lose data. Methods In this paper, SMOTE arithmetic and the application in R language were introduced briefly and we used SMOTE arithmetic for real unbalanced data. Results The ratio between benign and malignant cases was 1/3 in original data and the ratio was 1 in balanced data. Conclusions The SMOTE arithmetic has good performance in balancing data.

參考文獻(xiàn):

[1]Wang H, Guo XH, Jia ZW et al. Multilevel binomial logistic prediction model for malignant pulmonary nodules based on texture features of CT image[J]. European Journal of Radiology, 2010, 74: 124-129.
[2]Guo XH, Sun Tao, Wu HF, et al. Support Vector Machine Prediction Model of Early-stage Lung Cancer Based on Curvelet Transform to Extract Texture Features of CT[J]. World Academy of Science, Engineering and Technology, 2010,71:  333-337.
[3]Francisco FN, Cesar HM, Pedro AG. A dynamic over-sampling procedure based on sensitivity or multi-class problems[J]. Pattern Recognition, 2011, 44: 1821-1833.
[4]Alberto F, María J, Francisco H. On the influence of an adaptive inference system in fuzzy rule based classification systems for imbalanced data-sets[J]. Expert Systems with Applications, 2009, 36: 9805-9812.
[5]Chawla NV, Bowyer KW, Hall LO, et al. Smote: synthetic minority over-sampling technique[J], Journal of Artificial Intelligence Research, 2002,16: 321-357.
 

服務(wù)與反饋:
文章下載】【加入收藏
提示:您還未登錄,請登錄!點(diǎn)此登錄
 
友情鏈接  
地址:北京安定門外安貞醫(yī)院內(nèi)北京生物醫(yī)學(xué)工程編輯部
電話:010-64456508  傳真:010-64456661
電子郵箱:[email protected]