关键词:
糖尿病
机器学习
网格搜索与交叉验证
切比雪夫距离
GCC-KNN
摘要:
糖尿病是世界上常见的慢性病,及时查验和治疗非常有必要。在目前数字化背景下,将机器学习与医疗安全相结合具有重要意义。针对医院检查人群中潜在糖尿病病人遗传分类识别进行研究。由于糖尿病病人分类识别的特征值较多且密集,且本文主要的需求是保证其识别的准确率,故在KNN算法的基础上进行改进,使用了改进的GCC-KNN模型来对其进行分类识别,通过网格搜索优化算法确定了K值的最优参数,以及将不同距离进行对比,选择了该模型的最优距离切比雪夫距离,实现了医院检查人群中潜在糖尿病病人遗传的初步划分。经过实验对比,GCC-KNN模型准确率在潜在糖尿病病人遗传分类识别中均优于其他对比模型。Diabetes is a common chronic disease in the world, so it is necessary to check and treat it in time. In the current digital context, combining machine learning with medical security is of great significance. The genetic classification and identification of potential diabetes patients in the hospital inspection population were studied. Because the feature values for classification and recognition of diabetes patients are more and more intensive, and the main demand of this paper is to ensure the accuracy of their recognition, it is improved on the basis of the KNN algorithm, using the improved GCC-KNN model to classify and recognize them, determining the optimal parameters of K value through the grid search optimization algorithm, and comparing different distances, selecting the optimal distance Chebyshev distance of this model, realizing the preliminary division of the genetics of potential diabetes patients in the hospital inspection population. Through experimental comparison, the accuracy of GCC-KNN model is superior to other comparison models in genetic classification and recognition of potential diabetes patients.