• 基于 Logistic 回归与随机森林的慢性阻塞性肺疾病治疗成功率影响因素排序及预测模型构建
  • Construction of a predictive model and ranking of influencing factors for the success rate of chronic obstructive pulmonary disease treatment based on logistic regression and random forest
  • 杨灵梅.基于 Logistic 回归与随机森林的慢性阻塞性肺疾病治疗成功率影响因素排序及预测模型构建[J].内科急危重症杂志,2026,32(2):167-171
    DOI:10.11768/nkjwzzzz20260213
    中文关键词:  慢性阻塞性肺疾病  慢性健康状况  呼吸系统疾病  血pH值  随机森林模型
    英文关键词:
    基金项目:
    作者单位E-mail
    杨灵梅  349694173@qq.com 
    摘要点击次数: 103
    全文下载次数: 143
    中文摘要:
          摘要 目的:探究基于Logistic回归和随机森林算法的慢性阻塞性肺疾病(COPD)治疗成功率影响因素重要性排序及预测模型构建。方法:本研究为回顾性研究。选取COPD住院患者作为研究对象,依据是否治疗成功将患者分为成功组和失败组。采用Logistic回归分析COPD治疗成功的影响因素,并构建随机森林预测模型,将COPD患者按照7:3的比例随机分为训练集和验证集。在训练组中,使用随机森林算法构建COPD治疗成功的预测模型,并通过变量预测贡献排序,探究COPD治疗成功的影响因素。使用验证集检验模型,并评价模型的应用效能。结果:共入选146例COPD患者,治疗成功组103例(70.55%);失败组43例(29.45%)。成功组患者年龄偏小,其他合并症、吸烟史、儿时严重呼吸系统疾病史患者占比、入院第一天急性生理学与慢性健康状况评估(APACHEⅡ)评分、治疗4h后血pH值低于失败组(P均<0.05);Logistic回归分析显示,年龄、其他合并症、吸烟史、入院第一天APACHEⅡ评分、儿时有无严重呼吸系统疾病史、治疗4h后血pH值是COPD治疗成功率的独立影响因素(OR>1, P<0.05);重要性排序依次是年龄、吸烟史、其他合并症、入院第一天APACHEⅡ评分、儿时有无严重呼吸系统疾病史及治疗4h后血pH值。随机森林模型训练集错误率为9.67% 验证集错误率为1.37%。结论:年龄、其他合并症、吸烟史、入院第一天APACHEⅡ评分、儿时有无严重呼吸系统疾病史、治疗4h后血pH值是COPD治疗成功率的独立影响因素,重要性排序依次是年龄、吸烟史、其他合并症、入院第一天APACHEⅡ评分、儿时有无严重呼吸系统疾病史及治疗4h后血pH值。
    英文摘要:
          Abstract Objective: To explore the ranking of influential factors affecting the success rate of chronic obstructive pulmonary disease (COPD) treatment and to construct a predictive model using logistic regression and random forest algorithms. Methods: Hospitalized COPD patients were selected as study subjects and divided into a success group and a failure group based on treatment outcomes. Logistic regression was used to analyze the factors influencing COPD treatment success, while a random forest predictive model was developed. Patients were randomly assigned to training (70%) and validation (30%) sets at a 7:3 ratio. The random forest algorithm was applied to the training set to build a predictive model for COPD treatment success, and the relative importance of variables was ranked to identify key influencing factors. The model was validated using the validation set, and its predictive performance was evaluated. Results: A total of 146 COPD patients were included, with 103 (70.55%) in the success group and 43 (29.45%) in the failure group. The success group had younger patients, fewer comorbidities, no smoking history, no severe childhood respiratory diseases, lower acute physiological assessment and chronic health evaluation II (APACHE II) scores on admission day 1, and lower blood pH values 4 h after treatment compared to the failure group (all P< 0.05). Logistic regression analysis identified age, comorbidities, smoking history, APACHE II score on admission day 1, history of severe childhood respiratory diseases, and blood pH value 4 h after treatment as independent predictors of COPD treatment success (OR > 1, P< 0.05). The importance ranking of these factors was as follows: age, smoking history, comorbidities, APACHE II score on admission day 1, history of severe childhood respiratory diseases, and blood pH value 4 h after treatment. The random forest model achieved a training error rate of 9.67% and a validation error rate of 1.37%. Conclusion: Age, comorbidities, smoking history, APACHE II score on admission day 1, history of severe childhood respiratory diseases, and blood pH value 4 h after treatment are independent predictors of COPD treatment success. The importance ranking of these factors is consistent with the logistic regression results: age, smoking history, comorbidities, APACHE II score on admission day 1, history of severe childhood respiratory diseases, and blood pH value 4 h after treatment.