基于糖尿病足多阶段病程管理的AI智能体构建与验证

Construction and validation of an AI agent for multistage disease-course management in diabetic foot

  • 摘要:
    目的 探讨工具驱动型AI智能体在糖尿病足患者多阶段病程临床决策中的应用价值。
    方法 以大语言模型Qwen3-max为推理引擎,整合ReAct框架、检索增强生成(RAG)技术及多模态数据处理工具,构建适配糖尿病足多次复诊场景的AI智能体;基于34例糖尿病足患者(累计140次就诊)的回顾性数据进行验证,对比该智能体与原生大语言模型Qwen3-max及医学大语言模型Baichuan-M1-14B的性能差异。
    结果 AI智能体的临床实用性评分达(8.29±0.91)分,高于Qwen3-max(7.56±0.70)分,t = 4.19,P < 0.001和Baichuan-M1-14B(7.82±0.67)分,t = 3.67,P < 0.001;且病程次数越多优势越明显,高病程组(≥7次)AI 智能体评分达( 9.50±0.58)分,较Qwen3-max(7.50±0.58)分]和Baichuan-M1-14B[(8.50±0.58)分]分别高出2.00分和1.60分。AI智能体通过RAG技术使诊断分期正确率提高到94.1%,ReAct框架使幻觉发生率降至8.7%,关键指标自动识别准确率达95.7%,且将病程数据整合时间较传统人工方式缩短了88.3%。
    结论 本研究构建的AI智能体实现了糖尿病足多次病程的自动化、标准化分析与管理,有效降低了原生大语言模型的“幻觉”风险,可为糖尿病足早期预警、病情监测及个体化治疗提供较好的技术支撑。

     

    Abstract:
    Objective  To investigate the application value of a tool-driven artificial intelligence (AI) agent in clinical decision-making across multiple stages of the disease course in patients with diabetic foot.
    Methods Using the large language model (LLM) Qwen3-max as the reasoning engine, an AI agent adaptable to repeated follow-up scenarios in diabetic foot was constructed by integrating the ReAct framework, retrieval-augmented generation (RAG) technology, and multimodal data-processing tools. Validation was performed based on retrospective data from 34 patients with diabetic foot (a total of 140 visits). Performance differences were compared among the AI agent, the native large language model Qwen3-max, and the medical LLM Baichuan-M1-14B.
    Results The clinical practicality score of the AI agent reached 8.29±0.91, significantly higher than that of the Qwen3-max (7.56±0.70, t = 4.19, P < 0.001) and Baichuan-M1-14B (7.82±0.67, t = 3.67, P < 0.001). Moreover, the advantage became more pronounced as the number of disease-course visits increased. In the high disease-course group (≥7 visits), the AI agent scored 9.50±0.58, exceeding Qwen3-max (7.50±0.58) and Baichuan-M1-14B (8.50±0.58) by 2.00 and 1.60 points, respectively. The AI agent increased the accuracy of diagnosis and staging to 94.1% through RAG technology, reduced the hallucination incidence to 8.7% via the ReAct framework, achieved an automatic recognition accuracy of 95.7% for key indicators, and shortened the time required for disease-course data integration by 88.3% compared with traditional manual methods.
    Conclusions The AI agent constructed in this study achieved automated and standardized analysis and management of multiple disease-course episodes in diabetic foot, effectively reducing the “hallucination” risk of native LLMs, and can provide solid technical support for early warning, condition monitoring, and individualized treatment of diabetic foot.

     

/

返回文章
返回