Abstract:
Objective To investigate the application value of a tool-driven artificial intelligence (AI) agent in clinical decision-making across multiple stages of the disease course in patients with diabetic foot.
Methods Using the large language model (LLM) Qwen3-max as the reasoning engine, an AI agent adaptable to repeated follow-up scenarios in diabetic foot was constructed by integrating the ReAct framework, retrieval-augmented generation (RAG) technology, and multimodal data-processing tools. Validation was performed based on retrospective data from 34 patients with diabetic foot (a total of 140 visits). Performance differences were compared among the AI agent, the native large language model Qwen3-max, and the medical LLM Baichuan-M1-14B.
Results The clinical practicality score of the AI agent reached 8.29±0.91, significantly higher than that of the Qwen3-max (7.56±0.70, t = 4.19, P < 0.001) and Baichuan-M1-14B (7.82±0.67, t = 3.67, P < 0.001). Moreover, the advantage became more pronounced as the number of disease-course visits increased. In the high disease-course group (≥7 visits), the AI agent scored 9.50±0.58, exceeding Qwen3-max (7.50±0.58) and Baichuan-M1-14B (8.50±0.58) by 2.00 and 1.60 points, respectively. The AI agent increased the accuracy of diagnosis and staging to 94.1% through RAG technology, reduced the hallucination incidence to 8.7% via the ReAct framework, achieved an automatic recognition accuracy of 95.7% for key indicators, and shortened the time required for disease-course data integration by 88.3% compared with traditional manual methods.
Conclusions The AI agent constructed in this study achieved automated and standardized analysis and management of multiple disease-course episodes in diabetic foot, effectively reducing the “hallucination” risk of native LLMs, and can provide solid technical support for early warning, condition monitoring, and individualized treatment of diabetic foot.