ӳ��ý

Scalable Medication Extraction and Discontinuation Identification from Electronic Health Records Using Large Language Models.

Journal of clinical epidemiology

Authors	Chong Shao Douglas Snyder Chiran Li Bowen Gu Kerry Ngan Chun-Ting Yang Jiageng Wu Richard Wyss Kueiyu Lin Jie Yang
Keywords	electronic health records information extraction large language model medication discontinuation
Abstract	OBJECTIVE: Identifying medication discontinuations in electronic health records (EHRs) is vital for patient safety but is often hindered by information being buried in unstructured notes. This study aims to evaluate the capabilities of advanced open-sourced and proprietary large language models (LLMs) in extracting medications and classifying their medication status from EHR notes, focusing on their scalability for medication information extraction without human annotation.STUDY DESIGN AND SETTING: We collected three EHR datasets from diverse sources to build the evaluation benchmark: one publicly available dataset (Re-CASI), one we annotated based on public MIMIC notes (MIV-Med), and one internally annotated on clinical notes from Mass General Brigham (MGB-Med). We evaluated 12 advanced LLMs, including general-domain open-sourced models (e.g., Llama-3.1-70B-Instruct, Qwen2.5-72B-Instruct), medical-specific models (e.g., MeLLaMA-70B-chat), and a proprietary model (GPT-4o). We explored multiple LLM prompting strategies, including zero-shot, 5-shot, and Chain-of-Thought (CoT) approaches. Performance on medication extraction, medication status classification, and their joint task (extraction then classification) was systematically compared across all experiments.RESULTS: LLMs showed promising performance on medication extraction, while discontinuation classification and joint tasks were more challenging. GPT-4o consistently achieved the highest average F1 scores in all tasks under zero-shot setting - 94.0% for medication extraction, 78.1% for discontinuation classification, and 72.7% for the joint task. Open-sourced models followed closely, with Llama-3.1-70B-Instruct achieving the highest performance in medication status classification on the MIV-Med dataset (68.7%) and in the joint task on both the Re-CASI (76.2%) and MIV-Med (60.2%) datasets. Medical-specific LLMs demonstrated lower performance compared to advanced general-domain LLMs. Few-shot learning generally improved performance, while CoT reasoning showed inconsistent gains. Notably, open-sourced models occasionally surpassed GPT-4o performance, underscoring their potential in privacy-sensitive clinical research.CONCLUSION: LLMs demonstrate strong potential for medication extraction and discontinuation identification on EHR notes, with open-sourced models offering scalable alternatives to proprietary systems and few-shot learning further improving LLMs' capability.PLAIN LANGUAGE SUMMARY: Stopping a medicine can affect safety and treatment decisions, yet this detail is often buried in long electronic health record notes. We evaluated whether large language models, which read and summarize text, can automatically find medication names and decide whether each medicine is still being taken, has been stopped, or neither. We tested 12 models, including open-source options suitable for secure hospital use, on three collections of clinical notes and compared three simple instruction styles: giving no examples, showing a few examples, and asking for step-by-step reasoning. All models produced usable results. The strongest systems scored about 94 for finding medication names and about 78 for deciding continued or stopped status, on a standard 0 to 100 measure that balances completeness and correctness. Showing a few examples usually helped more than step-by-step prompts, and several open-source models performed close to a leading proprietary system. These tools could help hospitals and researchers monitor medications at scale to support drug-safety studies, adherence tracking, and clinical decision support, with local validation and safeguards before clinical use.
Year of Publication	2025
Journal	Journal of clinical epidemiology
Pages	112049
Date Published	11/2025
ISSN	1878-5921
DOI	10.1016/j.jclinepi.2025.112049
PubMed ID	41232578
Links

Recent ӳ��ý Publications

Enabling global-scale nucleic acid repositories through versatile, scalable biochemical selection from room-temperature archives.

Mitochondrial energetic failure underlies FLVCR1-related sensory neuropathy.

Modeling nascent transcription from chromatin landscape and structure with CLASTER.

Impaired nucleocytoplasmic transport in SOD1-mediated ALS.

Generation of a comprehensive epigenomic atlas in clear cell renal cell carcinoma informs kidney cancer progression and heritability.