| アイテムタイプ |
学術雑誌論文 / Journal Article(1) |
| 公開日 |
2025-09-30 |
| タイトル |
|
|
タイトル |
Performance Improvement of a Natural Language Processing Tool for Extracting Patient Narratives Related to Medical States From Japanese Pharmaceutical Care Records by Increasing the Amount of Training Data: Natural Language Processing Analysis and Validation Study |
| 言語 |
|
|
言語 |
eng |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
natural language processing |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
NLP |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
named entity recognition |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
NER |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
deep learning |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
pharmaceutical care record |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
electronic medical record |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
EMR |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
Japanese |
| 資源タイプ |
|
|
資源タイプ |
journal article |
| アクセス権 |
|
|
アクセス権 |
open access |
| 著者 |
Ohno, Yukiko
Aomori, Tohru
Nishiyama, Tomohiro
Kato, Riri
Fujiki, Reina
Ishikawa, Haruki
Kiyomiya, Keisuke
Isawa, Minae
Mochizuki, Mayumi
荒牧, 英治
Ohtani, Hisakazu
|
| 抄録 |
|
|
内容記述タイプ |
Abstract |
|
内容記述 |
Background: Patients' oral expressions serve as valuable sources of clinical information to improve pharmacotherapy. Natural language processing (NLP) is a useful approach for analyzing unstructured text data, such as patient narratives. However, few studies have focused on using NLP for narratives in the Japanese language. Objective: We aimed to develop a high-performance NLP system for extracting clinical information from patient narratives by examining the performance progression with a gradual increase in the amount of training data. Methods: We used subjective texts from the pharmaceutical care records of Keio University Hospital from April 1, 2018, to March 31, 2019, comprising 12,004 records from 6559 cases. After preprocessing, we annotated diseases and symptoms within the texts. We then trained and evaluated a deep learning model (bidirectional encoder representations from transformers combined with a conditional random field [BERT-CRF]) through 10-fold cross-validation. The annotated data were divided into 10 subsets, and the amount of training data was progressively increased over 10 steps. We also analyzed the causes of errors. Finally, we applied the developed system to the analysis of case report texts to evaluate its usability for texts from other sources. Results: The F1-score of the system improved from 0.67 to 0.82 as the amount of training data increased from 1200 to 12,004 records. The F1-score reached 0.78 with 3600 records and was largely similar thereafter. As performance improved, errors from incorrect extractions decreased significantly, which resulted in an increase in precision. For case reports, the F1-score also increased from 0.34 to 0.41 as the training dataset expanded from 1200 to 12,004 records. Performance was lower for extracting symptoms from case report texts compared with pharmaceutical care records, suggesting that this system is more specialized for analyzing subjective data from pharmaceutical care records. Conclusions: We successfully developed a high-performance system specialized in analyzing subjective data from pharmaceutical care records by training a large dataset, with near-complete saturation of system performance with about 3600 training records. This system will be useful for monitoring symptoms, offering benefits for both clinical practice and research. |
| 書誌情報 |
en : JMIR Medical Informatics
巻 13,
ページ数 17,
発行日 2025-03-04
|
| 出版者 |
|
|
出版者 |
JMIR Publications |
| ISSN |
|
|
収録物識別子タイプ |
EISSN |
|
収録物識別子 |
2291-9694 |
| 出版者版DOI |
|
|
関連タイプ |
isReplacedBy |
|
|
識別子タイプ |
DOI |
|
|
関連識別子 |
https://doi.org/10.2196/68863 |
| 出版者版URI |
|
|
関連タイプ |
isReplacedBy |
|
|
識別子タイプ |
URI |
|
|
関連識別子 |
https://medinform.jmir.org/2025/1/e68863 |
| 権利 |
|
|
権利情報Resource |
https://creativecommons.org/licenses/by/4.0/ |
|
権利情報 |
©Yukiko Ohno, Tohru Aomori, Tomohiro Nishiyama, Riri Kato, Reina Fujiki, Haruki Ishikawa, Keisuke Kiyomiya, Minae Isawa, Mayumi Mochizuki, Eiji Aramaki, Hisakazu Ohtani. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 04.03.2025. This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on https://medinform.jmir.org/, as well as this copyright and license information must be included. |
| 著者版フラグ |
|
|
出版タイプ |
NA |
| 助成情報 |
|
|
|
助成機関名 |
Japan Science and Technology Agency (JST) |
|
|
研究課題番号 |
JPMJSP2123 |
|
|
研究課題名 |
JST SPRING |