| アイテムタイプ |
学術雑誌論文 / Journal Article(1) |
| 公開日 |
2025-11-18 |
| タイトル |
|
|
タイトル |
Predictive Model for Extended-Spectrum β-Lactamase–Producing Bacterial Infections Using Natural Language Processing Technique and Open Data in Intensive Care Unit Environment: Retrospective Observational Study |
| 言語 |
|
|
言語 |
eng |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
predictive modeling |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
MIMIC-3 dataset |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
natural language processing |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
NLP |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
QuickUMLS |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
named entity recognition |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
ESBL-producing bacterial infections |
| 資源タイプ |
|
|
資源タイプ |
journal article |
| アクセス権 |
|
|
アクセス権 |
open access |
| 著者 |
Ito, Genta
矢田, 竣太郎
若宮, 翔子
荒牧, 英治
|
| 抄録 |
|
|
内容記述タイプ |
Abstract |
|
内容記述 |
Background: Machine learning has advanced medical event prediction, mostly using private data. The public MIMIC-3 (Medical Information Mart for Intensive Care III) data set, which contains detailed data on over 40,000 intensive care unit patients, stands out as it can help develop better models including structured and textual data. Objective: This study aimed to build and test a machine learning model using the MIMIC-3 data set to determine the effectiveness of information extracted from electronic medical record text using a named entity recognition, specifically QuickUMLS, for predicting important medical events. Using the prediction of extended-spectrum β-lactamase (ESBL)-producing bacterial infections as an example, this study shows how open data sources and simple technology can be useful for making clinically meaningful predictions. Methods: The MIMIC-3 data set, including demographics, vital signs, laboratory results, and textual data, such as discharge summaries, was used. This study specifically targeted patients diagnosed with Klebsiella pneumoniae or Escherichia coli infection. Predictions were based on ESBL-producing bacterial standards and the minimum inhibitory concentration criteria. Both the structured data and extracted patient histories were used as predictors. In total, 2 models, an L1-regularized logistic regression model and a LightGBM model, were evaluated using the receiver operating characteristic area under the curve (ROC-AUC) and the precision-recall curve area under the curve (PR-AUC). Results: Of 46,520 MIMIC-3 patients, 4046 were identified with bacterial cultures, indicating the presence of K pneumoniae or E coli. After excluding patients who lacked discharge summary text, 3614 patients remained. The L1-penalized model, with variables from only the structured data, displayed a ROC-AUC of 0.646 and a PR-AUC of 0.307. The LightGBM model, combining structured and textual data, achieved a ROC-AUC of 0.707 and a PR-AUC of 0.369. Key contributors to the LightGBM model included patient age, duration since hospital admission, and specific medical history such as diabetes. The structured data-based model showed improved performance compared to the reference models. Performance was further improved when textual medical history was included. Compared to other models predicting drug-resistant bacteria, the results of this study ranked in the middle. Some misidentifications, potentially due to the limitations of QuickUMLS, may have affected the accuracy of the model. Conclusions: This study successfully developed a predictive model for ESBL-producing bacterial infections using the MIMIC-3 data set, yielding results consistent with existing literature. This model stands out for its transparency and reliance on open data and open-named entity recognition technology. The performance of the model was enhanced using textual information. With advancements in natural language processing tools such as BERT and GPT, the extraction of medical data from text holds substantial potential for future model optimization. |
| 書誌情報 |
en : JMIR Formative Research
巻 8,
ページ数 9,
発行日 2024-07-10
|
| 出版者 |
|
|
出版者 |
JMIR Publications |
| ISSN |
|
|
収録物識別子タイプ |
EISSN |
|
収録物識別子 |
2561-326X |
| 出版者版DOI |
|
|
関連タイプ |
isReplacedBy |
|
|
識別子タイプ |
DOI |
|
|
関連識別子 |
https://doi.org/10.2196/54044 |
| 出版者版URI |
|
|
関連タイプ |
isReplacedBy |
|
|
識別子タイプ |
URI |
|
|
関連識別子 |
https://formative.jmir.org/2024/1/e54044 |
| 権利 |
|
|
権利情報Resource |
https://creativecommons.org/licenses/by/4.0/ |
|
権利情報 |
©Genta Ito, Shuntaro Yada, Shoko Wakamiya, Eiji Aramaki. Originally published in JMIR Formative Research (https://formative.jmir.org), 10.07.2024. This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Formative Research, is properly cited. The complete bibliographic information, a link to the original publication on https://formative.jmir.org, as well as this copyright and license information must be included. |
| 著者版フラグ |
|
|
出版タイプ |
NA |
| 助成情報 |
|
|
|
助成機関名 |
Japan Science and Technology Agency(JST) |
|
|
研究課題番号 |
JPMJCR22N1 |
|
|
研究課題番号URI |
https://projectdb.jst.go.jp/grant/JST-PROJECT-22717060/ |
|
|
研究課題名 |
リアルワールドテキスト処理の深化によるデータ駆動型探薬 |