WEKO3
アイテム
Is Boundary Annotation Necessary? Evaluating Boundary-Free Approaches to Improve Clinical Named Entity Annotation Efficiency: Case Study
http://hdl.handle.net/10061/0002000775
http://hdl.handle.net/10061/0002000775d6db99f7-8927-4ba8-8023-479eec8a6963
| アイテムタイプ | 学術雑誌論文 / Journal Article(1) | |||||||
|---|---|---|---|---|---|---|---|---|
| 公開日 | 2025-02-14 | |||||||
| タイトル | ||||||||
| タイトル | Is Boundary Annotation Necessary? Evaluating Boundary-Free Approaches to Improve Clinical Named Entity Annotation Efficiency: Case Study | |||||||
| 言語 | ||||||||
| 言語 | eng | |||||||
| キーワード | ||||||||
| 主題Scheme | Other | |||||||
| 主題 | natural language processing | |||||||
| キーワード | ||||||||
| 主題Scheme | Other | |||||||
| 主題 | named entity recognition | |||||||
| キーワード | ||||||||
| 主題Scheme | Other | |||||||
| 主題 | information extraction | |||||||
| キーワード | ||||||||
| 主題Scheme | Other | |||||||
| 主題 | text annotation | |||||||
| キーワード | ||||||||
| 主題Scheme | Other | |||||||
| 主題 | entity boundaries | |||||||
| キーワード | ||||||||
| 主題Scheme | Other | |||||||
| 主題 | lenient annotation | |||||||
| キーワード | ||||||||
| 主題Scheme | Other | |||||||
| 主題 | case reports | |||||||
| キーワード | ||||||||
| 主題Scheme | Other | |||||||
| 主題 | annotation | |||||||
| キーワード | ||||||||
| 主題Scheme | Other | |||||||
| 主題 | case study | |||||||
| キーワード | ||||||||
| 主題Scheme | Other | |||||||
| 主題 | medical case report | |||||||
| キーワード | ||||||||
| 主題Scheme | Other | |||||||
| 主題 | efficiency | |||||||
| キーワード | ||||||||
| 主題Scheme | Other | |||||||
| 主題 | model | |||||||
| キーワード | ||||||||
| 主題Scheme | Other | |||||||
| 主題 | model performance | |||||||
| キーワード | ||||||||
| 主題Scheme | Other | |||||||
| 主題 | dataset | |||||||
| キーワード | ||||||||
| 主題Scheme | Other | |||||||
| 主題 | Japan | |||||||
| キーワード | ||||||||
| 主題Scheme | Other | |||||||
| 主題 | Japanese | |||||||
| キーワード | ||||||||
| 主題Scheme | Other | |||||||
| 主題 | entity | |||||||
| キーワード | ||||||||
| 主題Scheme | Other | |||||||
| 主題 | clinical domain | |||||||
| キーワード | ||||||||
| 主題Scheme | Other | |||||||
| 主題 | clinical | |||||||
| 資源タイプ | ||||||||
| 資源タイプ | journal article | |||||||
| アクセス権 | ||||||||
| アクセス権 | open access | |||||||
| 著者 |
Herman Bernardim Andrade, Gabriel
× Herman Bernardim Andrade, Gabriel
× 矢田, 竣太郎× 荒牧, 英治 |
|||||||
| 抄録 | ||||||||
| 内容記述タイプ | Abstract | |||||||
| 内容記述 | Background: Named entity recognition (NER) is a fundamental task in natural language processing. However, it is typically preceded by named entity annotation, which poses several challenges, especially in the clinical domain. For instance, determining entity boundaries is one of the most common sources of disagreements between annotators due to questions such as whether modifiers or peripheral words should be annotated. If unresolved, these can induce inconsistency in the produced corpora, yet, on the other hand, strict guidelines or adjudication sessions can further prolong an already slow and convoluted process. Objective: The aim of this study is to address these challenges by evaluating 2 novel annotation methodologies, lenient span and point annotation, aiming to mitigate the difficulty of precisely determining entity boundaries. Methods: We evaluate their effects through an annotation case study on a Japanese medical case report data set. We compare annotation time, annotator agreement, and the quality of the produced labeling and assess the impact on the performance of an NER system trained on the annotated corpus. Results: We saw significant improvements in the labeling process efficiency, with up to a 25% reduction in overall annotation time and even a 10% improvement in annotator agreement compared to the traditional boundary-strict approach. However, even the best-achieved NER model presented some drop in performance compared to the traditional annotation methodology. Conclusions: Our findings demonstrate a balance between annotation speed and model performance. Although disregarding boundary information affects model performance to some extent, this is counterbalanced by significant reductions in the annotator’s workload and notable improvements in the speed of the annotation process. These benefits may prove valuable in various applications, offering an attractive compromise for developers and researchers. |
|||||||
| 書誌情報 |
en : JMIR Medical Informatics 巻 12, 発行日 2024-07-02 |
|||||||
| 出版者 | ||||||||
| 出版者 | JMIR Publications | |||||||
| ISSN | ||||||||
| 収録物識別子タイプ | EISSN | |||||||
| 収録物識別子 | 2291-9694 | |||||||
| 出版者版DOI | ||||||||
| 関連タイプ | isReplacedBy | |||||||
| 識別子タイプ | DOI | |||||||
| 関連識別子 | https://doi.org/10.2196/59680 | |||||||
| 出版者版URI | ||||||||
| 関連タイプ | isReplacedBy | |||||||
| 識別子タイプ | URI | |||||||
| 関連識別子 | https://medinform.jmir.org/2024/1/e59680/ | |||||||
| 権利 | ||||||||
| 権利情報Resource | https://creativecommons.org/licenses/by/4.0/ | |||||||
| 権利情報 | $00A9Gabriel Herman Bernardim Andrade, Shuntaro Yada, Eiji Aramaki. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 02.07.2024. This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on https://medinform.jmir.org/, as well as this copyright and license information must be included. | |||||||
| 著者版フラグ | ||||||||
| 出版タイプ | NA | |||||||