ログイン
Language:

WEKO3

  • トップ
  • ランキング
To
lat lon distance
To

Field does not validate



インデックスリンク

インデックスツリー

メールアドレスを入力してください。

WEKO

One fine body…

WEKO

One fine body…

アイテム

  1. 02 情報科学
  2. 01 学術雑誌論文

LATTE: Lattice ATTentive Encoding for Character-based Word Segmentation

http://hdl.handle.net/10061/0002000593
http://hdl.handle.net/10061/0002000593
fbc9ff3c-5922-4f5d-8c41-8b554da5157b
アイテムタイプ 学術雑誌論文 / Journal Article(1)
公開日 2024-10-17
タイトル
タイトル LATTE: Lattice ATTentive Encoding for Character-based Word Segmentation
言語
言語 eng
キーワード
主題Scheme Other
主題 Word Segmentation
キーワード
主題Scheme Other
主題 Representation Learning
資源タイプ
資源タイプ journal article
アクセス権
アクセス権 open access
著者 Chay-intr,Thodsaporn

× Chay-intr,Thodsaporn

en Chay-intr,Thodsaporn

Search repository
上垣外, 英剛

× 上垣外, 英剛

WEKO 35596

ja 上垣外, 英剛

ja-Kana カミガイト, ヒデタカ

en Kamigaito, Hidetaka

Search repository
Funakoshi,Kotaro

× Funakoshi,Kotaro

en Funakoshi,Kotaro

Search repository
Okumura,Manabu

× Okumura,Manabu

en Okumura,Manabu

Search repository
抄録
内容記述タイプ Abstract
内容記述 A character sequence comprises at least one or more segmentation alternatives. This can be considered segmentation ambiguity and may weaken segmentation performance in word segmentation. Proper handling of such ambiguity lessens ambiguous decisions on word boundaries. Previous works have achieved remarkable segmentation performance and alleviated the ambiguity problem by incorporating the lattice, owing to its ability to capture segmentation alternatives, along with graph-based and pre-trained models. However, multiple granularity information, including character and word, in a lattice that encodes with such models may not be attentively exploited. To strengthen multi-granularity representations in a lattice, we propose the Lattice ATTentive Encoding (LATTE) method for character-based word segmentation. Our model employs the lattice structure to handle segmentation alternatives and utilizes graph neural networks along with an attention mechanism to attentively extract multi-granularity representation from the lattice for complementing character representations. Our experimental results demonstrated improvements in segmentation performance on the BCCWJ, CTB6, and BEST2010 datasets in three languages, particularly Japanese, Chinese, and Thai.
書誌情報 ja : 自然言語処理

巻 30, 号 2, p. 456-488, 発行日 2023-06-15
出版者
出版者 一般社団法人言語処理学会
ISSN
収録物識別子タイプ EISSN
収録物識別子 2185-8314
出版者版DOI
関連タイプ isReplacedBy
識別子タイプ DOI
関連識別子 https://doi.org/10.5715/jnlp.30.456
出版者版URI
関連タイプ isReplacedBy
識別子タイプ URI
関連識別子 https://www.jstage.jst.go.jp/article/jnlp/30/2/30_456/_article/-char/ja/
権利
権利情報Resource https://creativecommons.org/licenses/by/4.0/
権利情報 $00A9 2023 The Association for Natural Language Processing. Licensed under CC BY 4.0 https://creativecommons.org/licenses/by/4.0/
著者版フラグ
出版タイプ NA
戻る
0
views
See details
Views

Versions

Ver.1 2024-10-17 07:16:08.140852
Show All versions

Share

Share
tweet

Cite as

Other

print

エクスポート

OAI-PMH
  • OAI-PMH JPCOAR 2.0
  • OAI-PMH JPCOAR 1.0
  • OAI-PMH DublinCore
  • OAI-PMH DDI
Other Formats
  • JSON
  • BIBTEX
  • ZIP

コミュニティ

確認

確認

確認


Powered by WEKO3


Powered by WEKO3