| アイテムタイプ |
会議発表論文 / Conference Paper(1) |
| 公開日 |
2025-07-23 |
| タイトル |
|
|
タイトル |
Disentangling Pretrained Representation to Leverage Low-Resource Languages in Multilingual Machine Translation |
| 言語 |
|
|
言語 |
eng |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
Multilinguality |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
Machine Translation |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
Less-Resourced/Endangered Languages |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
Neural language representation models |
| 資源タイプ |
|
|
資源タイプ |
conference paper |
| アクセス権 |
|
|
アクセス権 |
open access |
| 著者 |
Hudi, Frederikus
Qu, Zhi
上垣外, 英剛
渡辺, 太郎
|
| 抄録 |
|
|
内容記述タイプ |
Abstract |
|
内容記述 |
Multilingual neural machine translation aims to encapsulate multiple languages into a single model. However, it requires an enormous dataset, leaving the low-resource language (LRL) underdeveloped. As LRLs may benefit from shared knowledge of multilingual representation, we aspire to find effective ways to integrate unseen languages in a pre-trained model. Nevertheless, the intricacy of shared representation among languages hinders its full utilisation. To resolve this problem, we employed target language prediction and a central language-aware layer to improve representation in integrating LRLs. Focusing on improving LRLs in the linguistically diverse country of Indonesia, we evaluated five languages using a parallel corpus of 1,000 instances each, with experimental results measured by BLEU showing zero-shot improvement of 7.4 from the baseline score of 7.1 to a score of 15.5 at best. Further analysis showed that the gains in performance are attributed more to the disentanglement of multilingual representation in the encoder with the shift of the target language-specific representation in the decoder. |
| 書誌情報 |
en : Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
p. 4978-4989,
ページ数 12,
発行日 2024-05
|
| 会議情報 |
|
|
|
会議名 |
The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) |
|
|
開始年 |
2024 |
|
|
開始月 |
05 |
|
|
開始日 |
20 |
|
|
終了年 |
2024 |
|
|
終了月 |
05 |
|
|
終了日 |
25 |
|
|
開催期間 |
2024-05-20 - 2024-05-25 |
|
|
開催地 |
Torino, Italia |
|
開催国 |
ITA |
| 出版者 |
|
|
出版者 |
ELRA and ICCL |
| 出版者版URI |
|
|
関連タイプ |
isReplacedBy |
|
|
識別子タイプ |
URI |
|
|
関連識別子 |
https://aclanthology.org/2024.lrec-main.446/ |
| 権利 |
|
|
権利情報Resource |
https://creativecommons.org/licenses/by-nc/4.0/ |
|
権利情報 |
$00A9 2024 ELRA Language Resource Association: CC BY-NC 4.0 |
| 著者版フラグ |
|
|
出版タイプ |
NA |