| アイテムタイプ |
学術雑誌論文 / Journal Article(1) |
| 公開日 |
2025-12-16 |
| タイトル |
|
|
タイトル |
Neural End-To-End Speech Translation Leveraged by ASR Posterior Distribution |
| 言語 |
|
|
言語 |
eng |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
end-to-end speech translation |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
spoken language translation |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
multi-task learning |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
knowledge distillation |
| 資源タイプ |
|
|
資源タイプ |
journal article |
| アクセス権 |
|
|
アクセス権 |
open access |
| 著者 |
Ko, Yuka
Sudoh, Katsuhito
Sakti, Sakriani
中村, 哲
|
| 抄録 |
|
|
内容記述タイプ |
Abstract |
|
内容記述 |
End-to-end speech translation (ST) directly renders source language speech to the target language without intermediate automatic speech recognition (ASR) output as in a cascade approach. End-to-end ST avoids error propagation from intermediate ASR results. Although recent attempts have applied multi-task learning using an auxiliary task of ASR to improve ST performance, they use cross-entropy loss to one-hot references in the ASR task, and the trained ST models do not consider possible ASR confusion. In this study, we propose a novel multi-task learning framework for end-to-end STs leveraged by ASR-based loss against posterior distributions obtained using a pre-trained ASR model called ASR posterior-based loss (ASR-PBL). The ASR-PBL method, which enables a ST model to reflect possible ASR confusion among competing hypotheses with similar pronunciations, can be applied to one of the strong multi-task ST baseline models with Hybrid CTC/Attention ASR task loss. In our experiments on the Fisher Spanish-to-English corpus, the proposed method demonstrated better BLEU results than the baseline that used standard CE loss. |
| 書誌情報 |
en : IEICE TRANSACTIONS on Information
巻 E107-D,
号 10,
p. 1322-1331,
ページ数 10,
発行日 2024-10-01
|
| 出版者 |
|
|
出版者 |
IEICE |
| ISSN |
|
|
収録物識別子タイプ |
EISSN |
|
収録物識別子 |
1745-1361 |
| 出版者版DOI |
|
|
関連タイプ |
isReplacedBy |
|
|
識別子タイプ |
DOI |
|
|
関連識別子 |
https://doi.org/10.1587/transinf.2023EDP7249 |
| 出版者版URI |
|
|
関連タイプ |
isReplacedBy |
|
|
識別子タイプ |
URI |
|
|
関連識別子 |
https://globals.ieice.org/en_transactions/information/10.1587/transinf.2023EDP7249/_f |
| 権利 |
|
|
権利情報 |
Copyright © 2024 The Institute of Electronics, Information and Communication Engineers. |
| 著者版フラグ |
|
|
出版タイプ |
NA |
| 助成情報 |
|
|
|
助成機関名 |
Japan Science and Technology Agency (JST) |
|
|
研究課題番号 |
JPMJSP2140 |
|
|
研究課題名 |
JST SPRING |
| 助成情報 |
|
|
|
助成機関名 |
Japan Society for the Promotion of Science (JSPS) |
|
|
研究課題番号 |
JP21H05054 |
|
|
研究課題番号URI |
https://kaken.nii.ac.jp/grant/KAKENHI-PROJECT-21H05054/ |
|
|
研究課題名 |
多元自動通訳システムと評価法に関する研究とその応用展開 |
| 助成情報 |
|
|
|
助成機関名 |
Japan Society for the Promotion of Science (JSPS) |
|
|
研究課題番号 |
JP21H03467 |
|
|
研究課題番号URI |
https://kaken.nii.ac.jp/grant/KAKENHI-PROJECT-23K21681/ |
|
|
研究課題名 |
言語の壁を超える低資源多言語Machine Speech Chain技術の構築 |