| アイテムタイプ |
学術雑誌論文 / Journal Article(1) |
| 公開日 |
2025-12-16 |
| タイトル |
|
|
タイトル |
Applying Syntax-Prosody Mapping Hypothesis and Boundary-Driven Theory to Neural Sequence-to-Sequence Speech Synthesis |
| 言語 |
|
|
言語 |
eng |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
Syntactics |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
Speech synthesis |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
Linguistics |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
Vegetation |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
Symbols |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
Vectors |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
Training data |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
Text to speech |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
Rain |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
Production |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
Downstep in Japanese |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
initial lowering |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
phonological hierarchy |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
rhythmic boost |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
syntax-prosody mapping hypothesis |
| キーワード |
|
|
主題Scheme |
Other |
|
主題 |
text-to-speech synthesis |
| 資源タイプ |
|
|
資源タイプ |
journal article |
| アクセス権 |
|
|
アクセス権 |
open access |
| 著者 |
Furukawa, Kei
Kishiyama, Takeshi
中村, 哲
Sakti, Sakriani
|
| 抄録 |
|
|
内容記述タイプ |
Abstract |
|
内容記述 |
This study presents a novel approach to Japanese speech synthesis by applying the syntax-prosody mapping hypothesis and the boundary-driven theory, both from linguistics. Focusing on the phonological phenomena of initial lowering and rhythmic boost, our research introduces the Recursive Phonological Model, which significantly outperforms traditional methods in both objective and subjective evaluation experiments. This study proposes new objective evaluation criteria for Japanese speech synthesis. These criteria offer a more rigorous and linguistically grounded methodology for assessing the quality of synthesized speech. The Recursive Phonological Model accurately captures both the presence and absence of initial lowering, a common phenomenon in Japanese speech. This is the first model to successfully reflect such syntactic variations through intonation, demonstrating its advanced ability to handle complex phonological patterns. Additionally, the model demonstrates a unique proficiency in reproducing the rhythmic boost phenomenon, despite rhythmic boost being absent in the training data. This ability underscores the importance of learning phonological boundaries in speech synthesis. Our approach not only yields more natural-sounding speech but also enriches the field by incorporating complex linguistic theories in the computational process. This research thus marks a significant advance in the naturalness and linguistic accuracy of speech synthesis, with broader implications for computational linguistics and artificial intelligence. |
| 書誌情報 |
en : IEEE Access
巻 12,
p. 160896-160917,
ページ数 22,
発行日 2024-10-28
|
| 出版者 |
|
|
出版者 |
IEEE |
| ISSN |
|
|
収録物識別子タイプ |
EISSN |
|
収録物識別子 |
2169-3536 |
| 出版者版DOI |
|
|
関連タイプ |
isReplacedBy |
|
|
識別子タイプ |
DOI |
|
|
関連識別子 |
https://doi.org/10.1109/ACCESS.2024.3487053 |
| 出版者版URI |
|
|
関連タイプ |
isReplacedBy |
|
|
識別子タイプ |
URI |
|
|
関連識別子 |
https://ieeexplore.ieee.org/abstract/document/10736967 |
| 権利 |
|
|
権利情報Resource |
https://creativecommons.org/licenses/by/4.0/ |
|
権利情報 |
© 2024 The Authors. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ |
| 著者版フラグ |
|
|
出版タイプ |
NA |
| 助成情報 |
|
|
|
助成機関名 |
Japan Science and Technology Agency (JST) |
|
|
研究課題番号 |
JPMJSP2140 |
|
|
研究課題名 |
JST SPRING |
| 助成情報 |
|
|
|
助成機関名 |
Japan Society for the Promotion of Science (JSPS) |
|
|
研究課題番号 |
JP21H05054 |
|
|
研究課題番号URI |
https://kaken.nii.ac.jp/grant/KAKENHI-PROJECT-21H05054/ |
|
|
研究課題名 |
多元自動通訳システムと評価法に関する研究とその応用展開 |
| 助成情報 |
|
|
|
助成機関名 |
Japan Society for the Promotion of Science (JSPS) |
|
|
研究課題番号 |
JP23K21681 |
|
|
研究課題番号URI |
https://kaken.nii.ac.jp/grant/KAKENHI-PROJECT-23K21681/ |
|
|
研究課題名 |
言語の壁を超える低資源多言語Machine Speech Chain技術の構築 |