ログイン
Language:

WEKO3

  • トップ
  • ランキング
To
lat lon distance
To

Field does not validate



インデックスリンク

インデックスツリー

メールアドレスを入力してください。

WEKO

One fine body…

WEKO

One fine body…

アイテム

  1. 02 情報科学
  2. 01 学術雑誌論文

Toward fast meeting transcription: NAIST system for CHiME-8 NOTSOFAR-1 task and its analysis

http://hdl.handle.net/10061/0002001304
http://hdl.handle.net/10061/0002001304
bf528341-3dce-47db-92bd-0062f848d092
アイテムタイプ 学術雑誌論文 / Journal Article(1)
公開日 2025-12-24
タイトル
タイトル Toward fast meeting transcription: NAIST system for CHiME-8 NOTSOFAR-1 task and its analysis
言語
言語 eng
キーワード
主題Scheme Other
主題 CHiME-8
キーワード
主題Scheme Other
主題 Meeting transcription
キーワード
主題Scheme Other
主題 Multi-talker speech recognition
資源タイプ
資源タイプ journal article
アクセス権
アクセス権 open access
著者 Hirano, Yuta

× Hirano, Yuta

en Hirano, Yuta

Search repository
Nguyen, Mau

× Nguyen, Mau

en Nguyen, Mau

Search repository
Azuma, Kakeru

× Azuma, Kakeru

en Azuma, Kakeru

Search repository
Saragih, Jan Meyer

× Saragih, Jan Meyer

en Saragih, Jan Meyer

Search repository
Sakti, Sakriani

× Sakti, Sakriani

en Sakti, Sakriani

Search repository
抄録
内容記述タイプ Abstract
内容記述 This paper reports on the NAIST system submitted to the CHIME-8 challenge’s NOTSOFAR-1 (Natural Office Talkers in Settings of Far-field Audio Recordings) task, including results and analyses from several additional experiments. While fast processing is crucial for real-world applications, the CHIME-7 challenge focused solely on reducing error rate, neglecting the practical aspects of system performance such as inference speed. Therefore, this research aims to develop a practical system by improving recognition accuracy while simultaneously reducing inference speed. To address this challenge, we propose enhancing the baseline module architecture by modifying both the CSS and ASR modules. Specifically, the ASR module was built based on a WavLM large feature extractor and a Zipformer transducer. Furthermore, we employed reverberation removal using block-wise weighted prediction error (WPE) as preprocessing for the speech separation module. The proposed system achieved a relative reduction in tcpWER of 11.6% for single-channel tracks and 18.7% for multi-channel tracks compared to the baseline system. Moreover, the proposed system operates up to six times faster than the baseline system while achieving superior tcpWER results. We also report on the observed changes in system performance due to variations in the amount of training data for the ASR model, as well the impact of the maximum word-length setting in the transducer-based ASR module on the subsequent diarization system, based on findings from our system development.
書誌情報 en : Computer Speech & Language

巻 95, p. 1-13, ページ数 13, 発行日 2025-07-16
出版者
出版者 Elsevier
ISSN
収録物識別子タイプ EISSN
収録物識別子 0885-2308
出版者版DOI
関連タイプ isReplacedBy
識別子タイプ DOI
関連識別子 https://doi.org/10.1016/j.csl.2025.101836
出版者版URI
関連タイプ isReplacedBy
識別子タイプ URI
関連識別子 https://www.sciencedirect.com/science/article/pii/S0885230825000610
権利
権利情報Resource https://creativecommons.org/licenses/by-nc/4.0/
権利情報 © 2025 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC license(http://creativecommons.org/licenses/by-nc/4.0/).
著者版フラグ
出版タイプ NA
助成情報
助成機関名 Japan Society for the Promotion of Science (JSPS)
研究課題番号 JP21H05054
研究課題番号URI https://kaken.nii.ac.jp/grant/KAKENHI-PROJECT-21H05054/
研究課題名 多元自動通訳システムと評価法に関する研究とその応用展開
助成情報
助成機関名 Japan Society for the Promotion of Science (JSPS)
研究課題番号 JP23K21681
研究課題番号URI https://kaken.nii.ac.jp/grant/KAKENHI-PROJECT-23K21681/
研究課題名 言語の壁を超える低資源多言語Machine Speech Chain技術の構築
戻る
0
views
See details
Views

Versions

Ver.1 2025-12-24 02:04:54.025680
Show All versions

Share

Share
tweet

Cite as

Other

print

エクスポート

OAI-PMH
  • OAI-PMH JPCOAR 2.0
  • OAI-PMH JPCOAR 1.0
  • OAI-PMH DublinCore
  • OAI-PMH DDI
Other Formats
  • JSON
  • BIBTEX
  • ZIP

コミュニティ

確認

確認

確認


Powered by WEKO3


Powered by WEKO3