Utterance-based Selective Training for Cost-Effective Task-Adaptation of Acoustic Models

Cincarek, Tobias; Toda, Tomoki; Saruwatari, Hiroshi; Kiyohiro, Shikano

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

{"_buckets": {"deposit": "7686f553-ab32-4a06-853c-6706281f762f"}, "_deposit": {"created_by": 4, "id": "4828", "owners": [4], "pid": {"revision_id": 0, "type": "depid", "value": "4828"}, "status": "published"}, "_oai": {"id": "oai:naist.repo.nii.ac.jp:00004828", "sets": ["36"]}, "author_link": ["12877", "341", "12878", "12879"], "item_1698715929687": {"attribute_name": "会議情報", "attribute_value_mlt": [{"subitem_conference_country": "FRA", "subitem_conference_date": {"subitem_conference_date_language": "en", "subitem_conference_period": "May 20, 2006"}, "subitem_conference_names": [{"subitem_conference_name": "SRIV 2006: ITRW on Speech Recognition and Intrinsic Variatioon", "subitem_conference_name_language": "en"}], "subitem_conference_places": [{"subitem_conference_place": "Toulouse", "subitem_conference_place_language": "en"}]}]}, "item_9_biblio_info_7": {"attribute_name": "書誌情報", "attribute_value_mlt": [{"bibliographicIssueDates": {"bibliographicIssueDate": "2006-05", "bibliographicIssueDateType": "Issued"}, "bibliographicPageEnd": "76", "bibliographicPageStart": "71"}]}, "item_9_description_5": {"attribute_name": "抄録", "attribute_value_mlt": [{"subitem_description": "The construction of acoustic models for speech recognition systems is a very costly and time-consuming process, since their robust training requires large amounts of transcribed speech data. This paper describes an approach for costeffective construction of task-adapted acoustic models. Existing speech data(bases) are employed to set up a large training data pool. Apart from that, only a small amount of taskspecific speech data is required. Based on an algorithm for utterance-based selective training of acoustic models, training utterances are selected from the training data pool so that the likelihood of the acoustic model given the task-specific speech data is maximized. The proposed method is evaluated for acoustic models with context-independent and contextdependent phonetic units. Results are reported for building an infant (preschool children) acoustic model with speech from elementary school children and an elderly acoustic model with adult speech. The proposed approach is already effective if there are only 20 task-specific utterances available. A relative improvement in word accuracy of up to 10% is achieved over conventional acoustic model construction and up to 2.8% over MAP and MLLR adaptation with task-specific data. The gap in performance to a high-cost acoustic model can be reduced up to 76%.", "subitem_description_language": "en", "subitem_description_type": "Abstract"}]}, "item_9_text_21": {"attribute_name": "NAIST ID", "attribute_value_mlt": [{"subitem_text_value": "73292716"}]}, "item_9_version_type_16": {"attribute_name": "著者版フラグ", "attribute_value_mlt": [{"subitem_version_resource": "http://purl.org/coar/version/c_970fb48d4fbd8a85", "subitem_version_type": "VoR"}]}, "item_access_right": {"attribute_name": "アクセス権", "attribute_value_mlt": [{"subitem_access_right": "open access", "subitem_access_right_uri": "http://purl.org/coar/access_right/c_abf2"}]}, "item_creator": {"attribute_name": "著者", "attribute_type": "creator", "attribute_value_mlt": [{"creatorNames": [{"creatorName": "Cincarek, Tobias", "creatorNameLang": "en"}], "nameIdentifiers": [{"nameIdentifier": "12877", "nameIdentifierScheme": "WEKO"}]}, {"creatorNames": [{"creatorName": "Toda, Tomoki", "creatorNameLang": "en"}], "nameIdentifiers": [{"nameIdentifier": "341", "nameIdentifierScheme": "WEKO"}, {"nameIdentifier": "90403328", "nameIdentifierScheme": "e-Rad", "nameIdentifierURI": "https://kaken.nii.ac.jp/ja/search/?qm=90403328"}]}, {"creatorNames": [{"creatorName": "Saruwatari, Hiroshi", "creatorNameLang": "en"}], "nameIdentifiers": [{"nameIdentifier": "12878", "nameIdentifierScheme": "WEKO"}]}, {"creatorNames": [{"creatorName": "Kiyohiro, Shikano", "creatorNameLang": "en"}], "nameIdentifiers": [{"nameIdentifier": "12879", "nameIdentifierScheme": "WEKO"}]}]}, "item_files": {"attribute_name": "ファイル情報", "attribute_type": "file", "attribute_value_mlt": [{"accessrole": "open_date", "date": [{"dateType": "Available", "dateValue": "2023-03-02"}], "displaytype": "detail", "download_preview_message": "", "file_order": 0, "filename": "SRIV_2006_71.pdf", "filesize": [{"value": "191.5 kB"}], "format": "application/pdf", "future_date_message": "", "is_thumbnail": false, "licensetype": "license_note", "mimetype": "application/pdf", "size": 191500.0, "url": {"label": "fulltext", "objectType": "fulltext", "url": "https://naist.repo.nii.ac.jp/record/4828/files/SRIV_2006_71.pdf"}, "version_id": "1c668433-4555-4ace-ba0e-2cb503ba8e45"}]}, "item_language": {"attribute_name": "言語", "attribute_value_mlt": [{"subitem_language": "eng"}]}, "item_resource_type": {"attribute_name": "資源タイプ", "attribute_value_mlt": [{"resourcetype": "conference paper", "resourceuri": "http://purl.org/coar/resource_type/c_5794"}]}, "item_title": "Utterance-based Selective Training for Cost-Effective Task-Adaptation of Acoustic Models", "item_titles": {"attribute_name": "タイトル", "attribute_value_mlt": [{"subitem_title": "Utterance-based Selective Training for Cost-Effective Task-Adaptation of Acoustic Models", "subitem_title_language": "en"}]}, "item_type_id": "9", "owner": "4", "path": ["36"], "permalink_uri": "http://hdl.handle.net/10061/8260", "pubdate": {"attribute_name": "PubDate", "attribute_value": "2012-08-22"}, "publish_date": "2012-08-22", "publish_status": "0", "recid": "4828", "relation": {}, "relation_version_is_last": true, "title": ["Utterance-based Selective Training for Cost-Effective Task-Adaptation of Acoustic Models"], "weko_shared_id": -1}

Utterance-based Selective Training for Cost-Effective Task-Adaptation of Acoustic Models

http://hdl.handle.net/10061/8260

名前 / ファイル	ライセンス	アクション
fulltext (191.5 kB)

Item type

会議発表論文 / Conference Paper(1)

公開日

2012-08-22

タイトル

Utterance-based Selective Training for Cost-Effective Task-Adaptation of Acoustic Models

言語

eng

資源タイプ

conference paper

アクセス権

open access

著者

Cincarek, Tobias
Toda, Tomoki

WEKO 341
e-Rad 90403328

en	Toda, Tomoki

Search repository

Saruwatari, Hiroshi
Kiyohiro, Shikano

抄録

内容記述タイプ

Abstract

内容記述

The construction of acoustic models for speech recognition systems is a very costly and time-consuming process, since their robust training requires large amounts of transcribed speech data. This paper describes an approach for costeffective construction of task-adapted acoustic models. Existing speech data(bases) are employed to set up a large training data pool. Apart from that, only a small amount of taskspecific speech data is required. Based on an algorithm for utterance-based selective training of acoustic models, training utterances are selected from the training data pool so that the likelihood of the acoustic model given the task-specific speech data is maximized. The proposed method is evaluated for acoustic models with context-independent and contextdependent phonetic units. Results are reported for building an infant (preschool children) acoustic model with speech from elementary school children and an elderly acoustic model with adult speech. The proposed approach is already effective if there are only 20 task-specific utterances available. A relative improvement in word accuracy of up to 10% is achieved over conventional acoustic model construction and up to 2.8% over MAP and MLLR adaptation with task-specific data. The gap in performance to a high-cost acoustic model can be reduced up to 76%.

書誌情報

p. 71-76, 発行日 2006-05

会議情報

会議名

SRIV 2006: ITRW on Speech Recognition and Intrinsic Variatioon

開催期間

May 20, 2006

開催地

Toulouse

開催国

FRA

著者版フラグ

出版タイプ

VoR

戻る

views

See details

	Views

Versions

Ver.1

2023-07-25 13:53:36.874348

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

Utterance-based Selective Training for Cost-Effective Task-Adaptation of Acoustic Models

× Cincarek, Tobias

× Toda, Tomoki

× Saruwatari, Hiroshi

× Kiyohiro, Shikano

Versions

Share

Cite as

エクスポート