ログイン
Language:

WEKO3

  • トップ
  • ランキング
To
lat lon distance
To

Field does not validate



インデックスリンク

インデックスツリー

メールアドレスを入力してください。

WEKO

One fine body…

WEKO

One fine body…

アイテム

  1. 02 情報科学
  2. 01 学術雑誌論文

Alleviating parameter-tuning burden in reinforcement learning for large-scale process control

http://hdl.handle.net/10061/0002000130
http://hdl.handle.net/10061/0002000130
7d10760a-9e82-421a-9d10-8350a6a520ba
アイテムタイプ 学術雑誌論文 / Journal Article(1)
公開日 2024-02-16
タイトル
タイトル Alleviating parameter-tuning burden in reinforcement learning for large-scale process control
言語
言語 eng
キーワード
主題Scheme Other
主題 Reinforcement learning
キーワード
主題Scheme Other
主題 Process control
キーワード
主題Scheme Other
主題 Vinyl acetate monomer
キーワード
主題Scheme Other
主題 Monotonic improvement
資源タイプ
資源タイプ journal article
アクセス権
アクセス権 open access
著者 Zhu, Lingwei

× Zhu, Lingwei

en Zhu, Lingwei

Search repository
Takami, Go

× Takami, Go

en Takami, Go

Search repository
Kawahara, Mizuo

× Kawahara, Mizuo

en Kawahara, Mizuo

Search repository
Kanokogi, Hiroaki

× Kanokogi, Hiroaki

en Kanokogi, Hiroaki

Search repository
松原, 崇充

× 松原, 崇充

WEKO 181
e-Rad_Researcher 20508056

ja 松原, 崇充

ja-Kana マツバラ, タカミツ

en Matsubara, Takamitsu

Search repository
抄録
内容記述タイプ Abstract
内容記述 Modern process controllers necessitate high quality models and remedial system re-identification upon performance degradation. Reinforcement Learning (RL) can be a promising replacement for those laborious manual procedures. However, in realistic scenarios time is limited, algorithms that can robustly learn with reduced human-agent interactions or self-exploration e.g. parameter tuning are desired. In practice, a great portion of time in setting up an RL algorithm to properly work is spent on those trial-and-error interactions. To reduce the interaction time, we propose a principled framework to ensure monotonic policy improvement even with underperforming parameters, enhancing the robustness of RL process against parameter setting. We incorporate key ingredients such as random features and factorial policy into monotonic improvement mechanism for learning cautiously in large-scale process control problems. We demonstrate in challenging control problems on the simulated vinyl acetate monomer process that the proposed method robustly learns meaningful policy within a short, fixed learning horizon given various parameter configurations that simulate the interactions, comparing to the other method that can only show good performance specific to a narrow range of parameters.
書誌情報 en : Computers & Chemical Engineering

巻 158, 発行日 2022-01-06
出版者
出版者 Elsevier
ISSN
収録物識別子タイプ EISSN
収録物識別子 1873-4375
出版者版DOI
関連タイプ isReplacedBy
識別子タイプ DOI
関連識別子 https://doi.org/10.1016/j.compchemeng.2022.107658
出版者版URI
関連タイプ isReplacedBy
識別子タイプ URI
関連識別子 https://www.sciencedirect.com/science/article/pii/S0098135422000035
権利
権利情報Resource http://creativecommons.org/licenses/by-nc-nd/4.0/
権利情報 $00A9 2022 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
著者版フラグ
出版タイプ NA
戻る
0
views
See details
Views

Versions

Ver.1 2024-02-16 06:13:40.498053
Show All versions

Share

Share
tweet

Cite as

Other

print

エクスポート

OAI-PMH
  • OAI-PMH JPCOAR 2.0
  • OAI-PMH JPCOAR 1.0
  • OAI-PMH DublinCore
  • OAI-PMH DDI
Other Formats
  • JSON
  • BIBTEX
  • ZIP

コミュニティ

確認

確認

確認


Powered by WEKO3


Powered by WEKO3