WEKO3
アイテム
Alleviating parameter-tuning burden in reinforcement learning for large-scale process control
http://hdl.handle.net/10061/0002000130
http://hdl.handle.net/10061/00020001307d10760a-9e82-421a-9d10-8350a6a520ba
| アイテムタイプ | 学術雑誌論文 / Journal Article(1) | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 公開日 | 2024-02-16 | |||||||||||||
| タイトル | ||||||||||||||
| タイトル | Alleviating parameter-tuning burden in reinforcement learning for large-scale process control | |||||||||||||
| 言語 | ||||||||||||||
| 言語 | eng | |||||||||||||
| キーワード | ||||||||||||||
| 主題Scheme | Other | |||||||||||||
| 主題 | Reinforcement learning | |||||||||||||
| キーワード | ||||||||||||||
| 主題Scheme | Other | |||||||||||||
| 主題 | Process control | |||||||||||||
| キーワード | ||||||||||||||
| 主題Scheme | Other | |||||||||||||
| 主題 | Vinyl acetate monomer | |||||||||||||
| キーワード | ||||||||||||||
| 主題Scheme | Other | |||||||||||||
| 主題 | Monotonic improvement | |||||||||||||
| 資源タイプ | ||||||||||||||
| 資源タイプ | journal article | |||||||||||||
| アクセス権 | ||||||||||||||
| アクセス権 | open access | |||||||||||||
| 著者 |
Zhu, Lingwei
× Zhu, Lingwei
× Takami, Go
× Kawahara, Mizuo
× Kanokogi, Hiroaki
× 松原, 崇充 |
|||||||||||||
| 抄録 | ||||||||||||||
| 内容記述タイプ | Abstract | |||||||||||||
| 内容記述 | Modern process controllers necessitate high quality models and remedial system re-identification upon performance degradation. Reinforcement Learning (RL) can be a promising replacement for those laborious manual procedures. However, in realistic scenarios time is limited, algorithms that can robustly learn with reduced human-agent interactions or self-exploration e.g. parameter tuning are desired. In practice, a great portion of time in setting up an RL algorithm to properly work is spent on those trial-and-error interactions. To reduce the interaction time, we propose a principled framework to ensure monotonic policy improvement even with underperforming parameters, enhancing the robustness of RL process against parameter setting. We incorporate key ingredients such as random features and factorial policy into monotonic improvement mechanism for learning cautiously in large-scale process control problems. We demonstrate in challenging control problems on the simulated vinyl acetate monomer process that the proposed method robustly learns meaningful policy within a short, fixed learning horizon given various parameter configurations that simulate the interactions, comparing to the other method that can only show good performance specific to a narrow range of parameters. | |||||||||||||
| 書誌情報 |
en : Computers & Chemical Engineering 巻 158, 発行日 2022-01-06 |
|||||||||||||
| 出版者 | ||||||||||||||
| 出版者 | Elsevier | |||||||||||||
| ISSN | ||||||||||||||
| 収録物識別子タイプ | EISSN | |||||||||||||
| 収録物識別子 | 1873-4375 | |||||||||||||
| 出版者版DOI | ||||||||||||||
| 関連タイプ | isReplacedBy | |||||||||||||
| 識別子タイプ | DOI | |||||||||||||
| 関連識別子 | https://doi.org/10.1016/j.compchemeng.2022.107658 | |||||||||||||
| 出版者版URI | ||||||||||||||
| 関連タイプ | isReplacedBy | |||||||||||||
| 識別子タイプ | URI | |||||||||||||
| 関連識別子 | https://www.sciencedirect.com/science/article/pii/S0098135422000035 | |||||||||||||
| 権利 | ||||||||||||||
| 権利情報Resource | http://creativecommons.org/licenses/by-nc-nd/4.0/ | |||||||||||||
| 権利情報 | $00A9 2022 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/) | |||||||||||||
| 著者版フラグ | ||||||||||||||
| 出版タイプ | NA | |||||||||||||