File Download

There are no files associated with this item.

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)
Related Researcher

윤성환

Yoon, Sung Whan
Machine Intelligence and Information Learning Lab.
Read More

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

Full metadata record

DC Field Value Language
dc.citation.conferencePlace SI -
dc.citation.conferencePlace Singapore EXPO -
dc.citation.title International Conference on Learning Representations -
dc.contributor.author Lee, Hyun Kyu -
dc.contributor.author Yoon, Sung Whan -
dc.date.accessioned 2025-02-13T12:05:06Z -
dc.date.available 2025-02-13T12:05:06Z -
dc.date.created 2025-02-12 -
dc.date.issued 2025-04-24 -
dc.description.abstract Investigating flat minima on loss surfaces in parameter space is well-documented in the supervised learning context, highlighting its advantages for model generalization. However, limited attention has been paid to the reinforcement learning (RL) context, where the impact of flatter reward landscapes in policy parameter space remains largely unexplored. Beyond merely extrapolating from supervised learning, which suggests a link between flat reward landscapes and enhanced generalization, we aim to formally connect the flatness of the reward surface to the robustness of RL models. In policy models where a deep neural network determines actions, flatter reward landscapes in response to parameter perturbations lead to consistent rewards even when actions are perturbed. Moreover, robustness to actions further contributes to robustness against other variations, such as changes in state transition probabilities and reward functions. We extensively simulate various RL environments, confirming the consistent benefits of flatter reward landscapes in enhancing the robustness of RL under diverse conditions, including action selection, transition dynamics, and reward functions. -
dc.identifier.bibliographicCitation International Conference on Learning Representations -
dc.identifier.uri https://scholarworks.unist.ac.kr/handle/201301/86223 -
dc.identifier.url https://openreview.net/forum?id=4OaO3GjP7k -
dc.language 영어 -
dc.publisher International Conference on Learning Representations -
dc.title Flat Reward in Policy Parameter Space Implies Robust Reinforcement Learning -
dc.type Conference Paper -
dc.date.conferenceDate 2025-04-24 -

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.