A model-based deep reinforcement learning method applied to finite-horizon optimal control of nonlinear control-affine system

Kim, Jong Woo; Park, Byung Jun; Yoo, Haeun; Oh, Tae Hoon; Lee, Jay H.; Lee, Jong Min

doi:10.1016/j.jprocont.2020.02.003

Scholarworks@UNIST

UNIST Library

File Download

There are no files associated with this item.

SFX Link

Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Related Researcher

오태훈

Oh, Tae Hoon

Read More

Views & Downloads

Detailed Information

Cited time in webofscience

Cited time in scopus

Metadata Downloads

Full metadata record

DC Field	Value	Language
dc.citation.endPage	178	-
dc.citation.startPage	166	-
dc.citation.title	JOURNAL OF PROCESS CONTROL	-
dc.citation.volume	87	-
dc.contributor.author	Kim, Jong Woo	-
dc.contributor.author	Park, Byung Jun	-
dc.contributor.author	Yoo, Haeun	-
dc.contributor.author	Oh, Tae Hoon	-
dc.contributor.author	Lee, Jay H.	-
dc.contributor.author	Lee, Jong Min	-
dc.date.accessioned	2024-03-13T10:05:13Z	-
dc.date.available	2024-03-13T10:05:13Z	-
dc.date.created	2024-03-13	-
dc.date.issued	2020-03	-
dc.description.abstract	The Hamilton-Jacobi-Bellman (HJB) equation can be solved to obtain optimal closed-loop control policies for general nonlinear systems. As it is seldom possible to solve the HJB equation exactly for nonlinear systems, either analytically or numerically, methods to build approximate solutions through simulation based learning have been studied in various names like neurodynamic programming (NDP) and approximate dynamic programming (ADP). The aspect of learning connects these methods to reinforcement learning (RL), which also tries to learn optimal decision policies through trial-and-error based learning. This study develops a model-based RL method, which iteratively learns the solution to the HJB and its associated equations. We focus particularly on the control-affine system with a quadratic objective function and the finite horizon optimal control (FHOC) problem with time-varying reference trajectories. The HJB solutions for such systems involve time-varying value, costate, and policy functions subject to boundary conditions. To represent the time-varying HJB solution in high-dimensional state space in a general and efficient way, deep neural networks (DNNs) are employed. It is shown that the use of DNNs, compared to shallow neural networks (SNNs), can significantly improve the performance of a learned policy in the presence of uncertain initial state and state noise. Examples involving a batch chemical reactor and a one-dimensional diffusion-convection-reaction system are used to demonstrate this and other key aspects of the method. (C) 2020 Elsevier Ltd. All rights reserved.	-
dc.identifier.bibliographicCitation	JOURNAL OF PROCESS CONTROL, v.87, pp.166 - 178	-
dc.identifier.doi	10.1016/j.jprocont.2020.02.003	-
dc.identifier.issn	0959-1524	-
dc.identifier.scopusid	2-s2.0-85079376230	-
dc.identifier.uri	https://scholarworks.unist.ac.kr/handle/201301/81581	-
dc.identifier.wosid	000518872200014	-
dc.language	영어	-
dc.publisher	ELSEVIER SCI LTD	-
dc.title	A model-based deep reinforcement learning method applied to finite-horizon optimal control of nonlinear control-affine system	-
dc.type	Article	-
dc.description.isOpenAccess	FALSE	-
dc.relation.journalWebOfScienceCategory	Automation & Control Systems; Engineering, Chemical	-
dc.relation.journalResearchArea	Automation & Control Systems; Engineering	-
dc.type.docType	Article	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.subject.keywordAuthor	Reinforcement learning	-
dc.subject.keywordAuthor	Approximate dynamic programming	-
dc.subject.keywordAuthor	Deep neural networks	-
dc.subject.keywordAuthor	Globalized dual heuristic programming	-
dc.subject.keywordAuthor	Finite horizon optimal control problem	-
dc.subject.keywordAuthor	Hamilton-Jacobi-Bellman equation	-
dc.subject.keywordPlus	APPROXIMATE OPTIMAL-CONTROL	-

Show Simple Item Record

qrcode

RSS 1.0 RSS 2.0

UNIST | Library

Tel : 052-217-1404 / Email : scholarworks@unist.ac.kr

ScholarWorks@UNIST was established as an OAK Project for the National Library of Korea.