Convergence analysis of the deep neural networks based globalized dual heuristic programming

Kim, Jong Woo; Oh, Tae Hoon; Son, Sang Hwan; Jeong, Dong Hwi; Lee, Jong Min

doi:10.1016/j.automatica.2020.109222

Scholarworks@UNIST

UNIST Library

File Download

There are no files associated with this item.

SFX Link

Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Related Researcher

오태훈

Oh, Tae Hoon

Read More

Views & Downloads

Detailed Information

Cited time in webofscience

Cited time in scopus

Metadata Downloads

Full metadata record

DC Field	Value	Language
dc.citation.startPage	109222	-
dc.citation.title	AUTOMATICA	-
dc.citation.volume	122	-
dc.contributor.author	Kim, Jong Woo	-
dc.contributor.author	Oh, Tae Hoon	-
dc.contributor.author	Son, Sang Hwan	-
dc.contributor.author	Jeong, Dong Hwi	-
dc.contributor.author	Lee, Jong Min	-
dc.date.accessioned	2024-03-13T10:05:13Z	-
dc.date.available	2024-03-13T10:05:13Z	-
dc.date.created	2024-03-13	-
dc.date.issued	2020-12	-
dc.description.abstract	Globalized dual heuristic programming (GDHP) algorithm is a special form of approximate dynamic programming (ADP) method that solves the Hamilton-Jacobi-Bellman (HJB) equation for the case where the system takes control-affine form subject to the quadratic cost function. This study incorporates the deep neural networks (DNNs) as a function approximator to inherit the advantages of which to express high-dimensional function space. Elementwise error bound of the costate function sequence is newly derived and the convergence property is presented. In the approximated function space, uniformly ultimate boundedness (UUB) condition for the weights of the general multi-layer NNs weights is obtained. It is also proved that under the gradient descent method for solving the moving target regression problem, UUB gradually converges to the value, which exclusively contains the approximation reconstruction error. The proposed method is demonstrated on the continuous reactor control in aims to obtain the control policy for multiple initial states, which justifies the necessity of DNNs structure for such cases. (c) 2020 Elsevier Ltd. All rights reserved.	-
dc.identifier.bibliographicCitation	AUTOMATICA, v.122, pp.109222	-
dc.identifier.doi	10.1016/j.automatica.2020.109222	-
dc.identifier.issn	0005-1098	-
dc.identifier.scopusid	2-s2.0-85089817575	-
dc.identifier.uri	https://scholarworks.unist.ac.kr/handle/201301/81579	-
dc.identifier.wosid	000598166900008	-
dc.language	영어	-
dc.publisher	PERGAMON-ELSEVIER SCIENCE LTD	-
dc.title	Convergence analysis of the deep neural networks based globalized dual heuristic programming	-
dc.type	Article	-
dc.description.isOpenAccess	FALSE	-
dc.relation.journalWebOfScienceCategory	Automation & Control Systems; Engineering, Electrical & Electronic	-
dc.relation.journalResearchArea	Automation & Control Systems; Engineering	-
dc.type.docType	Article	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.subject.keywordAuthor	Approximate dynamic programming	-
dc.subject.keywordAuthor	Reinforcement learning	-
dc.subject.keywordAuthor	Deep neural networks	-
dc.subject.keywordAuthor	Lyapunov stability	-
dc.subject.keywordAuthor	Nonlinear control	-

Show Simple Item Record

qrcode

RSS 1.0 RSS 2.0

UNIST | Library

Tel : 052-217-1404 / Email : scholarworks@unist.ac.kr

ScholarWorks@UNIST was established as an OAK Project for the National Library of Korea.