logo
banner

Journals & Publications

Publications Papers

Papers

Policy Gradient Adaptive Dynamic Programming for Data-Based Optimal Control
Oct 30, 2017Author:
PrintText Size A A

Title: Policy Gradient Adaptive Dynamic Programming for Data-Based Optimal Control

 Authors: Luo, BA; Liu, DR; Wu, HN; Wang, D; Lewis, FL

 Author Full Names: Luo, Biao; Liu, Derong; Wu, Huai-Ning; Wang, Ding; Lewis, Frank L.

 Source: IEEE TRANSACTIONS ON CYBERNETICS, 47 (10):3341-3354; SI 10.1109/TCYB.2016.2623859 OCT 2017

 Language: English

 Abstract: The model-free optimal control problem of general discrete-time nonlinear systems is considered in this paper, and a data-based policy gradient adaptive dynamic programming (PGADP) algorithm is developed to design an adaptive optimal controller method. By using offline and online data rather than the mathematical system model, the PGADP algorithm improves control policy with a gradient descent scheme. The convergence of the PGADP algorithm is proved by demonstrating that the constructed Q-function sequence converges to the optimal Q-function. Based on the PGADP algorithm, the adaptive control method is developed with an actor-critic structure and the method of weighted residuals. Its convergence properties are analyzed, where the approximate Q-function converges to its optimum. Computer simulation results demonstrate the effectiveness of the PGADP-based adaptive control method.

 ISSN: 2168-2267

 eISSN: 2168-2275

 IDS Number: FF9BM

 Unique ID: WOS:000409311800032

*Click Here to View Full Record