Title: Policy Gradient Adaptive Dynamic Programming for Data-Based Optimal Control
|
| Authors: Luo, BA; Liu, DR; Wu, HN; Wang, D; Lewis, FL
|
| Author Full Names: Luo, Biao; Liu, Derong; Wu, Huai-Ning; Wang, Ding; Lewis, Frank L.
|
| Source: IEEE TRANSACTIONS ON CYBERNETICS, 47 (10):3341-3354; SI 10.1109/TCYB.2016.2623859 OCT 2017
|
| Language: English
|
| Abstract: The model-free optimal control problem of general discrete-time nonlinear systems is considered in this paper, and a data-based policy gradient adaptive dynamic programming (PGADP) algorithm is developed to design an adaptive optimal controller method. By using offline and online data rather than the mathematical system model, the PGADP algorithm improves control policy with a gradient descent scheme. The convergence of the PGADP algorithm is proved by demonstrating that the constructed Q-function sequence converges to the optimal Q-function. Based on the PGADP algorithm, the adaptive control method is developed with an actor-critic structure and the method of weighted residuals. Its convergence properties are analyzed, where the approximate Q-function converges to its optimum. Computer simulation results demonstrate the effectiveness of the PGADP-based adaptive control method.
|
| ISSN: 2168-2267
|
| eISSN: 2168-2275
|
| IDS Number: FF9BM
|
| Unique ID: WOS:000409311800032
*Click Here to View Full Record