logo
banner

Journals & Publications

Journals Publications Papers

Papers

Convergence Proof of Approximate Policy Iteration for Undiscounted Optimal Control of Discrete-Time Systems
Jan 04, 2016Author:
PrintText Size A A

Title: Convergence Proof of Approximate Policy Iteration for Undiscounted Optimal Control of Discrete-Time Systems Authors: Zhu, YH; Zhao, DB; He, HB; Ji, JH

Author Full Names: Zhu, Yuanheng; Zhao, Dongbin; He, Haibo; Ji, Junhong

Source: COGNITIVE COMPUTATION, 7 (6):763-771; SI 10.1007/s12559-015-9350-z DEC 2015

ISSN: 1866-9956

eISSN: 1866-9964

Unique ID: WOS:000366329200012

Abstract:

Approximate policy iteration (API) is studied to solve undiscounted optimal control problems in this paper. A discrete-time system with the continuous-state space and the finite-action set is considered. As approximation technique is used for the continuous-state space, approximation errors exist in the calculation and disturb the convergence of the original policy iteration. In our research, we analyze and prove the convergence of API for undiscounted optimal control. We use an iterative method to implement approximate policy evaluation and demonstrate that the error between approximate and exact value functions is bounded. Then, with the finite-action set, the greedy policy in policy improvement is generated directly. Our main theorem proves that if a sufficiently accurate approximator is used, API converges to the optimal policy. For implementation, we introduce a fuzzy approximator and verify the performance on the puddle world problem.

 

*Click Here to View Full Record