Author:
Zhu Lingwei,Takami Go,Kawahara Mizuo,Kanokogi Hiroaki,Matsubara Takamitsu
Subject
Computer Science Applications,General Chemical Engineering
Reference77 articles.
1. A fast and reliable policy improvement algorithm;Abbasi-Yadkori,2016
2. Constrained policy optimization;Achiam,2017
3. Constrained markov decision processes;Altman,1999
4. Learning near-optimal policies with bellman-residual minimization based fitted policy iteration and a single sample path;Antos;Mach. Learn.,2008
5. Dynamic policy programming;Azar;The Journal of Machine Learning Research (JMLR),2012
Cited by
8 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献