scientific article
From MaRDI portal
Publication:3175278
DOI10.13232/j.cnki.jnju.2017.06.007zbMath1399.68136MaRDI QIDQ3175278
Yushuang Wu, Jingwen Ma, Xingguo Chen, Xiao-yu Chen
Publication date: 18 July 2018
Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.
reinforcement learningoblique projectiontemporal difference learninglinear function approximationoff-policy