scientific article
zbMath1270.90097MaRDI QIDQ2844160
Masami Kurano, Masayuki Horiguchi, Masami Yasuda, Tetsuichiro Iki
Publication date: 28 August 2013
Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.
Markov decision processeslearning algorithmadaptive policyaverage casecommunicating casereward-penalty typeunknown transition matrix
Learning and adaptive systems in artificial intelligence (68T05) Markov chains (discrete-time Markov processes on discrete state spaces) (60J10) Applications of Markov chains and discrete-time Markov processes on general state spaces (social mobility, learning theory, industrial processes, etc.) (60J20) Markov and semi-Markov decision processes (90C40)
This page was built for publication: