Diverse randomized value functions: a provably pessimistic approach for offline reinforcement learning
From MaRDI portal
Publication:6595316
DOI10.1016/j.ins.2024.121146MaRDI QIDQ6595316
Zhen Wang, Hongyi Guo, Xudong Yu, Chang-Hong Wang, Chenjia Bai
Publication date: 30 August 2024
Published in: Information Sciences (Search for Journal in Brave)
diversificationpessimismdistributional shiftoffline reinforcement learningrandomized value functions
Computer science (68-XX) Game theory, economics, finance, and other social and behavioral sciences (91-XX)
This page was built for publication: Diverse randomized value functions: a provably pessimistic approach for offline reinforcement learning