The following pages link to Csaba Szepesvári (Q399885):
Displaying 43 items.
- Alignment based kernel learning with a continuous set of base kernels (Q399887) (← links)
- Model selection in reinforcement learning (Q415618) (← links)
- Regularized least-squares regression: learning from a sequence (Q645620) (← links)
- Models of active learning in group-structured state spaces (Q963065) (← links)
- Active learning in heteroscedastic noise (Q982644) (← links)
- Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path (Q1009248) (← links)
- Exploration-exploitation tradeoff using variance estimates in multi-armed bandits (Q1017665) (← links)
- Module-based reinforcement learning: Experiments with a real robot (Q1267736) (← links)
- Convergence results for single-step on-policy reinforcement-learning algorithms (Q1568533) (← links)
- Toward a classification of finite partial-monitoring games (Q1939263) (← links)
- Training parsers by inverse reinforcement learning (Q1959536) (← links)
- A modular analysis of adaptive (non-)convex optimization: optimism, composite objectives, variance reduction, and variational bounds (Q2290691) (← links)
- Mixing time estimation in reversible Markov chains from a single sample path (Q2330466) (← links)
- Approximate geometry representations and sensory fusion (Q2563859) (← links)
- An asynchronous stochastic approximation theorem and some applications (Q2707252) (← links)
- Efficient approximate planning in continuous space Markovian decision problems (Q2758704) (← links)
- Regularized policy iteration with nonparametric function spaces (Q2834459) (← links)
- On Learning the Optimal Waiting Time (Q2938733) (← links)
- Online Markov Decision Processes Under Bandit Feedback (Q2983230) (← links)
- (Q3096132) (← links)
- Partial Monitoring with Side Information (Q3164828) (← links)
- Tuning Bandit Algorithms in Stochastic Environments (Q3520056) (← links)
- Active Learning in Multi-armed Bandits (Q3529929) (← links)
- Active Learning of Group-Structured Environments (Q3529932) (← links)
- Algorithms for Reinforcement Learning (Q3588852) (← links)
- (Q4258651) (← links)
- Robust control using inverse dynamics neurocontrollers (Q4377237) (← links)
- An integrated architecture for motion-control and path-planning (Q4393202) (← links)
- (Q4515253) (← links)
- A Linearly Relaxed Approximate Linear Program for Markov Decision Processes (Q4567182) (← links)
- (Q4637078) (← links)
- A Modular Analysis of Adaptive (Non-)Convex Optimization: Optimism, Composite Objectives, and Variational Bounds (Q4645676) (← links)
- Stochastic Optimization in a Cumulative Prospect Theory Framework (Q4682345) (← links)
- Toward a Classification of Finite Partial-Monitoring Games (Q4930701) (← links)
- (Q4969242) (← links)
- (Q5053235) (← links)
- Bandit Algorithms (Q5109247) (← links)
- Partial Monitoring—Classification, Regret Bounds, and Algorithms (Q5247607) (← links)
- Learning Near-Optimal Policies with Bellman-Residual Minimization Based Fitted Policy Iteration and a Single Sample Path (Q5307594) (← links)
- (Q5396654) (← links)
- Improved Rates for the Stochastic Continuum-Armed Bandit Problem (Q5434068) (← links)
- Machine Learning: ECML 2004 (Q5450744) (← links)
- Computer Vision - ECCV 2004 (Q5712900) (← links)