The following pages link to Benjamin Van Roy (Q399882):
Displaying 46 items.
- Learning a factor model via regularized PCA (Q399883) (← links)
- (Q643268) (redirect page) (← links)
- Industry dynamics: foundations for models with an infinite number of firms (Q643269) (← links)
- On regression-based stopping times (Q708889) (← links)
- A short proof of optimality for the MIN cache replacement algorithm (Q845965) (← links)
- A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning (Q859737) (← links)
- Decentralized decision-making in a large team with local information. (Q1399517) (← links)
- (Q1586802) (redirect page) (← links)
- On the existence of fixed points for approximate value iteration and temporal-difference learning (Q1586803) (← links)
- On average versus discounted reward temporal-difference learning (Q1604814) (← links)
- Average cost temporal-difference learning (Q1805802) (← links)
- Feature-based methods for large scale dynamic programming (Q1911341) (← links)
- Adaptive execution: exploration and learning of price impact (Q2795866) (← links)
- An information-theoretic analysis of Thompson sampling (Q2810878) (← links)
- Resource allocation via message passing (Q2899114) (← links)
- Directed Principal Component Analysis (Q2931712) (← links)
- Control of Diffusions via Linear Programming (Q3001282) (← links)
- Computational Methods for Oblivious Equilibrium (Q3098318) (← links)
- Dynamic Pricing with a Prior on Market Response (Q3100446) (← links)
- Investment and Market Structure in Industries with Congestion (Q3100487) (← links)
- Manipulation Robustness of Collaborative Filtering (Q3117324) (← links)
- A Nonparametric Approach to Multiproduct Pricing (Q3391963) (← links)
- Consensus Propagation (Q3548109) (← links)
- Markov Perfect Industry Dynamics With Many Firms (Q3548509) (← links)
- (Q3590801) (← links)
- Capacity of the Trapdoor Channel With Feedback (Q3604464) (← links)
- The Linear Programming Approach to Approximate Dynamic Programming (Q3637395) (← links)
- An analysis of temporal-difference learning with function approximation (Q4362297) (← links)
- Optimal stopping of Markov processes: Hilbert space theory, approximation algorithms, and an application to pricing high-dimensional financial derivatives (Q4506926) (← links)
- An analysis of belief propagation on the turbo decoding graph with Gaussian densities (Q4544518) (← links)
- (Q4547446) (← links)
- A Tutorial on Thompson Sampling (Q4556183) (← links)
- Learning to Optimize via Information-Directed Sampling (Q4969321) (← links)
- Convergence of Min-Sum Message Passing for Quadratic Optimization (Q4975868) (← links)
- (Q5201298) (← links)
- (Q5214215) (← links)
- Learning to Optimize via Posterior Sampling (Q5247618) (← links)
- Universal Reinforcement Learning (Q5281503) (← links)
- Convergence of Min-Sum Message-Passing for Convex Optimization (Q5281554) (← links)
- Algorithms and Models for the Web-Graph (Q5311180) (← links)
- Efficient Reinforcement Learning in Deterministic Systems with Value Function Generalization (Q5359119) (← links)
- Performance Loss Bounds for Approximate Value Iteration with State Aggregation (Q5387976) (← links)
- A Cost-Shaping Linear Program for Average-Cost Approximate Dynamic Programming with Performance Guarantees (Q5387999) (← links)
- (Q5477860) (← links)
- On Constraint Sampling in the Linear Programming Approach to Approximate Dynamic Programming (Q5704184) (← links)
- Satisficing in Time-Sensitive Bandit Learning (Q5870357) (← links)