scientific article

zbMath0659.62086MaRDI QIDQ3809068

Publication date: 1985

Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.

bandit problems sequential decision annotated bibliography Minimax approach Continuous-time bandits independent Bernoulli arms uniform discounting

Mathematics Subject Classification ID

Dynamic programming (90C39) Research exposition (monographs, survey articles) pertaining to statistics (62-02) Stochastic games, stochastic differential games (91A15) Markov and semi-Markov decision processes (90C40) Sequential statistical design (62L05)

Related Items

Two-Armed Bandit Strategies that Discount Past and Future, A linear response bandit problem, Optimistic Gittins Indices, Covariate models for bernoulli bandits, Optimal assignment of sellers in a store with a random number of clientsviathe Armed Bandit model, Optimal allocations in sequential tests involving two populations with covariates, Multistage decission problems, Daisee: Adaptive importance sampling by balancing exploration and exploitation, Topp-Leone distribution with an application to binomial sampling, A central limit theorem, loss aversion and multi-armed bandits, Bayesian bandits in clinical trials, Unnamed Item, ON THE IDENTIFICATION AND MITIGATION OF WEAKNESSES IN THE KNOWLEDGE GRADIENT POLICY FOR MULTI-ARMED BANDITS, Recent advances in reinforcement learning in finance, A combinatorial multi-armed bandit approach to correlation clustering, Poissonian two-armed bandit: a new approach, Optimal Exploration–Exploitation in a Multi-armed Bandit Problem with Non-stationary Rewards, GROUP SEQUENTIAL TESTS WITH OUTCOME-DEPENDENT TREATMENT ASSIGNMENT, A confirmation of a conjecture on Feldman’s two-armed bandit problem, Learning the distribution with largest mean: two bandit frameworks, Unnamed Item, Per-Round Knapsack-Constrained Linear Submodular Bandits, Unnamed Item, BANDIT STRATEGIES EVALUATED IN THE CONTEXT OF CLINICAL TRIALS IN RARE LIFE-THREATENING DISEASES, Parametric continuity in dynamic programming problems, Electing monetary policymakers according to inflation performance, Learning in Combinatorial Optimization: What and How to Explore, An Approximation Approach for Response-Adaptive Clinical Trial Design, Unnamed Item, Parametric continuity in dynamic programming problems, The interest sensitivity of wealth in the life cycle model, A Two-Armed Bandit Problem with possibility of no Information, The lob-pass problem, Ethics, data-dependent designs, and the strategy of clinical trials: time to start learning-as-we-go?, Dynamic Pricing with a Poisson Bandit Model, A modification of the stochastic ruler method for discrete stochastic optimization, Adaptive Incentive-Compatible Sponsored Search Auction, Minimum principles in motor control., Infinite Arms Bandit: Optimality via Confidence Bounds, Modifications in the discount sequence for bandit processes, An asymptotically optimal heuristic for general nonstationary finite-horizon restless multi-armed, multi-action bandits, Randomized allocation with arm elimination in a bandit problem with covariates, On the Worth of Perfect Information in Bandits with Random Discounting, An optimal investment and consumption model with stochastic returns, Signaling Games, Two-Armed Restless Bandits with Imperfect Information: Stochastic Control and Indexability, When to Abandon a Research Project and Search for a New One, Contribution of Milton Sobel in Selection Problem Following Ethical Allocation, Study of Optimal Adaptive Rule in Testing Composite Hypothesis, Generalized Bandit Problems, Optimizing a Unimodal Response Function for Binary Variables, Optimal investment and consumption with stochastic dividends, Discussion on “A Hybrid Selection and Testing Procedure with Curtailment for Comparative Clinical Trials” by Elena M. Buzaianu and Pinyuen Chen, Gittins Index for Simple Family of Markov Bandit Processes with Switching Cost and No Discounting, Bayesian Uncertainty Directed Trial Designs, Dynamic allocation policies for the finite horizon one armed bandit problem, Comparing the relative efficiency of three sequential procedures, Parameter Estimation: The Proper Way to Use Bayesian Posterior Processes with Brownian Noise, Batched bandit problems, Bayesian economists \dots Bayesian agents. An alternative approach to optimal learning, On the generic nonconvergence of Bayesian actions and beliefs, Woodroofe's one-armed bandit problem revisited, Equilibrium learning in simple contests, The optimal sequential information acquisition structure: a rational utility-maximizing perspective, Infomax strategies for an optimal balance between exploration and exploitation, Celebrating 70: an interview with Don Berry, Optimal selection of obsolescence mitigation strategies using a restless bandit model, Bandit and covariate processes, with finite or non-denumerable set of arms, Response-adaptive designs for clinical trials: simultaneous learning from multiple patients, Optimal policies to obtain the most join results, Approximating the operating characteristics of Bayesian uncertainty directed trial designs, Keeping your options open, Open problems in universal induction \& intelligence, Wisdom of crowds versus groupthink: learning in groups and in isolation, Bandit bounds from stochastic variability extrema, Learning dynamic algorithm portfolios, A value function arising in the economics of information, Will truth out? -- An advisor's quest to appear competent, PALO: a probabilistic hill-climbing algorithm, On determining the importance of attributes with a stopping problem, On a theorem of Kelley, Two-armed bandit problem for parallel data processing systems, Learning dynamics with private and public signals, The multi-armed bandit problem: an efficient nonparametric solution, An index-based deterministic convergent optimal algorithm for constrained multi-armed bandit problems, Experimentation and competition, One-armed bandit process with a covariate, General time consistent discounting, Probably approximately optimal satisficing strategies, Optimal experimental design for a class of bandit problems, Gaussian two-armed bandit and optimization of batch data processing, Permissive planning: Extending classical planning to uncertain task domains., An application of Edgeworth expansion in Bayesian inferences: Optimal sample sizes in clinical trials, Customization of J. Bather's UCB strategy for a Gaussian multiarmed bandit, Adaptive selection of query execution strategies by learning automata, Optimal Bayesian strategies for the infinite-armed Bernoulli bandit, On the equivalence of optimal recommendation sets and myopically optimal query sets, Design issues for generalized linear models: a review, Branching bandits: A sequential search process with correlated pay-offs., Modeling item-item similarities for personalized recommendations on Yahoo! front page, Nonstochastic bandits: Countable decision set, unbounded costs and reactive environments, The role of externalities and information aggregation in market collapse, Ambiguity aversion in multi-armed bandit problems, Some results on two-stage clinical trials, Computational exploration of the biological basis of Black-Scholes expected utility function, Optimal strategies for a class of sequential control problems with precedence relations, Randomized prediction of individual sequences, The K-armed bandit problem with multiple priors, A program for sequential allocation of three Bernoulli populations, Online linear optimization and adaptive routing, Clinical Trials with Exponential Survival Times, A generalized Gittins index for a Markov chain and its recursive calculation, Stochastic dominance under Bayesian learning, A comparative study of ad hoc techniques and evolutionary methods for multi-armed bandit problems, Bayesian learning in normal form games, On Bayesian index policies for sequential resource allocation, Linear learning in changing environments, Reinforcement learning and evolutionary algorithms for non-stationary multi-armed bandit problems, Is the FDA too conservative or too aggressive?: a Bayesian decision analysis of clinical trial design, Sequential design of computer experiments for the estimation of a probability of failure, Exploration and correlation, Evaluation of asymptotic approximations for a two-stage Bernoulli bandit, Randomized allocation with nonparametric estimation for contextual multi-armed bandits with delayed rewards, Bayesian statistics and the efficiency and ethics of clinical trials, Bayesian parameter estimation in the Expectancy Valence model of the Iowa gambling task, A note on infinite-armed Bernoulli bandit problems with generalized beta prior distributions, An optimal strategy for sequential classification on partially ordered sets, A randomly reinforced urn, Hold or roll: reaching the goal in jeopardy race games, On the optimal amount of experimentation in sequential decision problems, The disorder problem for compound Poisson processes with exponential jumps, Sensitivity of the gittins index in the contiuous time two-armed bandit problem, Foregone with the wind: Indirect payoff information and its implications for choice, Simulation-based sequential Bayesian design, Simulation of adaptive response: A model of drug interdiction, Delegation and the regulation of risk, Utility-based on-line exploration for repeated navigation in an embedded graph, A model of experimentation with information externalities, Risk aversion in expected intertemporal discounted utilities bandit problems, Gittins' theorem under uncertainty, Theoretical tools for understanding and aiding dynamic decision making, A Bayesian analysis of human decision-making on bandit problems, Two-armed bandit problem and batch version of the mirror descent algorithm, Dismemberment and design for controlling the replication variance of regret for the multi-armed bandit, An asymptotically optimal strategy for constrained multi-armed bandit problems, Economists' models of learning, Multi-armed bandits in discrete and continuous time, Small-sample performance of Bernoulli two-armed bandit Bayesian strategies, Two armed bandits with change point in one arm, Which one should I imitate?, Randomized allocation with nonparametric estimation for a multi-armed bandit problem with covariates, The pure exploration problem with general reward functions depending on full distributions, Choosing among alternative discrete investment projects under uncertainty, Nonparametric bandit methods, Adaptive approaches to stochastic programming, Multi-armed bandit models for the optimal design of clinical trials: benefits and challenges, One-armed bandit problem for parallel data processing systems, Asymptotic properties of bandit processes with geometric responses.