- Format
- Inbunden (Hardback)
- Språk
- Engelska
- Antal sidor
- 404
- Utgivningsdatum
- 2012-04-27
- Upplaga
- 1
- Förlag
- John Wiley & Sons Inc
- Medarbetare
- Powell/Ryzhov
- Illustrationer
- Illustrations
- Dimensioner
- 236 x 160 x 28 mm
- Vikt
- Antal komponenter
- 1
- Komponenter
- 52:B&W 6.14 x 9.21in or 234 x 156mm (Royal 8vo) Case Laminate on White w/Gloss Lam
- ISBN
- 9780470596692
- 676 g
Du kanske gillar
-
Machine Learning
Sergios Theodoridis
InbundenAdobe Premiere Pro Classroom in a Book (2020 release)
Maxim Jago
Mixed media productRefactoring
Martin Fowler
InbundenOptimal Learning
1359Skickas inom 7-10 vardagar.
Fri frakt inom Sverige för privatpersoner.Passar bra ihop
De som köpt den här boken har ofta också köpt Approximate Dynamic Programming av Warren B Powell (inbunden).
Köp båda 2 för 2416 krKundrecensioner
Har du läst boken? Sätt ditt betyg »Fler böcker av författarna
-
Approximate Dynamic Programming
Warren B Powell
Understanding approximate dynamic programming (ADP) in large industrial settings helps develop practical and high-quality solutions to problems that involve making decisions in the presence of uncertainty. With a focus on modeling and algorithms i...
-
Handbook of Learning and Approximate Dynamic Programming
Jennie Si, Andrew G Barto, Warren B Powell, Don Wunsch
* A complete resource to Approximate Dynamic Programming (ADP), including on-line simulation code* Provides a tutorial that readers can use to start implementing the learning algorithms provided in the book* Includes ideas, directions, and recent ...
Recensioner i media
He concludes, "This book collects a number of interesting ideas in optimal learning, allows for connections to be made across disciplines, and is a welcome addition to my bookshelf. (Informs Journal on Computing, 1 October 2012)
Övrig information
WARREN B. POWELL, PhD, is Professor of Operations Research and Financial Engineering at Princeton University, where he is founder and Director of CASTLE Laboratory, a research unit that works with industrial partners to test new ideas found in operations research. The recipient of the 2004 INFORMS Fellow Award, Dr. Powell is the author of Approximate Dynamic Programming: Solving the Curses of Dimensionality, Second Edition (Wiley). ILYA O. RYZHOV, PhD, is Assistant Professor in the Department of Decision, Operations, and Information Technologies at the Robert H. Smith School of Business at the University of Maryland. He has made fundamental contributions to bridge the fields of ranking and selection with multiarmed bandits and optimal learning with mathematical programming.
Innehållsförteckning
Preface xv Acknowledgments xix 1 The challenges of learning 1 1.1 Learning the best path 2 1.2 Areas of application 4 1.3 Major problem classes 12 1.4 The different types of learning 13 1.5 Learning from different communities 16 1.6 Information collection using decision trees 18 1.6.1 A basic decision tree 18 1.6.2 Decision tree for offline learning 20 1.6.3 Decision tree for online learning 21 1.6.4 Discussion 25 1.7 Website and downloadable software 26 1.8 Goals of this book 26 Problems 28 2 Adaptive learning 31 2.1 The frequentist view 32 2.2 The Bayesian view 33 2.2.1 The updating equations for independent beliefs 34 2.2.2 The expected value of information 36 2.2.3 Updating for correlated normal priors 38 2.2.4 Bayesian updating with an uninformative prior 41 2.3 Updating for non-Gaussian priors 42 2.3.1 The gamma-exponential model 43 2.3.2 The gamma-Poisson model 44 2.3.3 The Pareto-uniform model 45 2.3.4 Models for learning probabilities* 46 2.3.5 Learning an unknown variance* 49 2.4 Monte Carlo simulation 51 2.5 Why does it work?* 54 2.5.1 Derivation of ~- 54 2.5.2 Derivation of Bayesian updating equations for independent beliefs 55 2.6 Bibliographic notes 57 Problems 57 3 The economics of information 61 3.1 An elementary information problem 61 3.2 The marginal value of information 65 3.3 An information acquisition problem 68 3.4 Bibliographic notes 70 Problems 70 4 Ranking and selection 71 4.1 The model 72 4.2 Measurement policies 75 4.2.1 Deterministic vs. sequential policies 75 4.2.2 Optimal sequential policies 76 4.2.3 Heuristic policies 77 4.3 Evaluating policies 81 4.4 More advanced topics* 83 4.4.1 An alternative representation of the probability space 83 4.4.2 Equivalence of using true means and sample estimates 84 4.5 Bibliographic notes 85 Problems 85 5 The knowledge gradient 89 5.1 The knowledge gradient for independent beliefs 90 5.1.1 Computation 91 5.1.2 Some properties of the knowledge gradient 93 5.1.3 The four distributions of learning 94 5.2 The value of information and the S-curve effect 95 5.3 Knowledge gradient for correlated beliefs 98 5.4 The knowledge gradient for some non-Gaussian distributions 103 5.4.1 The gamma-exponential model 104 5.4.2 The gamma-Poisson model 107 5.4.3 The Pareto-uniform model 108 5.4.4 The beta-Bernoulli model 109 5.4.5 Discussion 111 5.5 Relatives of the knowledge gradient 112 5.5.1 Expected improvement 113 5.5.2 Linear loss* 114 5.6 Other issues 116 5.6.1 Anticipatory vs. experiential learning 117 5.6.2 The problem of priors 118 5.6.3 Discussion 120 5.7 Why does it work?* 121 5.7.1 Derivation of the knowledge gradient formula 121 5.8 Bibliographic notes 125 Problems 126 6 Bandit problems 139 6.1 The theory and practice of Gittins indices 141 6.1.1 Gittins indices in the beta-Bernoulli model 142 6.1.2 Gittins indices in the normal-normal model 145 6.1.3 Approximating Gittins indices 147 6.2 Variations of bandit problems 148 6.3 Upper confidence bounding 149 6.4 The knowledge gradient for bandit problems 151 6.4.1 The basic idea 151 6.4.2 Some experimental comparisons 153 6.4.3 Non-normal models 156 6.5 Bibliographic notes 157 Problems 157 7 Elements of a learning problem 163 7.1 The states of our system 164 7.2 Types of decisions 166 7.3 Exogenous information 167 7.4 Transition functions 168 7.5 Objective functions 168 7.5.1 Designing versus controlling 168 7.5.2 Measurement costs 170 7.5.3 Objectives 170 7.6 Evaluating policies 175 7.7 Discussion 177 7.8 Bibliographic notes 178 Problems 178 8 Linear belief models 181 8.1 Applications 182 8.1.1 Maximizing ad c