Coverart for item
The Resource Algorithms for reinforcement learning, Csaba Szepesvári, (electronic book)

Algorithms for reinforcement learning, Csaba Szepesvári, (electronic book)

Label
Algorithms for reinforcement learning
Title
Algorithms for reinforcement learning
Statement of responsibility
Csaba Szepesvári
Creator
Subject
Language
eng
Summary
Reinforcement learning is a learning paradigm concerned with learning to control a system so as to maximize a numerical performance measure that expresses a long-term objective. What distinguishes reinforcement learning from supervised learning is that only partial feedback is given to the learner about the learner's predictions. Further, the predictions may have long term effects through influencing the future state of the controlled system. Thus, time plays a special role. The goal in reinforcement learning is to develop efficient learning algorithms, as well as to understand the algorithms' merits and limitations. Reinforcement learning is of great interest because of the large number of practical applications that it can be used to address, ranging from problems in artificial intelligence to operations research or control engineering. In this book, we focus on those algorithms of reinforcement learning that build on the powerful theory of dynamic programming. We give a fairly comprehensive catalog of learning problems, describe the core ideas, note a large number of state of the art algorithms, followed by the discussion of their theoretical properties and limitations
Member of
Cataloging source
CaBNvSL
http://library.link/vocab/creatorName
Szepesvári, Csaba.
Illustrations
illustrations
Index
no index present
Literary form
non fiction
Nature of contents
  • dictionaries
  • abstracts summaries
  • bibliography
Series statement
  • Synthesis digital library of engineering and computer science
  • Synthesis lectures on artificial intelligence and machine learning
Series volume
9
http://library.link/vocab/subjectName
Reinforcement learning
Target audience
  • adult
  • specialized
Label
Algorithms for reinforcement learning, Csaba Szepesvári, (electronic book)
Instantiates
Publication
Antecedent source
file reproduced from original
Bibliography note
Includes bibliographical references (p. 73-88)
Color
multicolored
Contents
  • 1. Markov decision processes -- Preliminaries -- Markov decision processes -- Value functions -- Dynamic programming algorithms for solving MDPs --
  • 2. Value prediction problems -- Temporal difference learning in finite state spaces -- Tabular TD(0) -- Every-visit Monte-Carlo -- TD([lambda]): unifying Monte-Carlo and TD(0) -- Algorithms for large state spaces -- TD([lambda]) with function approximation -- Gradient temporal difference learning -- Least-squares methods -- The choice of the function space --
  • 3. Control -- A catalog of learning problems -- Closed-loop interactive learning -- Online learning in bandits -- Active learning in bandits -- Active learning in Markov decision processes -- Online learning in Markov decision processes -- Direct methods -- Q-learning in finite MDPs -- Q-learning with function approximation -- Actor-critic methods -- Implementing a critic -- Implementing an actor --
  • 4. For further exploration -- Further reading -- Applications -- Software --
  • A. The theory of discounted Markovian decision processes -- A.1. Contractions and Banach's fixed-point theorem -- A.2. Application to MDPs -- Bibliography -- Author's biography
Dimensions
unknown
Extent
1 electronic text (xii, 89 p. : ill.)
File format
multiple file formats
Form of item
electronic
Isbn
9781608454938
Level of compression
unknown
Other physical details
digital file. ;
Quality assurance targets
unknown
Reformatting quality
access
Specific material designation
remote
System details
System requirements: Adobe Acrobat Reader
Label
Algorithms for reinforcement learning, Csaba Szepesvári, (electronic book)
Publication
Antecedent source
file reproduced from original
Bibliography note
Includes bibliographical references (p. 73-88)
Color
multicolored
Contents
  • 1. Markov decision processes -- Preliminaries -- Markov decision processes -- Value functions -- Dynamic programming algorithms for solving MDPs --
  • 2. Value prediction problems -- Temporal difference learning in finite state spaces -- Tabular TD(0) -- Every-visit Monte-Carlo -- TD([lambda]): unifying Monte-Carlo and TD(0) -- Algorithms for large state spaces -- TD([lambda]) with function approximation -- Gradient temporal difference learning -- Least-squares methods -- The choice of the function space --
  • 3. Control -- A catalog of learning problems -- Closed-loop interactive learning -- Online learning in bandits -- Active learning in bandits -- Active learning in Markov decision processes -- Online learning in Markov decision processes -- Direct methods -- Q-learning in finite MDPs -- Q-learning with function approximation -- Actor-critic methods -- Implementing a critic -- Implementing an actor --
  • 4. For further exploration -- Further reading -- Applications -- Software --
  • A. The theory of discounted Markovian decision processes -- A.1. Contractions and Banach's fixed-point theorem -- A.2. Application to MDPs -- Bibliography -- Author's biography
Dimensions
unknown
Extent
1 electronic text (xii, 89 p. : ill.)
File format
multiple file formats
Form of item
electronic
Isbn
9781608454938
Level of compression
unknown
Other physical details
digital file. ;
Quality assurance targets
unknown
Reformatting quality
access
Specific material designation
remote
System details
System requirements: Adobe Acrobat Reader

Library Locations

Processing Feedback ...