Prediction and Control with Function Approximation
University of Alberta and Alberta Machine Intelligence Institute via Coursera
-
2.8k
-
- Write review
Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
In this course, you will learn how to solve problems with large, high-dimensional, and potentially infinite state spaces. You will see that estimating value functions can be cast as a supervised learning problem---function approximation---allowing you to build agents that carefully balance generalization and discrimination in order to maximize reward. We will begin this journey by investigating how our policy evaluation or prediction methods like Monte Carlo and TD can be extended to the function approximation setting. You will learn about feature construction techniques for RL, and representation learning via neural networks and backprop. We conclude this course with a deep-dive into policy gradient methods; a way to learn policies directly without learning a value function. In this course you will solve two continuous-state control tasks and investigate the benefits of policy gradient methods in a continuous-action environment.
Prerequisites: This course strongly builds on the fundamentals of Courses 1 and 2, and learners should have completed these before starting this course. Learners should also be comfortable with probabilities & expectations, basic linear algebra, basic calculus, Python 3.0 (at least 1 year), and implementing algorithms from pseudocode.
By the end of this course, you will be able to:
-Understand how to use supervised learning approaches to approximate value functions
-Understand objectives for prediction (value estimation) under function approximation
-Implement TD with function approximation (state aggregation), on an environment with an infinite state space (continuous state space)
-Understand fixed basis and neural network approaches to feature construction
-Implement TD with neural network function approximation in a continuous state environment
-Understand new difficulties in exploration when moving to function approximation
-Contrast discounted problem formulations for control versus an average reward problem formulation
-Implement expected Sarsa and Q-learning with function approximation on a continuous state control task
-Understand objectives for directly estimating policies (policy gradient objectives)
-Implement a policy gradient method (called Actor-Critic) on a discrete state environment
Syllabus
- Welcome to the Course!
- Welcome to the third course in the Reinforcement Learning Specialization: Prediction and Control with Function Approximation, brought to you by the University of Alberta, Onlea, and Coursera. In this pre-course module, you'll be introduced to your instructors, and get a flavour of what the course has in store for you. Make sure to introduce yourself to your classmates in the "Meet and Greet" section!
- On-policy Prediction with Approximation
- This week you will learn how to estimate a value function for a given policy, when the number of states is much larger than the memory available to the agent. You will learn how to specify a parametric form of the value function, how to specify an objective function, and how estimating gradient descent can be used to estimate values from interaction with the world.
- Constructing Features for Prediction
- The features used to construct the agent’s value estimates are perhaps the most crucial part of a successful learning system. In this module we discuss two basic strategies for constructing features: (1) fixed basis that form an exhaustive partition of the input, and (2) adapting the features while the agent interacts with the world via Neural Networks and Backpropagation. In this week’s graded assessment you will solve a simple but infinite state prediction task with a Neural Network and TD learning.
- Control with Approximation
- This week, you will see that the concepts and tools introduced in modules two and three allow straightforward extension of classic TD control methods to the function approximation setting. In particular, you will learn how to find the optimal policy in infinite-state MDPs by simply combining semi-gradient TD methods with generalized policy iteration, yielding classic control methods like Q-learning, and Sarsa. We conclude with a discussion of a new problem formulation for RL---average reward---which will undoubtedly be used in many applications of RL in the future.
- Policy Gradient
- Every algorithm you have learned about so far estimates a value function as an intermediate step towards the goal of finding an optimal policy. An alternative strategy is to directly learn the parameters of the policy. This week you will learn about these policy gradient methods, and their advantages over value-function based methods. You will also learn how policy gradient methods can be used to find the optimal policy in tasks with both continuous state and action spaces.
Taught by
Martha White and Adam White
Reviews
4.7 rating, based on 22 Class Central reviews
4.8 rating at Coursera based on 820 ratings
Showing Class Central Sort
-
The community support for the course, and in fact for the entire specialization is next to negligible. More than 50% of the course content is simply based on re-iterating what has been mentioned in the reference book, and the explanations are even more shallow than what have been presented in the book.
-
It was an excellent course. I enjoyed a lot the learning. The concepts were very helpful and useful. It would be better if the libraries used were open-sourced or could be taught for students to build themselves, so that the students could be applying the learnings to their real-world problems.
-
As with the two prior classes in this specialization, I really appreciated this third class on prediction and control with function approximation. The lectures really help clarify the material that is presented in the book and the programming assignments and quizzes challenge you to understand the equations and how the updates are calculated. I actually figured out the value of subtracting off the baseline from softmax in a very real way during the quiz when I was calculating valued like e^(-44) and e^(-42) instead of "1" and e^(-2). Examples make it real.... Thanks so much to Martha and Adam for the effort they put into presenting the material in a clear way.
-
The course is really fine. I suggest you to further improve the Tile coding section and in the assignment 4, the computation of delta is quite confusing.
-
I really enjoyed this class. A mind blowing tour of the main algorithms used for continuous online use cases. Very clearly articulated lectures. Big congrats to Martha and Adam!
-
BEFORE this course: I’ve done a number of Coursera courses before. Whilst they are good, the level of learning tends to be superficial. THIS course the third of four courses. These are the best courses I’ve taken and I now feel I have learnt a ve…
-
Very good course for learning Reinforcement Learning. The instructors are very good and the approach of teaching is bust suited to understand the subject properly.
-
Excellent instructors!
And the textbook Reinforcement Learning: An Introduction is a masterpiece in itself. -
The course is very concise and to the point. It covers all the necessary aspects and tries its best to be in sync with the reinforcement learning book by Sutton. The instructors are well experienced and know how to present an idea in an easy but elegant manner. The weekly quiz and projects are challenging and will surely test the reader's understanding of the course. If you are new to reinforcement learning, I would really recommend this course along with it's two other courses before this in the Reinforcement Learning Specialization by University of Alberta. All in all you will have a great time learning this course.
-
Almost perfect, except two ~minor objections:
1/ the learning content between the 4 weeks is quite unbalanced. The initial weeks of the course are well sized, whereas week #3 and week #4 feel a touch light. It feels like the Instructors rushed to make the Course available online, and didn't have time to put as much content as they wished in the last weeks of the Course
2/ there are too many typos in some notebooks (specifically notebook of week #3). It gives the impression it was made in a rush, and nobody read over it again. Besides there seems to currently be some issue with this assignment -
Definitely a course to take to learn the ropes of RL. For this course, it is critical to follow the math. 4 stars instead of 5 only because the math could be made easier to follow with some extra effort from the tutors. But if you're strong in math, you should be fine. The math itself is not difficult, but the notation is challenging and the terminology is a bit tough to keep in head.
-
The instructors do a great job summarizing and being concise while following Sutton & Barto's RL introduction book.
The programming exercises, done via jupyter notebooks, really help to consolidate the theoretical knowledge acquired during the lessons and by reading the book.
Highly recommended course for anyone interested in getting a practical introduction to RL algorithms. -
This course is very rich of both mathematical and practical concepts, and it actually provides you with powerful tools to understand and use Reinforcement Learning. So far, it is the most interesting course in this specialization. Lectures are very clear and they often explain more deeply some concepts you find in the text book. Quizzes are challenging and well constructed.
-
I really enjoyed this third course of the specialisation.
The content and explanations are very helpful in building your intuition around quite complex concepts of RL with approximation. Quizzes and programming exercises are challenging enough to help you grasp necessary concepts and get hands on experience. Look forward to the next course in the specialisation. -
I really enjoyed taking this course and learned a lot. The Reinforcement Learning Specialization (https://www.coursera.org/specializations/reinforcement-learning) is a great introduction to reinforcement learning. This course is the third one in the specialization. All programming assignments are in Python.
-
This course covers a wide variety of topics and dives a good amount into each of them.
I wish the instructors would cover some of the topic and the math in a little more detail, and some of the content seems a tiny bit rushed, but otherwise, a brilliant course overall. -
Really engaging and interesting course. Amazingly talented instructors and equally amazing content. A must for those who are learning reinforcement learning or those who want to expand their knowledge in the field.
-
Amazing course with amazing, intuitive visualizations. It is clear that the instructors have spent a lot of time and effort in trying to make the course as visually descriptive as possible.
-
Nice course that is part of 3 more courses. All of the together cover a wide area in RL. Beginner in maths can easily follow. It's good to know some python before you start (very basic level)
-
The lectures are really dense. You have to slow down and watch multiple times in conjunction with the book to really get it.
Very abstract but the coding assignments are helpful.