Reinforcement Learning  

  (Part of COMPGI13/COMPM050, along with Kernel Methods  by Arthur Gretton)

This page refers to the previous 2016 course at UCL.  

See the UCL Moodle for slides from the current course.

Meeting Time: Thursdays 9:15 am in Roberts 421 (@ University College London)


Joseph Modayil  Course website  ( 

Hado van Hasselt   Course website (

Teaching Assistant: 

Zbigniew Wojna (

Assignment: Easy21 (posted: Feb 26, due: April 6)

Reference Text: Reinforcement Learning (draft 2nd edition) by Sutton and Barto 

Reference Text: Algorithms for Reinforcement Learning by Csaba Szepesvari 

Lecture Slides (with thanks to Dave Silver)

Lecture 1: Introduction to Reinforcement Learning  (slides: Jan 14)
Lecture 2: Exploration and Exploitation in Bandits  (slides: Feb 10)
Lecture 3: Markov Decision Processes  (slides: Jan 27)
Lecture 4: Planning by Dynamic Programming  (revised slides: Feb 6)
Lecture 5: Model-Free Prediction  (slides: Feb 11)

Lecture 6: Model-Free Control  (slides: Feb 25)
Lecture 7: Value Function Approximation  (slides: Mar 11)
Lecture 8: Policy Gradient Methods  (slides: Mar 10)
Lecture 9: Integrating Learning and Planning  (slides: Mar 16)
Lecture 10: Case Study: RL in Classic Games  (Regular lecture in class at 9:15am and guest lecture by David Silver at 1pm in Roberts G06)