## Deep Reinforcement Learning: Model Based Reinforcement Learning

** Published:**

** Published:**

In this post, we review the basic policy gradient algorithm for deep reinforcement learning and the actor-critic algorithm. Most of the contents are derived from CS 285 at UC Berkeley. ** Read more**

** Published:**

In this post, we will continue on our discuss of mirror descent. We will present a variant of mirror descent: the lazy mirror descent, also known as Nesterov’s dual averaging. ** Read more**

** Published:**

In this post, we describe a new geometry dependent algorithm that relies on different set of assumptions. The algorithm is called conditional gradient descent, aka Frank-Wolfe. ** Read more**

** Published:**

In this post, we will introduce the Mirror Descent algorithm that solves the convex optimization algorithm. ** Read more**

** Published:**

In this post, we will continue our analysis for gradient descent. Different from the previous post, we will not assume that the function is smooth. We will only assume that the function is convex and has some Lipschitz constant. ** Read more**

** Published:**

In this post, we will review the most basic and the most intuitive optimization method – the gradient decent method – in optimization. ** Read more**

** Published:**

Recently, I find an interesting course taught by Prof. Yin Tat Lee at UW. The course is called `Theory of Optimization and Continuous Algorithms’, and the lecture notes are available under the homepage of this courseuw-cse535-winter19. As a great fan of optimization theory and algorithm design, I think I will follow this course and write a bunch of blogs to record my study of this course. Most of the materials in this series of blogs will follow the lecture notes of the course, and and interesting optimization book Convex Optimization: Algorithms and Complexity by Sebastien Bubeck. Since this is the first blog about this course, I will present the preliminaries of the optimization theory, and some basic knowledge about convex optimization, including some basic properties of convex functions. ** Read more**