# Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

## Theory of Optimization: More on Mirror Descent

Published:

In this post, we will continue on our discuss of mirror descent. We will present a variant of mirror descent: the lazy mirror descent, also known as Nesterov’s dual averaging. Read more

## Theory of Optimization: Frank-Wolfe Algorithm

Published:

In this post, we describe a new geometry dependent algorithm that relies on different set of assumptions. The algorithm is called conditional gradient descent, aka Frank-Wolfe. Read more

## Theory of Optimization: Mirror Descent

Published:

In this post, we will introduce the Mirror Descent algorithm that solves the convex optimization algorithm. Read more

## Theory of Optimization: Projected (Sub)Gradient Descent

Published:

In this post, we will continue our analysis for gradient descent. In the previous lecture, we assume that all of the functions has $L$-Lipschitz gradient. For general $L$-smooth functions, the gradient descent algorithm can get a first order $\epsilon$-critical proint in $O(\frac{1}{\epsilon^2})$ iteration. When the function is convex, we show that we can use $O(\frac{1}{\epsilon})$ iterations to get a solution differ from the optimal for at most $\epsilon$. When the function is strongly convex and smooth, we show that the number of iterations can be reduced to $O(poly(\log \frac{1}{\epsilon}))$. However, in the previous post, we assume that the function is smooth, which implies that the function has gradient at all points. In this post, we will first assume that the function is convex but not smooth. Besides, in the previous post, we also focus on the unconstraint case, and in this posts, we will also introduce the analysis for constraint minimization. Read more

## Theory of Optimization: Gradient Descent

Published:

In this post, we will review the most basic and the most intuitive optimization method – the gradient decent method – in optimization.

The gradient descent algorithm works as follow: The algorithm requires an initial point $x^{(0)}\in\mathbb R^n$ and step size $h > 0$. Then the algorithm repeats to execute: until $||\nabla f(x^{(t)})|| \le \epsilon$. In the following of this section, we will assume that the gradient of $f$ is L-lipschitz, i.e. we call $f$ is L-lipschitz gradient or L-smooth. Read more

## Theory of Optimization: Preliminaries and Basic Properties

Published:

Recently, I find an interesting course taught by Prof. Yin Tat Lee at UW. The course is called `Theory of Optimization and Continuous Algorithms’, and the lecture notes are available under the homepage of this courseuw-cse535-winter19. As a great fan of optimization theory and algorithm design, I think I will follow this course and write a bunch of blogs to record my study of this course. Most of the materials in this series of blogs will follow the lecture notes of the course, and and interesting optimization book Convex Optimization: Algorithms and Complexity by Sebastien Bubeck. Since this is the first blog about this course, I will present the preliminaries of the optimization theory, and some basic knowledge about convex optimization, including some basic properties of convex functions. Read more

## Portfolio item number 1

Short description of portfolio item number 1

## Portfolio item number 2

Short description of portfolio item number 2

## An FPTAS for Stochastic Unbounded Min-Knapsack Problem

Published in International Frontiers of Algorithmics Workshop (FAW 2019), 2019

## Stochastic One-Sided Full-Information Bandit

Published in The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD 2019), 2019

## Gradient Method for Continuous Influence Maximization with Budget-Saving Considerations

Published in AAAI Conference on Artificial Intelligence (AAAI 2020), 2019

## Online Second Price Auction with Semi-bandit Feedback Under the Non-Stationary Setting

Published in AAAI Conference on Artificial Intelligence (AAAI 2020), 2019

## Mildly Overparametrized Neural Nets can Memorize Training Data Efficiently

Published in Preprint, 2019

## Oral presentation at ECML/PKDD 2019

Published:

In this talk, I presented my work with Prof. Wei Chen @MSRA on our paper Stochastic One-Sided Full-Information Bandit. The paper can be downloaded here. Read more

## Oral presentation at AAAI 2020

Published:

In this talk, I presented my work with Prof. Wei Chen @MSRA on our paper Online Second Price Auction with Semi-bandit Feedback Under the Non-Stationary Setting. Because of the virus in China, I cannot go the the AAAI main conference, and I will give my oral presentation remotely. The paper can be downloaded here. The PPT is available at here. Read more

## Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

## Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.