Deep Reinforcement Learning: Model Based Reinforcement Learning
Published:
Planning]; id1-->id3[Background
Planning]; id2-->id4[Continuous
Actions]; id2-->id5[Discrete
Actions]; id4-->id6[Shooting]; id4-->id7[Collocation]; id3-->id8[Simulate
Environment]; id3-->id9[Assist Learning
Algorithm]; id6-->id10[iLQR
DDP]:::methods; id7-->id11[Direct collocation
STOMP]:::methods; id5-->id12[Heuristic search
MCTS]:::methods; id8-->id13[DYNA
MVE
MBPO]:::methods; id9-->id14[Policy backprop
SVG
Dreamer]:::methods; classDef methods fill:#f96;
Optimal Control and Planning
What if we knew the transition dynamics
Often we do know the dynamics
- Games (e.g. Go)
- Easily modeled systems (e.g., navigating a car)
- Simulated environments (e.g, simulated robots, video games) Often we learn the dynamics
- System identification - fit unknown parameters of a known model
- Learning - fit a general purpose model to observed transition data
Model-based reinforcement learning Model-based reinforcement learning: learn the transition dynamics, then figure out how to choose actions