10. Inverse RL (IRL)

1. The Reward Problem

Designing a perfect Reward Function is hard. (The "King Midas" problem). If you reward "Speed", the car crashes.

2. Learning from Humans

Instead of guessing the reward, learn it from expert demonstrations. Observing a human driver entails inferring what they are trying to optimize (Safety + Speed).

Report an issue or suggest an improvement

Mark as Completed

Previous Next Lesson