Introduction
“World modeling” has become a woefully overloaded term; people use it to describe maps of the world and engines. In this post, we use the current popular usage of the term: learned dynamics models that predict a future state given the current state and an action . Specifically, we will implement some tiny world models step-by-step to develop some intuition.
Why do we even care about world models? Having a good world model is an insanely powerful thing. Everything in robotics (or embodied intelligence, as the cool kids call it) depends on having a good understanding of the world around us; predicting future events well is one of the best ways to prove a good understanding.
we often need bespoke sensors to measure the state of the system at a given point in time. For example, the angle of a pendulum is typically measured using an encoder in an inverted pendulum balancing task, while a potentiometer and an encoder provide observability in a ball-and-beam system. The
Further Reading
- Yann LeCun’s definition of world model
- What Is a World Model? | NVIDIA Glossary
- World Models | Rohit Bandaru
Dataset
The robot dataset is generated using the robot_data.py script.
We
Variational Autoencoders
Autoencoders.
A cheap way to do efficient feature extraction on images is to use a convolutional architecture. Thus, we add some convolution blocks in the decoder and some de-convolution blocks in the decoder.