Domain Randomization: Brute-Forcing the Reality Gap
Using extreme variation in synthetic data to bridge the sim-to-real gap - theory, practice, and hard-won lessons.
The reality gap keeps haunting us. Models trained on beautiful synthetic data fail on ugly real sensor images. Domain randomization is our main weapon.
The Core Idea
If you can't make synthetic data perfectly match reality, make it match everything:
"If the model has seen enough variation in simulation, reality is just another variation."
Randomize:
- Textures (including unrealistic ones)
- Lighting (extreme conditions)
- Noise (more than real sensors)
- Geometry (within plausible bounds)
- Camera parameters (beyond spec)
The model learns features robust to all these variations, including the specific variation called "real data."
What We Randomize
Visual Properties
- Textures: Random colors, procedural patterns, photo textures
- Lighting: Direction, intensity, color, number of sources
- Shadows: Hard/soft, direction, intensity
- Backgrounds: Uniform colors to complex scenes
Sensor Properties
- Noise: Gaussian, Poisson, salt-and-pepper, beyond realistic levels
- Blur: Motion, defocus, varying amounts
- Exposure: Under and over-exposed
- Compression artifacts: JPEG-like degradation
Geometric Properties
- Object scale: ±20% from nominal
- Object position: Jitter and displacement
- Viewpoint: Broader range than expected
Hand-Specific
- Skin tone: Full spectrum including unrealistic colors
- Hand shape: Scale, finger lengths, joint angles
- Accessories: Rings, watches, sleeves
Randomization Magnitude
The key insight: more randomization isn't always better.
Too little: model overfits to synthetic domain Too much: model can't learn meaningful features, sees only noise Sweet spot: enough variation to be robust, not so much that signal is lost
We tune randomization magnitude per-factor using validation on real data.
Curriculum Strategy
Some factors are better introduced gradually:
- Start with realistic rendering
- Add noise factors
- Add geometric variation
- Add extreme texture randomization
This curriculum helps the model learn basic features before confronting extreme variation.
Results
Hand keypoint detection:
- Trained on synthetic only (no randomization): 45mm error on real data
- Trained on synthetic with randomization: 12mm error on real data
- Trained on real data: 8mm error on real data
Domain randomization closed 85% of the gap. The remaining 15% comes from:
- Distribution mismatch in poses
- Subtle artifacts not captured by randomization
- Real data has implicit regularization effects
Failure Modes
Domain randomization doesn't help with:
- Systematic biases in synthetic data (e.g., always-centered objects)
- Missing factors of variation (e.g., motion blur patterns we didn't model)
- Out-of-distribution inputs that are unlike anything in randomization range
Continuous refinement is needed. As we discover failure cases, we add new randomization factors.