A Survey · 2026

A Survey on Predictive Safety
in Embodied AI

  1. Kleyton da Costa1,2
  2. Adriano Koshiyama2
  3. Dimitrios Kanoulas1
  4. Philip Treleaven1
  1. 1University College London
  2. 2Holistic AI

Abstract

Embodied AI systems operate in the physical world, where failures cause irreversible harm, yet safety research remains siloed across robotics, autonomous driving, and foundation-model communities. We survey 236 papers (2017–2026) through a three-axis taxonomy of safety aspects, embodied system types, and model types — including the emerging class of world action models (WAM) that unify vision-language-action and world-model architectures. We organise the field under the lens of predictive safety: the principle that an agent should evaluate candidate actions by simulating their consequences before execution. Combining topic modelling, co-occurrence analysis, and a gap-score metric that quantifies under-explored areas relative to expected coverage, we map where effort concentrates and where blind spots persist. Manipulation under robustness, alignment, and constraint satisfaction is well studied, whereas robustness in simulation environments and multi-agent systems, control barrier methods for autonomous driving, and the transfer of safe-RL formulations from RL to vision-language-action models remain substantially neglected. Risk-weighted scoring elevates autonomous driving to the top of the priority list. We distil the results into a quantitative roadmap of research priorities to close the most consequential safety gaps in embodied AI.

Papers
236
Period
2017 – 2026
Axes
10 × 7 × 5
Sources
Scopus · IEEE · arXiv

The corpus, paper by paper

236 papers · 2017 – 2026

Each point is one paper. Colour encodes the dominant architecture, method, or paper type; columns stack by year. Hover a point for details, click any legend chip to filter.

Hover any point to inspect a paper · double-click a category to solo it.

Categories are assigned by keyword matching against title and abstract, mirroring the paper's analysis pipeline. When a paper matches several, the dominant tag is selected by priority (architectures > methods > paper type).

1. Overview

The transition from digital to physical agency fundamentally changes the safety landscape. When a language model hallucinates, the consequence is correctable misinformation; when a robotic manipulator miscalculates a grasp force, the consequence may be a shattered object or an injured person. Physical actions are largely irreversible, unfold in open-ended environments that resist exhaustive specification, and affect individuals who never consented to interact with the system.

A unifying thread across the most promising approaches is predictive safety: an agent should evaluate the safety of candidate actions by simulating their consequences before committing to execution, rather than reacting to failures after they occur. Predictive safety draws on world models to generate imagined trajectories, on control barrier functions to certify they remain in safe regions, and on risk assessment to rank actions by their expected harm under uncertainty.

A model capable of imagining the future is also capable of screening unsafe futures before they are enacted.

2. Three-axis taxonomy

We cross-reference safety aspects, embodied system types, and model paradigms. The first two define the 10 × 7 cell matrix that drives the co-occurrence and gap-score analyses; the third is tagged per-paper to capture the underlying computational paradigm.

i. Safety aspects 10
  • Safe reinforcement learning
  • Robustness
  • Alignment
  • Constraint satisfaction
  • Human–robot safety
  • Collision avoidance
  • Formal verification
  • Control barrier functions
  • Risk assessment
  • Sim-to-real safety
ii. Embodied systems 7
  • Manipulation
  • Navigation
  • Autonomous driving
  • Drones / UAV
  • Humanoid & legged
  • Multi-agent systems
  • Simulation environments
iii. Model paradigms 5
  • Reinforcement learning
  • Embodied reasoning
  • Vision-language-action (VLA)
  • World models (WM)
  • World action models (WAM)

WAMs unify VLA and world-model architectures into a single network that jointly predicts future states and the actions that produce them — the locus where predictive safety can be enforced natively.

3. Where the field looks

Research effort is non-randomly concentrated across the 10 × 7 matrix (χ² = 68.9; permutation p = 0.021). Manipulation under robustness, constraint satisfaction, and alignment is densely studied; whole rows and columns elsewhere remain near-empty.

Figure 1. Safety aspect × embodied system co-occurrence (10 × 7). Each cell shows the number of papers tagged with both labels; intensity scales with count. Empty cells highlight under-explored intersections.
Figure 2. Embodied system × model paradigm (7 × 5). Reinforcement learning dominates across platforms; VLA concentrates in manipulation (n = 30); the WAM column is nascent across the board.

4. Risk-weighted research gaps

The gap-score metric G measures a cell's under-coverage relative to its expected count under the independence null. Weighting by real-world consequence (autonomous driving R = 0.95; manipulation R = 0.78; simulation R = 0.30) yields the risk-weighted gap RG, which sharpens priority ordering toward high-consequence platforms. Autonomous driving sweeps four of the top five.

Figure 3. Top-10 risk-weighted gaps. Filled bars show RG; the trailing tick marks the unweighted G for the same cell. The collapse of Robustness × Simulation (G = 0.76 → RG = 0.23) is a consequence of simulation's low risk weight.
#Aspect × SystemRGWhy it persists
1 Control barrier × Autonomous driving 0.59 CBF synthesis assumes control-affine dynamics; multi-agent traffic violates this.
2 Alignment × Autonomous driving 0.53 Alignment research developed around text LLMs has limited transfer to spatial reasoning.
3 Human–robot safety × Autonomous driving 0.53 HRI safety frameworks were built for deterministic robots, not stochastic learned controllers.
4 Constraint satisfaction × Autonomous driving 0.47 Safe-RL constraint formulations rarely transfer to large-scale, mixed-traffic deployment.
5 Formal verification × Manipulation 0.41 Industrial robots scale into less-controlled settings without verification frameworks for learned controllers.

Sensitivity. Gap rankings are stable across denominator variants (ρ > 0.99), bootstrap resamples, and keyword-dropout tests (τ = 0.71, Jaccard = 0.84). A 1,000-trial perturbation of the risk weights (each scaled by a uniform ±20% multiplier) preserves at least four of the baseline top-five priorities in 99.8% of trials.

5. Future directions

Three frontiers stand out for advancing the predictive-safety paradigm:

  1. Causal world models. Current world models learn correlational dynamics that may not faithfully capture intervention outcomes. Causal structure is essential for counterfactual safety analysis — what would have happened if the robot had not braked?
  2. Predictive safety as a training objective. Rather than treating safety as a post-hoc filter, future architectures should optimise for predicted harm reduction during training itself, e.g. via safety-potential reward shaping.
  3. Sim-to-real safety transfer. Recent work shows that pessimistic domain randomisation can provably transfer safety guarantees — not just policies — from simulation to real hardware, opening a principled path to zero-shot safe deployment.

6. Cite this work

BibTeX
@article{dacosta2026predictivesafety,
  title   = {A Survey on Predictive Safety in Embodied AI},
  author  = {da Costa, Kleyton and Koshiyama, Adriano and
             Kanoulas, Dimitrios and Treleaven, Philip},
  year    = {2026},
  url     = {https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6562019}
}