Large language models (LLMs) excel in structured domains like math and coding, but struggle in the noisy, dynamic conditions of the real world, where agents must navigate complex environments and social interactions. Existing simulators—whether game-like, domain-specific, or socially focused—lack realism, open-endedness, or natural language interfaces, limiting their applicability to LLM-based agents. To address these gaps, we introduce SimWorld: a platform for developing and evaluating LLM and VLM agents in rich, dynamic, and interactive settings.
Existing embodied simulators typically focus on indoor scenes. There have been urban simulators, but they either lack realism or are limited to autonomous driving. Critically, most of them do not allow users to flexibly generate new scenes or define new embodied AI tasks. In contrast, SimWorld provides a user-friendly Python API and diverse 3D assets that enable users to procedurally generate realistic and dynamic city-scale environments to support various Embodied AI research tasks. Our simulator can also be connected with large language models (LLMs) to drive the behavior of different types of agents (humans, vehicles, and robots) in the environments.
Simulator | Open-ended / Procedural | Open-ended / Lang-Ctrl | Physical / Social Realism | Action / Abstraction | Action / Open-Vocab | Agent Type | Physics Engine |
---|---|---|---|---|---|---|---|
Minedojo | ✅ | ❌ | ⭐️ | Low-level | ❌ | Humanoid | Minecraft |
Mindcraft | ✅ | ❌ | ⭐️ | High-level | ❌ | Humanoid | Minecraft |
MetaUrban | ✅ | ❌ | ⭐️⭐️ | Low-level | ❌ | Vehicle | PyBullet |
EmbodiedCity | ❌ | ❌ | ⭐️⭐️⭐️ | Low-level | ❌ | Drone/Vehicle | Unreal Engine |
CARLA | ❌ | ❌ | ⭐️⭐️⭐️ | Low-level | ❌ | Vehicle | UE / Unity |
GRUtopia | ❌ | ❌ | ⭐️⭐️ | Low-level | ❌ | Humanoid/Robot | Isaac Sim |
OmniGibson | ❌ | ❌ | ⭐️⭐️ | High-/Low-level | ❌ | Robot | Omniverse |
AI2-THOR | ✅ | ❌ | ⭐️⭐️ | Low-level | ❌ | Robot | Unity |
Habitat 3.0 | ❌ | ❌ | ⭐️⭐️ | Low-level | ❌ | Humanoid/Robot | Bullet |
SimWorld | ✅ | ✅ | ⭐️⭐️⭐️ | High-/Low-level | ✅ | Humanoid/Robot/Vehicle | Unreal Engine |
SimWorld supports various and dynamic simulated scenes. Based on the procedural generation, we developed a set of tools to help users to easily control the scene layout and the properties of the simulated scenes, including the building types, the road network, the roadside objects, the vegetation, etc.
SimWorld also supports control environmental on fly, users can change the weather, the time of day, the properties of the object and building during the runtime of the simulation by using the Python API.
SimWorld supports various agent actions, including navigation, social behavior, object manipulation, and interaction with the environment. Users can also use the Python API to control the agent’s actions.
Embodied reasoning and physical interaction can also be simulated in SimWorld. In this task, agents are required to reach a designated goal while circumventing both static (e.g., trees, benches) and dynamic (e.g., pedestrians) obstacles. This necessitates multimodal perception and real-time decision-making under uncertainties and unexpected events.
1
2
3
4
5
6
@misc{simworld,
title={SimWorld: An Open-ended Realistic Simulator for Autonomous Agents in Physical and Social Worlds},
author={Zhuang*, Yan and Ren*, Jiawei and Ye*, Xiaokang and He, Xuhong and Gao, Zijun and Wu, Ryan and Dogra, Mrinaal and Zhang, Cassie and Kim, Kai and Wolfinger, Bertt and Ma, Ziqiao and Shu$^{\dagger}$, Tianmin and Hu$^{\dagger}$, Zhiting and Qin$^{\dagger}$, Lianhui},
booktitle={Preprint(Under Review)},
year={2025}
}