SimWorld: A World Simulator for
Scaling Photorealistic Multi-Agent Interactions

We introduce SimWorld, a novel Unreal Engine-based simulator designed to generate unlimited, diverse urban environments for embodied AI tasks.

Existing embodied simulators typically focus on indoor scenes. There have been urban simulators, but they either lack realism or are limited to autonomous driving. Critically, most of them do not allow users to flexibly generate new scenes or define new embodied AI tasks. In contrast, SimWorld provides a user-friendly Python API and diverse 3D assets that enable users to procedurally generate realistic and dynamic city-scale environments to support various Embodied AI research tasks. Our simulator can also be connected with large language models (LLMs) to drive the behavior of different types of agents (humans, vehicles, and robots) in the environments. The simulator features infinite procedural scene generation, multi-agent interactions, and dynamic environmental elements. It also supports photorealism rendering and physics simulation powered by Unreal Engine 5.

Infinite World Generation

SimWorld generates infinite worlds with diverse shapes, scales, and distributions using procedural generation. Users can easily control and edit the world layout through both code and a user-friendly UI. The generated layouts serve as a foundation for sampling corresponding assets in Unreal Engine, creating interactive and immersive scenes.

Various and Dynamic Simulated Scenes

SimWorld supports various and dynamic simulated scenes. Based on the procedural generation, we developed a set of tools to help users to easily control the scene layout and the properties of the simulated scenes, including the building types, the road network, the roadside objects, the vegetation, etc.

SimWorld also supports control environmental on fly, users can change the weather, the time of day, the properties of the object and building during the runtime of the simulation by using the Python API.

Traffic System

SimWorld features a dynamic traffic system that simulates real-world traffic flow and pedestrian behavior to make the virtual environments more immersive and interactive. Users can control traffic lights, speed limits, and pedestrian movements. The traffic agents are powered by a custom-designed system, which supports both rule-based and LLM-driven control mechanisms.

Various Agent Actions

SimWorld supports various agent actions, including navigation, social behavior, object manipulation, and interaction with the environment. Users can also use the Python API to control the agent’s actions.

Language-Driven Agent Navigation

SimWorld supports language-driven agent navigation. Our system can parse the natural language instruction and convert it into the corresponding agent action. Based on this, our system can power large variety of embodied AI tasks, including navigation, object manipulation, and interaction with the environment.

Multi-Agent Interaction

SimWorld also supports multi-agent interaction, capable of managing 50+ agents performing actions simultaneously within the simulated environment on a single machine. This scalability enables the system to handle large-scale multi-agent AI tasks, such as collaboration and competition. Additionally, it has the potential to support complex tasks like economic simulation, city modeling, and other advanced scenarios.

SimWorld Team

Yan Zhuang*, Jiawei Ren*, Xiaokang Ye*

Xuhong He, Zijun Gao, Ryan Wu, Mrinaal Dogra, Cassie Zhang, Kai Kim, Bertt Wolfinger

Ziqiao Ma, Tianmin Shu†, Zhiting Hu†, Lianhui Qin†

Maitrix.org

UCSD

JHU

1
2
3
4
5
6
@misc{simworld,
    title={SimWorld: A World Simulator for Scaling Photorealistic Multi-Agent Interactions},
    author={Zhuang*, Yan and Ren*, Jiawei and Ye*, Xiaokang and He, Xuhong and Gao, Zijun and Wu, Ryan and Dogra, Mrinaal and Zhang, Cassie and Kim, Kai and Wolfinger, Bertt and Ma, Ziqiao and Shu$^{\dagger}$, Tianmin and Hu$^{\dagger}$, Zhiting and Qin$^{\dagger}$, Lianhui},
    booktitle={Demonstration Track, IEEE/CVF conference on computer vision and pattern recognition (CVPR)},
    year={2025}
}