LangSuit⋅E is a systematic and simulation-free testbed for evaluating embodied capabilities of large language models (LLMs) across different tasks in embodied textual worlds. The highlighted features include:
- Simluation-Free Embodied Environments: The testbed provides a general simulation-free textual world that supports most embodied tasks, including navigation, manipulation, communications. The environment is based on Gymnasium and inherits the design patterns.
- Embodied Observations and Actions: All agents' observations are designed to be embodied with custimisible
max_view_distance
,max_manipulate_distance
,focal_length
, etc. - Customizible Embodied Agents: The agents in LangSuit⋅E are fully-customizable w.r.t their action spaces and communicative capabilities, i.e., one can easily adapt the communication and acting strategy from one task to another.
- Multi-agent Coopearation: The testbed supports planning, acting and communication among multiple agents, where each agents can be custimized to have different configurations.
- Human-agent Communication: Besides communication between agents, the testbed supports communication and cooperation between human and agents.
- Full support to LangChain library: The LangSuitE testbed supports full usage of API language models, Open-source language models, tool usages, Chain-of-Thought (CoT) strategies, etc..
We form a benchmark by adating from existing annotations of simluated embodied engines, a by-product benefit of pursuing a general textual embodied world. Below showcases 6 representative embodied tasks, with variants of the number of rooms, the number of agents, the action spaces of agents (whether they can communicate with each other or ask humans).
Task | Simulator | # of Scenes | # of Tasks | # of Actions | Multi-Room | Multi-Agent | Communicative |
---|---|---|---|---|---|---|---|
BabyAI | Mini Grid | 105 | 500 | 6 | ✓ | ✗ | ✗ |
Rearrange | AI2Thor | 120 | 6,000 | 8 | ✗ | ✗ | ✗ |
IQA | AI2Thor | 30 | 1,920 | 5 | ✗ | ✗ | ✓ |
ALFred | AI2Thor | 120 | 8,055 | 12 | ✗ | ✗ | ✗ |
TEACh | AI2Thor | 120 | 3,215 | 13 | ✗ | ✓ | ✓ |
CWAH | Virtual Home | 2 | 50 | 6 | ✓ | ✓ | ✓ |
@misc{langsuite2023,
author = {Zilong Zheng, Mengmeng Wang, Zixia Jia, Baichen Tong, Jiasheng Gu, Song-Chun Zhu},
title = {LangSuit⋅E: Controlling, Planning, and Interacting with Large Language Models in Embodied Text Environments},
year = {2023},
publisher = {GitHub},
url = {https://github.com/bigai-nlco/langsuite}
}
For any questions and issues, please contact [email protected].