Skip to content

Latest commit

 

History

History
134 lines (114 loc) · 5.33 KB

index.md

File metadata and controls

134 lines (114 loc) · 5.33 KB

🏘️ LangSuit⋅E

Controlling, Planning, and Interacting with Large Language Models in Embodied Text Environments

License: MIT Documentation

LangSuit⋅E is a systematic and simulation-free testbed for evaluating embodied capabilities of large language models (LLMs) across different tasks in embodied textual worlds. The highlighted features include:

  • Simluation-Free Embodied Environments: The testbed provides a general simulation-free textual world that supports most embodied tasks, including navigation, manipulation, communications. The environment is based on Gymnasium and inherits the design patterns.
  • Embodied Observations and Actions: All agents' observations are designed to be embodied with custimisible max_view_distance, max_manipulate_distance, focal_length, etc.
  • Customizible Embodied Agents: The agents in LangSuit⋅E are fully-customizable w.r.t their action spaces and communicative capabilities, i.e., one can easily adapt the communication and acting strategy from one task to another.
  • Multi-agent Coopearation: The testbed supports planning, acting and communication among multiple agents, where each agents can be custimized to have different configurations.
  • Human-agent Communication: Besides communication between agents, the testbed supports communication and cooperation between human and agents.
  • Full support to LangChain library: The LangSuitE testbed supports full usage of API language models, Open-source language models, tool usages, Chain-of-Thought (CoT) strategies, etc..

Table of Contents

📦 Benchmark and Dataset

We form a benchmark by adating from existing annotations of simluated embodied engines, a by-product benefit of pursuing a general textual embodied world. Below showcases 6 representative embodied tasks, with variants of the number of rooms, the number of agents, the action spaces of agents (whether they can communicate with each other or ask humans).

Task Simulator # of Scenes # of Tasks # of Actions Multi-Room Multi-Agent Communicative
BabyAI Mini Grid 105 500 6
Rearrange AI2Thor 120 6,000 8
IQA AI2Thor 30 1,920 5
ALFred AI2Thor 120 8,055 12
TEACh AI2Thor 120 3,215 13
CWAH Virtual Home 2 50 6

Citation

@misc{langsuite2023,
  author    = {Zilong Zheng, Mengmeng Wang, Zixia Jia, Baichen Tong, Jiasheng Gu, Song-Chun Zhu},
  title     = {LangSuit⋅E: Controlling, Planning, and Interacting with Large Language Models in Embodied Text Environments},
  year      = {2023},
  publisher = {GitHub},
  url       = {https://github.com/bigai-nlco/langsuite}
}

For any questions and issues, please contact [email protected].