From 7bf679f88d7009893db449d202247652b7f31b3d Mon Sep 17 00:00:00 2001 From: Frank Xu Date: Tue, 24 Dec 2024 23:26:07 -0500 Subject: [PATCH] Update README.md --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index b6b774e..10be784 100644 --- a/README.md +++ b/README.md @@ -27,6 +27,7 @@ > This repository hosts the *canonical* implementation of WebArena to reproduce the results reported in the paper. The web navigation infrastructure has been significantly enhanced by [AgentLab](https://github.com/ServiceNow/AgentLab/), introducing several key features: (1) support for parallel experiments using [BrowserGym](https://github.com/ServiceNow/BrowserGym), (2) integration of popular web navigation benchmarks (e.g., VisualWebArena) within a unified framework, (3) unified leaderboard reporting, and (4) improved handling of environment edge cases. We strongly recommend using this framework for your experiments. ## News +* [12/20/2024] Check out our new benchmark on even more consequential tasks, including terminal use and coding, [TheAgentCompany]([./resources/README.md#12212023-human-trajectories](https://the-agent-company.com/)). * [12/21/2023] We release the recording of trajectories performed by human annotators on ~170 tasks. Check out the [resource page](./resources/README.md#12212023-human-trajectories) for more details. * [11/3/2023] Multiple features! * Uploaded newest [execution trajectories](./resources/README.md#1132023-execution-traces-from-our-experiments-v2)