-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
errors on webarena #210
Comments
2024-10-24 23:32:05,605 - 1605292 - browsergym.experiments.loop - INFO - Running experiment GenericAgent-gpt-4o-mini-2024-07-18_on_webarena.439_26 in:
/home/toolkit/agentlab_results/2024-10-24_21-02-33_GenericAgent-gpt-4o-mini-2024-07-18_on_webarena/2024-10-24_21-02-36_GenericAgent-gpt-4o-mini-2024-07-18_on_webarena.439_26
2024-10-24 23:32:05,608 - 1605292 - httpx - DEBUG - load_ssl_context verify=True cert=None trust_env=True http2=False
2024-10-24 23:32:05,611 - 1605292 - httpx - DEBUG - load_verify_locations cafile='/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/certifi/cacert.pem'
2024-10-24 23:32:05,647 - 1605292 - browsergym.experiments.loop - DEBUG - Agent created.
2024-10-24 23:32:05,649 - 1605292 - browsergym.experiments.loop - DEBUG - Environment created.
...
...
...
action:
click('1970') # Click on the section to view the detailed list of items in the cart
2024-10-24 23:33:31,371 - 1605292 - browsergym.experiments.loop - DEBUG - Chat info sent.
2024-10-24 23:33:31,371 - 1605292 - browsergym.experiments.loop - DEBUG - Sending action to environment.
2024-10-24 23:33:31,371 - 1605292 - browsergym.core.env - DEBUG - Executing action
2024-10-24 23:33:31,804 - 1605292 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='http://snowwebarena.eastus.cloudapp.azure.com:8082/checkout/cart/'>
2024-10-24 23:33:31,806 - 1605292 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='http://snowwebarena.eastus.cloudapp.azure.com:8082/checkout/cart/'>
2024-10-24 23:33:31,807 - 1605292 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='http://snowwebarena.eastus.cloudapp.azure.com:8082/checkout/cart/'>
2024-10-24 23:33:31,809 - 1605292 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='http://snowwebarena.eastus.cloudapp.azure.com:8082/checkout/cart/'>
2024-10-24 23:33:31,810 - 1605292 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='http://snowwebarena.eastus.cloudapp.azure.com:8082/checkout/cart/'>
2024-10-24 23:33:32,380 - 1605292 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='http://snowwebarena.eastus.cloudapp.azure.com:8082/checkout/'>
2024-10-24 23:33:32,397 - 1605292 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='http://snowwebarena.eastus.cloudapp.azure.com:8082/checkout/'>
2024-10-24 23:33:32,400 - 1605292 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='http://snowwebarena.eastus.cloudapp.azure.com:8082/checkout/'>
2024-10-24 23:33:34,820 - 1605292 - browsergym.core.env - DEBUG - Action executed
2024-10-24 23:33:35,372 - 1605292 - browsergym.core.env - DEBUG - Active page checked
2024-10-24 23:33:35,372 - 1605292 - browsergym.core.env - DEBUG - User message done
2024-10-24 23:33:35,372 - 1605292 - browsergym.core.env - DEBUG - Initiating task validation
2024-10-24 23:33:35,374 - 1605292 - urllib3.connectionpool - DEBUG - Starting new HTTP connection (1): snowwebarena.eastus.cloudapp.azure.com:8082
2024-10-24 23:33:45,409 - 1605292 - browsergym.experiments.loop - WARNING - Exception uncaught by agent or environment in task webarena.439.
ConnectionError:
HTTPConnectionPool(host='snowwebarena.eastus.cloudapp.azure.com', port=8082): Max retries exceeded with url: /rest/default/V1/integration/admin/token (Caused by NameResolutionError("<urllib3.connection.HTTPConnection object at 0x7fa6fdf79130>: Failed to resolve 'snowwebarena.eastus.cloudapp.azure.com' ([Errno -3] Temporary failure in name resolution)"))
Traceback (most recent call last):
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/urllib3/connection.py", line 199, in _new_conn
sock = connection.create_connection(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/urllib3/util/connection.py", line 60, in create_connection
for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/socket.py", line 976, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
socket.gaierror: [Errno -3] Temporary failure in name resolution
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/urllib3/connectionpool.py", line 789, in urlopen
response = self._make_request(
^^^^^^^^^^^^^^^^^^^
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/urllib3/connectionpool.py", line 495, in _make_request
conn.request(
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/urllib3/connection.py", line 441, in request
self.endheaders()
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/http/client.py", line 1331, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/http/client.py", line 1091, in _send_output
self.send(msg)
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/http/client.py", line 1035, in send
self.connect()
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/urllib3/connection.py", line 279, in connect
self.sock = self._new_conn()
^^^^^^^^^^^^^^^^
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/urllib3/connection.py", line 206, in _new_conn
raise NameResolutionError(self.host, self, e) from e
urllib3.exceptions.NameResolutionError: <urllib3.connection.HTTPConnection object at 0x7fa6fdf79130>: Failed to resolve 'snowwebarena.eastus.cloudapp.azure.com' ([Errno -3] Temporary failure in name resolution)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/requests/adapters.py", line 667, in send
resp = conn.urlopen(
^^^^^^^^^^^^^
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/urllib3/connectionpool.py", line 843, in urlopen
retries = retries.increment(
^^^^^^^^^^^^^^^^^^
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/urllib3/util/retry.py", line 519, in increment
raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='snowwebarena.eastus.cloudapp.azure.com', port=8082): Max retries exceeded with url: /rest/default/V1/integration/admin/token (Caused by NameResolutionError("<urllib3.connection.HTTPConnection object at 0x7fa6fdf79130>: Failed to resolve 'snowwebarena.eastus.cloudapp.azure.com' ([Errno -3] Temporary failure in name resolution)"))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/toolkit/dev/BrowserGym/browsergym/experiments/src/browsergym/experiments/loop.py", line 246, in run
step_info.from_step(env, action, obs_preprocessor=agent.obs_preprocessor)
File "/home/toolkit/dev/BrowserGym/browsergym/experiments/src/browsergym/experiments/loop.py", line 379, in from_step
self.obs, self.reward, self.terminated, self.truncated, env_info = env.step(action)
^^^^^^^^^^^^^^^^
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/gymnasium/wrappers/time_limit.py", line 57, in step
observation, reward, terminated, truncated, info = self.env.step(action)
^^^^^^^^^^^^^^^^^^^^^
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/gymnasium/wrappers/order_enforcing.py", line 56, in step
return self.env.step(action)
^^^^^^^^^^^^^^^^^^^^^
File "/home/toolkit/dev/BrowserGym/browsergym/core/src/browsergym/core/env.py", line 415, in step
reward, done, user_message, task_info = self._task_validate()
^^^^^^^^^^^^^^^^^^^^^
File "/home/toolkit/dev/BrowserGym/browsergym/core/src/browsergym/core/env.py", line 440, in _task_validate
reward, done, user_message, info = self.task.validate(self.page, self.chat.messages)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/toolkit/dev/BrowserGym/browsergym/webarena/src/browsergym/webarena/task.py", line 187, in validate
score = self.evaluator(
^^^^^^^^^^^^^^^
File "<@beartype(webarena.evaluation_harness.evaluators.EvaluatorComb.__call__) at 0x7fa70bc34360>", line 115, in __call__
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/webarena/evaluation_harness/evaluators.py", line 359, in __call__
cur_score = evaluator(trajectory, config_file, page, client)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<@beartype(webarena.evaluation_harness.evaluators.HTMLContentEvaluator.__call__) at 0x7fa70bc340e0>", line 115, in __call__
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/webarena/evaluation_harness/evaluators.py", line 266, in __call__
target_url = eval(func)
^^^^^^^^^^
File "<string>", line 1, in <module>
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/webarena/evaluation_harness/helper_functions.py", line 42, in shopping_get_latest_order_url
"Authorization": f"Bearer {shopping_get_auth_token()}",
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/webarena/evaluation_harness/helper_functions.py", line 24, in shopping_get_auth_token
response = requests.post(
^^^^^^^^^^^^^^
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/requests/api.py", line 115, in post
return request("post", url, data=data, json=json, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/requests/api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/requests/adapters.py", line 700, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='snowwebarena.eastus.cloudapp.azure.com', port=8082): Max retries exceeded with url: /rest/default/V1/integration/admin/token (Caused by NameResolutionError("<urllib3.connection.HTTPConnection object at 0x7fa6fdf79130>: Failed to resolve 'snowwebarena.eastus.cloudapp.azure.com' ([Errno -3] Temporary failure in name resolution)")) |
2024-10-24 21:53:05,632 - 1532216 - browsergym.experiments.loop - INFO - Running experiment GenericAgent-gpt-4o-mini-2024-07-18_on_webarena.162_21 in:
/home/toolkit/agentlab_results/2024-10-24_21-02-33_GenericAgent-gpt-4o-mini-2024-07-18_on_webarena/2024-10-24_21-02-35_GenericAgent-gpt-4o-mini-2024-07-18_on_webarena.162_21
2024-10-24 21:53:05,633 - 1532216 - httpx - DEBUG - load_ssl_context verify=True cert=None trust_env=True http2=False
2024-10-24 21:53:05,634 - 1532216 - httpx - DEBUG - load_verify_locations cafile='/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/certifi/cacert.pem'
2024-10-24 21:53:05,661 - 1532216 - browsergym.experiments.loop - DEBUG - Agent created.
2024-10-24 21:53:05,663 - 1532216 - browsergym.experiments.loop - DEBUG - Environment created.
2024-10-24 21:53:06,882 - 1532216 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='about:blank'>
2024-10-24 21:53:07,268 - 1532216 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='http://snowwebare
...
...
...
action:
click('1339')
2024-10-24 21:53:59,213 - 1532216 - browsergym.experiments.loop - DEBUG - Chat info sent.
2024-10-24 21:53:59,213 - 1532216 - browsergym.experiments.loop - DEBUG - Sending action to environment.
2024-10-24 21:53:59,213 - 1532216 - browsergym.core.env - DEBUG - Executing action
2024-10-24 21:53:59,307 - 1532216 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='http://snowwebarena.eastus.cloudapp.azure.com:8082/checkout/#payment'>
2024-10-24 21:53:59,309 - 1532216 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='http://snowwebarena.eastus.cloudapp.azure.com:8082/checkout/#payment'>
2024-10-24 21:53:59,309 - 1532216 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='http://snowwebarena.eastus.cloudapp.azure.com:8082/checkout/#payment'>
2024-10-24 21:53:59,310 - 1532216 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='http://snowwebarena.eastus.cloudapp.azure.com:8082/checkout/#payment'>
2024-10-24 21:53:59,311 - 1532216 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='http://snowwebarena.eastus.cloudapp.azure.com:8082/checkout/#payment'>
2024-10-24 21:54:00,324 - 1532216 - browsergym.core.env - DEBUG - Action executed
2024-10-24 21:54:00,887 - 1532216 - browsergym.core.env - DEBUG - Active page checked
2024-10-24 21:54:00,887 - 1532216 - browsergym.core.env - DEBUG - User message done
2024-10-24 21:54:00,887 - 1532216 - browsergym.core.env - DEBUG - Initiating task validation
2024-10-24 21:54:00,887 - 1532216 - browsergym.core.env - DEBUG - Task validation done
2024-10-24 21:54:00,888 - 1532216 - browsergym.core.observation - DEBUG - Marking frame ''
2024-10-24 21:54:01,509 - 1532216 - browsergym.core.observation - WARNING - Failed to extract BrowserGym data from ARIA string: 'Order Summary Cart Subtotal $11.99 Shipping Flat Rate - Fixed $5.00 Order Total $16.99 1Item in Cart \ue622 Ship To: \ue606 edit EmmaLopez 101 S San Mateo Dr San Mateo, California94010 United States 6505551212 Shipping Method: \ue606 edit Flat Rate - Fixed'
2024-10-24 21:54:02,026 - 1532216 - browsergym.experiments.loop - WARNING - Exception uncaught by agent or environment in task webarena.162.
Error:
Execution context was destroyed, most likely because of a navigation
Traceback (most recent call last):
File "/home/toolkit/dev/BrowserGym/browsergym/experiments/src/browsergym/experiments/loop.py", line 246, in run
step_info.from_step(env, action, obs_preprocessor=agent.obs_preprocessor)
File "/home/toolkit/dev/BrowserGym/browsergym/experiments/src/browsergym/experiments/loop.py", line 379, in from_step
self.obs, self.reward, self.terminated, self.truncated, env_info = env.step(action)
^^^^^^^^^^^^^^^^
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/gymnasium/wrappers/time_limit.py", line 57, in step
observation, reward, terminated, truncated, info = self.env.step(action)
^^^^^^^^^^^^^^^^^^^^^
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/gymnasium/wrappers/order_enforcing.py", line 56, in step
return self.env.step(action)
^^^^^^^^^^^^^^^^^^^^^
File "/home/toolkit/dev/BrowserGym/browsergym/core/src/browsergym/core/env.py", line 424, in step
obs = self._get_obs()
^^^^^^^^^^^^^^^
File "/home/toolkit/dev/BrowserGym/browsergym/core/src/browsergym/core/env.py", line 552, in _get_obs
"open_pages_titles": [page.title() for page in self.context.pages],
^^^^^^^^^^^^
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/playwright/sync_api/_generated.py", line 10070, in title
return mapping.from_maybe_impl(self._sync(self._impl_obj.title()))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/playwright/_impl/_sync_base.py", line 109, in _sync
return task.result()
^^^^^^^^^^^^^
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/playwright/_impl/_page.py", line 663, in title
return await self._main_frame.title()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/playwright/_impl/_frame.py", line 769, in title
return await self._channel.send("title")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/playwright/_impl/_connection.py", line 61, in send
return await self._connection.wrap_api_call(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/playwright/_impl/_connection.py", line 490, in wrap_api_call
return await cb()
^^^^^^^^^^
File "/home/toolkit/micromamba/envs/ui-assist/lib/python3.12/site-packages/playwright/_impl/_connection.py", line 99, in inner_send
result = next(iter(done)).result()
^^^^^^^^^^^^^^^^^^^^^^^^^
playwright._impl._api_types.Error: Execution context was destroyed, most likely because of a navigation
2024-10-24 21:54:02,055 - 1532216 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='http://snowwebarena.eastus.cloudapp.azure.com:8082/checkout/onepage/success/'>
2024-10-24 21:54:02,056 - 1532216 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='http://snowwebarena.eastus.cloudapp.azure.com:8082/checkout/onepage/success/'>
2024-10-24 21:54:02,057 - 1532216 - browsergym.core.env - DEBUG - _activate_page_from_js(page) called, page=<Page url='http://snowwebarena.eastus.cloudapp.azure.com:8082/checkout/onepage/success/'> |
I am facing this issue while trying to run webarena on a random task python3 demo_agent/run_demo.py --model_name="gpt-4o-mini" --task_name="webarena.4" order: None |
This should be solved with #256
These look like stochastic errors, maybe try with a different version of playwright / chromium ? You should be able to update playwright now thanks to #257 I'll close this one, please open another issue if you still encounter these errors |
will be posting errors I get on webarena here
The text was updated successfully, but these errors were encountered: