Skip to content

BrowserGym, a gym environment for web task automation in the Chromium browser.

License

Notifications You must be signed in to change notification settings

ServiceNow/BrowserGym

Repository files navigation

BrowserGym: a Gym Environment for Web Task Automation

[Setup] [Usage] [Demo] [Citation]

This package provides browsergym, a gym environment for web task automation in the Chromium browser.

4x4.grid.mp4

Example of a GPT4-V agent executing openended tasks (top row, chat interactive), as well as WebArena and WorkArena tasks (bottom row)

BrowserGym includes the following benchmarks by default:

Designing new web benchmarks with BrowserGym is easy, and simply requires to inherit the AbstractBrowserTask class.

Setup

To install browsergym, you can either install one of the browsergym-miniwob, browsergym-webarena, browsergym-visualwebarena and browsergym-workarena packages, or you can simply install browsergym which includes all of these by default.

pip install browsergym

Then, a required step is to setup playwright by running

playwright install chromium

Finally, each benchmark comes with its own specific setup that requires to follow additional steps.

Development setup

To install browsergym locally for development, use the following commands:

git clone https://github.com/ServiceNow/BrowserGym.git
cd BrowserGym
make install

Usage

Open-ended task example

Boilerplate code to run an agent on an interactive, open-ended task:

import gymnasium as gym
import browsergym.core  # register the openended task as a gym environment

env = gym.make(
    "browsergym/openended",
    task_kwargs={"start_url": "https://www.google.com/"},  # starting URL
    wait_for_user_message=True,  # wait for a user message after each agent message sent to the chat
)
obs, info = env.reset()
done =