A secure cloud Linux computer powered by E2B Desktop Sandbox and controlled by open-source LLMs.
Desktop.Use.+.Streaming.mp4
- Uses E2B for secure Desktop Sandbox
- Supports Meta Llama, OS-Atlas/ShowUI and any LLM you want to integrate!
- Operates the computer via the keyboard, mouse, and shell commands
- Live streams the display of the sandbox on the client computer
- User can pause and prompt the agent at any time
- Uses Ubuntu, but designed to work with any operating system
The details of the design are laid out in this article: How I taught an AI to use a computer
Open Computer Use is designed to easily support new LLMs. The LLM and provider combinations are are defined in models.py. Following the comments in this file, one can easily add any LLM and provider that adheres to the OpenAI API specification.
The list of tested models and providers currently includes:
Type | Model | Providers |
---|---|---|
Vision | Llama 3.2 | Fireworks, OpenRouter, Llama API |
Vision | Gemini 2.0 Flash | |
Action | Llama 3.3 | Fireworks, Llama API |
Action | DeepSeek | DeepSeek |
Action | Gemini 2.0 Flash | |
Grounding | OS-Atlas | HuggingFace Spaces |
Grounding | ShowUI | HuggingFace Spaces |
The following lines of code in models.py define the default LLMs and providers:
grounding_model = OSAtlasProvider()
vision_model = FireworksProvider("llama3.2")
action_model = FireworksProvider("llama3.3")
If you add a new model or provider, please make a PR to this repository!
- Python 3.10 or later
- git
- E2B API key
- Fireworks API key
In your terminal:
brew install poetry ffmpeg
In your terminal:
git clone https://github.com/e2b-dev/open-computer-use/
Enter the project directory:
cd open-computer-use
Create a .env
file in open-computer-use
and set the following:
# Get your API key here - https://e2b.dev/
E2B_API_KEY="your-e2b-api-key"
FIREWORKS_API_KEY="your-fireworks-api-key"
Run the following command to start the agent:
poetry install
poetry run start
The agent will start and prompt you for its first instruction.