A lightweight Julia package for browser automation using the Chrome DevTools Protocol (CDP). Inspired by Python's Playwright but providing just the essential functionality to get you started with browser automation in Julia.
Warning
This package is experimental and was developed with the help of Cognition's Devin. While it's great for supervised browser automation, never leave AI agents unsupervised when controlling your browser!
- Lightweight & Simple: Focused on essential browser automation features
- Existing Browser Sessions: Connect to already open Chrome windows (keep your login sessions!)
- AI-Friendly: Perfect for supervised browser automation with LLMs
- Modern Protocol: Uses Chrome DevTools Protocol (CDP) for reliable communication
-
Browser Automation
- Simple CDP-based browser control
- Connect to existing Chrome/Chromium sessions
- Automatic browser cleanup with try-finally blocks
- Verbose mode for debugging
-
Page Operations
- Easy page navigation with
goto
- Screenshot capture with
screenshot
- Full HTML content access with
content
- Versatile JavaScript evaluation with
evaluate
- Paragraph and text extraction
- Easy page navigation with
-
DOM Interaction
- Form input field manipulation
- Radio button and checkbox handling
- Text area content management
- Multiple element selection and verification
-
Input Control
- Mouse movement and click simulation
- Double-click support
- Keyboard input and key press events
- Modifier key combinations (Control, Alt, Shift)
- Element position detection
Package is not registered yet.
using Pkg
Pkg.add("ChromeDevToolsLite")
- Chrome/Chromium browser started with remote debugging enabled
- Julia 1.10 or higher
See Chrome Setup Guide at the end of this README for detailed instructions for your operating system.
using ChromeDevToolsLite
# Connect to the browser (assumes Chrome is running with remote debugging on port 9222)
client = connect_browser()
try
# Navigate to a page and wait for load
goto(client, "https://example.com")
# Get the source content of the page
source = content(client)
# Wait for specific elements to be visible
wait_for_visible(client, "h1") # Wait for main heading
# Get the current page
page = get_page(client)
page_info = get_page_info(page)
# Find and interact with elements
button = query_selector(client, "button")
wait_for_visible(client, button) # Ensure button is visible
# Move mouse and click
move_mouse(client, button) # Move to element
click(client)
# Type text with keyboard
input = query_selector(client, "input")
type_text(input, "Hello World!")
press_key(client, "Enter")
# Take a screenshot -- returns a base64 encoded string, optionally save to file
screenshot(client; save_path="screenshot.png")
finally
close(client)
end
To be updated...
using ChromeDevToolsLite
using PromptingTools
using PromptingTools: pprint
client = connect_browser()
try
# Navigate and wait for page load
goto(client, "https://example.com")
# Get the screenshot
screenshot(client; save_path="screenshot.png")
# Ask LLM about the page // ideally, define tools for computer use
msg = aitools("What's on this page?"; image_path="screenshot.png")
pprint(msg)
finally
close(client)
end
- Browser Connection
# Ensure Chrome is running with debugging port
if !ensure_browser_available()
error("Chrome not available. Start it with: chromium --remote-debugging-port=9222")
end
# Connect with verbose mode for debugging
client = connect_browser(verbose=true)
- Page Navigation and Content
# Use try-catch for navigation issues
try
goto(client, "https://example.com")
page_content = content(client)
catch e
println("Navigation failed: ", e)
end
- JavaScript Evaluation
# Handle JavaScript evaluation safely
try
# Find and click a button
evaluate(client, "document.querySelector('button').click()")
# Get input value
value = evaluate(client, "document.querySelector('input').value")
catch e
println("JavaScript evaluation failed: ", e)
end
- Resource Cleanup
# Always use try-finally for proper cleanup
client = connect_browser()
try
goto(client, "https://example.com")
# Your automation code here
finally
close(client)
end
For more detailed examples and solutions, see the examples/ directory.
The package includes six comprehensive example scripts in the examples/
directory that demonstrate all key features:
- Browser connection and cleanup
- Simple page navigation
- Basic operations
- Navigation and content extraction
- JavaScript evaluation
- Screenshot capture
- Content manipulation
- Finding elements on the page
- Clicking and form filling
- Element property extraction
- Complex form handling with multiple input types
- Batch form field updates
- JSON-based form state verification
- Form submission and navigation tracking
- Multi-line text handling
- Dynamic DOM manipulation and styling
- Complex JavaScript execution
- JSON-based content verification
- Visual result capture with screenshots
- Mouse movement and positioning
- Click and double-click operations
- Keyboard input simulation
- Modifier key combinations
- Element position detection
- Complex input sequences
To run an example:
julia --project=. examples/1_basic_connection.jl
- Clone the repository
- Install the package in development mode:
using Pkg; Pkg.develop(path=".")
- Start Chrome/Chromium with remote debugging:
chromium --remote-debugging-port=9222 # Or for headless testing: chromium --remote-debugging-port=9222 --headless
- Run the examples to verify functionality:
julia --project=. examples/1_basic_connection.jl
- Find your Chrome/Chromium installation path (typically
C:\Program Files\Google\Chrome\Application\chrome.exe
) - Open Command Prompt (cmd) and run:
"C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222
Or for headless mode:
"C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222 --headless
- Open Terminal and run:
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222
Or for headless mode:
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222 --headless
- Open Terminal and run:
google-chrome --remote-debugging-port=9222
# Or for Chromium
chromium --remote-debugging-port=9222
Or for headless mode:
google-chrome --remote-debugging-port=9222 --headless
# Or for Chromium
chromium --remote-debugging-port=9222 --headless
To verify Chrome is running in debug mode:
- Open your browser and navigate to:
http://localhost:9222
- You should see a JSON page listing available debugging targets
- In Julia, you can verify with:
using ChromeDevToolsLite
ensure_browser_available("http://localhost:9222")
- Port Already in Use: If port 9222 is taken, try a different port (e.g., 9223)
- Permission Denied: Run the command with elevated privileges (admin/sudo). Add permissions to the terminal / VSCode in your Mac's Security & Privacy settings.
- Chrome Not Found: Ensure the path to Chrome executable is correct
- Chrome Already Running: If you have Chrome already running, you cannot start a new instance with debugging enabled. You need to first close Chrome and then start it with the debugging port enabled.
- Firewall Issues: Check if your firewall is blocking the connection
- WebDriver.jl: A mature package using the Selenium WebDriver protocol. While it requires opening new browser windows (losing existing sessions), it's battle-tested and might be more suitable for production use cases.