Browser automation in OpenClaw — the full picture.
OpenClaw's browser layer is what separates it from text-only agents. It can fill forms, log in, scroll dashboards, and read pages the way a human would. Three modes — managed, attached, and Playwright — and each fits a different use case.
Quick answers
Can OpenClaw browse the web?
Yes — OpenClaw drives a real Chrome browser via the Chrome DevTools Protocol. It opens URLs, clicks, types, scrolls, fills forms, takes screenshots, and reads rendered pages. JavaScript runs, single-page apps work, sessions persist.Does OpenClaw work on JavaScript-heavy sites?
Yes. The browser is real Chrome, not a scraper — it executes JavaScript, waits for network requests, handles single-page-app routing, and renders modal dialogs. Anything a human user sees, the agent sees.Can OpenClaw bypass CAPTCHAs?
No, by design. CAPTCHAs are intentional anti-automation. Don't try. The right answer is API access where available, or a human-in-the-loop pause for the CAPTCHA step.Is OpenClaw's browser the same as Playwright?
Related but distinct. The built-in browser uses the same CDP foundation Playwright does, but with an LLM-friendly snapshot system. The optional Playwright skill exposes Playwright's full API for cases where you need fine selector control.Can the agent log into a site for me?
Yes — using the attached-browser mode, the agent inherits your existing logged-in session. For automation, persist cookies in the agent's workspace so you don't re-login every session. Don't store passwords in skill code; use the credentials manager.
Capability
What the browser tool does
OpenClaw's browser is a real Chrome instance the agent can drive. It opens URLs, clicks, types, scrolls, takes screenshots, fills forms, and reads rendered pages — the same way a human would, just deterministically and at agent speed.
Three things make this materially different from a "scrape a webpage" tool:
- JavaScript runs. Single-page apps, lazy loading, modal dialogs — all work because Chrome actually renders the page.
- Sessions persist. The agent can log into a service, then operate inside it across many turns.
- The agent gets a structured page model. Not raw HTML — a snapshot with role-based affordances (button "Submit", link "Next page") so it can decide what to click without parsing CSS selectors.
Pick one
Three modes
| Mode | What it is | Best for |
|---|---|---|
| Managed | Headless Chrome the gateway spawns | Most automation, clean sessions |
| Attached | Connects to your running Chrome | Tasks needing your existing logins |
| Playwright skill | Full Playwright API exposed as a skill | Power users, complex flows |
Default to managed
90% of automation jobs work fine with the managed browser. Use attached only when you actually need an existing logged-in session you can't easily replicate.Default
Mode 1 — managed Chrome
Out of the box, the gateway spawns a headless Chrome and exposes browser tools to the agent. No setup needed.
> Find the latest Hacker News post about OpenClaw and summarize it.
[agent calls browser_navigate]
url: https://news.ycombinator.com/from?site=openclaw.ai
[agent calls browser_snapshot]
returns: structured listing of stories...
[agent calls browser_navigate]
url: <story URL>
[agent calls browser_get_text]
returns: rendered post content
[agent replies with summary]Configuration knobs that matter:
{
"browser": {
"mode": "managed",
"headless": true,
"userAgent": "Mozilla/5.0 (compatible; OpenClawAgent/1.0)",
"viewport": { "width": 1280, "height": 800 },
"timeout": 30000,
"blockResources": ["image", "media"]
}
}blockResources is the biggest lever for cost and speed — blocking images cuts page load time 40–60% and shrinks the snapshot tokens significantly.
When you need real logins
Mode 2 — attach to your session
Added in OpenClaw 2026.3. The agent connects to a Chrome you're already running with remote debugging on. Your logins, extensions, and tabs are all available.
# Start Chrome with remote debugging
google-chrome --remote-debugging-port=9222 --user-data-dir=$HOME/.chrome-openclaw
# Tell OpenClaw to use it
openclaw browser attach ws://localhost:9222Use a separate profile
Don't attach OpenClaw to your daily-driver Chrome. Spin up a dedicated profile (the--user-data-dir flag) so an agent mishap can't affect your real bookmarks or saved passwords.Power user
Mode 3 — Playwright skill
For complex flows where you need precise control — multi-context isolation, advanced waiting, custom event handling — install the Playwright skill from ClawHub.
openclaw skills install playwrightThe skill exposes a richer surface: browser_evaluate (run JS in the page), browser_choose_file (file upload), browser_press (key sequences), browser_select_option (dropdowns by value). The trade-off is your agent now has to reason about CSS selectors instead of just role-based affordances — sometimes worth it, often overkill.
Why this matters
Snapshots vs DOM
The default browser tool returns structured snapshots rather than raw HTML or pixel screenshots. This is the single most important design choice for LLM-driven browsing.
page: example.com/dashboard
heading: "Welcome back, Sam"
nav:
- link "Projects" → /projects
- link "Settings" → /settings
- button "Sign out"
main:
list "Recent activity":
- item "Q2 launch · updated 2h ago"
- item "Hiring · updated yesterday"
button "New project" → primary
form "Quick add":
- input "title" (required)
- select "team" [Eng, PM, Design]
- button "Add"The agent can reason "click the 'New project' primary button" instead of "find a button with class .btn-primary inside .dashboard-actions[data-test=ka-new]" — which is fragile and token-heavy. Snapshots are the killer feature.
Reality
CAPTCHA + bot detection
You will hit them. Treat them as expected.
| Detection | Frequency | Strategy |
|---|---|---|
| reCAPTCHA / Cloudflare | Common on login pages | Use stored session cookies; avoid daily logins |
| IP rate-limiting | Heavy automation | Residential proxy or slow down |
| Behavior fingerprinting | Banking, ticketing | Don't try |
| Email/SMS 2FA | Higher-stakes accounts | Manual handoff or app-specific tokens |
Don't try to break CAPTCHAs
CAPTCHAs are intentional friction. Bypassing them is adversarial and brittle. If you need agents on a service that's CAPTCHA-protected, the right answer is API access if available, or human-in-the-loop pause-and-resume for the CAPTCHA step.Hard-won
Production tips
- Always set a navigation timeout. 30s default is fine; without one a slow page hangs the agent.
- Block images and media when possible. 40–60% faster, much cheaper.
- Use snapshot mode, not raw DOM. Tokens matter; agents reason better on structured data.
- Persist cookies. Make every session ephemeral and you'll re-login constantly. Save
browser/cookies.jsonto the workspace. - Don't run browser tools at heartbeat speed. Snapshots are token-heavy; reserve browser calls for user-initiated and scheduled work.
- Snapshot before every action. Pages change between turns; stale state causes click-the-wrong-thing errors.
- Use Playwright when selectors are unavoidable. Snapshots can't always disambiguate three buttons that look identical to the role layer.
FAQ
Want OpenClaw without the ops?
Provision is the managed OpenClaw cloud — agents, channels, browser, and skills, all running. $99/mo. 48-hour free trial.