CodexClaude CodeClaudeCursorWindsurfOpenclaw

agent-browser

Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button",

Overview

The agent-browser skill, hosted within the mxyhi/ok-skills repository, provides a command-line interface designed for AI agents to perform complex web interactions. This tool enables agents like Claude, Cursor, and Windsurf to execute tasks such as navigating between pages, interacting with UI elements like buttons and forms, and capturing screenshots for visual verification. According to the documentation in the source repository, it is particularly effective for automating repetitive browser-based workflows and extracting structured data from web pages. The repository has gained significant community interest, currently showing 423 stars. By integrating this skill, developers can empower their AI agents to handle live web environments, facilitating automated testing and real-time information retrieval across various websites and web applications.

Use Cases

Automating the submission of multi-step web forms and data entry tasks.
Performing end-to-end testing of web applications by simulating user clicks and navigation.
Extracting specific data points and capturing screenshots from live websites for analysis.

Install Notes

# Review source first
open https://github.com/mxyhi/ok-skills/blob/main/agent-browser/SKILL.md

Copy or clone the skill folder into your agent skills directory after reviewing its instructions and scripts.

Security Notes

Users should be aware that this skill allows AI agents to interact directly with live web environments and input data into forms. As with any browser automation tool, it is important to monitor the agent's actions to ensure compliance with website terms of service and to protect sensitive information during automated sessions.

Related Skills

Web Application Testing

anthropics/skills

Browser Automation

Toolkit for interacting with and testing local web applications using Playwright. Supports verifying frontend functionality, debugging UI behavior, capturing browser screenshots, and viewing browser logs.

CodexClaude
pythonfrontend
150,001 starsSource linked

Electron App Automation

mxyhi/ok-skills

Browser Automation

Automate Electron desktop apps (VS Code, Slack, Discord, Figma, Notion, Spotify, etc.) using agent-browser via Chrome DevTools Protocol. Use when the user needs to interact with an Electron app, automate a desktop app, connect to a running app, control a native app, or test an Electron application. Triggers include "au

CodexClaude Code
designbrowser
423 starsApache-2.0

Browser Trace

mxyhi/ok-skills

Browser Automation

Capture a full DevTools-protocol trace of any browser automation — CDP firehose, screenshots, and DOM dumps — then bisect the stream into per-page searchable buckets. Use when the user wants to debug a failed run, audit network/console/DOM activity, attach a trace to an in-progress session, or feed structured per-page

CodexClaude Code
browserautomation
423 starsApache-2.0

Kimi WebBridge

mxyhi/ok-skills

Browser Automation

Control the user's real browser (with their login sessions) via a local daemon at http://127.0.0.1:10086.

CodexClaude Code
browserautomation
423 starsApache-2.0

Ably — Realtime Infrastructure as a Service

TerminalSkills/skills

Browser Automation

You are an expert in Ably, the enterprisegrade realtime messaging platform. You help developers add pub/sub messaging, presence, chat, live updates, and event streaming to applications with guaranteed message ordering, exactlyonce delivery, automatic reconnection, and global edge infrastructure — handling millions of m

CodexClaude Code
typescriptreact
71 starsApache-2.0