docs: add agent-browser MCP server guide (Playwright + @ref accessibility snapshots) by willtwilson · Pull Request #12141 · danny-avila/LibreChat

willtwilson · 2026-03-08T21:40:08Z

Summary

Adds a documentation guide for the agent-browser MCP server — a self-hostable browser automation server for LibreChat agents, powered by Vercel's agent-browser library.

Why agent-browser instead of raw Playwright/Puppeteer?

Raw Playwright wrappers expose CSS selectors and XPath to the model. These break when a SPA re-renders, require the model to infer element identity from unstructured HTML, and are fragile across site deployments.

Vercel's agent-browser library solves this by producing accessibility tree snapshots with stable @ref identifiers:

button [@e3] "Sign in"
input  [@e7] placeholder="Email address"

The LLM passes @e7 directly to fill or click — no selector guessing, no XPath, no brittle DOM traversal. This makes agent-browser substantially more reliable than Puppeteer-based MCP servers for LLM-driven web automation.

What this PR adds

docs/docs/configuration/tools/agent-browser.mdx covering:

@ref system explained — why accessibility snapshots outperform CSS selectors for AI agents
Tool reference table — all 9 tools: navigate, snapshot, click, fill, get_text, press_key, screenshot, get_url, close_browser
Setup — Docker Compose and build-from-source tabs
librechat.yaml configuration — mcpServers block with autoApprove list
Critical implementation note — express.json() must NOT be used alongside SSEServerTransport (consuming the request stream causes HTTP 400 on every initialize call)
Session management pattern — how SSEServerTransport.sessionId routes POST /messages to the correct SSE connection
Zod-based tool registration — McpServer fluent API pattern
Typical agent workflow — navigate → snapshot → fill → press_key → get_text sequence

Documents the agent-browser MCP server which provides Playwright-backed browser automation for LibreChat agents via the Vercel agent-browser library. Key topics covered: - Why @ref accessibility snapshots beat raw CSS selectors for LLM agents - Tool reference table (navigate, snapshot, click, fill, get_text, etc.) - Docker Compose and build-from-source setup - librechat.yaml mcpServers configuration - Critical: why express.json() must NOT be used with MCP SSE transport - Session management and SSEServerTransport routing pattern - Zod-based tool registration pattern Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Adds a new documentation page describing how to run and configure an “agent-browser” MCP server for LibreChat, focusing on accessibility snapshot @ref usage and SSE transport patterns.

Changes:

Adds docs/docs/configuration/tools/agent-browser.mdx with setup instructions (Docker Compose / build-from-source) and a tool reference.
Documents an SSE session management pattern and an Express middleware caveat for SSEServerTransport.
Provides example librechat.yaml configuration for registering the MCP server.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Copilot · 2026-03-08T21:45:47Z