docs: add agent-browser MCP server guide (Playwright + @ref accessibility snapshots)#12141
docs: add agent-browser MCP server guide (Playwright + @ref accessibility snapshots)#12141willtwilson wants to merge 2 commits intodanny-avila:mainfrom
Conversation
Documents the agent-browser MCP server which provides Playwright-backed browser automation for LibreChat agents via the Vercel agent-browser library. Key topics covered: - Why @ref accessibility snapshots beat raw CSS selectors for LLM agents - Tool reference table (navigate, snapshot, click, fill, get_text, etc.) - Docker Compose and build-from-source setup - librechat.yaml mcpServers configuration - Critical: why express.json() must NOT be used with MCP SSE transport - Session management and SSEServerTransport routing pattern - Zod-based tool registration pattern Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds a new documentation page describing how to run and configure an “agent-browser” MCP server for LibreChat, focusing on accessibility snapshot @ref usage and SSE transport patterns.
Changes:
- Adds
docs/docs/configuration/tools/agent-browser.mdxwith setup instructions (Docker Compose / build-from-source) and a tool reference. - Documents an SSE session management pattern and an Express middleware caveat for
SSEServerTransport. - Provides example
librechat.yamlconfiguration for registering the MCP server.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
|
|
||
| ## Related | ||
|
|
||
| - [MCP Server configuration reference](/docs/configuration/librechat_yaml/object_structure/mcp_servers) |
There was a problem hiding this comment.
This internal docs link appears to be broken in the current repo structure (no docs/docs/configuration/librechat_yaml/... pages exist). Update it to a valid local docs path or an external URL that exists (e.g., the configuration guide referenced in librechat.example.yaml).
| - [MCP Server configuration reference](/docs/configuration/librechat_yaml/object_structure/mcp_servers) | |
| - [MCP Server configuration reference](https://www.librechat.ai/docs/configuration/librechat_yaml/object_structure/mcp_servers) |
There was a problem hiding this comment.
Fixed in 0dc05d9 — replaced the broken internal path with the full librechat.ai external URL: https://www.librechat.ai/docs/configuration/librechat_yaml/object_structure/mcp_servers
| | Tool | Description | | ||
| |------|-------------| | ||
| | `navigate` | Navigate to a URL; returns the page title | |
There was a problem hiding this comment.
The markdown table header uses a double leading pipe (|| ...), which will render an extra empty column in standard Markdown/Nextra. Use a single leading pipe on each row (e.g., | Tool | Description |).
There was a problem hiding this comment.
Fixed in 0dc05d9 — double leading pipe removed from all three tables in the file.
|
|
||
| <Tabs items={['Docker Compose', 'Build from source']}> | ||
| <Tabs.Tab> | ||
| Add to your `docker-compose.override.yml`: |
There was a problem hiding this comment.
This refers to docker-compose.override.yml, but the repo’s Docker guidance/file naming uses docker-compose.override.yaml (see docker-compose.yml header). Update the filename here to match the documented convention.
| Add to your `docker-compose.override.yml`: | |
| Add to your `docker-compose.override.yaml`: |
There was a problem hiding this comment.
Fixed in 0dc05d9 — updated to docker-compose.override.yaml to match LibreChat's documented convention. Good spot.
| build: | ||
| context: ./packages/mcp-servers/agent-browser | ||
| environment: |
There was a problem hiding this comment.
The Docker build context points to ./packages/mcp-servers/agent-browser, but there is no packages/mcp-servers directory in this repo. Either add the referenced server package or update the path/instructions to the actual location for the agent-browser MCP server.
There was a problem hiding this comment.
Fixed in 0dc05d9 — this is the most substantive change in the fix commit. The PR now includes the actual packages/mcp-servers/agent-browser/ package (server.ts, Dockerfile, package.json, tsconfig.json, README.md), so the Docker build context and source paths in the docs are accurate. The package is a generalised version with all homelab-specific config removed, optional Perplexica integration behind an env flag, and SSRF validation on the navigate tool.
| git clone https://github.com/danny-avila/LibreChat | ||
| cd LibreChat/packages/mcp-servers/agent-browser | ||
|
|
There was a problem hiding this comment.
These build-from-source steps cd into packages/mcp-servers/agent-browser, but that path does not exist in this repository. Update the path/commands to match the actual location and build process, or include the missing package in the PR.
There was a problem hiding this comment.
Fixed in 0dc05d9 — same root cause as the build context issue above. Now that the package actually exists at packages/mcp-servers/agent-browser/, all build-from-source steps are accurate.
| autoApprove: | ||
| - navigate | ||
| - snapshot | ||
| - click | ||
| - fill | ||
| - get_text | ||
| - press_key | ||
| - screenshot | ||
| - get_url | ||
| - close_browser |
There was a problem hiding this comment.
autoApprove is not a recognized key in LibreChat’s mcpServers YAML schema (MCP options include fields like type, url, headers, timeout, etc.). As written, this config will fail validation or be ignored; remove autoApprove or replace it with the supported configuration mechanism for tool approvals.
| autoApprove: | |
| - navigate | |
| - snapshot | |
| - click | |
| - fill | |
| - get_text | |
| - press_key | |
| - screenshot | |
| - get_url | |
| - close_browser |
There was a problem hiding this comment.
Removed in 0dc05d9. You're right — autoApprove is not a valid key in LibreChat's mcpServers YAML schema and would be silently ignored. The concept of tool pre-approval belongs in the LibreChat UI agent configuration, not in the librechat.yaml MCP server block. The misleading config line has been removed entirely.
| mcpServers: | ||
| agent-browser: | ||
| type: sse | ||
| url: http://agent-browser-mcp:8932/sse | ||
| # Adjust the URL for local/non-Docker setups: |
There was a problem hiding this comment.
Using an internal Docker hostname like http://agent-browser-mcp:8932/sse will typically be blocked by LibreChat’s MCP SSRF protections unless it’s explicitly allowlisted via mcpSettings.allowedDomains. Add a note/example showing the required mcpSettings configuration for Docker setups.
| mcpServers: | |
| agent-browser: | |
| type: sse | |
| url: http://agent-browser-mcp:8932/sse | |
| # Adjust the URL for local/non-Docker setups: | |
| # Allow the MCP client to call your agent-browser MCP server | |
| mcpSettings: | |
| allowedDomains: | |
| # Docker internal hostname (as used in the URL below) | |
| - http://agent-browser-mcp:8932 | |
| # Local/non-Docker setups | |
| - http://localhost:8932 | |
| mcpServers: | |
| agent-browser: | |
| type: sse | |
| url: http://agent-browser-mcp:8932/sse | |
| # For local/non-Docker setups, use: |
There was a problem hiding this comment.
Great addition — incorporated in 0dc05d9. Added the mcpSettings.allowedDomains block above the mcpServers config example, covering both the Docker internal hostname (agent-browser-mcp:8932) and the local dev case (localhost:8932). Without this, LibreChat's SSRF protection would silently block all MCP connections to internal Docker services.
| | Variable | Default | Description | | ||
| |----------|---------|-------------| | ||
| | `PORT` | `8932` | HTTP port the MCP server listens on | |
There was a problem hiding this comment.
The environment variables table header also uses a double leading pipe (|| ...), which will render incorrectly. Use standard Markdown table syntax with a single leading pipe per row.
There was a problem hiding this comment.
Fixed in 0dc05d9 — same double-pipe issue in the env vars table. All tables verified clean.
| server.tool( | ||
| "navigate", | ||
| "Navigate the browser to a URL. Returns the page title.", | ||
| { url: z.string().describe("Full URL including https://") }, | ||
| async ({ url }) => { | ||
| // ... call agent-browser BrowserManager |
There was a problem hiding this comment.
The navigate tool takes a url string from the MCP client and passes it directly to the browser/agent-browser without any hostname or scheme restrictions. In a typical LibreChat setup this URL is ultimately controlled by an untrusted chat user, which enables SSRF: a remote user can have the agent browse internal or link-local services (e.g. private VPC hosts, 127.0.0.1) and read the responses back through the chat. To mitigate this, validate and restrict url (e.g., disallow private IP ranges and internal hostnames or enforce an allowlist) and ensure this server is not exposed to untrusted clients without additional network or authentication controls.
| server.tool( | |
| "navigate", | |
| "Navigate the browser to a URL. Returns the page title.", | |
| { url: z.string().describe("Full URL including https://") }, | |
| async ({ url }) => { | |
| // ... call agent-browser BrowserManager | |
| const isPrivateHostname = (hostname: string): boolean => { | |
| const lower = hostname.toLowerCase(); | |
| return ( | |
| lower === "localhost" || | |
| lower === "ip6-localhost" || | |
| lower.endsWith(".local") || | |
| lower.endsWith(".internal") | |
| ); | |
| }; | |
| const isPrivateIp = (hostname: string): boolean => { | |
| // Basic checks for common private/loopback/link-local IPv4 and IPv6 literals. | |
| if (/^\d{1,3}(\.\d{1,3}){3}$/.test(hostname)) { | |
| const parts = hostname.split(".").map(Number); | |
| const [a, b] = parts; | |
| if (a === 10) return true; // 10.0.0.0/8 | |
| if (a === 127) return true; // 127.0.0.0/8 (loopback) | |
| if (a === 169 && b === 254) return true; // 169.254.0.0/16 (link-local) | |
| if (a === 172 && b >= 16 && b <= 31) return true; // 172.16.0.0/12 | |
| if (a === 192 && b === 168) return true; // 192.168.0.0/16 | |
| } | |
| if (hostname === "::1") { | |
| return true; // IPv6 loopback | |
| } | |
| return false; | |
| }; | |
| const isAllowedUrl = (value: string): boolean => { | |
| try { | |
| const parsed = new URL(value); | |
| if (parsed.protocol !== "http:" && parsed.protocol !== "https:") { | |
| return false; | |
| } | |
| const hostname = parsed.hostname; | |
| if (isPrivateHostname(hostname) || isPrivateIp(hostname)) { | |
| return false; | |
| } | |
| return true; | |
| } catch { | |
| return false; | |
| } | |
| }; | |
| server.tool( | |
| "navigate", | |
| "Navigate the browser to a URL. Returns the page title.", | |
| { | |
| url: z | |
| .string() | |
| .url() | |
| .refine(isAllowedUrl, { | |
| message: | |
| "URL must use http/https and must not point to private, loopback, or link-local addresses.", | |
| }) | |
| .describe("Full URL including https://"), | |
| }, | |
| async ({ url }) => { | |
| // ... call agent-browser BrowserManager with a validated, external URL |
There was a problem hiding this comment.
Implemented in 0dc05d9 — this is an important security fix and I've applied the suggested isAllowedUrl() implementation in full. The navigate tool now rejects requests to private IP ranges (10.x, 172.16-31.x, 192.168.x, 127.x, ::1) and internal hostnames (.local, .internal, localhost). Added a docs note explaining this protection and noting that homelab users who intentionally need internal navigation can fork and remove the check. The SSRF validation has also been applied to the actual server.ts in the new packages/mcp-servers/agent-browser/ package.
| ports: | ||
| - "8932:8932" |
There was a problem hiding this comment.
The docker-compose example exposes the MCP server on host port 8932 without any guidance to restrict access or add authentication. If copied as-is to production, the unauthenticated /sse and /messages endpoints can be reached from the host network, allowing anyone who can reach this port to invoke tools like navigate, click, and fill and drive a headless browser against internal sites. Recommend documenting that this service should normally be kept on the internal Docker network (omitting ports) or protected via firewalling/mTLS/authentication before being exposed externally.
| ports: | |
| - "8932:8932" | |
| # Recommended: keep this service on the internal Docker network only (no host port mapping). | |
| # For local development **only**, you may expose it to localhost by uncommenting: | |
| # ports: | |
| # - "127.0.0.1:8932:8932" | |
| # Do **not** expose this port publicly without putting it behind strict network controls | |
| # (e.g. firewall rules, mTLS, or an authenticated reverse proxy). |
There was a problem hiding this comment.
Addressed in 0dc05d9 — the port mapping is now commented out by default with a clear note that the service should remain on the internal Docker network. The comment explains that for local development only, users can uncomment 127.0.0.1:8932:8932, and that the port should never be exposed publicly without authentication/firewall controls. This is especially important given the navigate tool's ability to drive a headless browser.
Create packages/mcp-servers/agent-browser/ with: - Generalised server.ts (no homelab-specific config) - SSRF validation on navigate tool - Optional Perplexica integration (env var toggle) - Multi-stage Dockerfile with non-root user - Updated docs: security warnings, correct config schema Address review feedback: - Fix SSRF vulnerability on navigate tool - Remove autoApprove (not in mcpServers schema) - Add mcpSettings.allowedDomains - Fix broken docs links and file extensions - Fix double-pipe table formatting - Add Docker port exposure security guidance Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
Adds a documentation guide for the agent-browser MCP server — a self-hostable browser automation server for LibreChat agents, powered by Vercel's
agent-browserlibrary.Why agent-browser instead of raw Playwright/Puppeteer?
Raw Playwright wrappers expose CSS selectors and XPath to the model. These break when a SPA re-renders, require the model to infer element identity from unstructured HTML, and are fragile across site deployments.
Vercel's
agent-browserlibrary solves this by producing accessibility tree snapshots with stable@refidentifiers:The LLM passes
@e7directly tofillorclick— no selector guessing, no XPath, no brittle DOM traversal. This makes agent-browser substantially more reliable than Puppeteer-based MCP servers for LLM-driven web automation.What this PR adds
docs/docs/configuration/tools/agent-browser.mdxcovering:navigate,snapshot,click,fill,get_text,press_key,screenshot,get_url,close_browserlibrechat.yamlconfiguration —mcpServersblock withautoApprovelistexpress.json()must NOT be used alongsideSSEServerTransport(consuming the request stream causes HTTP 400 on everyinitializecall)SSEServerTransport.sessionIdroutes POST/messagesto the correct SSE connectionMcpServerfluent API patternRelated
agent-browsernpm: https://www.npmjs.com/package/agent-browser