Skip to content

docs: add agent-browser MCP server guide (Playwright + @ref accessibility snapshots)#12141

Open
willtwilson wants to merge 2 commits intodanny-avila:mainfrom
willtwilson:docs/mcp-agent-browser-guide
Open

docs: add agent-browser MCP server guide (Playwright + @ref accessibility snapshots)#12141
willtwilson wants to merge 2 commits intodanny-avila:mainfrom
willtwilson:docs/mcp-agent-browser-guide

Conversation

@willtwilson
Copy link

Summary

Adds a documentation guide for the agent-browser MCP server — a self-hostable browser automation server for LibreChat agents, powered by Vercel's agent-browser library.

Why agent-browser instead of raw Playwright/Puppeteer?

Raw Playwright wrappers expose CSS selectors and XPath to the model. These break when a SPA re-renders, require the model to infer element identity from unstructured HTML, and are fragile across site deployments.

Vercel's agent-browser library solves this by producing accessibility tree snapshots with stable @ref identifiers:

button [@e3] "Sign in"
input  [@e7] placeholder="Email address"

The LLM passes @e7 directly to fill or click — no selector guessing, no XPath, no brittle DOM traversal. This makes agent-browser substantially more reliable than Puppeteer-based MCP servers for LLM-driven web automation.

What this PR adds

docs/docs/configuration/tools/agent-browser.mdx covering:

  • @ref system explained — why accessibility snapshots outperform CSS selectors for AI agents
  • Tool reference table — all 9 tools: navigate, snapshot, click, fill, get_text, press_key, screenshot, get_url, close_browser
  • Setup — Docker Compose and build-from-source tabs
  • librechat.yaml configurationmcpServers block with autoApprove list
  • Critical implementation noteexpress.json() must NOT be used alongside SSEServerTransport (consuming the request stream causes HTTP 400 on every initialize call)
  • Session management pattern — how SSEServerTransport.sessionId routes POST /messages to the correct SSE connection
  • Zod-based tool registrationMcpServer fluent API pattern
  • Typical agent workflow — navigate → snapshot → fill → press_key → get_text sequence

Related

Documents the agent-browser MCP server which provides Playwright-backed
browser automation for LibreChat agents via the Vercel agent-browser library.

Key topics covered:
- Why @ref accessibility snapshots beat raw CSS selectors for LLM agents
- Tool reference table (navigate, snapshot, click, fill, get_text, etc.)
- Docker Compose and build-from-source setup
- librechat.yaml mcpServers configuration
- Critical: why express.json() must NOT be used with MCP SSE transport
- Session management and SSEServerTransport routing pattern
- Zod-based tool registration pattern

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 8, 2026 21:40
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new documentation page describing how to run and configure an “agent-browser” MCP server for LibreChat, focusing on accessibility snapshot @ref usage and SSE transport patterns.

Changes:

  • Adds docs/docs/configuration/tools/agent-browser.mdx with setup instructions (Docker Compose / build-from-source) and a tool reference.
  • Documents an SSE session management pattern and an Express middleware caveat for SSEServerTransport.
  • Provides example librechat.yaml configuration for registering the MCP server.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.


## Related

- [MCP Server configuration reference](/docs/configuration/librechat_yaml/object_structure/mcp_servers)
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This internal docs link appears to be broken in the current repo structure (no docs/docs/configuration/librechat_yaml/... pages exist). Update it to a valid local docs path or an external URL that exists (e.g., the configuration guide referenced in librechat.example.yaml).

Suggested change
- [MCP Server configuration reference](/docs/configuration/librechat_yaml/object_structure/mcp_servers)
- [MCP Server configuration reference](https://www.librechat.ai/docs/configuration/librechat_yaml/object_structure/mcp_servers)

Copilot uses AI. Check for mistakes.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 0dc05d9 — replaced the broken internal path with the full librechat.ai external URL: https://www.librechat.ai/docs/configuration/librechat_yaml/object_structure/mcp_servers

Comment on lines +31 to +33
| Tool | Description |
|------|-------------|
| `navigate` | Navigate to a URL; returns the page title |
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The markdown table header uses a double leading pipe (|| ...), which will render an extra empty column in standard Markdown/Nextra. Use a single leading pipe on each row (e.g., | Tool | Description |).

Copilot uses AI. Check for mistakes.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 0dc05d9 — double leading pipe removed from all three tables in the file.


<Tabs items={['Docker Compose', 'Build from source']}>
<Tabs.Tab>
Add to your `docker-compose.override.yml`:
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This refers to docker-compose.override.yml, but the repo’s Docker guidance/file naming uses docker-compose.override.yaml (see docker-compose.yml header). Update the filename here to match the documented convention.

Suggested change
Add to your `docker-compose.override.yml`:
Add to your `docker-compose.override.yaml`:

Copilot uses AI. Check for mistakes.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 0dc05d9 — updated to docker-compose.override.yaml to match LibreChat's documented convention. Good spot.

Comment on lines +61 to +63
build:
context: ./packages/mcp-servers/agent-browser
environment:
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Docker build context points to ./packages/mcp-servers/agent-browser, but there is no packages/mcp-servers directory in this repo. Either add the referenced server package or update the path/instructions to the actual location for the agent-browser MCP server.

Copilot uses AI. Check for mistakes.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 0dc05d9 — this is the most substantive change in the fix commit. The PR now includes the actual packages/mcp-servers/agent-browser/ package (server.ts, Dockerfile, package.json, tsconfig.json, README.md), so the Docker build context and source paths in the docs are accurate. The package is a generalised version with all homelab-specific config removed, optional Perplexica integration behind an env flag, and SSRF validation on the navigate tool.

Comment on lines +75 to +77
git clone https://github.com/danny-avila/LibreChat
cd LibreChat/packages/mcp-servers/agent-browser

Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These build-from-source steps cd into packages/mcp-servers/agent-browser, but that path does not exist in this repository. Update the path/commands to match the actual location and build process, or include the missing package in the PR.

Copilot uses AI. Check for mistakes.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 0dc05d9 — same root cause as the build context issue above. Now that the package actually exists at packages/mcp-servers/agent-browser/, all build-from-source steps are accurate.

Comment on lines +100 to +109
autoApprove:
- navigate
- snapshot
- click
- fill
- get_text
- press_key
- screenshot
- get_url
- close_browser
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

autoApprove is not a recognized key in LibreChat’s mcpServers YAML schema (MCP options include fields like type, url, headers, timeout, etc.). As written, this config will fail validation or be ignored; remove autoApprove or replace it with the supported configuration mechanism for tool approvals.

Suggested change
autoApprove:
- navigate
- snapshot
- click
- fill
- get_text
- press_key
- screenshot
- get_url
- close_browser

Copilot uses AI. Check for mistakes.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in 0dc05d9. You're right — autoApprove is not a valid key in LibreChat's mcpServers YAML schema and would be silently ignored. The concept of tool pre-approval belongs in the LibreChat UI agent configuration, not in the librechat.yaml MCP server block. The misleading config line has been removed entirely.

Comment on lines +94 to +98
mcpServers:
agent-browser:
type: sse
url: http://agent-browser-mcp:8932/sse
# Adjust the URL for local/non-Docker setups:
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using an internal Docker hostname like http://agent-browser-mcp:8932/sse will typically be blocked by LibreChat’s MCP SSRF protections unless it’s explicitly allowlisted via mcpSettings.allowedDomains. Add a note/example showing the required mcpSettings configuration for Docker setups.

Suggested change
mcpServers:
agent-browser:
type: sse
url: http://agent-browser-mcp:8932/sse
# Adjust the URL for local/non-Docker setups:
# Allow the MCP client to call your agent-browser MCP server
mcpSettings:
allowedDomains:
# Docker internal hostname (as used in the URL below)
- http://agent-browser-mcp:8932
# Local/non-Docker setups
- http://localhost:8932
mcpServers:
agent-browser:
type: sse
url: http://agent-browser-mcp:8932/sse
# For local/non-Docker setups, use:

Copilot uses AI. Check for mistakes.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great addition — incorporated in 0dc05d9. Added the mcpSettings.allowedDomains block above the mcpServers config example, covering both the Docker internal hostname (agent-browser-mcp:8932) and the local dev case (localhost:8932). Without this, LibreChat's SSRF protection would silently block all MCP connections to internal Docker services.

Comment on lines +116 to +118
| Variable | Default | Description |
|----------|---------|-------------|
| `PORT` | `8932` | HTTP port the MCP server listens on |
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The environment variables table header also uses a double leading pipe (|| ...), which will render incorrectly. Use standard Markdown table syntax with a single leading pipe per row.

Copilot uses AI. Check for mistakes.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 0dc05d9 — same double-pipe issue in the env vars table. All tables verified clean.

Comment on lines +171 to +176
server.tool(
"navigate",
"Navigate the browser to a URL. Returns the page title.",
{ url: z.string().describe("Full URL including https://") },
async ({ url }) => {
// ... call agent-browser BrowserManager
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The navigate tool takes a url string from the MCP client and passes it directly to the browser/agent-browser without any hostname or scheme restrictions. In a typical LibreChat setup this URL is ultimately controlled by an untrusted chat user, which enables SSRF: a remote user can have the agent browse internal or link-local services (e.g. private VPC hosts, 127.0.0.1) and read the responses back through the chat. To mitigate this, validate and restrict url (e.g., disallow private IP ranges and internal hostnames or enforce an allowlist) and ensure this server is not exposed to untrusted clients without additional network or authentication controls.

Suggested change
server.tool(
"navigate",
"Navigate the browser to a URL. Returns the page title.",
{ url: z.string().describe("Full URL including https://") },
async ({ url }) => {
// ... call agent-browser BrowserManager
const isPrivateHostname = (hostname: string): boolean => {
const lower = hostname.toLowerCase();
return (
lower === "localhost" ||
lower === "ip6-localhost" ||
lower.endsWith(".local") ||
lower.endsWith(".internal")
);
};
const isPrivateIp = (hostname: string): boolean => {
// Basic checks for common private/loopback/link-local IPv4 and IPv6 literals.
if (/^\d{1,3}(\.\d{1,3}){3}$/.test(hostname)) {
const parts = hostname.split(".").map(Number);
const [a, b] = parts;
if (a === 10) return true; // 10.0.0.0/8
if (a === 127) return true; // 127.0.0.0/8 (loopback)
if (a === 169 && b === 254) return true; // 169.254.0.0/16 (link-local)
if (a === 172 && b >= 16 && b <= 31) return true; // 172.16.0.0/12
if (a === 192 && b === 168) return true; // 192.168.0.0/16
}
if (hostname === "::1") {
return true; // IPv6 loopback
}
return false;
};
const isAllowedUrl = (value: string): boolean => {
try {
const parsed = new URL(value);
if (parsed.protocol !== "http:" && parsed.protocol !== "https:") {
return false;
}
const hostname = parsed.hostname;
if (isPrivateHostname(hostname) || isPrivateIp(hostname)) {
return false;
}
return true;
} catch {
return false;
}
};
server.tool(
"navigate",
"Navigate the browser to a URL. Returns the page title.",
{
url: z
.string()
.url()
.refine(isAllowedUrl, {
message:
"URL must use http/https and must not point to private, loopback, or link-local addresses.",
})
.describe("Full URL including https://"),
},
async ({ url }) => {
// ... call agent-browser BrowserManager with a validated, external URL

Copilot uses AI. Check for mistakes.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented in 0dc05d9 — this is an important security fix and I've applied the suggested isAllowedUrl() implementation in full. The navigate tool now rejects requests to private IP ranges (10.x, 172.16-31.x, 192.168.x, 127.x, ::1) and internal hostnames (.local, .internal, localhost). Added a docs note explaining this protection and noting that homelab users who intentionally need internal navigation can fork and remove the check. The SSRF validation has also been applied to the actual server.ts in the new packages/mcp-servers/agent-browser/ package.

Comment on lines +67 to +68
ports:
- "8932:8932"
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docker-compose example exposes the MCP server on host port 8932 without any guidance to restrict access or add authentication. If copied as-is to production, the unauthenticated /sse and /messages endpoints can be reached from the host network, allowing anyone who can reach this port to invoke tools like navigate, click, and fill and drive a headless browser against internal sites. Recommend documenting that this service should normally be kept on the internal Docker network (omitting ports) or protected via firewalling/mTLS/authentication before being exposed externally.

Suggested change
ports:
- "8932:8932"
# Recommended: keep this service on the internal Docker network only (no host port mapping).
# For local development **only**, you may expose it to localhost by uncommenting:
# ports:
# - "127.0.0.1:8932:8932"
# Do **not** expose this port publicly without putting it behind strict network controls
# (e.g. firewall rules, mTLS, or an authenticated reverse proxy).

Copilot uses AI. Check for mistakes.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 0dc05d9 — the port mapping is now commented out by default with a clear note that the service should remain on the internal Docker network. The comment explains that for local development only, users can uncomment 127.0.0.1:8932:8932, and that the port should never be exposed publicly without authentication/firewall controls. This is especially important given the navigate tool's ability to drive a headless browser.

Create packages/mcp-servers/agent-browser/ with:
- Generalised server.ts (no homelab-specific config)
- SSRF validation on navigate tool
- Optional Perplexica integration (env var toggle)
- Multi-stage Dockerfile with non-root user
- Updated docs: security warnings, correct config schema

Address review feedback:
- Fix SSRF vulnerability on navigate tool
- Remove autoApprove (not in mcpServers schema)
- Add mcpSettings.allowedDomains
- Fix broken docs links and file extensions
- Fix double-pipe table formatting
- Add Docker port exposure security guidance

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants