Skip to content

Python: Added AgentMiddleware to Copilot and Claude agents#4601

Open
dmytrostruk wants to merge 6 commits intomicrosoft:mainfrom
dmytrostruk:copilot-claude-middleware
Open

Python: Added AgentMiddleware to Copilot and Claude agents#4601
dmytrostruk wants to merge 6 commits intomicrosoft:mainfrom
dmytrostruk:copilot-claude-middleware

Conversation

@dmytrostruk
Copy link
Member

Motivation and Context

Added AgentMiddleware support to Copilot and Claude agents.

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

@dmytrostruk dmytrostruk self-assigned this Mar 10, 2026
Copilot AI review requested due to automatic review settings March 10, 2026 19:05
@markwallace-microsoft
Copy link
Member

markwallace-microsoft commented Mar 10, 2026

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
TOTAL22700260188% 
report-only-changed-files is enabled. No files were changed during this commit :)

Python Unit Test Overview

Tests Skipped Failures Errors Time
4964 20 💤 0 ❌ 0 🔥 1m 19s ⏱️

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Agent Framework AgentMiddleware support to the Python GitHub Copilot and Claude agents by layering AgentMiddlewareLayer into their public agent types, with new unit tests validating middleware execution and streaming transform hooks.

Changes:

  • Wrap GitHub Copilot agent with AgentMiddlewareLayer via a new GitHubCopilotAgent class and rename the core implementation to RawGitHubCopilotAgent.
  • Add AgentMiddlewareLayer to ClaudeAgent (telemetry-enabled) and add middleware-focused tests for Claude and Copilot.
  • Refactor streaming/non-streaming run() paths in both agents to centralize stream construction.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
python/packages/github_copilot/agent_framework_github_copilot/_agent.py Introduces middleware-enabled GitHubCopilotAgent wrapper and refactors run()/streaming helpers.
python/packages/github_copilot/tests/test_github_copilot_agent.py Adds middleware execution + streaming transform hook tests for Copilot agent.
python/packages/claude/agent_framework_claude/_agent.py Adds AgentMiddlewareLayer to ClaudeAgent and refactors stream construction; updates docs/signatures.
python/packages/claude/tests/test_claude_agent.py Adds middleware execution + streaming transform hook tests for Claude agent.
Comments suppressed due to low confidence (1)

python/packages/claude/agent_framework_claude/_agent.py:739

  • ClaudeAgent inherits AgentTelemetryLayer before AgentMiddlewareLayer, so the effective run() signature comes from AgentTelemetryLayer.run() (which doesn't surface the middleware parameter in its overloads). This makes the new middleware support hard to discover/type-check for users. Consider either (a) swapping the inheritance order so AgentMiddlewareLayer.run() is the visible signature, or (b) adding explicit run() overloads on ClaudeAgent that include middleware and forward to super().run().
class ClaudeAgent(
    AgentTelemetryLayer,
    AgentMiddlewareLayer,
    RawClaudeAgent[OptionsT],
    Generic[OptionsT],
):
    """Claude Agent with OpenTelemetry instrumentation.

Copilot AI review requested due to automatic review settings March 10, 2026 19:27
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (3)

python/packages/claude/agent_framework_claude/_agent.py:628

  • For consistency with the rest of the framework (e.g., RawAgent.run(..., options=...)), consider adding an explicit kw-only options parameter to RawClaudeAgent.run() instead of only pulling it from kwargs. This makes the supported runtime options discoverable via type hints and avoids relying on kwargs.pop('options') for normal use.
    @overload
    def run(
        self,
        messages: AgentRunInputs | None = None,
        *,
        stream: Literal[False] = ...,
        session: AgentSession | None = None,
        middleware: Sequence[AgentMiddlewareTypes] | None = None,
        **kwargs: Any,
    ) -> Awaitable[AgentResponse[Any]]: ...

    @overload
    def run(
        self,
        messages: AgentRunInputs | None = None,
        *,
        stream: Literal[True],
        session: AgentSession | None = None,
        middleware: Sequence[AgentMiddlewareTypes] | None = None,
        **kwargs: Any,
    ) -> ResponseStream[AgentResponseUpdate, AgentResponse[Any]]: ...

    def run(
        self,
        messages: AgentRunInputs | None = None,
        *,
        stream: bool = False,
        session: AgentSession | None = None,
        middleware: Sequence[AgentMiddlewareTypes] | None = None,
        **kwargs: Any,
    ) -> Awaitable[AgentResponse[Any]] | ResponseStream[AgentResponseUpdate, AgentResponse[Any]]:
        """Run the agent with the given messages.

        Args:
            messages: The messages to process.

        Keyword Args:
            stream: If True, returns an async iterable of updates. If False (default),
                returns an awaitable AgentResponse.
            session: The conversation session. If session has service_session_id set,
                the agent will resume that session.
            middleware: Optional per-run AgentMiddleware applied on top of constructor middleware.
            kwargs: Additional keyword arguments including 'options' for runtime options
                (model, permission_mode can be changed per-request).

        Returns:
            When stream=True: An ResponseStream for streaming updates.
            When stream=False: An Awaitable[AgentResponse] with the complete response.
        """
        options = kwargs.pop("options", None)
        response = self._run_stream_impl(messages=messages, session=session, options=options, **kwargs)

python/packages/github_copilot/agent_framework_github_copilot/_agent.py:132

  • RawGitHubCopilotAgent is described like the primary public agent (“A GitHub Copilot Agent…”) and its examples refer to GitHubCopilotAgent, but the class name implies it is the lower-level/no-layers variant. Consider updating this docstring to explicitly state what is “raw” about it and to direct most users to GitHubCopilotAgent for middleware/telemetry support.

This issue also appears on line 300 of the same file.

class RawGitHubCopilotAgent(BaseAgent, Generic[OptionsT]):
    """A GitHub Copilot Agent.

    This agent wraps the GitHub Copilot SDK to provide Copilot agentic capabilities
    within the Agent Framework. It supports both streaming and non-streaming responses,
    custom tools, and session management.

python/packages/github_copilot/agent_framework_github_copilot/_agent.py:357

  • run() no longer has an explicit options kw-only parameter (it’s now only extracted from kwargs). This reduces IDE autocomplete and makes this agent inconsistent with the core RawAgent.run(..., options=...) API. Consider reintroducing an explicit options: OptionsT | None = None kw-only parameter (and keep the kwargs.pop('options', ...) fallback if you need backward compatibility).
    @overload
    def run(
        self,
        messages: AgentRunInputs | None = None,
        *,
        stream: Literal[False] = False,
        session: AgentSession | None = None,
        middleware: Sequence[AgentMiddlewareTypes] | None = None,
        **kwargs: Any,
    ) -> Awaitable[AgentResponse]: ...

    @overload
    def run(
        self,
        messages: AgentRunInputs | None = None,
        *,
        stream: Literal[True],
        session: AgentSession | None = None,
        middleware: Sequence[AgentMiddlewareTypes] | None = None,
        **kwargs: Any,
    ) -> ResponseStream[AgentResponseUpdate, AgentResponse]: ...

    def run(
        self,
        messages: AgentRunInputs | None = None,
        *,
        stream: bool = False,
        session: AgentSession | None = None,
        middleware: Sequence[AgentMiddlewareTypes] | None = None,
        **kwargs: Any,
    ) -> Awaitable[AgentResponse] | ResponseStream[AgentResponseUpdate, AgentResponse]:
        """Get a response from the agent.

        This method returns the final result of the agent's execution
        as a single AgentResponse object when stream=False. When stream=True,
        it returns a ResponseStream that yields AgentResponseUpdate objects.

        Args:
            messages: The message(s) to send to the agent.

        Keyword Args:
            stream: Whether to stream the response. Defaults to False.
            session: The conversation session associated with the message(s).
            middleware: Runtime middleware parameter accepted for compatibility with middleware layer routing.
            kwargs: Additional keyword arguments, including ``options`` for runtime options
                (model, timeout, etc.).

        Returns:
            When stream=False: An Awaitable[AgentResponse].
            When stream=True: A ResponseStream of AgentResponseUpdate items.

        Raises:
            AgentException: If the request fails.
        """
        options = cast(OptionsT | None, kwargs.pop("options", None))
        if stream:
            return self._run_stream_impl(messages=messages, session=session, options=options, **kwargs)
        return self._run_impl(messages=messages, session=session, options=options, **kwargs)

Copy link
Contributor

@moonbox3 moonbox3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 4 | Confidence: 83%

✗ Correctness

The refactoring introduces AgentMiddlewareLayer into both ClaudeAgent and GitHubCopilotAgent MROs, which is conceptually correct. However, there is a concrete correctness bug in both raw agents: AgentMiddlewareLayer.run() deliberately forwards combined function/chat middleware via combined_kwargs["middleware"] and expects it to flow downstream (through _run_stream_impl/_run_impl) to the underlying chat client. The new middleware named parameter added to RawClaudeAgent.run() and RawGitHubCopilotAgent.run() intercepts and silently drops this value — it is never forwarded to _run_stream_impl or _run_impl. Additionally, options was removed as an explicit typed parameter from RawGitHubCopilotAgent.run()'s overloads (replaced by middleware), while AgentMiddlewareLayer._middleware_handler still passes options= explicitly; this works at runtime via **kwargs extraction but is a type-safety regression for direct callers. The tests only cover agent-level middleware and would not catch the dropped function/chat middleware.

✓ Security Reliability

The diff introduces a middleware layer (AgentMiddlewareLayer) into both ClaudeAgent and GitHubCopilotAgent via multiple inheritance. From a security and reliability standpoint the changes are largely sound, but there is one notable reliability issue: in GitHubCopilotAgent.run(), the previous overloads declared options: OptionsT | None = None as an explicit typed parameter, but the new overloads replace it with middleware and move options into **kwargs, relying on kwargs.pop('options', None) with a cast. This silent removal of a typed parameter breaks the public API contract without deprecation and can cause silent option-dropping if callers pass options as a keyword argument positionally, though this is a breaking-change / API-reliability concern rather than a security risk. No secrets are hardcoded, no unsafe deserialization is present, no injection risks are introduced, and the middleware callable path uses Any typing for call_next in tests which is acceptable in test code. The cast() used for extracting options from kwargs bypasses type safety and could silently discard a wrongly typed value without raising, which is a mild reliability concern.

✗ Test Coverage

The diff adds middleware support (AgentMiddlewareLayer) to both ClaudeAgent and GitHubCopilotAgent, and includes 6 new tests covering class-based middleware, function-based middleware, and streaming transform hooks. However, the new per-run middleware parameter added to run() in both agents is never tested—there are zero tests that pass middleware at call time rather than construction time, and none that verify combined constructor+per-run middleware ordering. Additionally, the GitHubCopilotAgent run() overloads silently dropped the explicit options parameter (moved to **kwargs), but no test verifies that runtime options still route correctly through the refactored _run_impl/_run_stream_impl path. Multiple-middleware chaining order is also not covered by any test.

✗ Design Approach

The diff adds AgentMiddlewareLayer to both ClaudeAgent and GitHubCopilotAgent by splitting each into a raw base class and a composed final class. The approach is architecturally sound for the Claude agent, but contains a breaking API change in RawGitHubCopilotAgent: the typed options parameter is removed from all three run() overloads and replaced with middleware, pushing options into untyped **kwargs. This silently removes type-checking and IDE support for options from the public API. Additionally, adding middleware explicitly to the raw agent signatures is a leaky abstraction—the middleware routing concern belongs exclusively to AgentMiddlewareLayer, and the raw classes should accept it transparently via **kwargs rather than declaring it as a named parameter.

Flagged Issues

  • In RawClaudeAgent.run() and RawGitHubCopilotAgent.run(), the explicit middleware named parameter captures — and silently drops — the combined function/chat middleware forwarded by AgentMiddlewareLayer via combined_kwargs["middleware"]. Since middleware is now an explicit named param rather than flowing through **kwargs, it is never forwarded to _run_stream_impl or _run_impl, breaking function/chat middleware for composed agents.
  • Per-run middleware parameter added to run() in both agents is entirely untested — there is no test that calls agent.run(..., middleware=[...]) and verifies the per-run middleware executes, making it impossible to confirm the feature works at all.
  • The GitHubCopilotAgent run() overloads removed the explicit options parameter (replaced by middleware), with options now consumed via kwargs.pop('options', None). No regression test verifies that runtime options (e.g., model, timeout) still reach _run_impl/_run_stream_impl correctly after this refactor.
  • In RawGitHubCopilotAgent.run(), the options parameter is removed from all three typed overload signatures and replaced with middleware. This is a breaking change: callers of GitHubCopilotAgent.run(options=...) lose static type-checking and IDE autocomplete. The options kwarg is silently buried in **kwargs and extracted via cast(OptionsT | None, kwargs.pop('options', None)), which bypasses the TypedDict-based typing that made it safe. options should remain an explicit typed parameter in all overloads, and middleware should not appear in the raw class's overloads at all—it is AgentMiddlewareLayer's concern and can flow through **kwargs without an explicit declaration.

Suggestions

  • Remove middleware from the explicit named parameters of RawClaudeAgent.run() and RawGitHubCopilotAgent.run() so it continues to flow via **kwargs to _run_stream_impl/_run_impl, consistent with how AgentMiddlewareLayer expects to forward it. If the intent is only to prevent 'unexpected keyword argument' errors, forward it explicitly instead.
  • options was a typed, explicitly documented parameter in RawGitHubCopilotAgent.run() overloads before this diff and has been silently replaced by middleware. Consider keeping options in the overloads (even alongside middleware) to avoid a breaking type-checking regression for existing callers who relied on IDE autocomplete or mypy checking for options.
  • In RawGitHubCopilotAgent.run(), options was previously an explicit, typed overload parameter and is now silently moved into **kwargs with a cast(). Consider keeping options as an explicit keyword parameter in the overloads (alongside middleware) to preserve type safety and avoid silent option-dropping for existing callers.
  • The cast(OptionsT | None, kwargs.pop('options', None)) in _agent.py (GitHub Copilot) bypasses runtime type checking. If a caller accidentally passes a wrong type, it will be silently accepted. A runtime isinstance guard or at least a comment explaining the invariant would improve reliability.
  • The new _run_stream_impl / _run_impl split on both agents leaves **kwargs forwarded without any validation. Consider documenting (or asserting) that no unexpected kwargs are passed, to avoid silently ignoring unexpected arguments.
  • Add a test that passes middleware directly to run() (e.g., agent.run('Hello', middleware=[TrackingMiddleware()])) and asserts the middleware executes.
  • Add a test that combines constructor middleware and per-run middleware and asserts both execute in the correct order (constructor first, then per-run).
  • Add a regression test for GitHubCopilotAgent that passes options via run('Hello', options={...}) and asserts the options are forwarded to the underlying session/client call, to guard against the kwarg extraction breakage.
  • Add a test with two middleware in sequence to verify chaining order — currently all tests use exactly one middleware.
  • The middleware parameter added to RawClaudeAgent.run()'s three overloads (non-streaming, streaming, and base) is a leaky abstraction. AgentMiddlewareLayer._middleware_handler() passes middleware to super().run() via **context.kwargs, so it already arrives through **kwargs in the raw class. Declaring it explicitly makes the raw class aware of a concern it doesn't implement and silently drops—this should be left as an implicit **kwargs passthrough with a brief inline comment explaining why it's intentionally ignored.
  • The new _run_stream_impl helper in RawClaudeAgent silently drops the middleware kwarg received from AgentMiddlewareLayer (via context.kwargs → super().run() → _run_stream_impl's **kwargs). Since this will confuse future maintainers, a brief comment noting that middleware is intentionally absorbed and not forwarded—because it is handled by AgentMiddlewareLayer before reaching this point—would prevent future bugs from someone trying to 'fix' the drop.

Automated review by moonbox3's agents

@@ -620,15 +625,24 @@ def run(
When stream=False: An Awaitable[AgentResponse] with the complete response.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

middleware is accepted as a named parameter in run() but never forwarded here. AgentMiddlewareLayer passes combined function/chat middleware via kwargs["middleware"], which is now intercepted by the named param and silently dropped instead of reaching _get_stream.

Suggested change
"""
response = self._run_stream_impl(messages=messages, session=session, options=options, middleware=middleware, **kwargs)

"""
options = cast(OptionsT | None, kwargs.pop("options", None))
if stream:
return self._run_stream_impl(messages=messages, session=session, options=options, **kwargs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue: middleware (containing combined function/chat middleware from AgentMiddlewareLayer) is captured by the named param in run() and never forwarded to _run_stream_impl, silently discarding any function/chat middleware.

Suggested change
return self._run_stream_impl(messages=messages, session=session, options=options, **kwargs)
return self._run_stream_impl(messages=messages, session=session, options=options, middleware=middleware, **kwargs)

options = cast(OptionsT | None, kwargs.pop("options", None))
if stream:
return self._run_stream_impl(messages=messages, session=session, options=options, **kwargs)
return self._run_impl(messages=messages, session=session, options=options, **kwargs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

middleware not forwarded to _run_impl either — same silent drop for the non-streaming path.

Suggested change
return self._run_impl(messages=messages, session=session, options=options, **kwargs)
return self._run_impl(messages=messages, session=session, options=options, middleware=middleware, **kwargs)

@@ -302,7 +304,7 @@ def run(
*,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

options was previously an explicitly typed parameter in this overload and is now completely absent from the signature. AgentMiddlewareLayer._middleware_handler still passes options=context.options explicitly, which works at runtime (falls into **kwargs), but this is a type-breaking change for any typed callers of GitHubCopilotAgent.run().

Raises:
AgentException: If the request fails.
"""
options = cast(OptionsT | None, kwargs.pop("options", None))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cast() silently coerces whatever is in kwargs to OptionsT | None with no runtime check. If a caller passes options with a wrong type, the error will surface much later (or not at all). Consider validating the extracted value before use.

Suggested change
options = cast(OptionsT | None, kwargs.pop("options", None))
options = kwargs.pop("options", None)
if options is not None and not isinstance(options, (dict, type(None))):
raise TypeError(f"Expected options to be a mapping or None, got {type(options).__name__}")
options = cast(OptionsT | None, options)

stream: Literal[True],
session: AgentSession | None = None,
options: OptionsT | None = None,
middleware: Sequence[AgentMiddlewareTypes] | None = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as the non-streaming overload: options should remain explicit and typed here; middleware should not appear in the raw class signature.

Suggested change
middleware: Sequence[AgentMiddlewareTypes] | None = None,
options: OptionsT | None = None,

stream: bool = False,
session: AgentSession | None = None,
options: OptionsT | None = None,
middleware: Sequence[AgentMiddlewareTypes] | None = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue on the base overload. Restoring options and removing middleware here also lets you remove the cast(OptionsT | None, kwargs.pop('options', None)) inside the body and go back to the direct typed options parameter.

Suggested change
middleware: Sequence[AgentMiddlewareTypes] | None = None,
options: OptionsT | None = None,

@@ -581,6 +582,7 @@ def run(
*,
stream: Literal[False] = ...,
session: AgentSession | None = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Declaring middleware explicitly in RawClaudeAgent's overloads leaks AgentMiddlewareLayer's routing concern into the raw class. The parameter is accepted and immediately discarded (never forwarded to _run_stream_impl). Since it arrives through **kwargs anyway when AgentMiddlewareLayer calls super().run(), it can simply be absorbed there without an explicit declaration.

Suggested change
session: AgentSession | None = None,
**kwargs: Any,

@@ -591,6 +593,7 @@ def run(
*,
stream: Literal[True],
session: AgentSession | None = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same leaky abstraction as the non-streaming overload: middleware is declared but silently dropped. Remove it from this overload as well.

@@ -600,6 +603,7 @@ def run(
*,
stream: bool = False,
session: AgentSession | None = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue on the base overload. Dropping the explicit middleware declaration from all three overloads keeps the raw class unaware of the middleware routing mechanism.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants