[Pelis Agent Factory Advisor] Agentic Workflow Maturity Report — March 2026 #1224

2026-03-11T03:23:27Z

github-actions[bot]
bot Mar 11, 2026

📊 Executive Summary

The gh-aw-firewall repository has an exceptionally mature agentic workflow infrastructure with 21 agentic workflows already in place — covering security, CI diagnosis, documentation, testing, release, and issue management. Compared to the Pelis Agent Factory's 100+ workflows, this repo scores 4/5 on the maturity scale and has several clear opportunities to close the remaining gap, particularly in meta-observability, issue triage, and continuous code quality improvement.

🎓 Patterns Learned from Pelis Agent Factory

From the Documentation Site

The Pelis Agent Factory emphasizes these key patterns:

Specialization over monoliths — Many focused workflows beat one large agent. Each workflow does one job.
Meta-agents are essential — Agents that monitor other agents (Metrics Collector, Audit Workflows, Workflow Health Manager) become critical infrastructure at scale.
Read-only analysts + proposal agents — Some workflows only produce reports (discussions); others propose changes via PRs. Both have value.
Guardrails enable innovation — Strict permissions, safe-outputs, and scoped toolsets make it safe to run many agents continuously.
Causality chains — Workflows that create issues that downstream agents (issue-monster → Copilot) then act on. 69% merge rates observed.
Cache memory for continuity — Agents accumulate knowledge across runs to improve over time.
Skip-if-match anti-spam — Prevents duplicate PR/issue creation from repeated workflow runs.

From the Agentics Reference Repository

The githubnext/agentics repo contains these additional patterns not yet in this repo:

grumpy-reviewer — Adversarial code reviewer that finds non-obvious issues
daily-test-improver — Incremental test coverage (similar to test-coverage-improver but daily)
sub-issue-closer — Auto-closes completed sub-issues
contribution-check — Validates PRs against contribution guidelines
weekly-repo-chronicle — Weekly changelog-style summary of what changed
issue-arborist — Organizes issues into parent/sub-issue hierarchies
pr-nitpick-reviewer — Light-touch quality review on every PR

How This Repo Compares

What this repo does exceptionally well:

✅ Multi-engine smoke testing (Claude, Codex, Copilot) — unique and sophisticated
✅ Domain-specific security workflows (secret-diggers running hourly, security-guard on PRs)
✅ CI Doctor with causality chain (creates issues → issue-monster → Copilot fixes)
✅ Daily comprehensive security review + dependency monitoring
✅ Self-referential: this very workflow (Pelis Advisor) already exists

Gaps compared to Pelis Factory best practices:

❌ No issue triage / labeling agent
❌ No meta-analytics on workflow health costs and performance
❌ No breaking change detection for CLI flags / API
❌ No continuous code quality / simplicity PRs
❌ No auto-merge of main into long-lived PR branches
❌ No malicious code scan for recent commits

📋 Current Agentic Workflow Inventory

Workflow	Purpose	Trigger	Assessment
`build-test`	Runs build tests across 8 runtimes	PR + dispatch	✅ Good — multi-runtime matrix
`ci-cd-gaps-assessment`	Assesses CI/CD pipeline coverage gaps	Daily	✅ Useful meta-analysis
`ci-doctor`	Investigates CI failures, creates diagnostic issues	workflow_run	✅ Critical — causality chain active
`cli-flag-consistency-checker`	Checks CLI flags vs docs	Weekly	✅ Good — creates discussions
`dependency-security-monitor`	Monitors CVEs, proposes updates	Daily	✅ Strong — creates issues + PRs
`doc-maintainer`	Syncs docs with code changes	Daily	✅ Active — creates doc PRs
`issue-duplication-detector`	Detects duplicate issues	On issue open	✅ Good hygiene
`issue-monster`	Assigns issues to Copilot coding agent	Hourly + issue open	✅ Core orchestrator, but needs triage to feed it labelled issues
`pelis-agent-factory-advisor`	This workflow	Daily	✅ Self-reflective — this run
`plan`	`/plan` slash command	Comment slash command	✅ Good ChatOps primitive
`secret-digger-claude/codex/copilot`	Red team secret hunting in container	Hourly (3 engines)	✅ Unique — multi-engine red teaming
`security-guard`	Reviews PRs for security regressions	PR + dispatch	✅ Strong — Claude-powered
`security-review`	Daily comprehensive threat modeling	Daily	✅ Thorough — includes SAST tools
`smoke-chroot`	Chroot mode smoke test	PR + dispatch	✅ Specialized for chroot feature
`smoke-claude/codex/copilot`	Engine functionality validation	12h + PR	✅ Good — validates all engines
`test-coverage-improver`	Identifies gaps, writes tests	Weekly	✅ Good — security-focused PRs
`update-release-notes`	Enhances release notes on publish	On release	✅ Good automation

🚀 Actionable Recommendations

P0 — Implement Immediately

🏷️ Issue Triage Agent

What: Auto-label incoming issues with appropriate labels (bug, enhancement, documentation, question, security, performance) and post a welcoming comment summarizing the issue category.

Why: The issue-monster workflow assigns issues to Copilot for automated resolution, but currently has no labeling stage upstream. Adding labels would help issue-monster prioritize work and make the issue tracker useful for humans browsing by category. Currently there are 15+ open issues with inconsistent labeling.

How: Add a new workflow triggered on issues: [opened, reopened] using the issue-triage pattern from Pelis Factory. Use safe-outputs: add-labels and add-comment.

Effort: Low — well-established pattern, ~20 lines of workflow markdown

Example:

---
description: Triage incoming issues with labels and a welcome comment
on:
  issues:
    types: [opened, reopened]
permissions:
  issues: read
tools:
  github:
    toolsets: [issues, labels]
safe-outputs:
  add-labels:
    allowed: [bug, enhancement, documentation, question, security, performance, firewall, container, mcp]
  add-comment:
    max: 1
timeout-minutes: 5
---
# Issue Triage Agent
For each new issue: analyze content in context of the gh-aw-firewall codebase 
(a Docker/Squid network firewall for AI agents). Apply the most fitting label 
from the allowed list. Leave a brief comment explaining your categorization 
and suggesting next steps or related docs.

P1 — Plan for Near-Term

🔍 Workflow Health Manager / Audit

What: A meta-workflow that audits all agentic workflow runs (costs, error rates, turn counts, recurring failures) and creates issues when workflows are underperforming or failing repeatedly.

Why: Looking at the open issues, there are 5+ [agentics] failure issues open simultaneously (secret-diggers, smoke-codex, issue-monster, CI doctor all failing). A health manager would detect these patterns proactively and create consolidated diagnostic issues rather than having CI Doctor open individual issues. The Pelis Factory's Audit Workflows workflow created 93 discussions and contributed to 9 issues from which downstream agents fixed things.

How: Use agentic-workflows tool to fetch recent run data, analyze failure patterns, calculate cost trends. Create issues for repeated failures. Run daily.

Effort: Medium — requires interpreting workflow run data

⚠️ Breaking Change Checker

What: A workflow that detects potentially breaking changes in PRs — CLI flag removals/renames, type changes in public APIs, container image interface changes, domain whitelist format changes.

Why: AWF is a tool used by other teams and integrated into larger pipelines. A breaking change to --allow-domains syntax or container entrypoint could silently break downstream users. Currently there is no automated detection. Related issue: #1001 (LD_PRELOAD breaking Deno scoped permissions) is an example of a compatibility problem that wasn't caught before shipping.

How: Trigger on PR, diff src/cli.ts options, src/types.ts, containers/agent/entrypoint.sh, and action.yml. Compare against main branch to flag removals or signature changes.

Effort: Medium

🤔 PR Code Quality Reviewer (Grumpy Reviewer)

What: An adversarial code reviewer triggered on every PR that looks for non-obvious issues: off-by-one errors, missing error handling, inconsistent naming, logic that seems correct but has edge cases. Different from security-guard (which is security-focused).

Why: The current security-guard is excellent at security boundaries, but there's no general code quality reviewer. The grumpy-reviewer pattern in agentics deliberately takes a skeptical perspective to surface subtle bugs. Several recent PRs (test coverage PRs #1161, #1162, #1163) could benefit from code review.

How: Use githubnext/agentics/grumpy-reviewer as a template. Trigger on PR, use safe-outputs: add-comment (hidden older comments to avoid spam), 10-minute timeout.

Effort: Low — template available in agentics repo

P2 — Consider for Roadmap

🧹 Continuous Simplicity / Code Cleanup PRs

What: A weekly or daily workflow that identifies overly complex code, deeply nested conditionals, functions that are too long, and proposes simplification PRs.

Why: As the codebase grows (now 24 source files in src/), complexity creeps in. The Pelis Factory's "Continuous Simplicity" workflow had 22 merged PRs out of 28 proposed (78% merge rate), showing high signal quality. For a security tool, simpler code = fewer bugs.

Effort: Medium — needs focused scope to avoid spurious PRs

🔀 Mergefest — Auto-Sync PR Branches with Main

What: A workflow that automatically merges the main branch into open PRs that are behind, eliminating the "please merge main" ceremony.

Why: Several current open PRs (#1079, #1150, #1163) are long-lived and will experience merge conflicts. With active development on main, keeping PRs current is manual overhead.

Effort: Low — mergefest pattern available directly from Pelis Factory: gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/mergefest.md

🦠 Daily Malicious Code Scan

What: A workflow that reviews recent code commits (last 24h) for suspicious patterns — unexpected network calls, data exfiltration vectors, encoded payloads, unusual file operations.

Why: AWF is itself a security tool, making it a high-value target for supply chain attacks. A daily scan of new commits aligns with the repository's security-first mission. Pelis Factory's equivalent found real issues in production. The existing secret-digger workflows test runtime secrets but not source code injection.

Effort: Low — can be adapted from Pelis Factory: gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/daily-malicious-code-scan.md

📰 Weekly Repo Chronicle

What: A weekly summary discussion of what changed in the repository — merged PRs, opened/closed issues, workflow highlights, notable commits — presented in a readable narrative format.

Why: With 21 agentic workflows generating activity and 10+ open PRs, it's hard to keep a holistic view of what's happening. A weekly digest makes the project more transparent to contributors and stakeholders. Reference: githubnext/agentics/weekly-repo-chronicle.

Effort: Low

P3 — Future Ideas

📊 Portfolio Analyst for Workflow Cost Optimization

What: Analyzes token usage, turn counts, and costs across all agentic workflow runs. Identifies wasteful workflows (too chatty, too many turns for simple tasks) and suggests optimizations.

Why: With 21 workflows running daily/hourly (especially 3 secret-digger variants running every hour each), costs could accumulate. The Pelis Factory's Portfolio Analyst identified workflows that were "way too chatty" and created optimization opportunities.

Effort: Medium — requires access to billing/usage data

🌳 Issue Arborist — Issue Organization

What: Groups related issues as parent/sub-issues, creating hierarchy in the issue tracker.

Why: Open issues like #1039 (integration test gaps) and #1103 (shutdown performance) could each have multiple sub-issues tracking specific work items. The Pelis Factory's Arborist created 77 discussion reports and 18 parent issues.

Effort: Medium

🔄 Sub-Issue Closer

What: Automatically closes sub-issues when their parent issue is resolved.

Why: Pairs with Issue Arborist. Available from agentics: gh aw add-wizard githubnext/agentics/sub-issue-closer.

Effort: Low (but depends on Issue Arborist first)

📈 Maturity Assessment

Dimension	Score	Notes
Security Automation	5/5	Exceptional — secret-diggers, security-guard, security-review, dependency-monitor, container-scan
CI/CD Integration	5/5	CI Doctor + smoke tests across 3 engines is excellent
Documentation Automation	4/5	doc-maintainer + CLI checker; missing wiki/changelog
Issue/PR Management	3/5	issue-monster + duplication detection, but no triage/labeling
Code Quality	3/5	test-coverage-improver; missing continuous simplicity/refactoring
Meta-Observability	2/5	No workflow health monitoring, no cost analytics
Release Automation	3/5	update-release-notes; missing changeset/version management
Overall	4/5	Top-tier for a project this size; comparable to mature OSS projects

Current Level: 4/5 — "Advanced Practitioner"

Has specialized workflows across all major categories, multi-engine testing, and self-referential meta-analysis. Running dozens of workflows in production.

Target Level: 4.5/5 — "Factory-Grade"

Close the gaps in meta-observability, issue triage, and continuous quality improvement.

Gap Analysis: The 3 highest-leverage additions are:

Issue triage labeling (P0 — 1-2 hours to implement)
Workflow Health Manager (P1 — 4-6 hours)
Breaking Change Checker (P1 — 3-4 hours)

🔄 Comparison with Best Practices

What This Repo Does Well

Domain-specific security focus — The secret-digger triad (3 engines, hourly) is more sophisticated than anything in the Pelis Factory reference
Multi-engine testing — Testing Claude, Codex, and Copilot in smoke tests shows maturity in agent infrastructure validation
Causality chains — CI Doctor → issue → issue-monster → Copilot PR is a working production causality chain
Strict security model — roles: all where appropriate, skip-if-match anti-spam, scoped tool permissions
Self-aware — This Pelis Advisor workflow itself shows the team is tracking against best practices

What Could Improve

Upstream triage — Before issue-monster can do its job well, issues need labels/priority to enable smarter assignment
Meta-observability — No systematic tracking of which workflows are healthy vs. problematic (manually scanning issues today)
Continuous quality — No weekly code simplicity/cleanup PRs; test coverage improvements are good but security-only

Unique Opportunities Given the Domain (Firewall/Security)

Firewall policy regression tests — Auto-test that every new release still blocks the right domains and allows the right ones (automated integration test generation for new domain patterns)
Squid config linter — A workflow that validates generated squid.conf syntax and semantics using Squid's built-in tools in CI
Container attack surface tracker — Regular analysis of what capabilities and syscalls the agent container exposes, trending over releases

Run date: 2026-03-11 | Workflows analyzed: 21 agentic, 11 standard CI | Open issues: 15 | Open PRs: 10
Cache updated: /tmp/gh-aw/cache-memory/advisor-notes.json

AI generated by Pelis Agent Factory Advisor

expires on Mar 18, 2026, 3:23 AM UTC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Pelis Agent Factory Advisor] Agentic Workflow Maturity Report — March 2026 #1224

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

[Pelis Agent Factory Advisor] Agentic Workflow Maturity Report — March 2026 #1224

Uh oh!

github-actions[bot] bot Mar 11, 2026

📊 Executive Summary

🎓 Patterns Learned from Pelis Agent Factory

From the Documentation Site

From the Agentics Reference Repository

How This Repo Compares

📋 Current Agentic Workflow Inventory

🚀 Actionable Recommendations

P0 — Implement Immediately

🏷️ Issue Triage Agent

P1 — Plan for Near-Term

🔍 Workflow Health Manager / Audit

⚠️ Breaking Change Checker

🤔 PR Code Quality Reviewer (Grumpy Reviewer)

P2 — Consider for Roadmap

🧹 Continuous Simplicity / Code Cleanup PRs

🔀 Mergefest — Auto-Sync PR Branches with Main

🦠 Daily Malicious Code Scan

📰 Weekly Repo Chronicle

P3 — Future Ideas

📊 Portfolio Analyst for Workflow Cost Optimization

🌳 Issue Arborist — Issue Organization

🔄 Sub-Issue Closer

📈 Maturity Assessment

🔄 Comparison with Best Practices

What This Repo Does Well

What Could Improve

Unique Opportunities Given the Domain (Firewall/Security)

Replies: 0 comments

github-actions[bot]
bot Mar 11, 2026