You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The gh-aw-firewall repository has an exceptionally mature agentic workflow infrastructure with 21 agentic workflows already in place — covering security, CI diagnosis, documentation, testing, release, and issue management. Compared to the Pelis Agent Factory's 100+ workflows, this repo scores 4/5 on the maturity scale and has several clear opportunities to close the remaining gap, particularly in meta-observability, issue triage, and continuous code quality improvement.
🎓 Patterns Learned from Pelis Agent Factory
From the Documentation Site
The Pelis Agent Factory emphasizes these key patterns:
Specialization over monoliths — Many focused workflows beat one large agent. Each workflow does one job.
Meta-agents are essential — Agents that monitor other agents (Metrics Collector, Audit Workflows, Workflow Health Manager) become critical infrastructure at scale.
Read-only analysts + proposal agents — Some workflows only produce reports (discussions); others propose changes via PRs. Both have value.
Guardrails enable innovation — Strict permissions, safe-outputs, and scoped toolsets make it safe to run many agents continuously.
Causality chains — Workflows that create issues that downstream agents (issue-monster → Copilot) then act on. 69% merge rates observed.
Cache memory for continuity — Agents accumulate knowledge across runs to improve over time.
✅ Self-referential: this very workflow (Pelis Advisor) already exists
Gaps compared to Pelis Factory best practices:
❌ No issue triage / labeling agent
❌ No meta-analytics on workflow health costs and performance
❌ No breaking change detection for CLI flags / API
❌ No continuous code quality / simplicity PRs
❌ No auto-merge of main into long-lived PR branches
❌ No malicious code scan for recent commits
📋 Current Agentic Workflow Inventory
Workflow
Purpose
Trigger
Assessment
build-test
Runs build tests across 8 runtimes
PR + dispatch
✅ Good — multi-runtime matrix
ci-cd-gaps-assessment
Assesses CI/CD pipeline coverage gaps
Daily
✅ Useful meta-analysis
ci-doctor
Investigates CI failures, creates diagnostic issues
workflow_run
✅ Critical — causality chain active
cli-flag-consistency-checker
Checks CLI flags vs docs
Weekly
✅ Good — creates discussions
dependency-security-monitor
Monitors CVEs, proposes updates
Daily
✅ Strong — creates issues + PRs
doc-maintainer
Syncs docs with code changes
Daily
✅ Active — creates doc PRs
issue-duplication-detector
Detects duplicate issues
On issue open
✅ Good hygiene
issue-monster
Assigns issues to Copilot coding agent
Hourly + issue open
✅ Core orchestrator, but needs triage to feed it labelled issues
pelis-agent-factory-advisor
This workflow
Daily
✅ Self-reflective — this run
plan
/plan slash command
Comment slash command
✅ Good ChatOps primitive
secret-digger-claude/codex/copilot
Red team secret hunting in container
Hourly (3 engines)
✅ Unique — multi-engine red teaming
security-guard
Reviews PRs for security regressions
PR + dispatch
✅ Strong — Claude-powered
security-review
Daily comprehensive threat modeling
Daily
✅ Thorough — includes SAST tools
smoke-chroot
Chroot mode smoke test
PR + dispatch
✅ Specialized for chroot feature
smoke-claude/codex/copilot
Engine functionality validation
12h + PR
✅ Good — validates all engines
test-coverage-improver
Identifies gaps, writes tests
Weekly
✅ Good — security-focused PRs
update-release-notes
Enhances release notes on publish
On release
✅ Good automation
🚀 Actionable Recommendations
P0 — Implement Immediately
🏷️ Issue Triage Agent
What: Auto-label incoming issues with appropriate labels (bug, enhancement, documentation, question, security, performance) and post a welcoming comment summarizing the issue category.
Why: The issue-monster workflow assigns issues to Copilot for automated resolution, but currently has no labeling stage upstream. Adding labels would help issue-monster prioritize work and make the issue tracker useful for humans browsing by category. Currently there are 15+ open issues with inconsistent labeling.
How: Add a new workflow triggered on issues: [opened, reopened] using the issue-triage pattern from Pelis Factory. Use safe-outputs: add-labels and add-comment.
Effort: Low — well-established pattern, ~20 lines of workflow markdown
Example:
---description: Triage incoming issues with labels and a welcome commenton:
issues:
types: [opened, reopened]permissions:
issues: readtools:
github:
toolsets: [issues, labels]safe-outputs:
add-labels:
allowed: [bug, enhancement, documentation, question, security, performance, firewall, container, mcp]add-comment:
max: 1timeout-minutes: 5---# Issue Triage Agent
For each new issue: analyze content in context of the gh-aw-firewall codebase
(a Docker/Squid network firewall for AI agents). Apply the most fitting label
from the allowed list. Leave a brief comment explaining your categorization
and suggesting next steps or related docs.
P1 — Plan for Near-Term
🔍 Workflow Health Manager / Audit
What: A meta-workflow that audits all agentic workflow runs (costs, error rates, turn counts, recurring failures) and creates issues when workflows are underperforming or failing repeatedly.
Why: Looking at the open issues, there are 5+ [agentics] failure issues open simultaneously (secret-diggers, smoke-codex, issue-monster, CI doctor all failing). A health manager would detect these patterns proactively and create consolidated diagnostic issues rather than having CI Doctor open individual issues. The Pelis Factory's Audit Workflows workflow created 93 discussions and contributed to 9 issues from which downstream agents fixed things.
How: Use agentic-workflows tool to fetch recent run data, analyze failure patterns, calculate cost trends. Create issues for repeated failures. Run daily.
Effort: Medium — requires interpreting workflow run data
⚠️ Breaking Change Checker
What: A workflow that detects potentially breaking changes in PRs — CLI flag removals/renames, type changes in public APIs, container image interface changes, domain whitelist format changes.
Why: AWF is a tool used by other teams and integrated into larger pipelines. A breaking change to --allow-domains syntax or container entrypoint could silently break downstream users. Currently there is no automated detection. Related issue: #1001 (LD_PRELOAD breaking Deno scoped permissions) is an example of a compatibility problem that wasn't caught before shipping.
How: Trigger on PR, diff src/cli.ts options, src/types.ts, containers/agent/entrypoint.sh, and action.yml. Compare against main branch to flag removals or signature changes.
Effort: Medium
🤔 PR Code Quality Reviewer (Grumpy Reviewer)
What: An adversarial code reviewer triggered on every PR that looks for non-obvious issues: off-by-one errors, missing error handling, inconsistent naming, logic that seems correct but has edge cases. Different from security-guard (which is security-focused).
Why: The current security-guard is excellent at security boundaries, but there's no general code quality reviewer. The grumpy-reviewer pattern in agentics deliberately takes a skeptical perspective to surface subtle bugs. Several recent PRs (test coverage PRs #1161, #1162, #1163) could benefit from code review.
How: Use githubnext/agentics/grumpy-reviewer as a template. Trigger on PR, use safe-outputs: add-comment (hidden older comments to avoid spam), 10-minute timeout.
Effort: Low — template available in agentics repo
P2 — Consider for Roadmap
🧹 Continuous Simplicity / Code Cleanup PRs
What: A weekly or daily workflow that identifies overly complex code, deeply nested conditionals, functions that are too long, and proposes simplification PRs.
Why: As the codebase grows (now 24 source files in src/), complexity creeps in. The Pelis Factory's "Continuous Simplicity" workflow had 22 merged PRs out of 28 proposed (78% merge rate), showing high signal quality. For a security tool, simpler code = fewer bugs.
Effort: Medium — needs focused scope to avoid spurious PRs
🔀 Mergefest — Auto-Sync PR Branches with Main
What: A workflow that automatically merges the main branch into open PRs that are behind, eliminating the "please merge main" ceremony.
Why: Several current open PRs (#1079, #1150, #1163) are long-lived and will experience merge conflicts. With active development on main, keeping PRs current is manual overhead.
Effort: Low — mergefest pattern available directly from Pelis Factory: gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/mergefest.md
🦠 Daily Malicious Code Scan
What: A workflow that reviews recent code commits (last 24h) for suspicious patterns — unexpected network calls, data exfiltration vectors, encoded payloads, unusual file operations.
Why: AWF is itself a security tool, making it a high-value target for supply chain attacks. A daily scan of new commits aligns with the repository's security-first mission. Pelis Factory's equivalent found real issues in production. The existing secret-digger workflows test runtime secrets but not source code injection.
Effort: Low — can be adapted from Pelis Factory: gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/daily-malicious-code-scan.md
📰 Weekly Repo Chronicle
What: A weekly summary discussion of what changed in the repository — merged PRs, opened/closed issues, workflow highlights, notable commits — presented in a readable narrative format.
Why: With 21 agentic workflows generating activity and 10+ open PRs, it's hard to keep a holistic view of what's happening. A weekly digest makes the project more transparent to contributors and stakeholders. Reference: githubnext/agentics/weekly-repo-chronicle.
Effort: Low
P3 — Future Ideas
📊 Portfolio Analyst for Workflow Cost Optimization
What: Analyzes token usage, turn counts, and costs across all agentic workflow runs. Identifies wasteful workflows (too chatty, too many turns for simple tasks) and suggests optimizations.
Why: With 21 workflows running daily/hourly (especially 3 secret-digger variants running every hour each), costs could accumulate. The Pelis Factory's Portfolio Analyst identified workflows that were "way too chatty" and created optimization opportunities.
Effort: Medium — requires access to billing/usage data
🌳 Issue Arborist — Issue Organization
What: Groups related issues as parent/sub-issues, creating hierarchy in the issue tracker.
Why: Open issues like #1039 (integration test gaps) and #1103 (shutdown performance) could each have multiple sub-issues tracking specific work items. The Pelis Factory's Arborist created 77 discussion reports and 18 parent issues.
Effort: Medium
🔄 Sub-Issue Closer
What: Automatically closes sub-issues when their parent issue is resolved.
Why: Pairs with Issue Arborist. Available from agentics: gh aw add-wizard githubnext/agentics/sub-issue-closer.
Top-tier for a project this size; comparable to mature OSS projects
Current Level: 4/5 — "Advanced Practitioner"
Has specialized workflows across all major categories, multi-engine testing, and self-referential meta-analysis. Running dozens of workflows in production.
Target Level: 4.5/5 — "Factory-Grade"
Close the gaps in meta-observability, issue triage, and continuous quality improvement.
Gap Analysis: The 3 highest-leverage additions are:
Issue triage labeling (P0 — 1-2 hours to implement)
Workflow Health Manager (P1 — 4-6 hours)
Breaking Change Checker (P1 — 3-4 hours)
🔄 Comparison with Best Practices
What This Repo Does Well
Domain-specific security focus — The secret-digger triad (3 engines, hourly) is more sophisticated than anything in the Pelis Factory reference
Multi-engine testing — Testing Claude, Codex, and Copilot in smoke tests shows maturity in agent infrastructure validation
Causality chains — CI Doctor → issue → issue-monster → Copilot PR is a working production causality chain
Strict security model — roles: all where appropriate, skip-if-match anti-spam, scoped tool permissions
Self-aware — This Pelis Advisor workflow itself shows the team is tracking against best practices
What Could Improve
Upstream triage — Before issue-monster can do its job well, issues need labels/priority to enable smarter assignment
Meta-observability — No systematic tracking of which workflows are healthy vs. problematic (manually scanning issues today)
Continuous quality — No weekly code simplicity/cleanup PRs; test coverage improvements are good but security-only
Unique Opportunities Given the Domain (Firewall/Security)
Firewall policy regression tests — Auto-test that every new release still blocks the right domains and allows the right ones (automated integration test generation for new domain patterns)
Squid config linter — A workflow that validates generated squid.conf syntax and semantics using Squid's built-in tools in CI
Container attack surface tracker — Regular analysis of what capabilities and syscalls the agent container exposes, trending over releases
Run date: 2026-03-11 | Workflows analyzed: 21 agentic, 11 standard CI | Open issues: 15 | Open PRs: 10 Cache updated: /tmp/gh-aw/cache-memory/advisor-notes.json
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
📊 Executive Summary
The
gh-aw-firewallrepository has an exceptionally mature agentic workflow infrastructure with 21 agentic workflows already in place — covering security, CI diagnosis, documentation, testing, release, and issue management. Compared to the Pelis Agent Factory's 100+ workflows, this repo scores 4/5 on the maturity scale and has several clear opportunities to close the remaining gap, particularly in meta-observability, issue triage, and continuous code quality improvement.🎓 Patterns Learned from Pelis Agent Factory
From the Documentation Site
The Pelis Agent Factory emphasizes these key patterns:
From the Agentics Reference Repository
The
githubnext/agenticsrepo contains these additional patterns not yet in this repo:grumpy-reviewer— Adversarial code reviewer that finds non-obvious issuesdaily-test-improver— Incremental test coverage (similar totest-coverage-improverbut daily)sub-issue-closer— Auto-closes completed sub-issuescontribution-check— Validates PRs against contribution guidelinesweekly-repo-chronicle— Weekly changelog-style summary of what changedissue-arborist— Organizes issues into parent/sub-issue hierarchiespr-nitpick-reviewer— Light-touch quality review on every PRHow This Repo Compares
What this repo does exceptionally well:
Gaps compared to Pelis Factory best practices:
📋 Current Agentic Workflow Inventory
build-testci-cd-gaps-assessmentci-doctorcli-flag-consistency-checkerdependency-security-monitordoc-maintainerissue-duplication-detectorissue-monsterpelis-agent-factory-advisorplan/planslash commandsecret-digger-claude/codex/copilotsecurity-guardsecurity-reviewsmoke-chrootsmoke-claude/codex/copilottest-coverage-improverupdate-release-notes🚀 Actionable Recommendations
P0 — Implement Immediately
🏷️ Issue Triage Agent
What: Auto-label incoming issues with appropriate labels (
bug,enhancement,documentation,question,security,performance) and post a welcoming comment summarizing the issue category.Why: The
issue-monsterworkflow assigns issues to Copilot for automated resolution, but currently has no labeling stage upstream. Adding labels would help issue-monster prioritize work and make the issue tracker useful for humans browsing by category. Currently there are 15+ open issues with inconsistent labeling.How: Add a new workflow triggered on
issues: [opened, reopened]using theissue-triagepattern from Pelis Factory. Usesafe-outputs: add-labelsandadd-comment.Effort: Low — well-established pattern, ~20 lines of workflow markdown
Example:
P1 — Plan for Near-Term
🔍 Workflow Health Manager / Audit
What: A meta-workflow that audits all agentic workflow runs (costs, error rates, turn counts, recurring failures) and creates issues when workflows are underperforming or failing repeatedly.
Why: Looking at the open issues, there are 5+
[agentics]failure issues open simultaneously (secret-diggers, smoke-codex, issue-monster, CI doctor all failing). A health manager would detect these patterns proactively and create consolidated diagnostic issues rather than having CI Doctor open individual issues. The Pelis Factory's Audit Workflows workflow created 93 discussions and contributed to 9 issues from which downstream agents fixed things.How: Use
agentic-workflowstool to fetch recent run data, analyze failure patterns, calculate cost trends. Create issues for repeated failures. Run daily.Effort: Medium — requires interpreting workflow run data
What: A workflow that detects potentially breaking changes in PRs — CLI flag removals/renames, type changes in public APIs, container image interface changes, domain whitelist format changes.
Why: AWF is a tool used by other teams and integrated into larger pipelines. A breaking change to
--allow-domainssyntax or container entrypoint could silently break downstream users. Currently there is no automated detection. Related issue:#1001(LD_PRELOAD breaking Deno scoped permissions) is an example of a compatibility problem that wasn't caught before shipping.How: Trigger on PR, diff
src/cli.tsoptions,src/types.ts,containers/agent/entrypoint.sh, andaction.yml. Compare against main branch to flag removals or signature changes.Effort: Medium
🤔 PR Code Quality Reviewer (Grumpy Reviewer)
What: An adversarial code reviewer triggered on every PR that looks for non-obvious issues: off-by-one errors, missing error handling, inconsistent naming, logic that seems correct but has edge cases. Different from
security-guard(which is security-focused).Why: The current
security-guardis excellent at security boundaries, but there's no general code quality reviewer. Thegrumpy-reviewerpattern in agentics deliberately takes a skeptical perspective to surface subtle bugs. Several recent PRs (test coverage PRs #1161, #1162, #1163) could benefit from code review.How: Use
githubnext/agentics/grumpy-revieweras a template. Trigger on PR, usesafe-outputs: add-comment(hidden older comments to avoid spam), 10-minute timeout.Effort: Low — template available in agentics repo
P2 — Consider for Roadmap
🧹 Continuous Simplicity / Code Cleanup PRs
What: A weekly or daily workflow that identifies overly complex code, deeply nested conditionals, functions that are too long, and proposes simplification PRs.
Why: As the codebase grows (now 24 source files in
src/), complexity creeps in. The Pelis Factory's "Continuous Simplicity" workflow had 22 merged PRs out of 28 proposed (78% merge rate), showing high signal quality. For a security tool, simpler code = fewer bugs.Effort: Medium — needs focused scope to avoid spurious PRs
🔀 Mergefest — Auto-Sync PR Branches with Main
What: A workflow that automatically merges the
mainbranch into open PRs that are behind, eliminating the "please merge main" ceremony.Why: Several current open PRs (#1079, #1150, #1163) are long-lived and will experience merge conflicts. With active development on main, keeping PRs current is manual overhead.
Effort: Low —
mergefestpattern available directly from Pelis Factory:gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/mergefest.md🦠 Daily Malicious Code Scan
What: A workflow that reviews recent code commits (last 24h) for suspicious patterns — unexpected network calls, data exfiltration vectors, encoded payloads, unusual file operations.
Why: AWF is itself a security tool, making it a high-value target for supply chain attacks. A daily scan of new commits aligns with the repository's security-first mission. Pelis Factory's equivalent found real issues in production. The existing
secret-diggerworkflows test runtime secrets but not source code injection.Effort: Low — can be adapted from Pelis Factory:
gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/daily-malicious-code-scan.md📰 Weekly Repo Chronicle
What: A weekly summary discussion of what changed in the repository — merged PRs, opened/closed issues, workflow highlights, notable commits — presented in a readable narrative format.
Why: With 21 agentic workflows generating activity and 10+ open PRs, it's hard to keep a holistic view of what's happening. A weekly digest makes the project more transparent to contributors and stakeholders. Reference:
githubnext/agentics/weekly-repo-chronicle.Effort: Low
P3 — Future Ideas
📊 Portfolio Analyst for Workflow Cost Optimization
What: Analyzes token usage, turn counts, and costs across all agentic workflow runs. Identifies wasteful workflows (too chatty, too many turns for simple tasks) and suggests optimizations.
Why: With 21 workflows running daily/hourly (especially 3 secret-digger variants running every hour each), costs could accumulate. The Pelis Factory's Portfolio Analyst identified workflows that were "way too chatty" and created optimization opportunities.
Effort: Medium — requires access to billing/usage data
🌳 Issue Arborist — Issue Organization
What: Groups related issues as parent/sub-issues, creating hierarchy in the issue tracker.
Why: Open issues like
#1039(integration test gaps) and#1103(shutdown performance) could each have multiple sub-issues tracking specific work items. The Pelis Factory's Arborist created 77 discussion reports and 18 parent issues.Effort: Medium
🔄 Sub-Issue Closer
What: Automatically closes sub-issues when their parent issue is resolved.
Why: Pairs with Issue Arborist. Available from agentics:
gh aw add-wizard githubnext/agentics/sub-issue-closer.Effort: Low (but depends on Issue Arborist first)
📈 Maturity Assessment
Current Level: 4/5 — "Advanced Practitioner"
Target Level: 4.5/5 — "Factory-Grade"
Gap Analysis: The 3 highest-leverage additions are:
🔄 Comparison with Best Practices
What This Repo Does Well
roles: allwhere appropriate, skip-if-match anti-spam, scoped tool permissionsWhat Could Improve
Unique Opportunities Given the Domain (Firewall/Security)
squid.confsyntax and semantics using Squid's built-in tools in CIRun date: 2026-03-11 | Workflows analyzed: 21 agentic, 11 standard CI | Open issues: 15 | Open PRs: 10
Cache updated:
/tmp/gh-aw/cache-memory/advisor-notes.jsonBeta Was this translation helpful? Give feedback.
All reactions