[Pelis Agent Factory Advisor] Pelis Agent Factory Advisor - Agentic Workflow Maturity Report (2026-03-06) #1166
Replies: 1 comment
-
|
🔮 The ancient spirits stir; the smoke test agent was here, and the omens are recorded in the repository’s winds.
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
📊 Executive Summary
gh-aw-firewallhas grown into one of the most agentically mature repositories in the Pelis ecosystem, operating 28 agentic workflows covering CI/CD, security scanning, smoke testing, documentation, and code quality. The repository is well past the "starter" phase and entering the "factory optimization" phase — the next opportunities are meta-level observability, continuous code quality agents, issue organization, and security-specific automations that uniquely leverage this repo's domain (firewall/egress control for AI agents).🎓 Patterns Learned from Pelis Agent Factory
Key Patterns from the Documentation Site
Key Patterns from the
githubnext/agenticsReference RepositoryThe agentics repo contains 38 reference workflows including:
daily-test-improver,ci-coach,issue-arborist,sub-issue-closer,grumpy-reviewer,pr-nitpick-reviewer,daily-file-diet,duplicate-code-detector,daily-perf-improver,weekly-issue-summary,glossary-maintainer,repo-ask(Q&A bot), and more.How This Repo Compares
Well-covered: Security scanning (3 secret diggers, security-guard, security-review, dependency-security-monitor), smoke tests (4 variants), CI investigation (ci-doctor), documentation (doc-maintainer), test coverage (test-coverage-improver).
Missing: Issue organization, continuous code quality, meta-observability, release automation, performance improvement agents.
📋 Current Agentic Workflow Inventory
build-test-{bun,cpp,deno,dotnet,go,java,node,rust}ci-doctorci-cd-gaps-assessmentcli-flag-consistency-checkerdependency-security-monitordoc-maintainerissue-duplication-detectorissue-monsterpelis-agent-factory-advisorplan/planslash commandsecret-digger-{claude,codex,copilot}security-guardsecurity-reviewsmoke-{claude,codex,copilot,chroot}test-coverage-improverupdate-release-notes🚀 Actionable Recommendations
P0 — Implement Immediately
P0.1: Issue Triage Agent 🏷️
What: Auto-label incoming issues with categories:
bug,feature,documentation,question,security,performance,firewall-domain,integration-test.Why: The issue tracker currently has dozens of open issues including many unlabeled ones. Without triage, issue-monster has to work harder to prioritize. This is the #1 "hello world" of agentic workflows per Pelis guidance and provides immediate value.
How: Trigger on
issues: [opened, reopened]. Analyze issue content against the AWF domain (proxy errors, domain blocking, container issues, CI failures). Apply one label and leave an explanatory comment.Effort: Low (1-2 hours)
Example config:
P1 — Plan for Near-Term
P1.1: Workflow Health Manager 🏥
What: A meta-agent that monitors the health of all other agentic workflows in this repository — tracking failure rates, turn counts, cost trends, and missed runs.
Why: With 28 workflows, you're beyond what you can manually monitor. Currently there are many simultaneous failures (build-test-bun through build-test-rust all failed recently, issues #1122-#1133). A health manager would aggregate these patterns, identify systemic issues (e.g., "all build-test workflows fail after npm changes"), and create actionable issues. Per Pelis: Workflow Health Manager created 40 issues with a 34-PR causal chain.
How: Daily schedule, uses
agentic-workflowstool to audit recent runs. Detects patterns: workflows with >50% failure rate over 7 days, cost outliers, workflows with no recent runs.Effort: Medium
P1.2: Breaking Change Checker⚠️
What: Automatically detect when PRs introduce backward-incompatible changes to AWF's public interface: CLI flags, Docker API, env variable contracts, exit codes, config file formats.
Why: AWF is used as a dependency in other workflows. Breaking changes to
--allow-domains,--dns-servers,--image-tag, orWrapperConfigtypes could silently break users. This is especially important for a security tool where unexpected behavior could expose agents to unrestricted network access.How: Trigger on PRs. Analyze
src/cli.ts,src/types.ts, andaction.ymlfor interface changes. Cross-reference withCHANGELOG.mdand semver expectations. Create alert issues for breaking changes.Effort: Medium
P1.3: Daily Domain Allowlist Regression Tester 🔒
What: A firewall-specific workflow that daily validates AWF's core promise: that blocked domains are actually blocked and allowed domains are actually allowed.
Why: This repo IS a security tool. A regression in domain blocking (e.g., a bug that makes
.*match everything) could silently compromise all users. The existing smoke tests verify the tool runs, but don't systematically test the domain filtering logic against a known-good matrix of allowed/blocked domains.How: Run
awfwith a known domain whitelist, attempt connections to both allowed and blocked domains, assert the correct outcomes. Use theawf logs statscommand to verify the firewall log shows the right allow/deny decisions.Effort: Medium
P2 — Consider for Roadmap
P2.1: Code Simplifier / Daily Code Quality Agent 🧹
What: Daily agent that analyzes recently-modified TypeScript/Shell files and proposes simplifications without changing functionality.
Why: This repo has active development on complex areas (iptables rules, docker-compose generation, proxy configuration). The Pelis Code Simplifier achieved an 83% merge rate. Given this repo's complexity (1700+ line
docker-manager.ts), there's likely ongoing simplification value.How: Use
githubnext/agentics/code-simplifieras a starting template. Focus onsrc/,containers/agent/,containers/squid/changed in last 3 days.Effort: Low (remix from template)
P2.2: Metrics Collector / Portfolio Analyst 📊
What: Weekly analytics on the entire agent ecosystem — tracking which workflows produce the most value (PRs merged, issues resolved), which cost the most, and where efficiency gains are possible.
Why: 28 workflows × multiple runs/day = significant cost. Portfolio Analyst identified "chatty" agents in Pelis factory. With secret-diggers running hourly across 3 engines, understanding their value-to-cost ratio is important.
How: Weekly schedule. Aggregate workflow run data using
agentic-workflowstool. Report on turn counts, durations, outcomes. Create a discussion with metrics.Effort: Medium
P2.3: Release Changeset Automation 📦
What: When commits land on main, automatically draft a changeset entry that categorizes the change (feat/fix/chore) and suggests the appropriate semver bump.
Why:
update-release-notesonly runs after a release is published. Proactive changelog management means releases don't require manual changelog curation. The Pelis Changeset Generator achieved 78% merge rate.How: Trigger on push to main. Analyze commit messages (conventional commits format is already enforced). Generate changelog entry draft and open a PR.
Effort: Medium
P2.4: Issue Arborist 🌳
What: Automatically organize related issues into parent/sub-issue hierarchies.
Why: Currently there are many related open issues (multiple build-test failures, multiple agentics failures) that are scattered. Grouping them improves tracking. The Pelis Issue Arborist created 18 parent issues organizing 77+ related issues.
How: Weekly schedule. Find issues sharing labels or keywords. Create parent issue with sub-issue links. Works well with issue-monster which then can dispatch sub-issues to Copilot.
Effort: Medium
P2.5: CI Coach 🏋️
What: Analyze CI pipeline efficiency and suggest optimizations: parallelization, caching improvements, unnecessary steps.
Why: The build-test workflows for 8 language ecosystems are running on every PR. CI Coach has achieved 100% merge rate in the Pelis factory (9/9 PRs merged). Simple wins like better npm caching or consolidating test runs could reduce CI minutes significantly.
How: Weekly schedule. Analyze workflow YAML for inefficiencies: missing
cachesteps, sequential jobs that could be parallel, redundant checkout steps.Effort: Low (remix from
githubnext/agentics/ci-coach)P3 — Future Ideas
P3.1: Weekly Issue Summary Digest 📰
What: Weekly summary of issue activity — what was opened, closed, stalled, and what trends are emerging.
Why: With issue-monster running hourly and multiple agents creating issues, a human-readable weekly summary helps maintainers stay oriented without reading every issue. Available as template:
githubnext/agentics/weekly-issue-summary.Effort: Low (remix from template)
P3.2: Grumpy Code Reviewer 😤
What: An opinionated PR reviewer that focuses on code quality, not security (security-guard already covers security). Comments on code style, naming conventions, TypeScript best practices, shell script quality.
Why: Security Guard reviews for security issues, but there's no quality-focused reviewer. Given this repo enforces conventional commits and has strong TypeScript conventions (in AGENTS.md), an agent that enforces these on PRs would complement the existing security-guard.
Effort: Low (remix from
githubnext/agentics/grumpy-reviewer)Effort: Low
P3.3: Mergefest — Auto-merge Main into PR Branches 🔀
What: Automatically merge main branch into open PR branches when they fall behind.
Why: Eliminates the "please merge main" back-and-forth, especially for long-running PRs or when many quick fixes land on main after PRs are opened.
Effort: Low (available from Pelis factory)
P3.4: Container Security Image Scanner 🐳
What: Daily scan of the GHCR-published Docker images (
ghcr.io/github/gh-aw-firewall/squid:latestandagent:latest) for known CVEs.Why: AWF publishes Docker images to GHCR. These images contain Squid proxy, Ubuntu packages, Node.js, and iptables — a significant attack surface. Container image scanning is especially important for a security tool. The dependency-security-monitor covers npm deps but not the container OS packages.
Effort: Medium (requires container registry access and
trivyorgrype)📈 Maturity Assessment
Current Level: 3.5/5 — "Established Factory" — A rich collection of specialized agents covering CI, security, and documentation. Operating at scale with multiple engines (Claude, Codex, Copilot). Key missing layer: meta-observability and issue organization.
Target Level: 4.5/5 — "Optimized Factory" — Add observability layer, close issue management gaps, leverage the unique security domain for domain-specific agents.
Gap to Close:
🔄 Comparison with Best Practices
What This Repo Does Exceptionally Well ✅
shared/directory withmcp-pagination.md,secret-audit.md,reporting.mdshows sophisticated workflow compositionWhat Could Be Improved 🔧
Unique Opportunities for a Security/Firewall Repo 🔐
This report was generated by the Pelis Agent Factory Advisor workflow on 2026-03-06. Previous report: #1136 (2026-03-03). Cache memory updated at
/tmp/gh-aw/cache-memory/advisor-notes.md.Beta Was this translation helpful? Give feedback.
All reactions