feat(bug-fix): add label-driven bug-fix agentic workflow#3258
Open
BenBtg wants to merge 11 commits into
Open
Conversation
Add a `bug-fix` gh-aw workflow as stage 2 of the assess -> fix -> test bug pipeline, mirroring the existing `bug-assess` stage. It triggers when a maintainer applies the `bug-fix` label, recovers the slug and remediation contract from the prior bug-assess assessment comment, applies the fix, and opens a draft pull request plus a summary comment for human review. The workflow is intentionally decoupled from Spec Kit specifics: it consumes the assessment from the issue comment rather than any `.specify/` files, so it is portable to other repositories running the matching bug-assess stage. - .github/workflows/bug-fix.md authored and compiled to bug-fix.lock.yml - Label-gated trigger (github.event.label.name == 'bug-fix') - Draft PR via create-pull-request safe-output; scoped permissions - Untrusted-input / URL-safety guardrails consistent with bug-assess - Maintainer remains the gatekeeper; no unattended automation Refs #3238 Assisted-by: GitHub Copilot (model: Claude Opus 4.8, autonomous) Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Adds a new label-driven bug-fix agentic workflow (stage 2 of the assess → fix → test pipeline) that consumes the prior bug-assess issue comment as its contract, applies the proposed remediation, and opens a draft PR plus a single summary comment for maintainer review.
Changes:
- Added
.github/workflows/bug-fix.mddefining the bug-fix stage behavior, guardrails, and safe-outputs (draft PR + comment + status labels). - Added compiled workflow
.github/workflows/bug-fix.lock.ymlgenerated viagh aw compile(v0.79.8) for execution in GitHub Actions.
Show a summary per file
| File | Description |
|---|---|
| .github/workflows/bug-fix.md | New authored gh-aw markdown workflow that parses the latest bug-assess comment and produces a draft PR + issue comment. |
| .github/workflows/bug-fix.lock.yml | Generated/locked GitHub Actions workflow corresponding to bug-fix.md (pins actions/images, expands guardrails/runtime wiring). |
Review details
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 1/2 changed files
- Comments generated: 1
- Review effort level: Low
Address Copilot review feedback on PR #3258: - Trim tools.bash to the inspect set plus a small test-runner set (pytest, npm, go, cargo, dotnet), dropping package-manager/build tools (pip, npx, pnpm, yarn, mvn, gradle, make, bundle, rake, ruby, node) to reduce blast radius under prompt injection. - Set create-pull-request.protected-files.policy: blocked so edits to sensitive files (dependency manifests, README/CHANGELOG/SECURITY, etc.) block PR creation, matching the stronger contract used by the other PR-creating workflows in this repo. Refs #3238 Assisted-by: GitHub Copilot (model: Claude Opus 4.8, autonomous) Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
The Copilot autofix commits edited bug-fix.md (verdict phrasing, Assisted-by trailer) but did not recompile the lock, leaving body_hash stale. Since the workflow runs with strict integrity, the runtime-imported bug-fix.md must match the lock's recorded body_hash. Recompiled with gh-aw v0.79.8 (checkout pin kept at v7.0.0 to match sibling locks); the only change is the body_hash. Assisted-by: GitHub Copilot (model: Claude Opus 4.8, autonomous) Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
…eference Address two Copilot review findings: - add-labels.max: the authored frontmatter said max:1 but the committed lock enforced max:2 (stale from an earlier frontmatter), and Step 8 said 'max 2 labels total'. The workflow only ever applies ONE status label per run (fix-proposed | needs-reproduction | fix-blocked | needs-assessment), so 1 is the correct, tightest contract. Recompiled so the lock now enforces max:1, and reworded Step 8 to 'exactly one status label per run'. - bug-test label: Step 7 hard-coded applying a 'bug-test' label that does not exist in this repo. Since the workflow is portable, reworded to present the stage-3 bug-test workflow as the planned next stage 'if the repository has it configured' rather than assuming it exists. Recompiled with gh-aw v0.79.8; checkout pins kept at v7.0.0 to match sibling locks. No compile drift. Assisted-by: GitHub Copilot (model: Claude Opus 4.8, autonomous) Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
…lock A prior autofix flipped the authored frontmatter add-labels.max back to 2, re-introducing the mismatch: source said 2, the compiled lock enforced 1, and Step 8 prose says 'exactly one status label per run'. The workflow only ever applies a single status label per run (needs-assessment | needs-reproduction | fix-proposed | fix-blocked), so 1 is the correct, tightest contract and matches the compiled lock. Set the frontmatter to max:1 so source, lock, and prose all agree (also avoids the lock staleness guard failing on a frontmatter mismatch). Assisted-by: GitHub Copilot (model: Claude Opus 4.8, autonomous) Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a
bug-fixagentic workflow (gh-aw) as stage 2 of the assess → fix → test bug pipeline, mirroring the existingbug-assessstage. When a maintainer applies thebug-fixlabel to an issue, the workflow recovers the slug and remediation contract from the priorbug-assessassessment comment, applies the fix, and opens a draft pull request plus a summary comment for human review.The workflow is intentionally decoupled from Spec Kit specifics: it consumes the assessment from the issue comment rather than any
.specify/files or specify-CLI tooling, so it can be lifted into any repository running the matchingbug-assessstage.What's included
.github/workflows/bug-fix.md— authored markdown workflow, compiled to.github/workflows/bug-fix.lock.ymlwithgh aw compile(v0.79.8, matching the existing locks).bug-fix(github.event.label.name == 'bug-fix').create-pull-requestsafe-output opens a draft PR ([bug-fix]prefix, labelsbug-fix/automated) plus a single summary comment linking the PR. UsesRefs #(notCloses) since the test stage and human review still follow.BUG_SLUGand the remediation/files/tests contract from the bug-assess comment.bug-assessfirst (needs-assessmentlabel); never guesses a fix.bug-assess, minimal-change scope, no destructive commands. Maintainer remains the gatekeeper — no unattended automation.Acceptance criteria
bug-fixmarkdown workflow added under.github/workflows/and compiled to its.lock.ymlbug-fixlabel; gated so it only runs for that labelValidation
gh aw compile bug-fix→ 0 errors, 0 warningsmarkdownlint-cli2onbug-fix.md→ 0 errorscheckoutaction pin aligned to v7.0.0 in the lock body to match sibling locksDeployment note (action required before live use)
The
bug-fixtrigger label must exist so theissues: labeledevent can fire — same asbug-assess. It has been created on this repo (#d93f0b, description "Trigger the bug-fix agentic workflow"). For reference, the equivalent command is:gh label create bug-fix --repo github/spec-kit \ --color d93f0b --description "Trigger the bug-fix agentic workflow"The status labels the workflow applies (
needs-assessment,needs-reproduction,fix-proposed,fix-blocked) auto-create on first use via GitHub's add-labels API, so only the trigger label needs pre-creating.Testing notes
Static + structural validation (done):
gh aw compile bug-fix→ 0 errors / 0 warnings;markdownlint-cli2clean.on:trigger block is byte-identical to the already-deployedbug-assess.lock.ymlandadd-community-extension.lock.yml(issue-only,types: [labeled], label filtering via job condition). Same 7-job structure and permission isolation (read-only agent job; writes via the isolatedsafe_outputsjob).gh aw trialinstall dry-run confirmed the workflow compiles, installs, and becomes ACTIVE in a host repo.On
gh aw trial(why no automated end-to-end run is included):gh aw trial --clone-repocannot drive an issue-only, label-gated workflow — in clone-repo mode it invokes theworkflow_dispatchAPI, which returns HTTP 422 because this workflow (like its siblings) declares noworkflow_dispatchtrigger. Addingworkflow_dispatchpurely for trial-ability was rejected to preserve consistency withbug-assessand to keep the trigger surface tight. The authentic way to exercise it is the live label flow in a sandbox repo: create an issue, post abug-assess-style assessment comment carrying the slug, then apply thebug-fixlabel.If validating in a sandbox, useful workarounds observed during trial install: pass an absolute workflow path; enable host-repo Actions PR-approval permission; use a GitHub-noreply git email (avoids GH007 on push); and
--disable-security-scanner(the defensive "ignore previous instructions" text in the Untrusted-Input guardrail trips a false-positive prompt-injection flag).Refs #3238
This PR was authored by GitHub Copilot (model: Claude Opus 4.8) on behalf of @BenBtg.
Fixes: #3238