Skip to content

feat(bug-fix): add label-driven bug-fix agentic workflow#3258

Open
BenBtg wants to merge 11 commits into
mainfrom
benbtg-feat-3238-bug-fix-workflow
Open

feat(bug-fix): add label-driven bug-fix agentic workflow#3258
BenBtg wants to merge 11 commits into
mainfrom
benbtg-feat-3238-bug-fix-workflow

Conversation

@BenBtg

@BenBtg BenBtg commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds a bug-fix agentic workflow (gh-aw) as stage 2 of the assess → fix → test bug pipeline, mirroring the existing bug-assess stage. When a maintainer applies the bug-fix label to an issue, the workflow recovers the slug and remediation contract from the prior bug-assess assessment comment, applies the fix, and opens a draft pull request plus a summary comment for human review.

The workflow is intentionally decoupled from Spec Kit specifics: it consumes the assessment from the issue comment rather than any .specify/ files or specify-CLI tooling, so it can be lifted into any repository running the matching bug-assess stage.

What's included

  • .github/workflows/bug-fix.md — authored markdown workflow, compiled to .github/workflows/bug-fix.lock.yml with gh aw compile (v0.79.8, matching the existing locks).
  • Label-gated trigger: only runs when the added label is bug-fix (github.event.label.name == 'bug-fix').
  • Delivery: create-pull-request safe-output opens a draft PR ([bug-fix] prefix, labels bug-fix/automated) plus a single summary comment linking the PR. Uses Refs # (not Closes) since the test stage and human review still follow.
  • Slug/contract reuse: recovers BUG_SLUG and the remediation/files/tests contract from the bug-assess comment.
  • No prior assessment → stops and comments asking a maintainer to run bug-assess first (needs-assessment label); never guesses a fix.
  • Safety: scoped permissions (agent job is read-only; PR creation is isolated), untrusted-input / URL-safety guardrails consistent with bug-assess, minimal-change scope, no destructive commands. Maintainer remains the gatekeeper — no unattended automation.

Acceptance criteria

  • bug-fix markdown workflow added under .github/workflows/ and compiled to its .lock.yml
  • Triggered by applying the bug-fix label; gated so it only runs for that label
  • Generates a fix using shared bug logic, decoupled from Spec Kit specifics
  • Maintainer remains the gatekeeper; draft PR, no unattended automation
  • Output (draft PR + comment) posted back to the issue

Validation

  • gh aw compile bug-fix → 0 errors, 0 warnings
  • markdownlint-cli2 on bug-fix.md → 0 errors
  • checkout action pin aligned to v7.0.0 in the lock body to match sibling locks

Deployment note (action required before live use)

The bug-fix trigger label must exist so the issues: labeled event can fire — same as bug-assess. It has been created on this repo (#d93f0b, description "Trigger the bug-fix agentic workflow"). For reference, the equivalent command is:

gh label create bug-fix --repo github/spec-kit \
  --color d93f0b --description "Trigger the bug-fix agentic workflow"

The status labels the workflow applies (needs-assessment, needs-reproduction, fix-proposed, fix-blocked) auto-create on first use via GitHub's add-labels API, so only the trigger label needs pre-creating.

Testing notes

Static + structural validation (done):

  • gh aw compile bug-fix → 0 errors / 0 warnings; markdownlint-cli2 clean.
  • The compiled on: trigger block is byte-identical to the already-deployed bug-assess.lock.yml and add-community-extension.lock.yml (issue-only, types: [labeled], label filtering via job condition). Same 7-job structure and permission isolation (read-only agent job; writes via the isolated safe_outputs job).
  • A gh aw trial install dry-run confirmed the workflow compiles, installs, and becomes ACTIVE in a host repo.

On gh aw trial (why no automated end-to-end run is included): gh aw trial --clone-repo cannot drive an issue-only, label-gated workflow — in clone-repo mode it invokes the workflow_dispatch API, which returns HTTP 422 because this workflow (like its siblings) declares no workflow_dispatch trigger. Adding workflow_dispatch purely for trial-ability was rejected to preserve consistency with bug-assess and to keep the trigger surface tight. The authentic way to exercise it is the live label flow in a sandbox repo: create an issue, post a bug-assess-style assessment comment carrying the slug, then apply the bug-fix label.

If validating in a sandbox, useful workarounds observed during trial install: pass an absolute workflow path; enable host-repo Actions PR-approval permission; use a GitHub-noreply git email (avoids GH007 on push); and --disable-security-scanner (the defensive "ignore previous instructions" text in the Untrusted-Input guardrail trips a false-positive prompt-injection flag).

Refs #3238


This PR was authored by GitHub Copilot (model: Claude Opus 4.8) on behalf of @BenBtg.

Fixes: #3238

Add a `bug-fix` gh-aw workflow as stage 2 of the assess -> fix -> test
bug pipeline, mirroring the existing `bug-assess` stage. It triggers when
a maintainer applies the `bug-fix` label, recovers the slug and remediation
contract from the prior bug-assess assessment comment, applies the fix, and
opens a draft pull request plus a summary comment for human review.

The workflow is intentionally decoupled from Spec Kit specifics: it consumes
the assessment from the issue comment rather than any `.specify/` files, so it
is portable to other repositories running the matching bug-assess stage.

- .github/workflows/bug-fix.md authored and compiled to bug-fix.lock.yml
- Label-gated trigger (github.event.label.name == 'bug-fix')
- Draft PR via create-pull-request safe-output; scoped permissions
- Untrusted-input / URL-safety guardrails consistent with bug-assess
- Maintainer remains the gatekeeper; no unattended automation

Refs #3238

Assisted-by: GitHub Copilot (model: Claude Opus 4.8, autonomous)
Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 30, 2026 14:09

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new label-driven bug-fix agentic workflow (stage 2 of the assess → fix → test pipeline) that consumes the prior bug-assess issue comment as its contract, applies the proposed remediation, and opens a draft PR plus a single summary comment for maintainer review.

Changes:

  • Added .github/workflows/bug-fix.md defining the bug-fix stage behavior, guardrails, and safe-outputs (draft PR + comment + status labels).
  • Added compiled workflow .github/workflows/bug-fix.lock.yml generated via gh aw compile (v0.79.8) for execution in GitHub Actions.
Show a summary per file
File Description
.github/workflows/bug-fix.md New authored gh-aw markdown workflow that parses the latest bug-assess comment and produces a draft PR + issue comment.
.github/workflows/bug-fix.lock.yml Generated/locked GitHub Actions workflow corresponding to bug-fix.md (pins actions/images, expands guardrails/runtime wiring).

Review details

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 1/2 changed files
  • Comments generated: 1
  • Review effort level: Low

Comment thread .github/workflows/bug-fix.md Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review details

  • Files reviewed: 1/2 changed files
  • Comments generated: 1
  • Review effort level: Low

Comment thread .github/workflows/bug-fix.md
Address Copilot review feedback on PR #3258:

- Trim tools.bash to the inspect set plus a small test-runner set
  (pytest, npm, go, cargo, dotnet), dropping package-manager/build
  tools (pip, npx, pnpm, yarn, mvn, gradle, make, bundle, rake, ruby,
  node) to reduce blast radius under prompt injection.
- Set create-pull-request.protected-files.policy: blocked so edits to
  sensitive files (dependency manifests, README/CHANGELOG/SECURITY,
  etc.) block PR creation, matching the stronger contract used by the
  other PR-creating workflows in this repo.

Refs #3238

Assisted-by: GitHub Copilot (model: Claude Opus 4.8, autonomous)
Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
@BenBtg BenBtg marked this pull request as ready for review June 30, 2026 18:48
@BenBtg BenBtg requested a review from mnriem as a code owner June 30, 2026 18:48
Copilot AI review requested due to automatic review settings June 30, 2026 18:48

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@BenBtg BenBtg requested a review from Copilot June 30, 2026 22:25
@BenBtg BenBtg self-assigned this Jun 30, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review details

  • Files reviewed: 1/2 changed files
  • Comments generated: 3
  • Review effort level: Low

Comment thread .github/workflows/bug-fix.md
Comment thread .github/workflows/bug-fix.md Outdated
Comment thread .github/workflows/bug-fix.md Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 30, 2026 22:43
BenBtg and others added 2 commits June 30, 2026 23:43
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
The Copilot autofix commits edited bug-fix.md (verdict phrasing, Assisted-by
trailer) but did not recompile the lock, leaving body_hash stale. Since the
workflow runs with strict integrity, the runtime-imported bug-fix.md must match
the lock's recorded body_hash. Recompiled with gh-aw v0.79.8 (checkout pin kept
at v7.0.0 to match sibling locks); the only change is the body_hash.

Assisted-by: GitHub Copilot (model: Claude Opus 4.8, autonomous)
Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review details

  • Files reviewed: 1/2 changed files
  • Comments generated: 0 new
  • Review effort level: Low

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review details

  • Files reviewed: 1/2 changed files
  • Comments generated: 2
  • Review effort level: Low

Comment thread .github/workflows/bug-fix.md Outdated
Comment thread .github/workflows/bug-fix.md Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 30, 2026 23:45
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review details

  • Files reviewed: 1/2 changed files
  • Comments generated: 2
  • Review effort level: Low

Comment thread .github/workflows/bug-fix.md
Comment thread .github/workflows/bug-fix.md Outdated
BenBtg and others added 2 commits July 1, 2026 00:57
…eference

Address two Copilot review findings:

- add-labels.max: the authored frontmatter said max:1 but the committed lock
  enforced max:2 (stale from an earlier frontmatter), and Step 8 said 'max 2
  labels total'. The workflow only ever applies ONE status label per run
  (fix-proposed | needs-reproduction | fix-blocked | needs-assessment), so 1 is
  the correct, tightest contract. Recompiled so the lock now enforces max:1, and
  reworded Step 8 to 'exactly one status label per run'.
- bug-test label: Step 7 hard-coded applying a 'bug-test' label that does not
  exist in this repo. Since the workflow is portable, reworded to present the
  stage-3 bug-test workflow as the planned next stage 'if the repository has it
  configured' rather than assuming it exists.

Recompiled with gh-aw v0.79.8; checkout pins kept at v7.0.0 to match sibling
locks. No compile drift.

Assisted-by: GitHub Copilot (model: Claude Opus 4.8, autonomous)
Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings July 1, 2026 00:59

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review details

  • Files reviewed: 1/2 changed files
  • Comments generated: 1
  • Review effort level: Low

Comment thread .github/workflows/bug-fix.md Outdated
…lock

A prior autofix flipped the authored frontmatter add-labels.max back to 2,
re-introducing the mismatch: source said 2, the compiled lock enforced 1, and
Step 8 prose says 'exactly one status label per run'. The workflow only ever
applies a single status label per run (needs-assessment | needs-reproduction |
fix-proposed | fix-blocked), so 1 is the correct, tightest contract and matches
the compiled lock. Set the frontmatter to max:1 so source, lock, and prose all
agree (also avoids the lock staleness guard failing on a frontmatter mismatch).

Assisted-by: GitHub Copilot (model: Claude Opus 4.8, autonomous)
Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement label-driven bug fix workflow

3 participants