Skip to content

fix(archive): don't let a broken worktree block archiving#3049

Open
frankh wants to merge 2 commits into
mainfrom
posthog-code/fix-archive-broken-worktree
Open

fix(archive): don't let a broken worktree block archiving#3049
frankh wants to merge 2 commits into
mainfrom
posthog-code/fix-archive-broken-worktree

Conversation

@frankh

@frankh frankh commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Problem

Some worktree tasks can't be archived — every attempt just shows the generic Failed to archive task toast, forever.

Archiving a worktree task first captures a git checkpoint of the worktree (a restore point used on un-archive). That capture (CaptureCheckpointSaga) refuses to run and throws when the worktree is mid-operation or conflicted:

  • Cannot capture checkpoint while git operation is in progress (e.g. an interrupted rebase/merge)
  • Cannot capture checkpoint with unresolved merge conflicts in the index

When capture throws, the entire archive aborts and rolls back (ArchiveService.archiveTask), so the task can never be archived — and because the worktree stays stuck in the bad state, it fails on every retry. The real error is only logged; the UI shows the generic toast.

Confirmed against real data: a worktree left in an interrupted rebase with unmerged index entries reproduces this exactly.

Fix

Checkpoint capture is best-effort recovery data — a failure to snapshot should not block the destructive archive. The flow already degrades gracefully when a worktree isn't a valid git repo (logs a warning, sets checkpointId = null, proceeds). This extends that same tolerance:

  1. Checkpoint capture is non-fatal. On failure, log a warning, drop the restore point (checkpointId = null), and continue teardown. Safe because the saga rolls back its own internal steps before failing, so no partial checkpoint ref is left behind.
  2. Worktree teardown is hardened. A deleteWorktree failure (git lock, filesystem error) now logs a warning and still records the archive.

Trade-off: a task archived this way has no restore point on un-archive — acceptable, since the worktree was already in an unrestorable/broken state. No manual DB/git surgery is needed to recover currently-stuck tasks; retrying Archive after this ships succeeds.

Testing

  • New integration test: archiving a worktree left in an in-progress merge conflict now succeeds with checkpointId: null and the worktree removed (rejects without the fix).
  • pnpm --filter @posthog/workspace-server test — 28/28 archive integration tests pass.
  • pnpm --filter @posthog/workspace-server typecheck — clean.
  • biome lint — clean.

Created with PostHog Code

Archiving a worktree task first captures a git checkpoint as an
un-archive restore point. CaptureCheckpointSaga throws when the worktree
is mid-operation (interrupted rebase/merge) or has unmerged index
entries, which aborted and rolled back the entire archive. Because the
worktree stays in that state, the task could never be archived and only
ever showed a generic "Failed to archive task" toast.

Make checkpoint capture non-fatal: on failure, log a warning, drop the
restore point (checkpointId = null), and proceed with teardown, mirroring
the existing behavior for a non-git worktree. Also harden worktree
teardown so a deleteWorktree failure (git lock, filesystem error) still
records the archive.

Generated-By: PostHog Code
Task-Id: f4af32a4-5962-4dd6-b729-b4659bdd0866
@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown

React Doctor found no issues in the changed files. 🎉

Reviewed by React Doctor for commit b29b2cf.

@frankh frankh requested a review from jonathanlab July 1, 2026 10:07
@frankh

frankh commented Jul 1, 2026

Copy link
Copy Markdown
Contributor Author

I think it makes more sense to just leave dirty worktrees lying around than being completely unable to archive the task

@greptile-apps

greptile-apps Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Comments Outside Diff (1)

  1. packages/workspace-server/src/services/archive/archive.integration.test.ts, line 436 (link)

    P2 No test coverage for the deleteWorktree failure-tolerance path

    The new test exercises the checkpoint-capture failure branch only. The second hardening — where deleteWorktree itself throws but the archive is still recorded — has no corresponding test. A test that stubs or mocks WorktreeManager.deleteWorktree to throw, then asserts that the archive record is still created, would give confidence that the inner try/catch behaves as intended and that later code (e.g. archiveRepo.create) is still reached.

    Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Reviews (1): Last reviewed commit: "fix(archive): don't let a broken worktre..." | Re-trigger Greptile

Comment on lines 284 to 302
await step(
async () => {
const manager = new WorktreeManager({
mainRepoPath: folderPath,
worktreeBasePath: this.workspaceSettings.getWorktreeLocation(),
});
await manager.deleteWorktree(worktreePath);
const parentDir = path.dirname(worktreePath);
await forceRemove(parentDir);
try {
const manager = new WorktreeManager({
mainRepoPath: folderPath,
worktreeBasePath: this.workspaceSettings.getWorktreeLocation(),
});
await manager.deleteWorktree(worktreePath);
const parentDir = path.dirname(worktreePath);
await forceRemove(parentDir);
} catch (error) {
this.log.warn(
`Failed to remove worktree at ${worktreePath}; archiving anyway (on-disk worktree may need manual cleanup)`,
{ error },
);
}
},
async () => {},
);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Silent deletion failure leaves unarchivable state when checkpoint succeeded

When deleteWorktree throws (e.g. a git lock file or a still-registered worktree), the inner try/catch swallows the error and step() considers the deletion a success. If checkpoint capture already succeeded at this point, the archive record is written with a non-null checkpointId, while the git worktree is still registered under its original name in the main repo. A subsequent unarchiveTask call hits shouldRestoreWorktree = true, then restoreWorktreeFromCheckpoint asks the WorktreeManager to add a worktree with preferredName = worktree.name — which git will reject because that name/path is already registered, leaving the task permanently un-restorable.

The fix in the checkpoint path (nulling checkpointId on failure) avoids this because shouldRestoreWorktree becomes false. The same treatment would apply here: if deletion fails and a checkpoint was captured, set archivedTask.checkpointId = null (and clean up the now-orphaned checkpoint ref) so the archive record is internally consistent.

If deleteWorktree throws (git lock, still-registered worktree) after a
checkpoint was captured, the archive record was written with a non-null
checkpointId while the worktree stayed registered under its original name.
A later unarchiveTask would then try to re-add a worktree with that same
name/path, which git rejects — leaving the task permanently un-restorable.

On deletion failure, null out checkpointId (matching the existing capture-
failure path) and best-effort delete the now-orphaned checkpoint ref so the
archive record stays internally consistent.

Adds an integration test that stubs WorktreeManager.deleteWorktree to throw
and asserts the archive is still recorded with a null checkpointId.

Addresses Greptile review feedback on #3049.

Generated-By: PostHog Code
Task-Id: f4af32a4-5962-4dd6-b729-b4659bdd0866
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant