perf(actions): match runner tasks in SQL via normalized job-label table#38282
Open
bircni wants to merge 2 commits into
Open
perf(actions): match runner tasks in SQL via normalized job-label table#38282bircni wants to merge 2 commits into
bircni wants to merge 2 commits into
Conversation
Replace the in-memory label scan in runner task assignment with an indexed SQL query so a runner poll stays O(1 row) regardless of the waiting backlog: - Add a normalized action_run_job_label table (one row per RunsOn label) kept in sync on job insert/delete, so labels match in SQL instead of loading and filtering every waiting job in Go. - Add a composite (status, updated) index so "oldest waiting job" is an index seek instead of a sort of the whole waiting backlog. - Rewrite CreateTaskForRunner around the SQL match, failing unpreparable jobs so they leave the queue, and re-aggregate run/attempt status on claim so run-level concurrency still sees the occupying run. - Migration v343 creates the table, adds the index, and backfills labels for assignable (waiting/blocked) jobs. Assisted-by: Claude:claude-opus-4-8
This migration targets the 1.28 release, so move AddActionRunJobMatchingSchema (migration 343) from the v1_27 package into a new v1_28 package and mark the 1.27 version boundary. The migration ID is unchanged. Assisted-by: Claude:claude-opus-4-8
Contributor
There was a problem hiding this comment.
Pull request overview
This PR introduces database-backed runner label matching and queue-pick optimizations for Gitea Actions task assignment. It normalizes runs-on labels into a dedicated table and rewrites task picking to select the oldest matchable waiting job in SQL, avoiding O(backlog) in-memory filtering and preventing head-of-line stalls from unpreparable jobs.
Changes:
- Add normalized
action_run_job_labeltable plus migration/backfill (v343) and add a composite(status, updated)index to speed up “oldest waiting job” selection. - Route ActionRunJob creation through a single insert path that also persists label rows, and add label cleanup on run/repo deletion paths to avoid orphaned label rows.
- Rewrite
CreateTaskForRunnerinto a transactional SQL-based picker with bounded skip/fail handling for unpreparable jobs, plus new/updated tests covering matching and failure behavior.
Reviewed changes
Copilot reviewed 15 out of 16 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| services/repository/delete.go | Deletes job-label rows before repo-scoped job deletion to avoid orphan labels. |
| services/actions/run.go | Inserts jobs via InsertActionRunJob to keep label projection in sync. |
| services/actions/reusable_workflow.go | Uses InsertActionRunJob for reusable-workflow child job inserts. |
| services/actions/rerun.go | Uses InsertActionRunJob when cloning jobs for reruns to keep labels synced. |
| services/actions/cleanup.go | Deletes job-label rows before deleting run jobs during run cleanup. |
| models/migrations/v1_28/v343.go | Migration creating label table, adding composite index, and backfilling labels for waiting/blocked jobs. |
| models/migrations/v1_28/v343_test.go | Migration test validating backfill behavior (including dedup). |
| models/migrations/v1_28/main_test.go | TestMain bootstrap for v1_28 migration tests. |
| models/migrations/migrations.go | Registers migration 343. |
| models/fixtures/action_run_job_label.yml | Fixture file for the new label table (empty baseline). |
| models/actions/task.go | Removes the old in-memory scanning picker implementation (moved to task_pick.go). |
| models/actions/task_test.go | Adds tests validating SQL-based picking semantics and unpreparable-job handling. |
| models/actions/task_pick.go | New transactional SQL-based CreateTaskForRunner and helpers. |
| models/actions/run_job.go | Adds composite (status, updated) index tags used by the picker. |
| models/actions/run_job_label.go | New model + insert/delete helpers for normalized runs-on labels and SQL match condition builder. |
| models/actions/run_job_label_test.go | Tests for label matching contract and SQL condition equivalence. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+221
to
+223
| if fallbackErr != nil { | ||
| return err | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Split out of #38150, which is being broken into smaller, independently reviewable PRs. This one contains the database part: the schema and query changes that make runner task assignment scale with the waiting backlog. It is independent of the no-DB split (#38281) and can land in either order.
Runner task assignment previously loaded every waiting job and filtered labels in Go. This replaces that with an indexed SQL query so a poll stays ~O(1 row) regardless of backlog size:
action_run_job_labeltable — one row perRunsOnlabel, kept in sync on job insert/delete, so label matching happens in SQL instead of in memory.(status, updated)index onaction_run_jobso "oldest waiting job" is an index seek instead of a sort of the whole waiting backlog.CreateTaskForRunneraround the SQL match; unpreparable jobs are failed so they leave the queue, and run/attempt status is re-aggregated on claim so run-level concurrency still sees the occupying run.