Skip to content

Compactor scheduler: cleanup executor#14629

Merged
andyasp merged 7 commits intofeature/compactor-schedulerfrom
aasp/executor-cleanup
Mar 11, 2026
Merged

Compactor scheduler: cleanup executor#14629
andyasp merged 7 commits intofeature/compactor-schedulerfrom
aasp/executor-cleanup

Conversation

@andyasp
Copy link
Contributor

@andyasp andyasp commented Mar 11, 2026

What this PR does

I was originally preparing a PR to merge executor code into main, but kept finding issues so this is a cleanup PR for the feature branch.

  • Consolidated the configuration for the scheduler client into its own struct and hid the configuration.
    • Renamed flag names for consistency
    • ExecutorRetryMinBackoff and ExecutorRetryMaxBackoff were only used for backing off on updates. Renamed and added validation that was missing for them.
  • In BucketCompactor the syncer (c.sy) was nearly only used in Compact (which compacts for all users) so I was considering passing nil for it instead of constructing one when executing single compaction jobs. There was one odd usage that used it for a metric. That usage was unnecessary though because c.sy.metrics.blocksMarkedForDeletion is the exact same Counter as c.metrics.blocksMarkedForDeletion, so I removed a layer of the indirection. I didn't end up using nil for the syncer for safety against refactors, but kept this change.
- return repairIssue347(ctx, c.logger, c.bkt, c.sy.metrics.blocksMarkedForDeletion, issue347Err)
+ return repairIssue347(ctx, c.logger, c.bkt, c.metrics.blocksMarkedForDeletion, issue347Err)
  • syncDir was being set in the metadata fetcher (which it calls cacheDir) which caches metadata on disk. We don't want to use that in the scheduler mode because any compactor can work on any user so it's far less effective. The plan is to introduce remote cache handling later.
  • The compaction execution was using metaSyncer to retrieve block metadata specified by an added blockIDs argument, but it was a poor fit. It mismatched the other "full sync + filter" usage and adding an argument broke the semantics of MetaFetcher's fetch singleflight usage (I think excludeMarkedForDeletion may also be a bug, but I'll verify that later):
// Run this in thread safe run group.
v, err := f.g.Do("", func() (i interface{}, err error) {
    // Bug: all calls share the same key, but can differ in arguments
    return f.fetchMetadata(ctx, excludeMarkedForDeletion, blockIDs)
})

To fix this I changed it to use MetaFetcher directly and rewrote FetchRequestedMetas to call loadMeta on its own. There is still a useless to us in-memory cache check that loadMeta does, but that may be the place where we can put in future remote cache changes so I let it slide here.

Which issue(s) this PR fixes or relates to

Checklist

  • Tests updated.
  • Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]. If changelog entry is not needed, please add the changelog-not-needed label to the PR.
  • about-versioning.md updated with experimental features.

Note

Medium Risk
Touches compactor scheduling/metadata-fetch paths and renames scheduler-related flags, which could break existing scheduler-mode deployments or alter job execution behavior; changes are mostly refactors but affect runtime configuration and control flow.

Overview
Refactors compactor scheduler mode by moving all scheduler client settings into a new SchedulerClientConfig (compactor.scheduler-client.* flags), adding stricter validation/backoff options, and updating the compactor to switch modes based on scheduler-client.enabled.

Simplifies BucketCompactor by no longer storing a metaSyncer on the struct (it’s now passed into Compact), removes the unused SyncRequestedMetas path, and updates scheduler job execution to fetch requested block metas directly via MetaFetcher.FetchRequestedMetas (with filters applied) instead of reusing the full-sync singleflight path.

Updates generated config/help/docs/defaults to drop the old planning_mode/compactor.scheduler.*/executor retry flags and to point cluster-validation wiring at the new scheduler client gRPC config.

Written by Cursor Bugbot for commit 1e34584. This will update automatically on new commits. Configure here.

@andyasp andyasp added the changelog-not-needed PRs that don't need a CHANGELOG.md entry label Mar 11, 2026
@andyasp andyasp marked this pull request as ready for review March 11, 2026 01:52
@andyasp andyasp requested a review from a team as a code owner March 11, 2026 01:52
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Unused compactorCfg field on schedulerExecutor struct
    • Removed the unused compactorCfg field from schedulerExecutor and its redundant initialization to eliminate dead code.

Create PR

Or push these changes by commenting:

@cursor push fe83948996
Preview (fe83948996)
diff --git a/pkg/compactor/executor.go b/pkg/compactor/executor.go
--- a/pkg/compactor/executor.go
+++ b/pkg/compactor/executor.go
@@ -139,7 +139,6 @@
 
 // schedulerExecutor requests compaction jobs from an external scheduler.
 type schedulerExecutor struct {
-	compactorCfg             Config
 	cfg                      SchedulerClientConfig
 	logger                   log.Logger
 	schedulerClient          compactorschedulerpb.CompactorSchedulerClient
@@ -151,7 +150,6 @@
 
 func newSchedulerExecutor(cfg Config, logger log.Logger, invalidClusterValidation *prometheus.CounterVec) (*schedulerExecutor, error) {
 	executor := &schedulerExecutor{
-		compactorCfg:             cfg,
 		cfg:                      cfg.SchedulerClientConfig,
 		logger:                   logger,
 		invalidClusterValidation: invalidClusterValidation,

@andyasp andyasp requested a review from tacole02 as a code owner March 11, 2026 02:18
@github-actions
Copy link
Contributor

github-actions bot commented Mar 11, 2026

💻 Deploy preview deleted (Compactor scheduler: cleanup executor).

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Error message references wrong flag names
    • Updated the scheduler endpoint validation error text to reference the actual flags compactor.scheduler-client.endpoint and compactor.scheduler-client.enabled.

Create PR

Or push these changes by commenting:

@cursor push 97d56fef3b
Preview (97d56fef3b)
diff --git a/pkg/compactor/executor.go b/pkg/compactor/executor.go
--- a/pkg/compactor/executor.go
+++ b/pkg/compactor/executor.go
@@ -79,7 +79,7 @@
 }
 
 var (
-	errInvalidSchedulerEndpoint          = fmt.Errorf("invalid compactor.scheduler-client.scheduler-endpoint, required when scheduler-worker.enabled is true")
+	errInvalidSchedulerEndpoint          = fmt.Errorf("invalid compactor.scheduler-client.endpoint, required when compactor.scheduler-client.enabled is true")
 	errInvalidSchedulerUpdateInterval    = fmt.Errorf("invalid compactor.scheduler-client.update-interval, interval must be positive")
 	errInvalidSchedulerLeasingMinBackoff = fmt.Errorf("invalid compactor.scheduler-client.leasing-min-backoff, must be positive")
 	errInvalidSchedulerLeasingMaxBackoff = fmt.Errorf("invalid compactor.scheduler-client.leasing-max-backoff, must be greater than min backoff")

Copy link
Contributor

@npazosmendez npazosmendez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean ✅

Comment on lines +499 to +503
// Note: The syncer isn't used in single job execution (only during full BucketCompactor.Compact), but we construct it rather than pass nil to guard against future changes
syncer, err := newMetaSyncer(userLogger, reg, userBucket, fetcher, NewShardAwareDeduplicateFilter(), c.blocksMarkedForDeletion)
if err != nil {
return compactorschedulerpb.UPDATE_TYPE_REASSIGN, errors.Wrap(err, "failed to create syncer")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if BucketCompactor.Compact took the syncer as an argument instead? I feel like I'm missing something

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow, that works and removes that entirely now. Thanks!

@andyasp andyasp merged commit 63d03bc into feature/compactor-scheduler Mar 11, 2026
74 of 76 checks passed
@andyasp andyasp deleted the aasp/executor-cleanup branch March 11, 2026 21:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog-not-needed PRs that don't need a CHANGELOG.md entry

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants