Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

query-scheduler: lift tenant data out of userQueue; encapsulate tenant-querier relationship #6133

Merged
merged 30 commits into from
Oct 3, 2023

Conversation

francoposa
Copy link
Member

@francoposa francoposa commented Sep 25, 2023

What this PR does

the overall organization of this code is very much open to feedback.
this is supposed to be a pure refactoring PR, not changing behavior

my goals here are to:

  1. pull tenant data out of the userQueue so we can treat it just like a bare queue, in preparation for replacing it with a queue using the multidimensional-fairness-round-robin approach
  2. draw a line between maintaining the the tenant-querier relationships and and doing the actual enqueuing and dequeuing; this new separation is largely demonstrated by the changes to getOrAddQueue (now getOrAddTenantQueue), and getNextQueueForQuerier.

Which issue(s) this PR fixes or relates to

Fixes #

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@francoposa francoposa force-pushed the francoposa/lift-user-data-out-of-userQueue branch from 2cfd90d to 8ded965 Compare September 26, 2023 15:11
@francoposa
Copy link
Member Author

francoposa commented Sep 26, 2023

Okay so everything before 5e47b0b is the super basic refactoring just lifting the fields out of userQueue and up into queues... now to try to encapsulate this a bit

@pracucci
Copy link
Collaborator

@francoposa For any work on query-scheduler queue, I kindly ask you to get @charleskorn reviews cause he recently worked on it and has a good understanding of it works. Thanks!

@francoposa francoposa changed the title lift user data out of userQueue query-scheduler: lift tenant data out of userQueue; encapsulate tenant-querier relationship Sep 26, 2023
Comment on lines +87 to 89
type userQueue struct {
requests *list.List
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be flattened so it's no longer a separate type at all

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you say flattened, do you mean making this something like type userQueue *list.List? If so, that makes sense to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I may keep it unflattened for now, for when we move to the newer queue implementation.

That way I can wrap both userQueue and the new queue a a basic queue interface so they can be swapped 1:1.

// We set this to nil if number of available queriers <= maxQueriers.
queriers map[string]struct{}
maxQueriers int
tenantQuerierState tenantQuerierState
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not married to embedding tenantQuerierState into the queues struct, they could be separated.

Copy link
Contributor

@charleskorn charleskorn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall direction LGTM.

General comments:

  • We seem to use the terms "user" and "tenant" interchangeably throughout this code, and this was true before this PR as well. It would be good to pick one and use it consistently. (This doesn't have to be done in this PR, this could be a separate PR.)
  • I'm wondering if we need to rename some types - there are a lot of types with "queue" in their name, which I find confusing. Perhaps we should rename RequestQueue to RequestDispatcher, and queues to tenantQueueManager? Then it's clear that tenantQueueManager is responsible for managing all the different queues for different tenants.

pkg/scheduler/queue/queue.go Outdated Show resolved Hide resolved
// - it is detected during request enqueueing that a tenant's queriers
// were calculated from an outdated max-queriers-per-tenant value

queriersByID map[string]*querierConn
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to clarify which ID we're referring to as the key here.

We have a couple of different IDs stored as strings throughout this file - perhaps adding some specific type definitions would help clarify which one is what and also catch places where we use the wrong kind of ID?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a bit verbose because we have to alias search and sort implementations for the slice of querierIDs when we make that a type alias, but I have a good solution - will hold it until your renaming PR gets in because I have to change a bunch of function signatures

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, thanks.

pkg/scheduler/queue/user_queues.go Outdated Show resolved Hide resolved
Comment on lines +87 to 89
type userQueue struct {
requests *list.List
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you say flattened, do you mean making this something like type userQueue *list.List? If so, that makes sense to me.

pkg/scheduler/queue/user_queues.go Outdated Show resolved Hide resolved
pkg/scheduler/queue/user_queues.go Show resolved Hide resolved
pkg/scheduler/queue/user_queues_test.go Show resolved Hide resolved
@francoposa francoposa force-pushed the francoposa/lift-user-data-out-of-userQueue branch from 153a4c9 to d753be3 Compare September 27, 2023 20:51
…y tenant ID; push removeTenant down to tenantQuerierAssignments method
… to hide usage of tenantQuerierAssignments from consumers
@francoposa
Copy link
Member Author

francoposa commented Sep 27, 2023

still WIP on a couple things but stepping away for today

  1. need to re-add that test case for maxQueriers >= available queriers
  2. open question on the mod len(q.users) comment
  3. possible queues to tenantQueueManager rename -> I agree that queues is far too vague, but my first instinct is to avoid naming things somethingManager if possible; would want to brainstorm other possibilities

@francoposa
Copy link
Member Author

everything should be addressed for the moment; save for the further merge conflicts that will need corrected once your renaming PR is in.

renamed queues to queueBroker as I believe it best represents the role that struct is playing in mediating the enqueuing and dequeuing.

@francoposa francoposa marked this pull request as ready for review September 29, 2023 18:32
@francoposa francoposa requested a review from a team as a code owner September 29, 2023 18:32
@francoposa
Copy link
Member Author

francoposa commented Sep 29, 2023

@charleskorn I went ahead and made the type alias changes here for querierID and tenantID, as I will be away for a week and a half; feel free to do add whatever you want to carry this refactoring chunk across the line while I am out.

I didn't place a huge emphasis on converting every usage of user to tenant; in part because there's just a lot in this PR already, and in part because I did not want to address yet whether we need to rename files like user_queue too, as some people prefer to avoid file renaming as much as possibly for git history reasons.

@charleskorn
Copy link
Contributor

@charleskorn I went ahead and made the type alias changes here for querierID and tenantID, as I will be away for a week and a half; feel free to do add whatever you want to carry this refactoring chunk across the line while I am out.

Awesome, thanks. I'll pick this up once #6141 is merged.

I didn't place a huge emphasis on converting every usage of user to tenant; in part because there's just a lot in this PR already, and in part because I did not want to address yet whether we need to rename files like user_queue too, as some people prefer to avoid file renaming as much as possibly for git history reasons.

Leaving that change to a subsequent PR is fine with me. Renaming the file should be fine, but again, doesn't need to be done as part of this PR.

# Conflicts:
#	pkg/scheduler/queue/queue.go
Copy link
Contributor

@charleskorn charleskorn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @francoposa

@charleskorn charleskorn enabled auto-merge (squash) October 3, 2023 00:54
@charleskorn
Copy link
Contributor

@francoposa I'll merge this and leave any follow-up refactoring / cleanup (eg. "user" to "tenant") for you when you're back.

@charleskorn charleskorn merged commit bd095a2 into main Oct 3, 2023
28 checks passed
@charleskorn charleskorn deleted the francoposa/lift-user-data-out-of-userQueue branch October 3, 2023 01:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants