-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add node drain wait period #4506
Conversation
Thank you very much for your PR! Let's work to get this in together. :) And welcome to the |
@rayterrill HNY 🥳 Any updates on this PR? Please feel free to ask us if you need help with this PR :) |
@Himangini I'd still love to get this in somehow - this is something we definitely would like to see in eksctl. Not sure how to proceed - still pretty new to Go from a big project perspective. :) |
@rayterrill No worries. I'll take a look and give some code advice in the morning :) Some proper code, so you can see what I meant about waiting. :) It's not easy, so don't worry about it. |
Sorry @rayterrill but I just realised that we are doing a lot of things now since November or so, and I was wondering if this is still a thing that is required or not working? #1699 (comment) We are doing a proper eviction and are filtering based on pods which can be evicted or deleted based on status. |
This is really about slowing down the eviction/drain process - we have some apps that need some time to come up once they've been evicted from nodes, and the current process basically rapid fire cordon and drains all nodes in the nodegroup. We're looking to inject some wait time in there in between each operation to allow things to stabilize. Is there some docs on how the health check piece works? |
Sadly no. It's just the code, I'm reading. Look at this function: func (d *Evictor) GetPodsForEviction(nodeName string) (*PodDeleteList, []error) { in pkg/drain/evictor/evictor.go.
Ahhhha. I see. So wait in between... for what exactly? Is there a criteria? We can't check if the pod came up at a different location ofc... so is it really just a hard sleep between node evictions? |
That's what I was thinking - a configurable sleep (an option we could pass in). Def crude, but it would help with some app workloads. |
@rayterrill Alright then. Let's do that. |
@rayterrill Can you take a look at the lint and test errors or do you want me to? |
Oh yeah, you have to update the tests and everywhere where the function is called since you updated the function parameter list. :) |
@Skarlso let me take a look at that this wknd. |
…rrill/eksctl into add-node-drain-wait-period
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution! I have left some comments.
Co-authored-by: Jake Klein <jakelarsj@gmail.com>
Co-authored-by: Chetan Patwal <cPu1@users.noreply.github.com>
Co-authored-by: Chetan Patwal <cPu1@users.noreply.github.com>
Co-authored-by: Chetan Patwal <cPu1@users.noreply.github.com>
Co-authored-by: Chetan Patwal <cPu1@users.noreply.github.com>
Co-authored-by: Chetan Patwal <cPu1@users.noreply.github.com>
Co-authored-by: Chetan Patwal <cPu1@users.noreply.github.com>
Co-authored-by: Chetan Patwal <cPu1@users.noreply.github.com>
Co-authored-by: Chetan Patwal <cPu1@users.noreply.github.com>
Co-authored-by: Chetan Patwal <cPu1@users.noreply.github.com>
Co-authored-by: Chetan Patwal <cPu1@users.noreply.github.com>
I think I made all requested changes. |
I have fixed the compilation errors and the merge conflicts for you. We're good to merge. |
Description
Add ability to specify a node drain wait period during a nodegroup drain operation. This helps with applications that may take time to start and do not react well to the "fire and forget" drain operation.
Also discussed in #1699
Checklist
README.md
, or theuserdocs
directory)area/nodegroup
) and kind (e.g.kind/improvement
)BONUS POINTS checklist: complete for good vibes and maybe prizes?! 🤯