Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve oom pod error handling #8463

Merged
merged 1 commit into from
Feb 25, 2022
Merged

improve oom pod error handling #8463

merged 1 commit into from
Feb 25, 2022

Conversation

sagor999
Copy link
Contributor

Description

Handle pod failed with out of memory error correctly.
This is the status of the pod with out of memory error:

status:
  message: 'Pod Node didn''t have enough resource: memory, requested: 2097152000,
    used: 65865462583, capacity: 66388008960'
  phase: Failed
  reason: OutOfmemory
  startTime: "2022-02-25T14:26:36Z"

Related Issue(s)

Fixes #8238
Improves #8372

How to test

Release Notes

Improve handling of an error when pod fails to start due to out of memory error on the node

Documentation

@sagor999 sagor999 requested a review from a team February 25, 2022 17:01
@github-actions github-actions bot added the team: workspace Issue belongs to the Workspace team label Feb 25, 2022
@codecov
Copy link

codecov bot commented Feb 25, 2022

Codecov Report

Merging #8463 (d1a75cb) into main (84128c3) will increase coverage by 21.03%.
The diff coverage is 0.00%.

Impacted file tree graph

@@             Coverage Diff             @@
##             main    #8463       +/-   ##
===========================================
+ Coverage   12.31%   33.34%   +21.03%     
===========================================
  Files          20       31       +11     
  Lines        1161     4579     +3418     
===========================================
+ Hits          143     1527     +1384     
- Misses       1014     2936     +1922     
- Partials        4      116      +112     
Flag Coverage Δ
components-gitpod-cli-app 11.17% <ø> (ø)
components-local-app-app-darwin-amd64 ?
components-local-app-app-darwin-arm64 ?
components-local-app-app-linux-amd64 ?
components-local-app-app-linux-arm64 ?
components-local-app-app-windows-386 ?
components-local-app-app-windows-amd64 ?
components-local-app-app-windows-arm64 ?
components-ws-manager-app 39.48% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
components/ws-manager/pkg/manager/manager.go 20.41% <0.00%> (ø)
components/local-app/pkg/auth/auth.go
components/local-app/pkg/auth/pkce.go
components/ws-manager/pkg/clock/clock.go 68.62% <0.00%> (ø)
components/ws-manager/pkg/manager/manager_ee.go 0.00% <0.00%> (ø)
components/ws-manager/pkg/manager/monitor.go 9.46% <0.00%> (ø)
components/ws-manager/pkg/manager/probe.go 0.00% <0.00%> (ø)
components/ws-manager/pkg/manager/metrics.go 11.11% <0.00%> (ø)
components/ws-manager/pkg/manager/annotations.go 65.11% <0.00%> (ø)
components/ws-manager/pkg/manager/imagespec.go 0.00% <0.00%> (ø)
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 84128c3...d1a75cb. Read the comment docs.

Copy link
Member

@aledbf aledbf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@roboquat roboquat merged commit 1f63f30 into main Feb 25, 2022
@roboquat roboquat deleted the pavel/oom-fix branch February 25, 2022 18:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note size/XS team: workspace Issue belongs to the Workspace team
Projects
None yet
3 participants