Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A little follow-up on cgroup v2 support #8590

Merged
merged 1 commit into from
Mar 7, 2022
Merged

A little follow-up on cgroup v2 support #8590

merged 1 commit into from
Mar 7, 2022

Conversation

utam0k
Copy link
Contributor

@utam0k utam0k commented Mar 4, 2022

Description

  • Update the docs for cgroup v2
  • Use runc's cgroup v2 determination

Related Issue(s)

Fixes #

How to test

No

Release Notes

Update the docs for cgroup v2

Documentation

No

@utam0k utam0k requested a review from a team March 4, 2022 06:38
@github-actions github-actions bot added the team: workspace Issue belongs to the Workspace team label Mar 4, 2022
@utam0k
Copy link
Contributor Author

utam0k commented Mar 4, 2022

@csweichel Thanks for previous your PR. Great! I have a question about the cgroup v2 hierarchy. Why do we need to create the user cgroup? I guess that if we just wanted to allow workspace users to manipulate the cgroup, we could just create the workspace cgroup.

@utam0k
Copy link
Contributor Author

utam0k commented Mar 4, 2022

If you want to review SVG, the rich diff is very helpful to you.
https://github.blog/2014-10-06-svg-viewing-diffing/

@codecov
Copy link

codecov bot commented Mar 4, 2022

Codecov Report

Merging #8590 (8aa147c) into main (3063396) will decrease coverage by 14.61%.
The diff coverage is n/a.

❗ Current head 8aa147c differs from pull request most recent head d34a781. Consider uploading reports for the commit d34a781 to get more accurate results

Impacted file tree graph

@@             Coverage Diff             @@
##             main    #8590       +/-   ##
===========================================
- Coverage   29.53%   14.91%   -14.62%     
===========================================
  Files         113       49       -64     
  Lines       18177     4732    -13445     
===========================================
- Hits         5369      706     -4663     
+ Misses      12349     3956     -8393     
+ Partials      459       70      -389     
Flag Coverage Δ
components-content-service-app ?
components-content-service-lib ?
components-gitpod-cli-app 11.17% <ø> (ø)
components-image-builder-mk3-app ?
components-local-app-app-darwin-amd64 ?
components-local-app-app-darwin-arm64 ?
components-local-app-app-linux-amd64 ?
components-local-app-app-linux-arm64 ?
components-local-app-app-windows-386 ?
components-local-app-app-windows-amd64 ?
components-local-app-app-windows-arm64 ?
components-supervisor-app ?
components-ws-daemon-app 21.16% <ø> (ø)
components-ws-daemon-lib 21.16% <ø> (ø)
components-ws-manager-app ?
components-ws-proxy-app ?
install-installer-raw-app 4.49% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
components/content-service/pkg/storage/noop.go
components/supervisor/pkg/ports/tunnel.go
components/supervisor/pkg/terminal/service.go
components/supervisor/pkg/ports/exposed-ports.go
components/content-service/pkg/initializer/git.go
components/content-service/pkg/layer/provider.go
components/supervisor/pkg/supervisor/supervisor.go
components/content-service/pkg/storage/namedurl.go
components/ws-manager/pkg/manager/monitor.go
components/ws-manager/pkg/manager/manager.go
... and 54 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3063396...d34a781. Read the comment docs.

@csweichel
Copy link
Contributor

Thanks @utam0k for updating the diagram. At the moment it doesn't make the first cgroup evacuation explicit. The process is as follows:

  1. PrepareForUserNS: evacuate container cgroup to <containerCGroup>/workspace which moves the ring0 process to this new workspace cgroup
  2. start ring2 in its own cgroup namespace (i.e. unshare(CLONE_NEWCGROUP) in ring2)
  3. Evacuate <containerCGroup>/workspace to <containerCGroup>/workspace/user and chown <containerCGroup>/workspace to the gitpod user

Motivation for this process is:

  • the "root" cgroup that the user sees (i.e. the cgroup ring2 was in when we called unshare) must not contain processes. If it did, we could not enable threaded subtree controllers (e.g. pid) and have that cgroup be of type domain. It would necessarily be domain threaded. For k3s support the "ring2 cgroup" must be of type domain.
  • by introducing the workspace cgroup, we can safely chown it to gitpod and give the user access because they're still "confined" by the container runtime's outer cgroup. If we didn't introduce workspace we'd need to give the user full control over the container cgroup so that new child cgroups could be created.

I was wondering: what's the motivation for using runc's cgroup implementation? It introduces a new dependency and with unclear benefit over the current implementation.

@utam0k utam0k force-pushed the to/cgv2 branch 2 times, most recently from fe9e4fa to d34a781 Compare March 7, 2022 00:50
@utam0k
Copy link
Contributor Author

utam0k commented Mar 7, 2022

I got your explanation. Thanks. It makes sense.

Thanks @utam0k for updating the diagram. At the moment it doesn't make the first cgroup evacuation explicit. The process is as follows:

PrepareForUserNS: evacuate container cgroup to /workspace which moves the ring0 process to this new workspace cgroup
start ring2 in its own cgroup namespace (i.e. unshare(CLONE_NEWCGROUP) in ring2)
Evacuate /workspace to /workspace/user and chown /workspace to the gitpod user
Motivation for this process is:

the "root" cgroup that the user sees (i.e. the cgroup ring2 was in when we called unshare) must not contain processes. If it did, we could not enable threaded subtree controllers (e.g. pid) and have that cgroup be of type domain. It would necessarily be domain threaded. For k3s support the "ring2 cgroup" must be of type domain.
by introducing the workspace cgroup, we can safely chown it to gitpod and give the user access because they're still "confined" by the container runtime's outer cgroup. If we didn't introduce workspace we'd need to give the user full control over the container cgroup so that new child cgroups could be created.


Sorry, this was not clear enough to convey my intent. I'm worried about this #8471 (comment). But I left that out of this PR.

I was wondering: what's the motivation for using runc's cgroup implementation? It introduces a new dependency and with unclear benefit over the current implementation.

@roboquat roboquat merged commit fe64389 into main Mar 7, 2022
@roboquat roboquat deleted the to/cgv2 branch March 7, 2022 07:37
@roboquat roboquat added deployed: workspace Workspace team change is running in production deployed Change is completely running in production labels Mar 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deployed: workspace Workspace team change is running in production deployed Change is completely running in production release-note size/L team: workspace Issue belongs to the Workspace team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants