Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unmount subpath should only scan the first level of files/directories #82698

Merged
merged 5 commits into from
Nov 8, 2019

Conversation

janario
Copy link
Contributor

@janario janario commented Sep 13, 2019

What type of PR is this?
/kind bug

What this PR does / why we need it:

Scenario:
Cronjobs with volume (nfs) mounted multiple time with different subpaths.

When the pod is deleted it tries to clean up all the mounting folders but it shows a lot of warns from files that are present in my volume itself

    I0912 16:27:57.717346    8731 mount_linux.go:247] Unmounting /var/lib/kubelet/pods/c15338f0-d579-11e9-8725-06831030d3a2/volume-subpaths/janario/app/1
    I0912 16:27:57.750994    8731 mount_helper.go:85] "/var/lib/kubelet/pods/c15338f0-d579-11e9-8725-06831030d3a2/volume-subpaths/janario/app/1" is unmounted, deleting the directory
    I0912 16:27:57.751043    8731 mount_linux.go:972] Successfully cleaned subpath directory /var/lib/kubelet/pods/c15338f0-d579-11e9-8725-06831030d3a2/volume-subpaths/janario/app/1
    I0912 16:27:57.751050    8731 mount_linux.go:965] Cleaning up subpath mounts for subpath my-dir
    

W0912 16:27:57.751059    8731 mount_helper.go:37] Warning: Unmount skipped because path does not exist: /var/lib/kubelet/pods/c15338f0-d579-11e9-8725-06831030d3a2/volume-subpaths/janario/app/my-dir

When I look the files before the umount my-dir is actually in a different path from the one printed out on the warn

/var/lib/kubelet/pods/c15338f0-d579-11e9-8725-06831030d3a2/volume-subpaths/janario/app/1/my-dir not /var/lib/kubelet/pods/c15338f0-d579-11e9-8725-06831030d3a2/volume-subpaths/janario/app/my-dir

Turns out, looking at the code, the umount process (subpath_linux.go doCleanSubPaths filepath.Walk) is doing:

  • make a list of filenames under volume-subpaths/janario/app/1/ ['my-dir']
  • umounts `volume-subpaths/janario/app/1/, 'my-dir' was under it so it doesn't exists anymore
  • interact with the list built at the first step and tries to umount each of them, concatenating it with the base path (volume-subpaths/janario/app/my-dir). Then shows warnings because it doesn't exist

subpath_linux.go doCleanSubPaths should only scans the first level of directories/files which are the actual mounting points

Does this PR introduce a user-facing change?:

NONE

@k8s-ci-robot
Copy link
Contributor

Welcome @janario!

It looks like this is your first PR to kubernetes/kubernetes 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/kubernetes has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/bug Categorizes issue or PR as related to a bug. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Sep 13, 2019
@k8s-ci-robot
Copy link
Contributor

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA.

It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.


Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 13, 2019
@k8s-ci-robot
Copy link
Contributor

Hi @janario. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added sig/storage Categorizes an issue or PR as relevant to SIG Storage. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Sep 13, 2019
@andyzhangx
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 13, 2019
@janario
Copy link
Contributor Author

janario commented Sep 13, 2019

I signed it

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Sep 13, 2019
@janario
Copy link
Contributor Author

janario commented Sep 13, 2019

/retest

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Sep 14, 2019
@k8s-ci-robot
Copy link
Contributor

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA.

It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.


Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. do-not-merge/contains-merge-commits Indicates a PR which contains merge commits. area/apiserver sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps. and removed cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Sep 14, 2019
@msau42
Copy link
Member

msau42 commented Oct 4, 2019

/assign @msau42 @jsafrane

@ttousai
Copy link

ttousai commented Oct 31, 2019

Bug triage for 1.17 here with a gentle reminder that code freeze for this release is on Nov. 18. Is this PR still intended for 1.17?

@janario
Copy link
Contributor Author

janario commented Oct 31, 2019

Bug triage for 1.17 here with a gentle reminder that code freeze for this release is on Nov. 18. Is this PR still intended for 1.17?

It would be good to have it
@msau42 @jsafrane @saad-ali @alejandrox1 @andyzhangx
any updates here?

@@ -241,7 +241,8 @@ func doCleanSubPaths(mounter mount.Interface, podDir string, volumeName string)
if err = doCleanSubPath(mounter, fullContainerDirPath, filepath.Base(path)); err != nil {
return err
}
return nil
// skip subdirs of the volume: it only matters the first level to unmount, otherwise it would try to unmount subdir of the volume
return filepath.SkipDir
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is going to work when the subpath could be a file. SkipDir will skip the rest of the directory in that case. How about we change this back to use ioutil.ReadDir? I unfortunately can't remember why I had changed it to Walk in the first place. Thanks for catching this!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ioutil.ReadDir was my first try :) , but I've had some problems in some e2e tests with file stale handle. I guess ioutil.ReadDir is too eager

For files I'm expecting them to be in the first level and get unmounted

/var/lib/kubelet/pods/<uid>/volume-subpaths/<volume>/<container name>/0 < my-dir
/var/lib/kubelet/pods/<uid>/volume-subpaths/<volume>/<container name>/1 < my-file

But I'll add a test case to make it sure, I'll send an update here soon

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

about the test case, just checked and it is already covered https://github.com/kubernetes/kubernetes/pull/82698/files#r341848461

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing case doesn't quite cover the scenario I'm thinking about. According to golang docs, "If the function returns SkipDir when invoked on a non-directory file, Walk skips the remaining files in the containing directory."

Which I think is not what we want? If "0" was a file and "1" was a directory, then I believe that means "1" will get skipped.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@msau42
very good point 👍

added case with files and directories

https://github.com/kubernetes/kubernetes/pull/82698/files#diff-b962320d780ac74dffce9da68eda4f4cR551

/var/lib/kubelet/pods/<uid>/volume-subpaths/<volume>/<container name>/0 < my-file0
/var/lib/kubelet/pods/<uid>/volume-subpaths/<volume>/<container name>/1 < my-dir1
/var/lib/kubelet/pods/<uid>/volume-subpaths/<volume>/<container name>/2 < my-dir2
/var/lib/kubelet/pods/<uid>/volume-subpaths/<volume>/<container name>/3 < my-file3

let me know what you think about it

name: "subpath-with-files",
prepare: func(base string) ([]mount.MountPoint, error) {
path := filepath.Join(base, containerSubPathDirectoryName, testVol, "container1", "0")
path2 := filepath.Join(base, containerSubPathDirectoryName, testVol, "container1", "1")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also possible for the path to be a file. Can you add a test case for that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was about to add a case for subpath with file, but turns out it already have one test case for that

https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/util/subpath/subpath_linux_test.go#L434

Do you see something else to cover?

@janario
Copy link
Contributor Author

janario commented Nov 3, 2019

/assign @jsafrane

@janario
Copy link
Contributor Author

janario commented Nov 3, 2019

@msau42 Thanks for the code review, I've just checked and replied your comments

Let me know if you find something else 👍

Also I've re-based master into my branch, it will re-run the tests

@janario
Copy link
Contributor Author

janario commented Nov 3, 2019

/retest

@ttousai
Copy link

ttousai commented Nov 4, 2019

Correction: Code freeze is on Thursday, November 14.

Sorry for the mix up.

}
return mounts, nil
},
unmount: func(mountpath string) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why we need this unmount function? The RemoveAll is especially scary if the test gets the path wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is set to test with the FakeMounter

The unmount process expects the directory to be empty after the unmount which will then remove the directory

Using the FakeMounter I do a clean up, to avoid it to fail in the remove

The bug itself is that without the skipDir it would start to try to unmount all the were inside my volume, which won't exist after the root unmount

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where I'm confused is that there's nothing under the "0", "1", etc test directories?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from the latest changes this is how the files look like:

janario@ubuntu:/tmp/clean-subpaths-subpath-with-files-825360851/volume-subpaths/vol1$ tree .
.
└── container1
    ├── 0
    ├── 1
    │   └── my-dir-1
    ├── 2
    │   └── my-dir-2
    └── 3

5 directories, 2 files

then the filepath.Walk starts with

  • 0 (file): unmount function doesn't do anything because it is a file and the file gets removed here
  • 1 (dir with sub dirs): unmount function will remove my-dir-1 but not 1
  • 2 (dir with sub dirs): same as dir 1
  • 3 (file): same as file 0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but I think I got your point, the scenario is created from my test case, so in the unmount from my scenario should know that my-dir-1 and my-dir-2 will always be empty

so there is no need for a recursive remove

I've just changed it to not recursive remove

let me know what do you think

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation! I didn't realize there were directories created underneath, so this makes sense to me.

I contemplated whether or not we should just add the cleanup dir logic to the FakeMounter, but then we would need the RemoveAll to be more generic, and I'm very wary of using RemoveAll anywhere. So I think this is fine.

@janario
Copy link
Contributor Author

janario commented Nov 7, 2019

/test pull-kubernetes-e2e-gce-100-performance

@janario
Copy link
Contributor Author

janario commented Nov 7, 2019

/test pull-kubernetes-e2e-gce-csi-serial

@msau42
Copy link
Member

msau42 commented Nov 8, 2019

/lgtm
/assign @jsafrane
for approval

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 8, 2019
@jsafrane
Copy link
Member

jsafrane commented Nov 8, 2019

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: janario, jsafrane

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 8, 2019
@k8s-ci-robot k8s-ci-robot merged commit 59aa8fd into kubernetes:master Nov 8, 2019
@janario
Copy link
Contributor Author

janario commented Nov 8, 2019

Thank you all 😉✌️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note-none Denotes a PR that doesn't merit a release note. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants