Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PermissionError: [Errno 13] Permission denied: '/input/images/ct/' #4

Closed
Zrrr1997 opened this issue Jun 10, 2022 · 16 comments
Closed

Comments

@Zrrr1997
Copy link

Hi there,

thanks for the great challenge! I am having some trouble with getting the baseline U-Net Docker container to run. I have modified nothing and am just running the following command:

./test.sh

from the autoPET/uNet_baseline directory. I get the following error:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/opt/conda/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/opt/algorithm/process.py", line 95, in <module>
    Unet_baseline().process()
  File "/opt/algorithm/process.py", line 87, in process
    uuid = self.load_inputs()
  File "/opt/algorithm/process.py", line 55, in load_inputs
    ct_mha = os.listdir(os.path.join(self.input_path, 'images/ct/'))[0]
PermissionError: [Errno 13] Permission denied: '/input/images/ct/'

It is a bit odd that os.listdir() is throwing a PermissionError. What could this be caused by? I am somewhat new to Docker, so this could be a very obvious error, but I would appreciate any help!

@Zrrr1997
Copy link
Author

Additionally, if one adds the line os.listdir(self.input_path) on line 54 (the line just above the error), the script throws another error:

OSError: [Errno 116] Stale file handle: '/input/'

@thomaskuestner
Copy link
Member

Can you confirm that you have downloaded the Github LFS files in the test directory?

Can you also please confirm the output of the SCRIPTPATH variable:

echo $SCRIPTPATH

Can you also please check that the permissions on your host system are set correctly so that the Github_repo_root/test path can be mounted (read/write access) into the container -v $SCRIPTPATH/test/input/:/input/.

Otherwise, feel free to change this line

-v $SCRIPTPATH/test/input/:/input/ \
to point to a different directory that has subfolders "images/pet" and "images/ct" which contain the mha files

@Zrrr1997
Copy link
Author

Hi, the files in the test directory are the following:

zmarinov@i14pc112:~/repos/autoPET$ tree test
test
├── expected_output_nnUNet
│   └── images
│       └── TCIA_001.nii.gz
├── expected_output_uNet
│   └── PRED.nii.gz
└── input
    └── images
        ├── ct
        │   └── af3b6605-c2b9-4067-8af5-8b85aafb2ae3.mha
        └── pet
            └── e260efef-0a29-4c68-972e-9e573c740de5.mha
7 directories, 4 files

The variable SCRIPTPATH seems to be /home/zmarinov/repos/autoPET, where autoPET is the root directory of the repo.

Lastly, what do you mean change the permissions for the path? Do you mean to run chmod 777 autoPET/test? I did that and it did not solve the issue.

I think I am missing something very simple but I would be very grateful if someone points me in the right direction...

@Zrrr1997
Copy link
Author

Here is the whole stacktrace:

zmarinov@i14pc112:~/repos/autoPET/uNet_baseline$ sudo sh test.sh 
Script path: /home/zmarinov/repos/autoPET
/home/zmarinov/repos/autoPET/uNet_baseline
Sending build context to Docker daemon  27.65kB
Step 1/14 : FROM pytorch/pytorch
 ---> ca04e7f7c8e5
Step 2/14 : RUN groupadd -r algorithm && useradd -m --no-log-init -r -g algorithm algorithm
 ---> Using cache
 ---> 856ccacfaf4a
Step 3/14 : RUN mkdir -p /opt/algorithm /input /output/images/automated-petct-lesion-segmentation     && chown -R algorithm:algorithm /opt/algorithm /input /output
 ---> Using cache
 ---> e6b818732b9c
Step 4/14 : USER algorithm
 ---> Using cache
 ---> 4b8cebd28672
Step 5/14 : WORKDIR /opt/algorithm
 ---> Using cache
 ---> add5cf2d7d3b
Step 6/14 : ENV PATH="/home/algorithm/.local/bin:${PATH}"
 ---> Using cache
 ---> fa3de32292fd
Step 7/14 : RUN python -m pip install --user -U pip
 ---> Using cache
 ---> 85dc06b55cce
Step 8/14 : COPY --chown=algorithm:algorithm requirements.txt /opt/algorithm/
 ---> Using cache
 ---> a4c57f95b515
Step 9/14 : COPY --chown=algorithm:algorithm monai_unet.py /opt/algorithm/
 ---> Using cache
 ---> eb9378f550a5
Step 10/14 : RUN python -m pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113
 ---> Using cache
 ---> d981f819523f
Step 11/14 : RUN python -m pip install --user -rrequirements.txt
 ---> Using cache
 ---> 5ccacba47aa4
Step 12/14 : COPY --chown=algorithm:algorithm epoch=777-step=64573.ckpt /opt/algorithm/
 ---> Using cache
 ---> b4893b1f1b1e
Step 13/14 : COPY --chown=algorithm:algorithm process.py /opt/algorithm/
 ---> Using cache
 ---> 093ebfb3d27d
Step 14/14 : ENTRYPOINT python -m process $0 $@
 ---> Using cache
 ---> 1e5b77c5dfff
Successfully built 1e5b77c5dfff
Successfully tagged unet_baseline:latest
1+0 records in
1+0 records out
32 bytes copied, 2.3053e-05 s, 1.4 MB/s
unet_baseline-output-787e3f15a4dd6a30b06c529777384ec5
Volume created, running evaluation
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/opt/conda/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/opt/algorithm/process.py", line 96, in <module>
    Unet_baseline().process()
  File "/opt/algorithm/process.py", line 88, in process
    uuid = self.load_inputs()
  File "/opt/algorithm/process.py", line 56, in load_inputs
    ct_mha = os.listdir(os.path.join(self.input_path, 'images/ct/'))[0]
PermissionError: [Errno 13] Permission denied: '/input/images/ct/'
Initialized UNET baseline>....................................
Checking GPU availability
Available: True
Device count: 1
Current device: 0
Device name: NVIDIA GeForce RTX 3060 Ti
Device memory: 8346140672
Start processing
Evaluation done, checking results
Sending build context to Docker daemon  27.65kB
Step 1/9 : FROM python:3.9-slim
 ---> e86cb69f6aef
Step 2/9 : RUN groupadd -r algorithm && useradd -m --no-log-init -r -g algorithm algorithm
 ---> Using cache
 ---> c1ea6d6a40bd
Step 3/9 : WORKDIR /opt/algorithm
 ---> Using cache
 ---> 237a3b15d5d5
Step 4/9 : USER algorithm
 ---> Using cache
 ---> 96a9de870fd0
Step 5/9 : ENV PATH="/home/algorithm/.local/bin:${PATH}"
 ---> Using cache
 ---> f23930cc6e3a
Step 6/9 : RUN python -m pip install --user -U pip
 ---> Using cache
 ---> 27a746f582b6
Step 7/9 : COPY --chown=algorithm:algorithm requirements.txt /opt/algorithm/
 ---> Using cache
 ---> d8a65befb3e9
Step 8/9 : RUN python -m pip install --user numpy
 ---> Using cache
 ---> acc6d98fd6bd
Step 9/9 : RUN python -m pip install --user simpleitk
 ---> Using cache
 ---> 3e8792e53635
Successfully built 3e8792e53635
Successfully tagged unet_eval:latest
Start
Traceback (most recent call last):
  File "<string>", line 5, in <module>
FileNotFoundError: [Errno 2] No such file or directory: '/output/images/automated-petct-lesion-segmentation'
unet_baseline-output-787e3f15a4dd6a30b06c529777384ec5

@Zrrr1997
Copy link
Author

Zrrr1997 commented Jun 13, 2022

It seems I have the same issue with the LFS bandwidth as this issue #2 .

Is this the cause of the PermissionDenied error? And how can you solve this besides waiting a few days to try again?

@thomaskuestner
Copy link
Member

Yes I assume so.

You can simply get this test.sh working, if you take one of the training samples, convert it to mha and put the files in the respective subfolders. All necessary codes are provided in the repo.

@Zrrr1997
Copy link
Author

Interestingly, this does not solve the issue. I used the conversion scripts on only one sample and used the CTres.nii.gz and SUV.nii.gz to produce the .mha files.

$ tree test/
test/
├── expected_output_nnUNet
│   └── images
│       └── TCIA_001.nii.gz
├── expected_output_uNet
│   └── PRED.nii.gz
└── input
    └── images
        ├── ct
        │   └── CTres.mha
        └── pet
            └── SUV.mha

7 directories, 4 files

Adding os.path.exists("/input") in process.py prints out True, but when the script tries to list the subdirectories with os.listdir(os.path.join(self.input_path, 'images/ct/')), the same PermissionError: [Errno 13] Permission denied: '/input/images/ct/' occurs.

Interestingly, if you add print(os.listdir(self.input_path)) without appending /images/ct/ you get another error: OSError: [Errno 116] Stale file handle: '/input/'

At this point, I am not quite sure what else to consider to fix this issue. I suspect there might be some more configurations I need to add to Docker or something in that direction. Any help would be appreciated.

@thomaskuestner
Copy link
Member

It seems that docker is the culprit here.

Please try to mount the directory in a different docker container, e.g. an ubuntu system like this:

docker run -it --rm -v $SCRIPTPATH/test/input/:/input/ ubuntu

Once you have then a shell inside this container (above command), set the ownership and permissions of this folder to root and read/write access for everybody else:

chown :100 /input
chmod 775 /input

Now you can test it again with the test.sh script

@Zrrr1997
Copy link
Author

Hi, thanks for the assistance! It seems that I cannot change the ownership even through the shell in this container. I get the Operation not permitted error. It seems that I have to set docker's permissions somehow globally to get this to run. I'll update if I find a solution, but any help would be appreciated.

docker run -it --rm -v /home/zmarinov/repos/new_autoPET/autoPET/test/input/:/input ubuntu 
root@b9c10c7de1eb:/# ls /input/
images
root@b9c10c7de1eb:/# chown :100 /input/
chown: changing group of '/input/': Operation not permitted
root@b9c10c7de1eb:/# 

@thomaskuestner
Copy link
Member

Could you provide more information on your host system and operation system that you try to run docker from?

Please try to set the permissions chmod 755 to the input folder also on your host system.

@Zrrr1997
Copy link
Author

OS

lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 18.04.6 LTS
Release:	18.04
Codename:	bionic

GPU

nvidia-smi
Tue Jun 14 16:59:25 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.73.05    Driver Version: 510.73.05    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:06:00.0 Off |                  N/A |
| 33%   31C    P8    15W / 200W |    493MiB /  8192MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1644      G   /usr/lib/xorg/Xorg                 38MiB |
|    0   N/A  N/A      1970      G   /usr/bin/gnome-shell               81MiB |
|    0   N/A  N/A     18980      G   /usr/lib/xorg/Xorg                278MiB |
|    0   N/A  N/A     19093      G   /usr/bin/gnome-shell               31MiB |
|    0   N/A  N/A     22639      G   ...AAAAAAAAA= --shared-files       59MiB |
+-----------------------------------------------------------------------------+

FIle system

df -T .
Filesystem                       Type   1K-blocks        Used    Available  Use% Mounted on
workstation:/home/zmarinov       nfs4    10652684288 10063552512  52244480 100% /home/zmarinov

UID, GUID

id zmarinov
uid=12258(zmarinov) gid=10001(i14staff) groups=10001(i14staff),27(sudo),999(docker),10000(i14)

I suspect the problem with permission might be coming from the User or Group IDs. If you need any other host system info, just let me know.

@thomaskuestner
Copy link
Member

I assume that this is a UID/GID mismatch and priviliges problem.

Since you are sudoer on your machine, can't you just set the permissions for the folders directly?

Can you check that your docker config folder has the correct permissions:

 sudo chown "$USER":"$USER" /home/"$USER"/.docker -R
 sudo chmod g+rwx "$HOME/.docker" -R

Alternatively could you try running docker with sudo?

Please check the following blog to see if that solves your problem: https://www.fullstaq.com/knowledge-hub/blogs/docker-and-the-host-filesystem-owner-matching-problem

@sepidehamiri
Copy link

sepidehamiri commented Jul 29, 2022

I have the same issue. I did all the solutions you said. Without sudo the result is:


/home/sepideh/Desktop/autoPET
test.sh: 7: ./build.sh: not found
1+0 records in
1+0 records out
32 bytes copied, 1.2301e-05 s, 2.6 MB/s
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/volumes/create": dial unix /var/run/docker.sock: connect: permission denied
Volume created, running evaluation
docker: Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/containers/create": dial unix /var/run/docker.sock: connect: permission denied.
See 'docker run --help'.
Evaluation done, checking results
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/build?buildargs=%7B%7D&cachefrom=%5B%5D&cgroupparent=&cpuperiod=0&cpuquota=0&cpusetcpus=&cpusetmems=&cpushares=0&dockerfile=Dockerfile.eval&labels=%7B%7D&memory=0&memswap=0&networkmode=default&rm=1&shmsize=0&t=unet_eval&target=&ulimits=null&version=1": dial unix /var/run/docker.sock: connect: permission denied
docker: Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/containers/create": dial unix /var/run/docker.sock: connect: permission denied.
See 'docker run --help'.
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Delete "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/volumes/unet_baseline-output-bce67a2a299797f0150b72eb69bc982e": dial unix /var/run/docker.sock: connect: permission denied

And with sudo:


/home/sepideh/Desktop/autoPET
test.sh: 7: ./build.sh: not found
1+0 records in
1+0 records out
32 bytes copied, 2.3497e-05 s, 1.4 MB/s
unet_baseline-output-b847a7c611251c4f0607f1bd355d6ac9
Volume created, running evaluation
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].
Evaluation done, checking results
Sending build context to Docker daemon  104.6MB
Step 1/9 : FROM python:3.9-slim
 ---> ae64b82339a8
Step 2/9 : RUN groupadd -r algorithm && useradd -m --no-log-init -r -g algorithm algorithm
 ---> Using cache
 ---> 125f5d5ccf65
Step 3/9 : WORKDIR /opt/algorithm
 ---> Using cache
 ---> 38dc0fa7327f
Step 4/9 : USER algorithm
 ---> Using cache
 ---> 8fa4dcfe9009
Step 5/9 : ENV PATH="/home/algorithm/.local/bin:${PATH}"
 ---> Using cache
 ---> 365ec37bb932
Step 6/9 : RUN python -m pip install --user -U pip
 ---> Using cache
 ---> 6de32032a3b2
Step 7/9 : COPY --chown=algorithm:algorithm requirements.txt /opt/algorithm/
 ---> Using cache
 ---> 7ee5404348ba
Step 8/9 : RUN python -m pip install --user numpy
 ---> Using cache
 ---> b7093bf26a64
Step 9/9 : RUN python -m pip install --user simpleitk
 ---> Using cache
 ---> a8514dd9986f
Successfully built a8514dd9986f
Successfully tagged unet_eval:latest
Start
Traceback (most recent call last):
  File "<string>", line 5, in <module>
IndexError: list index out of range
unet_baseline-output-b847a7c611251c4f0607f1bd355d6ac9

@thomaskuestner
Copy link
Member

@sepidehamiri It seems that for you the container is running (with sudo call), but is throwing an error inside?! Can you please confirm that you downloaded the GIT LFS image files?

@sepidehamiri
Copy link

GIT LFS image files

If I understand correctly, yes, I have all the weights and files with the original size after cloning, so I have correctly downloaded the GIT LFS image files.

@thomaskuestner
Copy link
Member

okay, it seems that you either have no GPU installed in your system or that docker is not exporting your GPU to the inside of the container:

WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

You would need to sort this out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants