Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

对自己的数据集进行moco自监督训练时,报错提示not enough values to unpack,请问如何解决 #160

Closed
ratcat1232 opened this issue Dec 25, 2021 · 5 comments

Comments

@ratcat1232
Copy link

请问是不是数据集格式的问题?
2021-12-25 15:33:39,456 - mmselfsup - INFO - workflow: [('train', 1)], max: 200 epochs
2021-12-25 15:33:39,457 - mmselfsup - INFO - Checkpoints will be saved to /home/sipl/zy/mmselfsup/work_dirs/imagelist_test by HardDiskBackend.
/home/sipl/anaconda3/envs/mmdet/lib/python3.8/site-packages/mmcv/utils/registry.py:251: UserWarning: The old API of register_module(module, force=False) is deprecated and will be removed, please use the new API register_module(name=None, force=False, module=None) instead.
warnings.warn(
/home/sipl/anaconda3/envs/mmdet/lib/python3.8/site-packages/mmcv/utils/registry.py:251: UserWarning: The old API of register_module(module, force=False) is deprecated and will be removed, please use the new API register_module(name=None, force=False, module=None) instead.
warnings.warn(
/home/sipl/anaconda3/envs/mmdet/lib/python3.8/site-packages/mmcv/utils/registry.py:251: UserWarning: The old API of register_module(module, force=False) is deprecated and will be removed, please use the new API register_module(name=None, force=False, module=None) instead.
warnings.warn(
/home/sipl/anaconda3/envs/mmdet/lib/python3.8/site-packages/mmcv/utils/registry.py:251: UserWarning: The old API of register_module(module, force=False) is deprecated and will be removed, please use the new API register_module(name=None, force=False, module=None) instead.
warnings.warn(
2021-12-25 15:33:52,377 - mmcv - INFO - Reducer buckets have been rebuilt in this iteration.
Exception in thread Thread-1:
Traceback (most recent call last):
File "/home/sipl/anaconda3/envs/mmdet/lib/python3.8/threading.py", line 932, in _bootstrap_inner
self.run()
File "/home/sipl/anaconda3/envs/mmdet/lib/python3.8/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/home/sipl/anaconda3/envs/mmdet/lib/python3.8/site-packages/torch/utils/data/_utils/pin_memory.py", line 28, in _pin_memory_loop
idx, data = r
ValueError: not enough values to unpack (expected 2, got 0)
Traceback (most recent call last):
File "tools/train.py", line 158, in
main()
File "tools/train.py", line 148, in main
train_model(
File "/home/sipl/zy/mmselfsup/mmselfsup/apis/train.py", line 186, in train_model
runner.run(data_loaders, cfg.workflow)
File "/home/sipl/anaconda3/envs/mmdet/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/sipl/anaconda3/envs/mmdet/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 47, in train
for i, data_batch in enumerate(self.data_loader):
File "/home/sipl/anaconda3/envs/mmdet/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 349, in iter
self._iterator._reset(self)
File "/home/sipl/anaconda3/envs/mmdet/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 852, in _reset
data = self._get_data()
File "/home/sipl/anaconda3/envs/mmdet/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1029, in _get_data
raise RuntimeError('Pin memory thread exited unexpectedly')
RuntimeError: Pin memory thread exited unexpectedly
Traceback (most recent call last):
File "/home/sipl/anaconda3/envs/mmdet/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/sipl/anaconda3/envs/mmdet/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/sipl/anaconda3/envs/mmdet/lib/python3.8/site-packages/torch/distributed/launch.py", line 260, in
main()
File "/home/sipl/anaconda3/envs/mmdet/lib/python3.8/site-packages/torch/distributed/launch.py", line 255, in main
raise subprocess.CalledProcessError(returncode=process.returncode,
subprocess.CalledProcessError: Command '['/home/sipl/anaconda3/envs/mmdet/bin/python', '-u', 'tools/train.py', '--local_rank=0', 'configs/selfsup/densecl/densecl_resnet50.py', '--seed', '0', '--launcher', 'pytorch', '--work_dir', 'work_dirs/imagelist_test/']' returned non-zero exit status 1.

  1. I have searched related issues but cannot get the expected help.
  2. I have read the FAQ documentation but cannot get the expected help.
@fangyixiao18
Copy link
Collaborator

could you please provide your env_info and the format of your own dataset?

@ratcat1232
Copy link
Author

could you please provide your env_info and the format of your own dataset?

感谢您的回复,以下是我的env_info和数据集格式信息:

PyTorch version: 1.7.1
Is debug build: False
CUDA used to build PyTorch: 9.2
ROCM used to build PyTorch: N/A

OS: Ubuntu 16.04.6 LTS (x86_64)
GCC version: (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609
Clang version: Could not collect
CMake version: Could not collect

Python version: 3.8 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: 9.2.148
GPU models and configuration: GPU 0: TITAN X (Pascal)
Nvidia driver version: 418.43
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.20.2
[pip3] torch==1.7.1
[pip3] torch-tb-profiler==0.2.1
[pip3] torchaudio==0.7.0a0+a853dff
[pip3] torchvision==0.8.2
[conda] blas 1.0 mkl https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
[conda] cudatoolkit 9.2 0 defaults
[conda] mkl 2021.2.0 h06a4308_296 defaults
[conda] mkl-service 2.3.0 py38h27cfd23_1 defaults
[conda] mkl_fft 1.3.0 py38h42c9631_2 defaults
[conda] mkl_random 1.2.1 py38ha9443f7_2 defaults
[conda] numpy 1.20.2 py38h2d18471_0 defaults
[conda] numpy-base 1.20.2 py38hfae3a4d_0 defaults
[conda] pytorch 1.7.1 py3.8_cuda9.2.148_cudnn7.6.3_0 pytorch
[conda] torch-tb-profiler 0.2.1
[conda] torchaudio 0.7.2 py38 pytorch
[conda] torchvision 0.8.2 py38_cu92 pytorch
Pillow (6.2.2)

数据集的结构和imagenet一样
train文件夹存放图片,meta包括一个标注了图片名的txt文件--train.txt.
这个txt文件只标注了文件名称,没有标签信息.

顺便一提,我更换了以前版本(就是需要mmcv=1.0.3的版本)的openselfsup,成功调通了代码.

@fangyixiao18
Copy link
Collaborator

你的 data config 里 data_source 是用的 ImageNet 还是 ImageList 呢?

@ratcat1232
Copy link
Author

你的 data config 里 data_source 是用的 ImageNet 还是 ImageList 呢?

您好,我是用的是imagelist

@fangyixiao18
Copy link
Collaborator

应该是版本问题,我们在 datasets/builder.py 里 pytorch 版本检查地方写的有点小问题,我们会改成 1.8 来避免一下。pytorch 1.7 中 pin_memory 和 persistent_worker 不能一起使用, 之前 mmcls 也有类似问题。open-mmlab/mmpretrain#472

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants