Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'CUDA out of memory' in CUDA 11.0 #290

Closed
ybbbbt opened this issue Jan 7, 2021 · 3 comments
Closed

'CUDA out of memory' in CUDA 11.0 #290

ybbbbt opened this issue Jan 7, 2021 · 3 comments

Comments

@ybbbbt
Copy link

ybbbbt commented Jan 7, 2021

Describe the bug
Dear Authors,
thanks for the great work. Currently, I have encountered a problem of 'out of memory' when using MinkowskiEngine (mainly for FCGF) in CUDA 11.0, but it works fine in CUDA 10.2.
The output is shown as below:

RuntimeError: CUDA out of memory. Tried to allocate 23.81 GiB (GPU 0; 10.76 GiB total capacity; 6.08 MiB already allocated; 5.03 GiB free; 22.00 MiB reserved in total by PyTorch)

To Reproduce

import torch
import numpy as np
import MinkowskiEngine as ME

xyz = np.random.uniform(-10, 10, (2000, 3))  # [N, 3]
feats = []
feats.append(np.ones((len(xyz), 1)))
feats = np.hstack(feats)

voxel_size = 0.025
# Voxelize xyz and feats
coords = np.floor(xyz / voxel_size)
_, unique_map, inverse_map = ME.utils.sparse_quantize(coords, return_index=True, return_inverse=True)
inds = unique_map
coords = coords[inds]
return_coords = xyz[inds]
coords = ME.utils.batched_coordinates([coords])

feats = feats[inds]

feats = torch.tensor(feats, dtype=torch.float32)
coords = coords.to(dtype=torch.int32)
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

stensor = ME.SparseTensor(feats, coordinates=coords, device=device)

xx = torch.ones((4, 480, 640)).cuda()
bs, ys, xs = torch.where(xx > 0)

Desktop (please complete the following information):

  • CUDA version: 11.0
  • NVIDIA Driver version: 460.27.04
  • OS: Ubuntu 18.04
  • Minkowski Engine version 0.5.0

Additional context
When I comment the stensor = ME.SparseTensor(feats, coordinates=coords, device=device), the bs, ys, xs = torch.where(xx > 0) will works without 'out of memory'.

@ybbbbt ybbbbt changed the title CUDA out of memory in CUDA 11.0 'CUDA out of memory' in CUDA 11.0 Jan 7, 2021
@chrischoy
Copy link
Contributor

chrischoy commented Jan 7, 2021

This bus is not related to Minkowski Engine. You are trying to allocate 23.81 GiB memory using torch.where.
The bug seems to be fixed on the latest version of torch >= 1.7.0.

@ybbbbt
Copy link
Author

ybbbbt commented Jan 7, 2021

Hi, thanks for your speedy reply.

I use the pytorch 1.7.0 with conda install pytorch==1.7.0 torchvision cudatoolkit=11.0 -c pytorch.

In CUDA 10.2, the above code only consume GPU memory no more than 1G.
In CUDA 11.0, even I reduce the variable xx to a tiny size (e.g. 1*4*6, see the code below), the out of memory issue still exist.
But when I remove the ME.SparseTensor(*), torch.where would not allocate such a large memory.

xx = torch.ones((1, 4, 6)).cuda()
bs, ys, xs = torch.where(xx > 0)

Besides, I also meet the same out of memory issue when writing code like a[a > th] = 0.

Even when I use torch BCELoss, something strange would also happen. (I'm sure that the input size and target size are identically the same)

  File "/home/aaa/anaconda3/envs/torch_1.7_cuda_11.0_py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/aaa/anaconda3/envs/torch_1.7_cuda_11.0_py37/lib/python3.7/site-packages/torch/nn/modules/loss.py", line 530, in forward
    return F.binary_cross_entropy(input, target, weight=self.weight, reduction=self.reduction)
  File "/home/aaa/anaconda3/envs/torch_1.7_cuda_11.0_py37/lib/python3.7/site-packages/torch/nn/functional.py", line 2519, in binary_cross_entropy
    "Please ensure they have the same size.".format(target.size(), input.size()))
ValueError: Using a target size (torch.Size([0])) that is different to the input size (torch.Size([1])) is deprecated. Please ensure they have the same size

This would not happen when I remove ME.SparseTensor(*) or downgrade to CUDA 10.2.
So I guess this may be some compatible issue with CUDA 11.0?

Anyway, thank you very much to take time on this issue. I will also try out the pytorch 1.7.1 asap.

@chrischoy
Copy link
Contributor

chrischoy commented Jan 7, 2021

I see. It seems like it was a CUDA error and I was able to reproduce the error on 11.0. Fortunately, it seems that the error was fixed on 11.1.

Please go to https://developer.nvidia.com/cuda-11.1.1-download-archive to download 11.1.

wget https://developer.download.nvidia.com/compute/cuda/11.1.1/local_installers/cuda_11.1.1_455.32.00_linux.run
sudo sh cuda_11.1.1_455.32.00_linux.run --toolkit --silent --override

# Install MinkowskiEngine with CUDA 11.1
export CUDA_HOME=/usr/local/cuda-11.1; pip install MinkowskiEngine -v --no-deps 

chrischoy added a commit that referenced this issue Jan 7, 2021
Tanazzah pushed a commit to Tanazzah/MinkowskiEngine that referenced this issue Feb 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants