Bug in cv::cuda::warpPerspective #2361

andrejlevkovitch · 2019-11-25T07:11:50Z

System information (version)

OpenCV => 4.1.0
Operating System / Platform => debian10
Compiler => g++-8.3.0
Cuda => cuda-10.1

Detailed description

With some matricies (perspective transformation) I get exception about illegal memory access in GPU memory:

OpenCV(4.1.0) /tmp/opencv_dir/modules/core/src/cuda_stream.cpp:473: error: (-217:Gpu API call) an illegal memory access was encountered in function 'waitForCompletion'

This happens only if I use not default stream and linear interpolation.

Steps to reproduce

Tested with cuda-10.1 and opencv-4.1.0.

// main.cpp

#include <cstdlib>
#include <opencv2/cudawarping.hpp>
#include <vector>

int main() {
  cv::Size inputSize{1000, 1000}; // can be different

  cv::Size outSize{670, 893};
  std::vector<double> values{
      0.3154414342519155,    0.2158186507123098,    133.1013012869572,
      -0.01298153793321658,  0.5681477441984867,    297.7284881948252,
      1.839213382549264e-05, 0.0006366310508410327, 1};

  cv::Mat matrix{cv::Size{3, 3}, CV_64FC1, values.data()};

  cv::cuda::GpuMat image{inputSize, CV_8UC4};

  cv::cuda::Stream stream;
  cv::cuda::warpPerspective(image, image, matrix, outSize, cv::INTER_LINEAR,
                            cv::BORDER_CONSTANT, cv::Scalar{0, 0, 0, 0},
                            stream);
  stream.waitForCompletion();

  return EXIT_SUCCESS;
}

# cmake

cmake_minimum_required(VERSION 3.12)

project(min_ex)

find_package(OpenCV COMPONENTS core cudawarping REQUIRED)

add_executable(${PROJECT_NAME} main.cpp)
target_compile_features(${PROJECT_NAME} PRIVATE cxx_std_11)
target_link_libraries(${PROJECT_NAME} PRIVATE ${OpenCV_LIBS})
target_include_directories(${PROJECT_NAME} PRIVATE ${OpenCV_INCLUDE_DIRS})

The text was updated successfully, but these errors were encountered:

nglee · 2019-12-29T23:50:44Z

Same on 3.4 branch.

nglee · 2020-08-28T07:01:05Z

Testing with CUDA8.0 and CUDA10.1, both cases seem to have this bug. GTX 1080 Ti

andrejlevkovitch · 2021-01-14T09:39:56Z

I simplify example to:

// main.cpp

#include <cstdlib>
#include <iostream>
#include <opencv2/cudawarping.hpp>
#include <vector>

int main() {
  cv::Size inputSize{1, 1};
  cv::Size outSize{1, 1};

  // clang-format off
  std::vector<float> values{
      0, 0, 0,
      0, 0, 2147483648, // XXX after converting to int we get -2147483648
  };
  // clang-format on

  cv::Mat matrix{cv::Size{3, 2}, CV_32FC1, values.data()};

  cv::cuda::GpuMat image{inputSize, CV_8UC1};
  cv::cuda::GpuMat output;

  cv::cuda::Stream stream;
  cv::cuda::warpAffine(image, output, matrix, outSize,
                       cv::INTER_LINEAR | cv::WARP_INVERSE_MAP,
                       cv::BORDER_CONSTANT, cv::Scalar{0}, stream);
  stream.waitForCompletion();

  return EXIT_SUCCESS;
}

Also I recompile opencv with nvcc flag -lineinfo and run this example by cuda-memcheck. Here is a output:

OpenCV(4.1.0) Error: Gpu API call (unspecified launch failure) in waitForCompletion, file /home/user/tmp/opencv_dir/modules/core/src/cuda_stream.cpp, line 473
terminate called after throwing an instance of 'cv::Exception'
  what():  OpenCV(4.1.0) /home/user/tmp/opencv_dir/modules/core/src/cuda_stream.cpp:473: error: (-217:Gpu API call) unspecified launch failure in function 'waitForCompletion'

========= CUDA-MEMCHECK
========= Invalid __global__ read of size 1
=========     at 0x00000350 in /home/user/tmp/opencv_dir/modules/core/include/opencv2/core/cuda/filters.hpp:97:void cv::cuda::device::imgproc::warp<cv::cuda::device::imgproc::AffineTransform, cv::cuda::device::LinearFilter<cv::cuda::device::BorderReader<cv::cuda::PtrStep<unsigned char>, cv::cuda::device::BrdConstant<float>>>, unsigned char>(unsigned char, cv::cuda::PtrStepSz<cv::cuda::PtrStep<unsigned char>>)
=========     by thread (0,0,0) in block (0,0,0)
=========     Address 0x7fdb9c600000 is out of bounds
=========     Device Frame:/home/user/tmp/opencv_dir/modules/core/include/opencv2/core/cuda/filters.hpp:97:void cv::cuda::device::imgproc::warp<cv::cuda::device::imgproc::AffineTransform, cv::cuda::device::LinearFilter<cv::cuda::device::BorderReader<cv::cuda::PtrStep<unsigned char>, cv::cuda::device::BrdConstant<float>>>, unsigned char>(unsigned char, cv::cuda::PtrStepSz<cv::cuda::PtrStep<unsigned char>>) (void cv::cuda::device::imgproc::warp<cv::cuda::device::imgproc::AffineTransform, cv::cuda::device::LinearFilter<cv::cuda::device::BorderReader<cv::cuda::PtrStep<unsigned char>, cv::cuda::device::BrdConstant<float>>>, unsigned char>(unsigned char, cv::cuda::PtrStepSz<cv::cuda::PtrStep<unsigned char>>) : 0x350)
=========     Saved host backtrace up to driver entry point at kernel launch time
=========     Host Frame:/lib/x86_64-linux-gnu/libcuda.so.1 (cuLaunchKernel + 0x2fe) [0x282a4e]
=========     Host Frame:/home/user/tmp/opencv_dir/build/lib/libopencv_cudawarping.so.4.1 [0x6ccc19]
=========     Host Frame:/home/user/tmp/opencv_dir/build/lib/libopencv_cudawarping.so.4.1 [0x6ccca7]
=========     Host Frame:/home/user/tmp/opencv_dir/build/lib/libopencv_cudawarping.so.4.1 [0x702ff5]
=========     Host Frame:/home/user/tmp/opencv_dir/build/lib/libopencv_cudawarping.so.4.1 [0x1451b6]
=========     Host Frame:/home/user/tmp/opencv_dir/build/lib/libopencv_cudawarping.so.4.1 [0x105e5e]
=========     Host Frame:/home/user/tmp/opencv_dir/build/lib/libopencv_cudawarping.so.4.1 [0x105e8f]
=========     Host Frame:/home/user/tmp/opencv_dir/build/lib/libopencv_cudawarping.so.4.1 [0x145430]
=========     Host Frame:/home/user/tmp/opencv_dir/build/lib/libopencv_cudawarping.so.4.1 [0x15d267]
=========     Host Frame:/home/user/tmp/opencv_dir/build/lib/libopencv_cudawarping.so.4.1 [0x14d032]
=========     Host Frame:/home/user/tmp/opencv_dir/build/lib/libopencv_cudawarping.so.4.1 [0x14b4d4]
=========     Host Frame:/home/user/tmp/opencv_dir/build/lib/libopencv_cudawarping.so.4.1 [0x10365f]
=========     Host Frame:/home/user/tmp/opencv_dir/build/lib/libopencv_cudawarping.so.4.1 (_ZN2cv4cuda10warpAffineERKNS_11_InputArrayERKNS_12_OutputArrayES3_NS_5Size_IiEEiiNS_7Scalar_IdEERNS0_6StreamE + 0x8e9) [0x62ca8]
=========     Host Frame:min_ex [0x2469]
=========     Host Frame:/lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main + 0xeb) [0x2409b]
=========     Host Frame:min_ex [0x21ba]
=========
========= Program hit cudaErrorLaunchFailure (error 719) due to "unspecified launch failure" on CUDA API call to cudaStreamSynchronize. 
=========     Saved host backtrace up to driver entry point at error
=========     Host Frame:/lib/x86_64-linux-gnu/libcuda.so.1 [0x391b13]
=========     Host Frame:/home/user/tmp/opencv_dir/build/lib/libopencv_core.so.4.1 [0xeea61e]
=========     Host Frame:/home/user/tmp/opencv_dir/build/lib/libopencv_core.so.4.1 (_ZN2cv4cuda6Stream17waitForCompletionEv + 0x21) [0x21e61f]
=========     Host Frame:min_ex [0x24a0]
=========     Host Frame:/lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main + 0xeb) [0x2409b]
=========     Host Frame:min_ex [0x21ba]
=========
========= Error: process didn't terminate successfully
========= No CUDA-MEMCHECK results found

cuda/filters.hpp:97 is just:

const int y1 = __float2int_rd(y);

from device method of LinearFilter::operator().

andrejlevkovitch · 2021-01-14T10:05:46Z

So, I think, problem in integer overflow. See on our transformation matrix - if we change our y-move value to 2147483583., then all will works fine.

andrejlevkovitch · 2021-01-14T13:04:45Z

Yes, problem is in integer overflow. Integer overflow is UB on cuda (like in c++ standard), so I just try change string that caused error from const int y1 = __float2int_rd(y); to const unsigned int y1 = __float2uint_rd(y); and after it I can't reproduce my error anymore. So, it looks like using signed integer values in cuda device functions is really bad idea.

andrejlevkovitch · 2021-01-14T13:45:33Z

#2712 also not reproduced after previous fix

andrejlevkovitch · 2021-01-15T09:50:38Z

Unfortunately I can't reproduce this error in some separate cuda function

andrejlevkovitch · 2021-01-15T10:31:46Z

Affine transformation supports three types of interpolation: nearest, linear and cubic - but problem reproduce only with INTER_LINEAR and with not default stream only. This is very strange, because function __float2int_rd uses in cubic filter also.

thilipwka · 2021-09-09T20:36:54Z

Hey !
I have an error with INTER_LINEAR type also, with NEAREST everything is working fine.
How to solve overflow in matrix in this code? Trying to solve this problem for 2 days , no luck (.
Thank you.

Mat CudaWarp(Mat aImgOriginal, Mat matrix, float w, float h) {
	Mat output;
	cv::cuda::GpuMat gpuOutput;
	cv::cuda::GpuMat  gpuInput;

	gpuInput.upload(aImgOriginal);

	cv::cuda::warpPerspective(gpuInput, gpuOutput, matrix, Point(w, h), INTER_LINEAR, BORDER_CONSTANT,0);

	gpuOutput.download(output);


	return output;

}

andrejlevkovitch · 2021-09-10T16:37:22Z

@thilipwka please, see my previous comments. Currently you can't fix this problem, because problem is very deep in cuda code

thilipwka · 2021-09-10T18:00:56Z

@thilipwka please, see my previous comments. Currently you can't fix this problem, because problem is very deep in cuda code

thank you

RamadanAhmed · 2022-03-08T11:29:25Z

I have the same problem

cv::cuda::remap(inputGPU, inputGPU, M1Gpu, M2Gpu, cv::INTER_LINEAR);

but it solved when I pass different gpu matrix for input and output

cv::cuda::remap(inputGPU, outputGPU, M1Gpu, M2Gpu, cv::INTER_LINEAR);

Emmanuel-Messulam · 2022-05-30T16:57:01Z

@andrejlevkovitch Is there some other way of doing a remap? A workaround?

rrrand mentioned this issue Oct 12, 2020

Exception in constructor of cv::cuda::Stream #2705

Closed

alalek mentioned this issue Oct 15, 2020

cuda::remap - illegal memory access #2712

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug in cv::cuda::warpPerspective #2361

Bug in cv::cuda::warpPerspective #2361

andrejlevkovitch commented Nov 25, 2019 •

edited

Loading

nglee commented Dec 29, 2019 •

edited

Loading

nglee commented Aug 28, 2020

andrejlevkovitch commented Jan 14, 2021 •

edited

Loading

andrejlevkovitch commented Jan 14, 2021

andrejlevkovitch commented Jan 14, 2021 •

edited

Loading

andrejlevkovitch commented Jan 14, 2021

andrejlevkovitch commented Jan 15, 2021

andrejlevkovitch commented Jan 15, 2021

thilipwka commented Sep 9, 2021 •

edited

Loading

andrejlevkovitch commented Sep 10, 2021

thilipwka commented Sep 10, 2021

RamadanAhmed commented Mar 8, 2022

Emmanuel-Messulam commented May 30, 2022

Bug in cv::cuda::warpPerspective #2361

Bug in cv::cuda::warpPerspective #2361

Comments

andrejlevkovitch commented Nov 25, 2019 • edited Loading

System information (version)

Detailed description

Steps to reproduce

nglee commented Dec 29, 2019 • edited Loading

nglee commented Aug 28, 2020

andrejlevkovitch commented Jan 14, 2021 • edited Loading

andrejlevkovitch commented Jan 14, 2021

andrejlevkovitch commented Jan 14, 2021 • edited Loading

andrejlevkovitch commented Jan 14, 2021

andrejlevkovitch commented Jan 15, 2021

andrejlevkovitch commented Jan 15, 2021

thilipwka commented Sep 9, 2021 • edited Loading

andrejlevkovitch commented Sep 10, 2021

thilipwka commented Sep 10, 2021

RamadanAhmed commented Mar 8, 2022

Emmanuel-Messulam commented May 30, 2022

andrejlevkovitch commented Nov 25, 2019 •

edited

Loading

nglee commented Dec 29, 2019 •

edited

Loading

andrejlevkovitch commented Jan 14, 2021 •

edited

Loading

andrejlevkovitch commented Jan 14, 2021 •

edited

Loading

thilipwka commented Sep 9, 2021 •

edited

Loading