Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove support for CUDNN 6 #15851

Closed
wants to merge 8 commits into from

Conversation

syed-ahmed
Copy link
Collaborator

This PR aims to remove support for cuDNN 6.

CC: @soumith @ngimel

environment:
JOB_BASE_NAME: pytorch-linux-xenial-cuda8-cudnn6-py3-build
JOB_BASE_NAME: pytorch-linux-xenial-cuda8-cudnn7-py3-build
DOCKER_IMAGE: "308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-cuda8-cudnn6-py3:278"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docker image is still cudnn6? Are there cuda8+cudnn7 images already pushed to aws?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the cuda8+cudnn7 docker image is not pushed

@ngimel
Copy link
Collaborator

ngimel commented Jan 9, 2019

I don't know how cudnn is discovered now, but it probably also has to be changed to require minimum 7.0 version.

@syed-ahmed
Copy link
Collaborator Author

@ngimel Added a check for minimum cudnn version. Will verify if it works tomorrow.

@syed-ahmed
Copy link
Collaborator Author

@ngimel verified that minimum version requirement change in the cmake file works:

-- Found CUDA: /usr/local/cuda (found version "8.0") 
-- Caffe2: CUDA detected: 8.0
-- Caffe2: CUDA nvcc is: /usr/local/cuda/bin/nvcc
-- Caffe2: CUDA toolkit directory: /usr/local/cuda
-- Caffe2: Header version is: 8.0
-- Found CUDNN: /usr/include/  
-- Found cuDNN: v6.0.21  (include: /usr/include/, library: /usr/lib/x86_64-linux-gnu/libcudnn.so.6)
CMake Error at cmake/public/cuda.cmake:150 (message):
  PyTorch requires cuDNN 7 and above.
Call Stack (most recent call first):
  cmake/Dependencies.cmake:665 (include)
  CMakeLists.txt:199 (include)


-- Configuring incomplete, errors occurred!
See also "/workspace/build/CMakeFiles/CMakeOutput.log".
See also "/workspace/build/CMakeFiles/CMakeError.log".
Failed to run 'bash ../tools/build_pytorch_libs.sh --use-cuda --use-fbgemm --use-nnpack --use-mkldnn --use-qnnpack caffe2'
root@886a9554535e:/workspace# 

@syed-ahmed syed-ahmed force-pushed the remove-cudnn-6-support branch 2 times, most recently from 5f6e2ad to 2b2cb51 Compare January 14, 2019 02:39
ssnl added a commit to ssnl/ossci-job-dsl that referenced this pull request Jan 15, 2019
Prepare for pytorch/pytorch#15851.

The docker side change was done in pietern/pytorch-dockerfiles@913d15b, but this is needed to make Jenkins really build these dockers.
Copy link
Contributor

@ezyang ezyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lovely!

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ezyang is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot pushed a commit that referenced this pull request Jan 17, 2019
Summary: This PR aims to remove support for cuDNN 6.

Differential Revision: D13709595

Pulled By: ezyang

fbshipit-source-id: 853624db1cf66b0534d7028654c38c2806fb4107
@syed-ahmed syed-ahmed closed this Jan 17, 2019
@ezyang
Copy link
Contributor

ezyang commented Jan 18, 2019

@syed-ahmed btw if you wanted to also delete

// Note [Legacy CuDNN grouped convolution support]
// ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
// CuDNN earlier than CuDNN 7 does not directly support group
// convolution, so we provide support for it by sequentially
// running a convolution per group  with appropriately
// adjusted sizes.  https://blog.yani.io/filter-group-tutorial/
// has a fairly good diagram explaining how it works.

that'd be peachy

syed-ahmed added a commit to syed-ahmed/pytorch that referenced this pull request Jan 18, 2019
Summary: This PR aims to remove support for cuDNN 6.

Differential Revision: D13709595

Pulled By: ezyang

fbshipit-source-id: 853624db1cf66b0534d7028654c38c2806fb4107
@syed-ahmed
Copy link
Collaborator Author

Sounds good! Peachy it is. I'll follow up on that soon.

soumith pushed a commit that referenced this pull request Jan 18, 2019
Summary: This PR aims to remove support for cuDNN 6.

Differential Revision: D13709595

Pulled By: ezyang

fbshipit-source-id: 853624db1cf66b0534d7028654c38c2806fb4107
facebook-github-bot pushed a commit that referenced this pull request Jan 23, 2019
Summary:
This PR updates the logic for using cudnnGet* and cudnnFind*. Current version of cudnn find and get (v7) returns a pair of best algorithm and the convDesc mathType. While we were using the returned algorithm, we didn't update the mathType. As a result, we ended up with a slow choice of algorithm and math type. Without this patch, we are seeing a 10x regression in group convolutions.

Changelist:
- Changed the template arguments to be `perf_t` instead of `algo_t` to unify cudnnFind and cudnnGet. Both cudnnFind and cudnnGet have the same purpose and hence, it made sense to unify them and get rid of `getAlgorithm`.
- Used cudnnGet*_v7 everywhere cudnnGet* was being used.
- Removed all cudnn6 paths (This PR depends on #15851)

Differential Revision: D13787601

Pulled By: ezyang

fbshipit-source-id: 81fe86727673d021306fe1c99c3e528b7c9ad17f
zdevito pushed a commit to zdevito/ATen that referenced this pull request Jan 23, 2019
Summary:
This PR updates the logic for using cudnnGet* and cudnnFind*. Current version of cudnn find and get (v7) returns a pair of best algorithm and the convDesc mathType. While we were using the returned algorithm, we didn't update the mathType. As a result, we ended up with a slow choice of algorithm and math type. Without this patch, we are seeing a 10x regression in group convolutions.

Changelist:
- Changed the template arguments to be `perf_t` instead of `algo_t` to unify cudnnFind and cudnnGet. Both cudnnFind and cudnnGet have the same purpose and hence, it made sense to unify them and get rid of `getAlgorithm`.
- Used cudnnGet*_v7 everywhere cudnnGet* was being used.
- Removed all cudnn6 paths (This PR depends on pytorch/pytorch#15851)

Differential Revision: D13787601

Pulled By: ezyang

fbshipit-source-id: 81fe86727673d021306fe1c99c3e528b7c9ad17f
facebook-github-bot pushed a commit that referenced this pull request Feb 5, 2019
Summary:
This PR updates the logic for using cudnnGet* and cudnnFind*. Current version of cudnn find and get (v7) returns a pair of best algorithm and the convDesc mathType. While we were using the returned algorithm, we didn't update the mathType. As a result, we ended up with a slow choice of algorithm and math type. Without this patch, we are seeing a 10x regression in group convolutions.

Changelist:
- Changed the template arguments to be `perf_t` instead of `algo_t` to unify cudnnFind and cudnnGet. Both cudnnFind and cudnnGet have the same purpose and hence, it made sense to unify them and get rid of `getAlgorithm`.
- Used cudnnGet*_v7 everywhere cudnnGet* was being used.
- Removed all cudnn6 paths (This PR depends on #15851)

Differential Revision: D13957944

Pulled By: ezyang

fbshipit-source-id: a88c39d80ae37f2d686665622302b62b50fab404
zdevito pushed a commit to zdevito/ATen that referenced this pull request Feb 5, 2019
Summary:
This PR updates the logic for using cudnnGet* and cudnnFind*. Current version of cudnn find and get (v7) returns a pair of best algorithm and the convDesc mathType. While we were using the returned algorithm, we didn't update the mathType. As a result, we ended up with a slow choice of algorithm and math type. Without this patch, we are seeing a 10x regression in group convolutions.

Changelist:
- Changed the template arguments to be `perf_t` instead of `algo_t` to unify cudnnFind and cudnnGet. Both cudnnFind and cudnnGet have the same purpose and hence, it made sense to unify them and get rid of `getAlgorithm`.
- Used cudnnGet*_v7 everywhere cudnnGet* was being used.
- Removed all cudnn6 paths (This PR depends on pytorch/pytorch#15851)

Differential Revision: D13957944

Pulled By: ezyang

fbshipit-source-id: a88c39d80ae37f2d686665622302b62b50fab404
sakaia added a commit to sakaia/pytorch that referenced this pull request Apr 11, 2019
Remove pointer to nonexistent Note.
It is already removed in "Remove support for CUDNN 6 (pytorch#15851)"
@sakaia sakaia mentioned this pull request Apr 11, 2019
facebook-github-bot pushed a commit that referenced this pull request Apr 11, 2019
Summary:
Remove pointer to nonexistent Note.
It is already removed in "Remove support for CUDNN 6 (#15851)"
Pull Request resolved: #19148

Differential Revision: D14891514

Pulled By: soumith

fbshipit-source-id: dd33cfefa3a21e18afae5b3992dea085adaabda8
@pytorch pytorch deleted a comment from intgogo Apr 16, 2019
zhangguanheng66 pushed a commit to zhangguanheng66/pytorch that referenced this pull request May 6, 2019
Summary:
Remove pointer to nonexistent Note.
It is already removed in "Remove support for CUDNN 6 (pytorch#15851)"
Pull Request resolved: pytorch#19148

Differential Revision: D14891514

Pulled By: soumith

fbshipit-source-id: dd33cfefa3a21e18afae5b3992dea085adaabda8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants