Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pyproject toml #743

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

jinmingyi1998
Copy link

@jinmingyi1998 jinmingyi1998 commented May 10, 2023

What does this PR do?

Fixes #740.

When install from a new environment, use Build System Requirements to install torch. Otherwise setup.py could not found torch.

Use Case:
In a new environment, install from source code or source_dist.tar.gz, setup.py could not import torch

Can be Improve:

  • add ninja to requirements for build from source
  • migrate setup.py to pyproject.toml

related reference:

Before submitting

  • Did you have fun?
    • Make sure you had fun coding 🙃
  • Did you read the contributor guideline?
  • Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
  • Did you make sure to update the docs?
    • N/A
  • Did you write any new necessary tests?
    • N/A
  • Did you update the changelog? (if needed)
    • N/A

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 10, 2023
@jinmingyi1998
Copy link
Author

jinmingyi1998 commented May 10, 2023

@danthe3rd something wrong in CI?

There is no change on dependencies and I have tested installation on my machine.

CI LOG:

Submodule 'third_party/cutlass' (https://github.com/NVIDIA/cutlass.git) registered for path 'third_party/cutlass'
Submodule 'third_party/flash-attention' (https://github.com/HazyResearch/flash-attention.git) registered for path 'third_party/flash-attention'
Cloning into '/home/circleci/xformers/third_party/cutlass'...
Cloning into '/home/circleci/xformers/third_party/flash-attention'...
Submodule path 'third_party/cutlass': checked out '06eb90cc0daae633b1e25e80ace1ef81ac158baa'
Submodule path 'third_party/flash-attention': checked out 'a1f49a2b92b6fa022379bbebafed9d7f5e96a675'
Submodule 'csrc/flash_attn/cutlass' (https://github.com/NVIDIA/cutlass.git) registered for path 'third_party/flash-attention/csrc/flash_attn/cutlass'
Cloning into '/home/circleci/xformers/third_party/flash-attention/csrc/flash_attn/cutlass'...
Submodule path 'third_party/flash-attention/csrc/flash_attn/cutlass': checked out '319a389f42b776fae5701afcb943fc03be5b5c25'
Using pip 23.0.1 from /home/circleci/venv/lib/python3.8/site-packages/pip (python 3.8)
Obtaining file:///home/circleci/xformers
  Running command pip subprocess to install build dependencies
  Collecting setuptools
    Using cached setuptools-67.7.2-py3-none-any.whl (1.1 MB)
  Collecting torch>=1.12
    Downloading torch-2.0.1-cp38-cp38-manylinux1_x86_64.whl (619.9 MB)
       ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 619.9/619.9 MB 3.5 MB/s eta 0:00:00
  Collecting wheel
    Using cached wheel-0.40.0-py3-none-any.whl (64 kB)
  Collecting nvidia-cuda-nvrtc-cu11==11.7.99
    Downloading nvidia_cuda_nvrtc_cu11-11.7.99-2-py3-none-manylinux1_x86_64.whl (21.0 MB)
       ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 21.0/21.0 MB 62.6 MB/s eta 0:00:00
  Collecting nvidia-cublas-cu11==11.10.3.66
    Downloading nvidia_cublas_cu11-11.10.3.66-py3-none-manylinux1_x86_64.whl (317.1 MB)
       ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 317.1/317.1 MB 7.6 MB/s eta 0:00:00
  Collecting jinja2
    Downloading Jinja2-3.1.2-py3-none-any.whl (133 kB)
       ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 133.1/133.1 kB 51.4 MB/s eta 0:00:00

I added torch requirement in build system which lead it re-install torch==2.0 again, then break CI test

@jinmingyi1998
Copy link
Author

CI test failed because build isolation, so need add --no-build-isolation when install repositry

A build frontend SHOULD, by default, create an isolated environment for each build, containing only the standard library and any explicitly requested build-dependencies. PEP517

@danthe3rd
Copy link
Contributor

danthe3rd commented May 10, 2023

Hi, and thanks for opening this PR!

I'm not sure what is the correct way of doing this. Indeed, we need pytorch to be installed first. And more than that: we need the exact same version at build-time AND run-time. I believe that's why the CI fails.
I'm not sure how other packages with similar requirements work - cc @fmassa maybe for torchvision or @bottler ?

@danthe3rd danthe3rd requested a review from fmassa May 10, 2023 08:53
@fmassa
Copy link
Contributor

fmassa commented May 10, 2023

Hi,

This was somewhat recently added to torchvision in pytorch/vision#6230 , and I haven't looked into that before. There is a comment from @pmeier there mentioning possible issues with this approach (i.e., picking the wrong PyTorch, which seems to be the case here), and there was no solution found at the moment.

So for now I don't have a fix for this.

@pmeier
Copy link

pmeier commented May 10, 2023

My comment pytorch/vision#6230 (comment) is still valid. I'm not familiar with the xformers project, but if you have any special requirements like nightly PyTorch builds or build against a specific CUDA version, this will cause trouble. Although we have that in torchvision (I'm actually not sure why), we aren't using it. The official installation instructions are to run python setup.py. Plus, we explicitly state in our README that if one wants to install with pip, they should use the --no-build-isolation flag, which bypasses the whole build requirements. Not sure if that is even an option with other standard based tools and thus can lead to even more problems.

@jinmingyi1998
Copy link
Author

Hi I make a brief conclusion:
For Users installing xformers:

  • Already have pytorch installed
    • Install from wheel : no impact
    • Install from source.tar.gz, this PR will build in an isolation environment, download and install a new temp-torch
  • no torch installed
    • Install from wheel: no impact
    • Install from source.tar.gz (like the issue no model named 'torch' #740 ) download and install a temp torch for build xformers , install a torch with cached torch wheel for install requirement.

To fix this CI:
add --no-build-isolation flag in "Install repositry" part

@jinmingyi1998
Copy link
Author

jinmingyi1998 commented May 10, 2023

PR Impact Conclusion:

Install from wheel:

no impact

Install from source(from source.tar.gz or install from git repo):

The installation build system will download and install a Temporary torch in isolation env and delete them after installation. After build done, if you do not meet the install requirements(requirements.txt), then install all requirements such as torch, numpy, etc. just like things to install requirements.

If you already have torch install and don't want to install again in the isolation env, build in your existed env, then add flag --no-build-isolation for pip install.

There is no need to install torch first and then install xformers!😆

@fmassa
Copy link
Contributor

fmassa commented May 10, 2023

Following the comments from @pmeier and the potential problems that adding the toml file, I think it might not be worth adding it.

Given that compiling xformers requires the exact same version of PyTorch during build time and runtime, this seems like it will cause a number of issues in the future IMO.

I think we should just update our readme to first ensure that we have PyTorch installed prior to running the xformers installation command

@adampauls
Copy link

@fmassa the problem occurs for me with poetry using a pyproject.toml file. It is not possible to install torch "first", since poetry is supposed to hide all that IIUC. Is there a known workaround? I thought adding it to build-system.requires might help but it does not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

no model named 'torch'
6 participants