Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add version constraints #203

Merged
merged 7 commits into from
Feb 16, 2024
Merged

fix: add version constraints #203

merged 7 commits into from
Feb 16, 2024

Conversation

aryarm
Copy link
Member

@aryarm aryarm commented Jan 2, 2024

I've opened this PR for us to develop minimum version constraints for our dependencies. I've made a first pass at this, which I hope to refine with the help of whoever reviews this! These constraints will reduce the chance that users encounter bugs when installing TRTools in environments that have old versions of our dependencies.

my strategy

I started with the versions of each dependency that had just been published at the time of the first commit to the repository (see 2b1f871). As expected, the tests did not pass when run with some of those versions. In general, support for old dependencies can be broken by ongoing code maintenance and new features.

So I incrementally bumped certain versions until the tests passed (see 4be0437). Some combination of binary search through our dependencies' versions and ctrl+f through our dependencies' changelogs was helpful here. It was also super helpful to use poetry, a popular dependency management tool, to help me navigate version incompatibilities. Based on my experience, I think we will need to continue using it if we want to keep supporting specific python versions. (Refer to #202 for more details.)

a bug requiring us to drop support for py 3.6

For example, there was one instance when I discovered that some of the tests (for code that was added to TRTools recently) would only pass for versions of pandas >= 1.2.0. I suspect that there must be some sort of bug in TRTools when used in combination with versions of pandas < 1.2.0. But when I tried to constrain pandas to >= 1.2.0, poetry informed me that I would be breaking support for python 3.6. Apparently, pandas dropped support for python 3.6 in 1.2.0.

So I had to drop TRTools' support for python 3.6, too. In any case, I don't think the version of setuptools that comes with python 3.6 supports pyproject.toml files (source), so we would probably have to revert our pyproject.toml file to a setup.py file if we wanted to continue supporting python 3.6. Also, we will have to drop support for python 3.6 (and certain versions of numpy, cyvcf2, and pysam) anyway when we merge the haptools.data and tr_harmonizer libraries, since haptools.data only supports python 3.7+.

@aryarm aryarm changed the base branch from master to deps/lock-constrain January 2, 2024 06:45
@aryarm
Copy link
Member Author

aryarm commented Jan 2, 2024

Note to the reviewer of this PR:
It's very likely that there are valid versions lower than the ones I began with -- especially for numpy, scikit-learn, scipy, and statsmodels. I could also have made a mistake when incrementing the version constraints.
So if these constraints aren't low enough, please feel free to lower them further! When reviewing this PR, you can run our tests against a lower version of a dependency via poetry and pytest. For example, to try numpy 1.17.2:

  1. poetry add 'numpy==1.17.2'
  2. poetry run pytest

poetry add will install numpy 1.17.2 and update our lock file + pyproject.toml file accordingly. By default, poetry will install an editable TRTools and its dependencies in a separate virtual environment (in case poetry conflicts with TRTools). poetry run allows you to execute pytest in that environment and poetry shell will allow you to "activate'" that environment, if you ever need to.

I recommend doing all this in a GitHub codespace so that there's zero chance of you messing up your local installation of TRTools. Github codespaces are pre-configured development containers running editors on the cloud, and are basically free for academic use. To create a codespace on this branch, click on "Create codespace" on this page. Wait a bit for the codespace to get set up and then open a terminal ( Ctrl + ` ) and run conda activate trtools to access an environment pre-installed with poetry.

After you're done, you can manually edit the pyproject.toml file to ensure each version of each dependency still has a >= sign in the front of it. And then you should also run poetry lock --no-update to ensure the lock file remains synced to the pyproject.toml file.

@aryarm aryarm requested a review from gymreklab January 2, 2024 07:00
@aryarm
Copy link
Member Author

aryarm commented Feb 15, 2024

unless anyone has any objections, I'll probably merge this PR by EOD

Melissa and I agreed (via personal conversation) that these version constraints are probably low enough for now. We can of course revisit them if someone requests us to

@aryarm
Copy link
Member Author

aryarm commented Feb 16, 2024

this pr also pins numpy to a version less than 2.0 for now, until we have a chance to test it and make sure it doesn't break anything

@aryarm aryarm merged commit ea3ad83 into deps/lock-constrain Feb 16, 2024
11 checks passed
@aryarm aryarm deleted the deps/constraints branch February 16, 2024 19:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants