Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add version constraints, poetry, and poetry.lock file, and use nox-poetry and py3.12 in CI #202

Merged
merged 60 commits into from
Feb 16, 2024

Conversation

aryarm
Copy link
Member

@aryarm aryarm commented Jan 2, 2024

This PR adds version constraints and a lock file to our repository. It also adds poetry (a dependency management tool) to our development environment, so that we can use it to maintain our lock file. Finally, it reconfigures our CI to use nox-poetry, allowing us to run our CI checks with the locked versions of our dependencies.

The code in this PR was adapted from the haptools repository which was, in turn, inspired by the Hypermodern Python series.

a new development setup

This PR makes some changes to our development workflow. First, it changes the dev installation instructions on the README. Rather than installing the project using pip, it now says to use poetry. It also provides a conda environment (in the dev-env.yml file) to ensure that the right poetry version is installed. The dev-env.yml file also includes nox and some other packages that we require for testing and development.

Within the conda environment, poetry creates its own separate virtual environment with TRTools and all of its dependencies, using the versions in our lock file. To enter it, you can run poetry shell. And then you can continue coding as if you had just done an editable install with pip install -e

It's important that this child environment remain separate from its parent conda environment to reduce the chance of version conflicts between the environments. Some packages installed in the conda environment, like bcftools and ART, will be available in the child environment, but the same won't be true for the python packages like poetry or nox.

To add or modify a dependency of TRTools, we can use the poetry add command, documented in the Contributing section of our README. This will install the new dependency in the child environment and update the pyproject.toml and poetry.lock files accordingly.

why should we use poetry?

To see how poetry might prevent us from breaking support for older python versions, let's consider an example. Let's say we have pandas >= 2.0.3 in our pyproject.toml and pandas is locked to 2.0.3 in our lock file. We want to develop some code that uses a new function/feature of pandas. But unbeknownst to us, the feature is only available in pandas 2.1.0+, which is also the first version of pandas to drop support for python 3.8. (In fact, a situation like this has already happened for python 3.6 in #203 !)

If we weren't using poetry, we would probably use the new function in our code and unknowingly break support for python 3.8. So how does poetry help in this situation? In two ways.

  1. Our development environment is configured to use only the locked versions of our dependencies. This ensures that all developers will use the minimum supported versions of each of our dependencies. So when we try to run our code that uses the new pandas feature, it will fail, since we will have pandas 2.0.3 installed locally instead of 2.1.0

  2. Once we figure out that we need a newer version of pandas, we might try to install pandas 2.1.0 via poetry add 'pandas>=2.1.0'. At that moment, poetry should complain and warn us that pandas 1.2.0 has dropped support for python 3.8. This forces us to either purposefully drop support for python 3.8 or come up with a workaround

To summarize, we should use poetry because 1) it helps us maintain a lock file and 2) it will complain about any incompatibilities between the versions of our dependencies in our pyproject.toml and help us choose the proper versions.

why should we use nox-poetry?

A developer could accidentally subvert poetry's checks and forget to update the lock file by installing pandas 2.1.0 with pip instead of poetry. Our CI should catch that. It uses nox-poetry to ensure that our tests are executed with the locked versions of our dependencies. nox also allows us to locally run our tests against each supported version of python.

Using the locked versions of our dependencies in the CI also protects our CI from any potential regressions that may be introduced in our dependencies. (For examples of this happening, see #206 and #208.) Unfortunately, I wasn't able to lock our dependencies for python 3.9+ without dropping support for python 3.7 because of cjolowicz/nox-poetry#1116 . So our CI tests for py3.9+ are still prone to breakage whenever a regression is introduced. In the future, we'll able to resolve this by using multiple constraint dependencies to maintain different versions of the same package in our lock file. That approach won't currently work because of python-poetry/poetry-plugin-export#183, but once that gets resolved, I will open a new PR to lock our dependencies for python 3.9+, as well. In the meantime, successful checks for python 3.9+ won't be required for PRs.

There are other benefits to using nox and nox-poetry, but it might be best if you just refer to their docs for the rest.

additional CI checks

This PR also adds two new CI checks:

  1. A test for python 3.12

  2. Confirmation that all new files are smaller than 0.5 MB

  3. Up till now, all of the CI checks would fail if the PR's code coverage was insufficient. To fix this, I've separated the coverage test into its own CI check so that the other tests will still pass. This should hopefully make our CI checks easier to interpret. Also, just a (somewhat unrelated) heads-up that I moved the contents of .coveragerc into our pyproject.toml file.

  4. By default, poetry will include all files in the trtools/ folder within the distribution published to PyPI. This bloats the sdist similarly to fix: bloating of sdists by setuptools-scm #195. Poetry's recommended solution is to exclude directories from the sdist by listing them in the [tool.poetry.exclude] section of our pyproject.toml file. But this means that the sdist might get bloated again if a developer adds a new directory (with files that should be excluded) but forgets to list the directory in the pyproject.toml file. To prevent this, I've added a test to check that the distribution folder never exceeds 0.3 MB.
    Note: I've excluded the entire testsupport directory from the source distribution, but I'm not sure that's ok? The python files in that directory weren't excluded until now, but I don't think they're needed by anything, since the test_trtools.sh script seems to download its own version of the code.

move pytest-cov command line args to noxfile so they don't always run
Base automatically changed from build/automated-releases to master January 16, 2024 22:08
@aryarm aryarm changed the title feat: add version constraints, poetry, and poetry.lock file, and use nox-poetry in CI feat: add version constraints, poetry, and poetry.lock file, and use nox-poetry and py3.12 in CI Jan 28, 2024
@aryarm aryarm marked this pull request as ready for review February 16, 2024 19:21
Copy link
Contributor

@LiterallyUniqueLogin LiterallyUniqueLogin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the changes! I appreciate that we won't unknowingly break support for lower package versions. I don't mind bumping our dependency requirements up regularly if we want their newer features, but it should be done in a documented way, and not produce bugs post-install.

I also appreciate the code coverage test being split into a separate test, that's cleaner.

Despite your good descriptions, I don't actually understand much about what poetry is doing. I think that's the right effort/outcome tradeoff, so I'll do this review without that knowledge. I've also never understood much of the CI configuration here, so cannot review that critically.

In general, I lean towards it being more important that trtools is easy to write tests for, and that it is testable when it is installed, than it is to reduce file sizes. Happy to discuss, perhaps that's not the correct heuristic. But my comments reflect that.

The CI check that all new files are smaller than 0.5 MB should be a warning but able to be overridden. Sometimes the cleanest thing to do for writing a test is to upload a 10MB VCF and write a test against it. That may not be best practice, but it is something I want to support.

Excluding trtools/testsupport/* includes excluding trtools/testsupport/test_trtools.sh, which means installs of trtools will not come with working tests. I don't particularly like that solution.

README.rst Outdated Show resolved Hide resolved
pyproject.toml Show resolved Hide resolved
pyproject.toml Show resolved Hide resolved
pyproject.toml Show resolved Hide resolved
pyproject.toml Show resolved Hide resolved
.github/pull_request_template.md Show resolved Hide resolved
@aryarm aryarm merged commit ed9c961 into master Feb 16, 2024
12 checks passed
@aryarm aryarm deleted the deps/lock-constrain branch February 16, 2024 22:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants