Skip to content

Commit

Permalink
[#150] Update the README with info about Python 3.12 and pyspark
Browse files Browse the repository at this point in the history
We can do our best to support Python 3.12, but it's a bit janky because pyspark
doesn't officially support it. Python 3.10 and 3.11 should be pretty solid.
  • Loading branch information
riley-harper committed Oct 7, 2024
1 parent ed1b01b commit 42b4f45
Showing 1 changed file with 18 additions and 1 deletion.
19 changes: 18 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,28 @@ hlink requires
- Python 3.10, 3.11, or 3.12
- Java 8 or greater for integration with PySpark

You can install the newest version of the python package directly from PyPI with pip:
You can install the newest version of the Python package directly from PyPI with pip:
```
pip install hlink
```

We do our best to make hlink compatible with Python 3.10-3.12. If you have a
problem using hlink on one of these versions of Python, please open an issue
through GitHub. Versions of Python older than 3.10 are not supported.

Note that pyspark 3.5 does not yet officially support Python 3.12. If you
encounter pyspark-related import errors while running hlink on Python 3.12, try

- Installing the setuptools package. The distutils package was deleted from the
standard library in Python 3.12, but some versions of pyspark still import
it. The setuptools package provides a hacky stand-in distutils library which
should fix some import errors in pyspark. We install setuptools in our
development and test dependencies so that our tests work on Python 3.12.

- Downgrading Python to 3.10 or 3.11. Pyspark officially supports these
versions of Python. So you should have better chances getting pyspark to work
well on Python 3.10 or 3.11.

## Docs

The documentation site can be found at [hlink.docs.ipums.org](https://hlink.docs.ipums.org).
Expand Down

0 comments on commit 42b4f45

Please sign in to comment.