Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pickle issue with load_ParametricUMAP #1134

Open
eafpres opened this issue Jun 19, 2024 · 3 comments
Open

Pickle issue with load_ParametricUMAP #1134

eafpres opened this issue Jun 19, 2024 · 3 comments

Comments

@eafpres
Copy link

eafpres commented Jun 19, 2024

Describe the bug

Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:   File "/var/app/current/application.py", line 478, in load_stuff
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:     model = load_ParametricUMAP(model_set + '/' + full_name,
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:   File "/home/user/mambaforge/envs/tensorml/lib/python3.11/site-packages/umap/parametric>
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:     model = pickle.load((open(model_output, "rb")))
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:   File "/home/user/mambaforge/envs/tensorml/lib/python3.11/site-packages/numba/core/seri>
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:     ctor, states = loads(serialized)
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:                    ^^^^^^^^^^^^^^^^^
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]: TypeError: code() argument 13 must be str, not int

To Reproduce
Steps to reproduce the behavior:
ubuntu 20.04
Python 3.11
umap-learn==0.5.3

  1. create an embedding:
  distance = 'sokalsneath'
  op_mix_ratio = 0.3
  embed_dim = 10
  reducer = umap.ParametricUMAP(random_state = 42,
                                transform_seed = 42,
                                n_neighbors = 15,
                                n_epochs = 500,
                                metric = distance,
                                min_dist = 0.0,
                                set_op_mix_ratio = op_mix_ratio,
                                n_components = embed_dim)
  mapper = reducer.fit(model_vectors)
  mapper.save(data_path + '/' + date_prefix + '/' +
              date_prefix + '_umap_mapper.umap')
  1. attempt to load the model on a different linux machine using load_ParametricUMAP()
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:   File "/var/app/current/application.py", line 478, in load_stuff
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:     model = load_ParametricUMAP(model_set + '/' + full_name,
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:   File "/home/user/mambaforge/envs/tensorml/lib/python3.11/site-packages/umap/parametric>
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:     model = pickle.load((open(model_output, "rb")))
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:   File "/home/user/mambaforge/envs/tensorml/lib/python3.11/site-packages/numba/core/seri>
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:     ctor, states = loads(serialized)
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:                    ^^^^^^^^^^^^^^^^^
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]: TypeError: code() argument 13 must be str, not int

Expected behavior
On another machine this worked. I believe it is a subtle pickle issue. I had issues with other pickle files, which was solved by using pickle.dump(object, open(filename), protocol = 2). I have not figured out how to get umap to use the protocol.

Desktop (please complete the following information):

  • OS: Windows 11 Pro, running WSL 2 with Ubuntu 20.04
@eafpres
Copy link
Author

eafpres commented Jun 20, 2024

Update--this may be a Python3.11-related issue. I have tested downgrading the server to Python3.9 and things seem too work then. I did try loading Python3.11 on my dev system and re-saving the model, but still got the error on the Python3.11 server.

@timsainb
Copy link
Collaborator

hey, can you try this branch to see if it resolves the issue on python 3.11? #1123

@kobiche
Copy link

kobiche commented Aug 21, 2024

I can confirm this is related to the python version. How should I proceed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants