Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to convert/load large models #342

Closed
Flocksserver opened this issue Mar 23, 2023 · 2 comments · Fixed by #399
Closed

How to convert/load large models #342

Flocksserver opened this issue Mar 23, 2023 · 2 comments · Fixed by #399

Comments

@Flocksserver
Copy link

Hey, I would like to use a large model with this library. Small one works awesome. I can convert the weights as described here:

Pretrained models are available on Hugging face's model hub and can be loaded using RemoteResources defined in this library. A conversion utility script is included in ./utils to convert Pytorch weights to a set of weights compatible with this library. This script requires Python and torch to be set-up, and can be used as follows: python ./utils/convert_model.py path/to/pytorch_model.bin where path/to/pytorch_model.bin is the location of the original Pytorch weights.

When downloading larger models (e.g. flan-t5-xl) there are several weight files:

-rw-r--r--   1 marcelkaufmann  staff  9449717937 23 Mär 17:58 pytorch_model-00001-of-00002.bin
-rw-r--r--   1 marcelkaufmann  staff  1949494999 23 Mär 17:35 pytorch_model-00002-of-00002.bin
-rw-r--r--   1 marcelkaufmann  staff       50781 23 Mär 17:35 pytorch_model.bin.index.json

It is possible to use these models with rust-bert right now?

@sunilmallya
Copy link

sunilmallya commented Mar 30, 2023

import torch

# set the path to the sharded model files
model_file_pattern = "pytorch_model-{:05d}-of-{:05d}.bin"
num_shards = 2

# load the model weights from each shard and append them to the list
combined_model_weights = {}
for i in range(num_shards):
    model_file = model_file_pattern.format(i+1, num_shards)
    model = torch.load(model_file, map_location=torch.device('cpu'))
    combined_model_weights.update(model)

torch.save(combined_model_weights, "pytorch_model.bin")

Once you have the single bin file, you can convert to rust 

@Flocksserver
Copy link
Author

Flocksserver commented Mar 30, 2023

Ah I see. Thanks for this suggestion 🙏 Maybe this could be an optional step within the "convert_model.py" script in the future. Looks straightforward 🥇

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants