GGUF (Breaking Change to Model Files) #633

abetlen · 2023-08-24T04:23:10Z

GGUF support for llama.cpp Closes #628

Currently works to update your old ggml v3 llama models run

python3 vendor/llama.cpp/convert-llama-ggmlv3-to-gguf.py --input <path-to-ggml> --output <path-to-gguf>

TODO

Fix tests
~~Move convert script into package to make it easier for people to migrate~~ Add docs link to conversion script in llama.cpp
Fix detokenization bug (getting extra leading space)

sndani · 2023-08-26T01:13:23Z

Hello, excited about trying this out with the CodeLlama gguf model.

Followed the MacOS (Sonoma beta) instructions. How do I get the 'llama' shared library?

llama-cpp-python % python3 -m llama_cpp.server --model $MODEL --n_gpu_layers 1
Traceback (most recent call last):
....
File "... /llama-cpp-python/llama_cpp/llama_cpp.py", line 80, in
_lib = _load_shared_library(_lib_base_name)
File " ... /llama-cpp-python/llama_cpp/llama_cpp.py", line 71, in _load_shared_library
raise FileNotFoundError(
FileNotFoundError: Shared library with base name 'llama' not found

Thanks!

abetlen · 2023-08-26T02:00:33Z

@sndani try reinstalling with the --verbose flag, it's likely a build error and will be reported by cmake. If the issue isn't resolvable please open an issue and I'll take a look. Cheers

sndani · 2023-08-26T06:16:51Z

@abetlen thanks for the great work and thanks for responding.
Yes, did run
% CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1 pip install -U llama-cpp-python --no-cache-dir --verbose

Turns out, cmake isn't getting to target libllama.so under vendor/llama.cpp (but 'make clean' try deleting it). This is a dev fix but will open an issue (or the next person who encounters this can) if this isn't only my environment for some reason.

% make clean
% cd vendor/llama.cpp
% LLAMA_METAL=on make libllama.so
% cd ../..
% pip install 'llama-cpp-python[server]'
% export LLAMA_CPP_LIB=./vendor/llama.cpp/libllama.so
% python3 -m llama_cpp.server --model $MODEL --n_gpu_layers 1

reddiamond1234 · 2023-09-01T06:50:27Z

My model is now a lot slower... is there any solution to fix this?

Update llama.cpp

bbbf0f4

abetlen changed the title ~~GGUF~~ GGUF (Breaking Change) Aug 24, 2023

abetlen added 3 commits August 24, 2023 01:01

Remove deprecated params

4ed632c

Fix

db982a8

Update model path

3674e5e

abetlen changed the title ~~GGUF (Breaking Change)~~ GGUF (Breaking Change to Model Files) Aug 24, 2023

abetlen added 3 commits August 24, 2023 18:01

Update llama.cpp

c2d1dea

Strip leading space when de-tokenizing.

8ac5946

Update README

80389f7

rlancemartin mentioned this pull request Aug 25, 2023

GGUF Support #628

Closed

abetlen added 3 commits August 25, 2023 13:43

Use _with_model variants for tokenization

48cf43b

Ignore vendor directory for tests

c8a7637

Update llama.cpp

ef23d1e

Ph0rk0z mentioned this pull request Aug 25, 2023

Add Support for the new GGUF format which replaces GGML oobabooga/text-generation-webui#3676

Closed

Add temporary docs for GGUF model conversion

ac37ea5

abetlen merged commit 915bbea into main Aug 25, 2023
15 checks passed

phronmophobic mentioned this pull request Sep 24, 2023

Add support for gguf phronmophobic/llama.clj#8

Closed

rlancemartin mentioned this pull request Feb 13, 2024

Doc: Update llamacpp.ipynb langchain-ai/langchain#17466

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GGUF (Breaking Change to Model Files) #633

GGUF (Breaking Change to Model Files) #633

abetlen commented Aug 24, 2023 •

edited

Loading

sndani commented Aug 26, 2023

abetlen commented Aug 26, 2023

sndani commented Aug 26, 2023

reddiamond1234 commented Sep 1, 2023

GGUF (Breaking Change to Model Files) #633

GGUF (Breaking Change to Model Files) #633

Conversation

abetlen commented Aug 24, 2023 • edited Loading

sndani commented Aug 26, 2023

abetlen commented Aug 26, 2023

sndani commented Aug 26, 2023

reddiamond1234 commented Sep 1, 2023

abetlen commented Aug 24, 2023 •

edited

Loading