Bug: Model ignores system prompt when use `/completion` endpoint #8393

andreys42 · 2024-07-09T12:18:56Z

What happened?

I'm testing the Meta-Llama-3-8B-Instruct-Q8_0 model using the llamacpp HTTP server, both through the chatui interface and direct requests via Python's requests.

When I use chatui with the chatPromptTemplate option, everything works fine, and the model's output is predictable and desirable.

However, when I make direct requests to the same server with the same model, the output is messy (lot of newline characters, repeating of the question, and so on) and most of the system instructions are being ignored (but general logic of ouput is fine), when I ask to answer only with 0 or 1, model still trying to motivate its decision in output

My attempts so far have been:

Use the same template (chatPromptTemplate from chatui) as the prompt key with user requests and assistant answers.
Using the {"chat-template" :"llama3"}
Using the prompt as a raw string of the current user's prompt with the "system_prompt" key as a base for the system instructions.

I've spent a lot of time trying to figure out the issue, but all of these approaches work much worse than using chatui way.

I believe the problem lies in my understanding of how to format the input prompts, and I'm not familiar enough with the syntax documentation.

Name and Version

lastest libs
Meta-Llama-3-8B-Instruct-Q8_0

What operating system are you seeing the problem on?

No response

Relevant log output

No response

The text was updated successfully, but these errors were encountered:

dspasyuk · 2024-07-09T17:06:57Z

@andreys42 Unless you are using conversation llama-cli -cnv mode you will need to use --in-prefix --in-suffix or wrap your input in Llama3 prompt template.

andreys42 · 2024-07-10T06:55:04Z

@andreys42 Unless you are using conversation llama-cli -cnv mode you will need to use --in-prefix --in-suffix or wrap your input in Llama3 prompt template.

@dspasyuk tnx for suggestion, --in-prefix/--in-suffix indeed make sense, will try, thank you
As for using llama3 prompt template for my input, I did that and mention it before, this made no differences for me...

matteoserva · 2024-07-10T09:48:24Z

You are probably using the wrong template.

Send your request to the /completion endpoint, then open the /slots endpoint to see what was effectively sent.

You can compare the good and bad prompts to see what was wrong.

dspasyuk · 2024-07-10T16:22:23Z

@andreys42 here is the setting I use in llama.cui that works well across major models:

../llama.cpp/llama-cli --model ../../models/meta-llama-3-8b-instruct-q5_k_s.gguf --n-gpu-layers 25 -cnv --simple-io -b 2048 --ctx_size 0 --temp 0 --top_k 10 --multiline-input --chat-template llama3 --log-disable

Here is the result:

Screencast.from.2024-07-10.10.20.44.AM.webm

You can test it for yourself here: https://github.com/dspasyuk/llama.cui

github-actions · 2024-08-24T01:06:49Z

This issue was closed because it has been inactive for 14 days since being marked as stale.

andreys42 added bug-unconfirmed medium severity Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable) labels Jul 9, 2024

github-actions bot added the stale label Aug 10, 2024

github-actions bot closed this as completed Aug 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: Model ignores system prompt when use `/completion` endpoint #8393

Bug: Model ignores system prompt when use `/completion` endpoint #8393

andreys42 commented Jul 9, 2024 •

edited

Loading

dspasyuk commented Jul 9, 2024

andreys42 commented Jul 10, 2024

matteoserva commented Jul 10, 2024

dspasyuk commented Jul 10, 2024

github-actions bot commented Aug 24, 2024

Bug: Model ignores system prompt when use /completion endpoint #8393

Bug: Model ignores system prompt when use /completion endpoint #8393

Comments

andreys42 commented Jul 9, 2024 • edited Loading

What happened?

Name and Version

What operating system are you seeing the problem on?

Relevant log output

dspasyuk commented Jul 9, 2024

andreys42 commented Jul 10, 2024

matteoserva commented Jul 10, 2024

dspasyuk commented Jul 10, 2024

github-actions bot commented Aug 24, 2024

Bug: Model ignores system prompt when use `/completion` endpoint #8393

Bug: Model ignores system prompt when use `/completion` endpoint #8393

andreys42 commented Jul 9, 2024 •

edited

Loading