Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

common : normalize naming style #7462

Merged
merged 3 commits into from
May 22, 2024
Merged

common : normalize naming style #7462

merged 3 commits into from
May 22, 2024

Conversation

ggerganov
Copy link
Owner

@ggerganov ggerganov commented May 22, 2024

Planning to some minor refactoring in the examples. Starting with some basic stuff - moving and renaming stuff

Copy link
Contributor

github-actions bot commented May 22, 2024

📈 llama.cpp server for bench-server-baseline on Standard_NC4as_T4_v3 for phi-2-q4_0: 553 iterations 🚀

Expand details for performance related PR only
  • Concurrent users: 8, duration: 10m
  • HTTP request : avg=8458.92ms p(95)=20527.39ms fails=, finish reason: stop=506 truncated=47
  • Prompt processing (pp): avg=93.4tk/s p(95)=366.67tk/s
  • Token generation (tg): avg=58.48tk/s p(95)=46.36tk/s
  • ggml-org/models/phi-2/ggml-model-q4_0.gguf parallel=8 ctx-size=16384 ngl=33 batch-size=2048 ubatch-size=256 pp=1024 pp+tg=2048 branch=gg/common-style commit=374a95f924f29b6d202fd269101a377499b2f09b

prompt_tokens_seconds

More
---
config:
    xyChart:
        titleFontSize: 12
        width: 900
        height: 600
    themeVariables:
        xyChart:
            titleColor: "#000000"
---
xychart-beta
    title "llama.cpp bench-server-baseline on Standard_NC4as_T4_v3
 duration=10m 553 iterations"
    y-axis "llamacpp:prompt_tokens_seconds"
    x-axis "llamacpp:prompt_tokens_seconds" 1716401542 --> 1716402170
    line [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 472.23, 472.23, 472.23, 472.23, 472.23, 988.4, 988.4, 988.4, 988.4, 988.4, 977.74, 977.74, 977.74, 977.74, 977.74, 975.1, 975.1, 975.1, 975.1, 975.1, 975.4, 975.4, 975.4, 975.4, 975.4, 965.25, 965.25, 965.25, 965.25, 965.25, 952.71, 952.71, 952.71, 952.71, 952.71, 966.97, 966.97, 966.97, 966.97, 966.97, 956.62, 956.62, 956.62, 956.62, 956.62, 970.29, 970.29, 970.29, 970.29, 970.29, 955.81, 955.81, 955.81, 955.81, 955.81, 957.8, 957.8, 957.8, 957.8, 957.8, 902.53, 902.53, 902.53, 902.53, 902.53, 924.83, 924.83, 924.83, 924.83, 924.83, 873.69, 873.69, 873.69, 873.69, 873.69, 873.95, 873.95, 873.95, 873.95, 873.95, 875.18, 875.18, 875.18, 875.18, 875.18, 873.14, 873.14, 873.14, 873.14, 873.14, 893.25, 893.25, 893.25, 893.25, 893.25, 888.68, 888.68, 888.68, 888.68, 888.68, 892.91, 892.91, 892.91, 892.91, 892.91, 898.57, 898.57, 898.57, 898.57, 898.57, 898.19, 898.19, 898.19, 898.19, 898.19, 898.33, 898.33, 898.33, 898.33, 898.33, 866.24, 866.24, 866.24, 866.24, 866.24, 866.24, 866.24, 866.24, 866.24, 866.24, 867.21, 867.21, 867.21, 867.21, 867.21, 879.28, 879.28, 879.28, 879.28, 879.28, 874.21, 874.21, 874.21, 874.21, 874.21, 873.86, 873.86, 873.86, 873.86, 873.86, 875.52, 875.52, 875.52, 875.52, 875.52, 875.17, 875.17, 875.17, 875.17, 875.17, 875.99, 875.99, 875.99, 875.99, 875.99, 881.04, 881.04, 881.04, 881.04, 881.04, 891.15, 891.15, 891.15, 891.15, 891.15, 873.85, 873.85, 873.85, 873.85, 873.85, 871.63, 871.63, 871.63, 871.63, 871.63, 870.56, 870.56, 870.56, 870.56, 870.56, 872.13, 872.13, 872.13, 872.13, 872.13, 874.78, 874.78, 874.78, 874.78, 874.78, 874.28, 874.28, 874.28, 874.28, 874.28, 867.7, 867.7, 867.7, 867.7, 867.7, 827.78, 827.78, 827.78, 827.78, 827.78, 826.55, 826.55, 826.55, 826.55, 826.55, 826.02, 826.02, 826.02, 826.02, 826.02, 824.63, 824.63, 824.63, 824.63, 824.63, 821.87, 821.87, 821.87, 821.87, 821.87, 823.68, 823.68, 823.68, 823.68, 823.68, 823.08, 823.08, 823.08, 823.08, 823.08, 828.03, 828.03, 828.03, 828.03, 828.03, 827.66, 827.66, 827.66, 827.66, 827.66, 833.58, 833.58, 833.58, 833.58, 833.58, 834.35, 834.35, 834.35, 834.35, 834.35, 839.62, 839.62, 839.62, 839.62, 839.62, 838.65, 838.65, 838.65, 838.65, 838.65, 840.5, 840.5, 840.5, 840.5, 840.5, 840.12, 840.12, 840.12, 840.12, 840.12, 841.22, 841.22, 841.22, 841.22, 841.22, 842.28, 842.28, 842.28, 842.28, 842.28, 845.99, 845.99, 845.99, 845.99, 845.99, 846.5, 846.5, 846.5, 846.5, 846.5]
                    
Loading
predicted_tokens_seconds
More
---
config:
    xyChart:
        titleFontSize: 12
        width: 900
        height: 600
    themeVariables:
        xyChart:
            titleColor: "#000000"
---
xychart-beta
    title "llama.cpp bench-server-baseline on Standard_NC4as_T4_v3
 duration=10m 553 iterations"
    y-axis "llamacpp:predicted_tokens_seconds"
    x-axis "llamacpp:predicted_tokens_seconds" 1716401542 --> 1716402170
    line [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 41.23, 41.23, 41.23, 41.23, 41.23, 38.88, 38.88, 38.88, 38.88, 38.88, 28.47, 28.47, 28.47, 28.47, 28.47, 30.26, 30.26, 30.26, 30.26, 30.26, 31.37, 31.37, 31.37, 31.37, 31.37, 31.84, 31.84, 31.84, 31.84, 31.84, 32.61, 32.61, 32.61, 32.61, 32.61, 33.98, 33.98, 33.98, 33.98, 33.98, 34.25, 34.25, 34.25, 34.25, 34.25, 34.18, 34.18, 34.18, 34.18, 34.18, 34.11, 34.11, 34.11, 34.11, 34.11, 33.8, 33.8, 33.8, 33.8, 33.8, 33.41, 33.41, 33.41, 33.41, 33.41, 32.92, 32.92, 32.92, 32.92, 32.92, 32.29, 32.29, 32.29, 32.29, 32.29, 31.2, 31.2, 31.2, 31.2, 31.2, 30.5, 30.5, 30.5, 30.5, 30.5, 29.75, 29.75, 29.75, 29.75, 29.75, 30.0, 30.0, 30.0, 30.0, 30.0, 29.86, 29.86, 29.86, 29.86, 29.86, 29.99, 29.99, 29.99, 29.99, 29.99, 30.15, 30.15, 30.15, 30.15, 30.15, 30.19, 30.19, 30.19, 30.19, 30.19, 30.17, 30.17, 30.17, 30.17, 30.17, 30.22, 30.22, 30.22, 30.22, 30.22, 30.21, 30.21, 30.21, 30.21, 30.21, 30.44, 30.44, 30.44, 30.44, 30.44, 30.48, 30.48, 30.48, 30.48, 30.48, 30.62, 30.62, 30.62, 30.62, 30.62, 30.95, 30.95, 30.95, 30.95, 30.95, 31.06, 31.06, 31.06, 31.06, 31.06, 31.13, 31.13, 31.13, 31.13, 31.13, 31.35, 31.35, 31.35, 31.35, 31.35, 31.46, 31.46, 31.46, 31.46, 31.46, 31.31, 31.31, 31.31, 31.31, 31.31, 31.35, 31.35, 31.35, 31.35, 31.35, 31.13, 31.13, 31.13, 31.13, 31.13, 31.1, 31.1, 31.1, 31.1, 31.1, 31.19, 31.19, 31.19, 31.19, 31.19, 31.32, 31.32, 31.32, 31.32, 31.32, 31.36, 31.36, 31.36, 31.36, 31.36, 31.54, 31.54, 31.54, 31.54, 31.54, 31.33, 31.33, 31.33, 31.33, 31.33, 30.98, 30.98, 30.98, 30.98, 30.98, 30.4, 30.4, 30.4, 30.4, 30.4, 29.67, 29.67, 29.67, 29.67, 29.67, 29.62, 29.62, 29.62, 29.62, 29.62, 29.56, 29.56, 29.56, 29.56, 29.56, 29.56, 29.56, 29.56, 29.56, 29.56, 29.6, 29.6, 29.6, 29.6, 29.6, 29.66, 29.66, 29.66, 29.66, 29.66, 29.74, 29.74, 29.74, 29.74, 29.74, 29.73, 29.73, 29.73, 29.73, 29.73, 29.69, 29.69, 29.69, 29.69, 29.69, 29.64, 29.64, 29.64, 29.64, 29.64, 29.76, 29.76, 29.76, 29.76, 29.76, 29.85, 29.85, 29.85, 29.85, 29.85, 29.9, 29.9, 29.9, 29.9, 29.9, 29.98, 29.98, 29.98, 29.98, 29.98, 30.09, 30.09, 30.09, 30.09, 30.09, 30.1, 30.1, 30.1, 30.1, 30.1]
                    
Loading

Details

kv_cache_usage_ratio

More
---
config:
    xyChart:
        titleFontSize: 12
        width: 900
        height: 600
    themeVariables:
        xyChart:
            titleColor: "#000000"
---
xychart-beta
    title "llama.cpp bench-server-baseline on Standard_NC4as_T4_v3
 duration=10m 553 iterations"
    y-axis "llamacpp:kv_cache_usage_ratio"
    x-axis "llamacpp:kv_cache_usage_ratio" 1716401542 --> 1716402170
    line [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.14, 0.14, 0.14, 0.14, 0.14, 0.38, 0.38, 0.38, 0.38, 0.38, 0.25, 0.25, 0.25, 0.25, 0.25, 0.13, 0.13, 0.13, 0.13, 0.13, 0.19, 0.19, 0.19, 0.19, 0.19, 0.17, 0.17, 0.17, 0.17, 0.17, 0.08, 0.08, 0.08, 0.08, 0.08, 0.17, 0.17, 0.17, 0.17, 0.17, 0.12, 0.12, 0.12, 0.12, 0.12, 0.2, 0.2, 0.2, 0.2, 0.2, 0.18, 0.18, 0.18, 0.18, 0.18, 0.29, 0.29, 0.29, 0.29, 0.29, 0.19, 0.19, 0.19, 0.19, 0.19, 0.26, 0.26, 0.26, 0.26, 0.26, 0.43, 0.43, 0.43, 0.43, 0.43, 0.4, 0.4, 0.4, 0.4, 0.4, 0.32, 0.32, 0.32, 0.32, 0.32, 0.22, 0.22, 0.22, 0.22, 0.22, 0.3, 0.3, 0.3, 0.3, 0.3, 0.1, 0.1, 0.1, 0.1, 0.1, 0.15, 0.15, 0.15, 0.15, 0.15, 0.22, 0.22, 0.22, 0.22, 0.22, 0.22, 0.22, 0.22, 0.22, 0.22, 0.17, 0.17, 0.17, 0.17, 0.17, 0.3, 0.3, 0.3, 0.3, 0.3, 0.13, 0.13, 0.13, 0.13, 0.13, 0.13, 0.13, 0.13, 0.13, 0.13, 0.19, 0.19, 0.19, 0.19, 0.19, 0.11, 0.11, 0.11, 0.11, 0.11, 0.12, 0.12, 0.12, 0.12, 0.12, 0.13, 0.13, 0.13, 0.13, 0.13, 0.17, 0.17, 0.17, 0.17, 0.17, 0.18, 0.18, 0.18, 0.18, 0.18, 0.15, 0.15, 0.15, 0.15, 0.15, 0.2, 0.2, 0.2, 0.2, 0.2, 0.31, 0.31, 0.31, 0.31, 0.31, 0.22, 0.22, 0.22, 0.22, 0.22, 0.14, 0.14, 0.14, 0.14, 0.14, 0.14, 0.14, 0.14, 0.14, 0.14, 0.12, 0.12, 0.12, 0.12, 0.12, 0.14, 0.14, 0.14, 0.14, 0.14, 0.37, 0.37, 0.37, 0.37, 0.37, 0.6, 0.6, 0.6, 0.6, 0.6, 0.46, 0.46, 0.46, 0.46, 0.46, 0.42, 0.42, 0.42, 0.42, 0.42, 0.14, 0.14, 0.14, 0.14, 0.14, 0.22, 0.22, 0.22, 0.22, 0.22, 0.18, 0.18, 0.18, 0.18, 0.18, 0.19, 0.19, 0.19, 0.19, 0.19, 0.17, 0.17, 0.17, 0.17, 0.17, 0.16, 0.16, 0.16, 0.16, 0.16, 0.24, 0.24, 0.24, 0.24, 0.24, 0.23, 0.23, 0.23, 0.23, 0.23, 0.31, 0.31, 0.31, 0.31, 0.31, 0.17, 0.17, 0.17, 0.17, 0.17, 0.16, 0.16, 0.16, 0.16, 0.16, 0.18, 0.18, 0.18, 0.18, 0.18, 0.1, 0.1, 0.1, 0.1, 0.1, 0.11, 0.11, 0.11, 0.11, 0.11, 0.1, 0.1, 0.1, 0.1, 0.1, 0.16, 0.16, 0.16, 0.16, 0.16]
                    
Loading
requests_processing
More
---
config:
    xyChart:
        titleFontSize: 12
        width: 900
        height: 600
    themeVariables:
        xyChart:
            titleColor: "#000000"
---
xychart-beta
    title "llama.cpp bench-server-baseline on Standard_NC4as_T4_v3
 duration=10m 553 iterations"
    y-axis "llamacpp:requests_processing"
    x-axis "llamacpp:requests_processing" 1716401542 --> 1716402170
    line [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 3.0, 3.0, 3.0, 3.0, 3.0, 6.0, 6.0, 6.0, 6.0, 6.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 5.0, 5.0, 5.0, 5.0, 5.0, 3.0, 3.0, 3.0, 3.0, 3.0, 7.0, 7.0, 7.0, 7.0, 7.0, 3.0, 3.0, 3.0, 3.0, 3.0, 8.0, 8.0, 8.0, 8.0, 8.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 7.0, 7.0, 7.0, 7.0, 7.0, 8.0, 8.0, 8.0, 8.0, 8.0, 4.0, 4.0, 4.0, 4.0, 4.0, 6.0, 6.0, 6.0, 6.0, 6.0, 3.0, 3.0, 3.0, 3.0, 3.0, 6.0, 6.0, 6.0, 6.0, 6.0, 7.0, 7.0, 7.0, 7.0, 7.0, 8.0, 8.0, 8.0, 8.0, 8.0, 4.0, 4.0, 4.0, 4.0, 4.0, 7.0, 7.0, 7.0, 7.0, 7.0, 3.0, 3.0, 3.0, 3.0, 3.0, 6.0, 6.0, 6.0, 6.0, 6.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 7.0, 7.0, 7.0, 7.0, 7.0, 8.0, 8.0, 8.0, 8.0, 8.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 4.0, 4.0, 4.0, 4.0, 4.0, 7.0, 7.0, 7.0, 7.0, 7.0, 5.0, 5.0, 5.0, 5.0, 5.0, 3.0, 3.0, 3.0, 3.0, 3.0, 4.0, 4.0, 4.0, 4.0, 4.0, 6.0, 6.0, 6.0, 6.0, 6.0, 7.0, 7.0, 7.0, 7.0, 7.0, 6.0, 6.0, 6.0, 6.0, 6.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 7.0, 7.0, 7.0, 7.0, 7.0, 8.0, 8.0, 8.0, 8.0, 8.0, 6.0, 6.0, 6.0, 6.0, 6.0, 8.0, 8.0, 8.0, 8.0, 8.0, 3.0, 3.0, 3.0, 3.0, 3.0, 6.0, 6.0, 6.0, 6.0, 6.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 6.0, 6.0, 6.0, 6.0, 6.0, 4.0, 4.0, 4.0, 4.0, 4.0, 5.0, 5.0, 5.0, 5.0, 5.0, 3.0, 3.0, 3.0, 3.0, 3.0, 0.0, 0.0, 0.0, 0.0, 0.0, 3.0, 3.0, 3.0, 3.0, 3.0, 4.0, 4.0, 4.0, 4.0, 4.0, 3.0, 3.0, 3.0, 3.0, 3.0, 6.0, 6.0, 6.0, 6.0, 6.0]
                    
Loading

@mofosyne mofosyne added refactoring Refactoring Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level labels May 22, 2024
@ggerganov ggerganov merged commit 6ff1398 into master May 22, 2024
60 of 74 checks passed
@ggerganov ggerganov deleted the gg/common-style branch May 22, 2024 17:04
@arch-btw
Copy link
Contributor

@ggerganov since you're working on this file, I wanted to let you know that it still says LLaMa in interactive mode instead of Llama. It might be even better to call it llama.cpp

control_message = " - To return control to LLaMa, end your input with '\\'.\n"

and:

control_message = " - Press Return to return control to LLaMa.\n"

On line 479 and 482.
I know you have too many pull requests already, so I figured I'd just let you know instead.

ggerganov added a commit that referenced this pull request May 23, 2024
teleprint-me pushed a commit to teleprint-me/llama.cpp that referenced this pull request May 23, 2024
* common : normalize naming style

ggml-ci

* common : match declaration / definition order

* zig : try to fix build
teleprint-me pushed a commit to teleprint-me/llama.cpp that referenced this pull request May 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
examples refactoring Refactoring Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level server
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants