LocalAI model gallery
The model gallery is a curated collection of models created by the community and tested with LocalAI.
We encourage contributions to the gallery! However, please note that if you are submitting a pull request (PR), we cannot accept PRs that include URLs to models based on LLaMA or models with licenses that do not allow redistribution. Nevertheless, you can submit a PR with the configuration file without including the downloadable URL.
To load a model from main onto localhost
bash ./load.sh wizard
For how to use the files in this repository, see the Documentation
name
: Name of the modelparameters
: Prediction parameterstop_p
: Top P valuetop_k
: Top K valuemaxtokens
: Maximum tokenstemperature
: Temperaturemodel
: Model file
f16
: Use F16 format (true/false)threads
: Number of threadsdebug
: Debug mode (true/false)roles
: Map of rolesembeddings
: Use embeddings (true/false)backend
: Backend name
template
chat
: Chat templatechat_message
: Chat message templatecompletion
: Completion templateedit
: Edit templatefunction
: Function template
function
disable_no_action
: Disable no action (true/false)no_action_function_name
: No action function nameno_action_description_name
: No action description name
feature_flags
: Map of feature flags
llm
system_prompt
: System prompttensor_split
: Tensor splitmain_gpu
: Main GPUrms_norm_eps
: RMS Norm Epsilonngqa
: NGQAprompt_cache_path
: Prompt cache pathprompt_cache_all
: Prompt cache all (true/false)prompt_cache_ro
: Prompt cache read-only (true/false)mirostat_eta
: Mirostat ETAmirostat_tau
: Mirostat TAUmirostat
: Mirostatgpu_layers
: GPU layersmmap
: Use MMAP (true/false)mmlock
: Use MMLock (true/false)low_vram
: Low VRAM mode (true/false)grammar
: Grammarstopwords
: List of stopwordscutstrings
: List of cutstringstrimspace
: List of trimspacecontext_size
: Context sizenuma
: Use NUMA (true/false)lora_adapter
: Lora adapterlora_base
: Lora baseno_mulmatq
: No MulMatQ (true/false)draft_model
: Draft modeln_draft
: N Draftquantization
: Quantization
autogptq
model_base_name
: Model base namedevice
: Devicetriton
: Use Triton (true/false)use_fast_tokenizer
: Use fast tokenizer (true/false)
diffusers
pipeline_type
: Pipeline typescheduler_type
: Scheduler typecuda
: Use CUDA (true/false)enable_parameters
: Enable parameterscfg_scale
: CFG Scaleimg2img
: Image to Image Diffuser (true/false)clip_skip
: Clip skipclip_model
: Clip modelclip_subfolder
: Clip subfolder
grpc
attempts
: Attemptsattempts_sleep_time
: Attempts sleep time
vall-e
audio_path
: Audio path