Skip to content
View LanguidoMensdelson's full-sized avatar

Block or report LanguidoMensdelson

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

This repository is an implementation of quantizing and converting the Llama3-8B-Instruct model weights and deploying it on Android for on-device inference.

Makefile 52 4 Updated May 25, 2024

Proxy that allows you to use ollama as a copilot like Github copilot

Go 305 22 Updated Sep 5, 2024

LLM Frontend in a single html file

HTML 234 26 Updated Oct 6, 2024

A developer reference project for creating Retrieval Augmented Generation (RAG) chatbots on Windows using TensorRT-LLM

TypeScript 2,692 319 Updated Aug 21, 2024

Universal LLM Deployment Engine with ML Compilation

Python 18,823 1,533 Updated Oct 5, 2024

Run GGUF models easily with a KoboldAI UI. One File. Zero Install.

C++ 4,998 350 Updated Oct 7, 2024