Skip to content
View Tabrizian's full-sized avatar
  • NVIDIA
  • Toronto, Canada

Organizations

@nuxt-community @kubeflow @triton-inference-server

Block or report Tabrizian

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Generative AI extensions for onnxruntime

C++ 450 108 Updated Oct 4, 2024

The most no-nonsense, locally or API-hosted AI code completion plugin for Visual Studio Code - like GitHub Copilot but completely free and 100% private.

TypeScript 2,926 153 Updated Sep 27, 2024

An autoregressive character-level language model for making more things

Python 2,502 658 Updated Jun 4, 2024

Neural Networks: Zero to Hero

Jupyter Notebook 11,645 1,457 Updated Aug 18, 2024

Package management made easy

Rust 3,040 168 Updated Oct 4, 2024

DSPy: The framework for programming—not prompting—foundation models

Python 17,495 1,334 Updated Oct 4, 2024

llama3.np is a pure NumPy implementation for Llama 3 model.

Python 959 73 Updated Jun 2, 2024

A VSCode extension to generate development environments using micromamba and conda-forge package repository

TypeScript 84 10 Updated Sep 13, 2024

CUDA checkpoint and restore utility

Cuda 204 10 Updated Apr 17, 2024

A book about compiling Racket and Python to x86-64 assembly

TeX 1,293 141 Updated Sep 28, 2024

A Python framework for high performance GPU simulation and graphics

Python 4,154 232 Updated Oct 5, 2024

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Python 1,716 91 Updated Jan 21, 2024

Development repository for the Triton language and compiler

C++ 12,928 1,569 Updated Oct 5, 2024

Extending JAX with custom C++ and CUDA code

Python 372 21 Updated Aug 18, 2024

Enabling CPython multi-core parallelism via subinterpreters.

245 6 Updated Aug 19, 2022

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Python 9,079 838 Updated Jul 1, 2024

The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.

11,397 2,108 Updated Sep 25, 2024

Fast and memory-efficient exact attention

Python 13,634 1,249 Updated Oct 4, 2024

MLX: An array framework for Apple silicon

C++ 16,627 953 Updated Oct 5, 2024

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 8,330 935 Updated Oct 1, 2024

PyTriton is a Flask/FastAPI-like interface that simplifies Triton's deployment in Python environments.

Python 719 50 Updated Sep 25, 2024

Utilities for using Python's PEP 554 subinterpreters

Python 110 6 Updated Oct 1, 2024

torch::deploy (multipy for non-torch uses) is a system that lets you get around the GIL problem by running multiple Python interpreters in a single C++ process.

C++ 173 35 Updated Jun 20, 2024

High accuracy RAG for answering questions from scientific documents with citations

Python 6,017 563 Updated Oct 4, 2024

A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API

Jupyter Notebook 10,134 1,459 Updated Aug 8, 2024

Some notes on things I find interesting and important.

JavaScript 1,966 177 Updated Sep 11, 2024

`std::execution`, the proposed C++ framework for asynchronous and parallel programming.

C++ 1,534 158 Updated Sep 20, 2024

RAPIDS Memory Manager

C++ 478 195 Updated Oct 5, 2024

A hybrid thread / fiber task scheduler written in C++ 11

C++ 1,863 193 Updated Jul 12, 2024

[ARCHIVED] The C++ Standard Library for your entire system. See https://github.com/NVIDIA/cccl

C++ 2,291 186 Updated Feb 7, 2024
Next