Skip to content
View monologuer's full-sized avatar

Block or report monologuer

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Popular repositories Loading

  1. flash-attention-minimal flash-attention-minimal Public

    Forked from tspeterkim/flash-attention-minimal

    Flash Attention in ~100 lines of CUDA (forward pass only)

    Cuda

  2. LLM4Decompile LLM4Decompile Public

    Forked from albertan017/LLM4Decompile

    Reverse Engineering: Decompiling Binary Code with Large Language Models

    Python

  3. candle candle Public

    Forked from huggingface/candle

    Minimalist ML framework for Rust

    Rust

  4. tiny-gpu tiny-gpu Public

    Forked from adam-maj/tiny-gpu

    A minimal GPU design in Verilog to learn how GPUs work from the ground up

    SystemVerilog

  5. NyuziProcessor NyuziProcessor Public

    Forked from jbush001/NyuziProcessor

    GPGPU microprocessor architecture

    C

  6. distributed-llama distributed-llama Public

    Forked from b4rtaz/distributed-llama

    Tensor parallelism is all you need. Run LLMs on an AI cluster at home using any device. Distribute the workload, divide RAM usage, and increase inference speed.

    C++