Skip to content
View hbwu-ntu's full-sized avatar

Highlights

  • Pro
Block or Report

Block or report hbwu-ntu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Implementation of Autoregressive Diffusion in Pytorch

Python 180 1 Updated Jul 27, 2024

Fake speech detection with the CodecFake dataset

Python 3 Updated Jul 27, 2024

为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…

Python 62,660 7,790 Updated Jul 24, 2024

Audio processing by using pytorch 1D convolution network

Python 994 88 Updated Feb 13, 2024

AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension

Python 10 Updated Jul 17, 2024

documentation for content creation

99 10 Updated Jul 26, 2024

A lightweight package for some common metrics used in speech

Python 2 Updated Jul 27, 2024

Utilities intended for use with Llama models.

Python 2,664 317 Updated Jul 27, 2024

AudioBench: A Universal Benchmark for Audio Large Language Models

Python 45 Updated Jul 22, 2024

An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)

Python 3,453 280 Updated Jul 22, 2024

LongRoPE is a novel method that can extends the context window of pre-trained LLMs to an impressive 2048k tokens.

Python 39 Updated Jul 25, 2024

Implementation of rectified flow and some of its followup research / improvements in Pytorch

Python 107 2 Updated Jul 24, 2024

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 29,833 3,443 Updated Jul 27, 2024

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

475 9 Updated Jul 22, 2024
Python 36 2 Updated Dec 19, 2023

Fine-tune and evaluate Whisper models for Automatic Speech Recognition (ASR) on custom datasets or datasets from huggingface.

Python 200 47 Updated May 23, 2023

MiniSora: A community aims to explore the implementation path and future development direction of Sora.

Python 1,131 146 Updated Jun 1, 2024

MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.

Python 853 41 Updated Jul 19, 2024

MARS5 speech model (TTS) from CAMB.AI

Jupyter Notebook 2,278 183 Updated Jul 20, 2024

This is the official implementation of the SEMamba paper.

Python 98 7 Updated Jul 20, 2024

Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch

Python 175 14 Updated Jul 27, 2024

Multilingual Voice Understanding Model

Python 1,670 158 Updated Jul 27, 2024

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 2,789 262 Updated Jul 25, 2024

Enjoy the magic of Diffusion models!

Python 6,003 536 Updated Jul 26, 2024

OpenDiT: An Easy, Fast and Memory-Efficient System for DiT Training and Inference

Python 1,366 88 Updated Jul 26, 2024

FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3

Python 133 8 Updated Apr 20, 2024

This repository contains the SpeechBrain Benchmarks

Python 77 33 Updated Jul 25, 2024

A multi-voice TTS system trained with an emphasis on quality

Jupyter Notebook 12,544 1,750 Updated Jun 27, 2024
Next