Stars
自动化上传视频到社交媒体:抖音、小红书、视频号、tiktok、youtube、bilibili
Foundational Models for State-of-the-Art Speech and Text Translation
A version 1.1 of the Alexander Koch low cost robot arm with some small changes.
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
API and websocket server for sensevoice. It has inherited some enhanced features, such as VAD detection, real-time streaming recognition, and speaker verification.
Training YOLOv5/YOLOv9 to detect fire in a video
fire-smoke-detect-yolov4-yolov5 and fire-smoke-detection-dataset 火灾检测,烟雾检测
Clone a voice in 5 seconds to generate arbitrary speech in real-time
FinGLM: 致力于构建一个开放的、公益的、持久的金融大模型项目,利用开源开放来促进「AI+金融」。
The online version is temporarily unavailable because we cannot afford the key. You can clone and run it locally. Note: we set defaul openai key. If keys exceed plan and are invalid, please tell us…
A modular graph-based Retrieval-Augmented Generation (RAG) system
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
A generative speech model for daily dialogue.
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Self-Supervised Speech Pre-training and Representation Learning Toolkit
A self-supervised learning framework for audio-visual speech
An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.
Implementation of BEST-RQ - a model for self-supervised learning of speech signals using a random projection quantizer, in Pytorch.
A collection of resources on digital human including clothed people digitalization, virtual try-on, and other related directions.
A cross-platform ChatGPT/Gemini UI (Web / PWA / Linux / Win / MacOS). 一键拥有你自己的跨平台 ChatGPT/Gemini 应用。
Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.
Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)
🚀 KIMI AI 长文本大模型逆向API白嫖测试【特长:长文本解读整理】,支持高速流式输出、智能体对话、联网搜索、长文档解读、图像OCR、多轮对话,零配置部署,多路token支持,自动清理会话痕迹。
🔥🔥🔥AI-driven database tool and SQL client, The hottest GUI client, supporting MySQL, Oracle, PostgreSQL, DB2, SQL Server, DB2, SQLite, H2, ClickHouse, and more.
AI Native Data App Development framework with AWEL(Agentic Workflow Expression Language) and Agents