Stars
🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
This repo contains the projects: 'Virtual Normal', 'DiverseDepth', and '3D Scene Shape'. They aim to solve the monocular depth estimation, 3D scene reconstruction from single image problems.
ECCV2022 - Real-Time Intermediate Flow Estimation for Video Frame Interpolation
Accurate geometric camera calibration with generic camera models
This is an official implementation of our CVPR 2021 paper "Deep Dual Consecutive Network for Human Pose Estimation" (https://openaccess.thecvf.com/content/CVPR2021/papers/Liu_Deep_Dual_Consecutive_…
OpenChat: Easy to use opensource chatting framework via neural networks
OV²SLAM is a Fully Online and Versatile Visual SLAM for Real-Time Applications
Production First and Production Ready End-to-End Speech Recognition Toolkit
Ubisoft La Forge - Animation Dataset
The official PyTorch implementation of img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation - CVPR 2021
Real-Time High-Resolution Background Matting
Recovers passwords from pixelized screenshots
A Trimap-Free Portrait Matting Solution in Real Time [AAAI 2022]
Code and data for our paper "High-Fidelity 3D Digital Human Creation from RGB-D Selfies".
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
This is PyTorch Implementation of Neural Style Transfer Algorithm which is modified for Audios.
Audio style transfer with shallow random parameters CNN.
This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs
Best Practices, code samples, and documentation for Computer Vision.
Official tensorflow implementation for CVPR2020 paper “Learning to Cartoonize Using White-box Cartoon Representations”
获取斗鱼&虎牙&哔哩哔哩&抖音&快手等 58 个直播平台的真实流媒体地址(直播源)和弹幕,直播源可在 PotPlayer、flv.js 等播放器中播放。
Deformable Style Transfer (ECCV 2020)
Code for our ICCC'20 paper - "Feel The Music: Automatically Generating A Dance For An Input Song"