Block or Report
Block or report andrewsofie
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
Taming Transformers for High-Resolution Image Synthesis
A latent text-to-image diffusion model
🦜🔗 Build context-aware reasoning applications
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
Lightweight collaborative Whiteboard / Sketchboard
The open-source city-building game for Game Boy Color.
Create LLM agents with long-term memory and custom tools 📚🦙
OpenMMLab Multimodal Advanced, Generative, and Intelligent Creation Toolbox. Unlock the magic 🪄: Generative-AI (AIGC), easy-to-use APIs, awsome model zoo, diffusion models, for text-to-image genera…
Flowtron is an auto-regressive flow-based generative network for text to speech synthesis with control over speech variation and style transfer
Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
Example script (supported) to help you integrate with our SaaS v1 API
Speech corpora for the speech recognition evaluation system
A neural network based StoryTeller that outputs a short story from an input image
Simple Tensorflow implementation of "Semantic Image Synthesis with Spatially-Adaptive Normalization" a.k.a. GauGAN, SPADE (CVPR 2019 Oral)
A mix of GAN implementations including progressive growing
🤖 💬 Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
A python toolkit for parsing captions (in natural language) into scene graphs (as symbolic representations).
PlaneRCNN detects and reconstructs piece-wise planar surfaces from a single RGB image
Deep Extreme Cut http://www.vision.ee.ethz.ch/~cvlsegmentation/dextr
A PyTorch port of the Neural 3D Mesh Renderer
This codebase demonstrates how to synthesize realistic 3D character animations given an arbitrary speech signal and a static character mesh.
This program calculates the word error rate of hypothesis in ASR and print the aligned result.
A free audio dataset of spoken digits. An audio version of MNIST.
Program to benchmark various speech recognition APIs