The Pytorch implementation of sound classification supports EcapaTdnn, PANNS, TDNN, Res2Net, ResNetSE and other models, as well as a variety of preprocessing methods.

Python 389 79 Updated Sep 4, 2024

ksanjeevan / crnn-audio-classification

UrbanSound classification using Convolutional Recurrent Networks in PyTorch

Python 380 80 Updated Jun 16, 2021

marcogdepinto / emotion-classification-from-audio-files

Understanding emotions from audio files using neural networks and multiple datasets.

Python 406 133 Updated Jul 1, 2023

subhadarship / kmeans_pytorch

kmeans using PyTorch

Jupyter Notebook 472 76 Updated May 9, 2023

YuanGongND / cav-mae

Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".

Python 224 22 Updated Mar 20, 2024

rawbeen248 / audio_classification_finetuning

This project focuses on the classification of animal sounds using deep learning. The core idea is to utilize audio processing techniques and a fine-tuned version of the hubert-base-ls960 model to a…

Python 6 1 Updated Mar 3, 2024

Meituan-AutoML / Twins

Two simple and effective designs of vision transformer, which is on par with the Swin transformer

Python 578 69 Updated Feb 14, 2023

qwopqwop200 / MaxVIT-pytorch

MaxVIT implementation(MaxViT: Multi-Axis Vision Transformer) This is an unofficial implementation. https://arxiv.org/abs/2204.01697

Python 9 1 Updated Apr 20, 2022

ChristophReich1996 / MaxViT

PyTorch reimplementation of the paper "MaxViT: Multi-Axis Vision Transformer" [ECCV 2022].

Python 160 18 Updated Jul 12, 2023

lucidrains / vit-pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Python 19,958 2,993 Updated Oct 4, 2024

google-research / maxvit

[ECCV 2022] Official repository for "MaxViT: Multi-Axis Vision Transformer". SOTA foundation models for classification, detection, segmentation, image quality, and generative modeling...

Jupyter Notebook 441 31 Updated Jun 2, 2023

whai362 / PVT

Official implementation of PVT series

Python 1,711 245 Updated Oct 27, 2022

open-mmlab / mmcv

OpenMMLab Computer Vision Foundation

Python 5,845 1,631 Updated Sep 26, 2024

ziplab / LIT

[AAAI 2022] This is the official PyTorch implementation of "Less is More: Pay Less Attention in Vision Transformers"

Python 90 10 Updated Jun 19, 2022

leoxiaobin / CvT

Forked from microsoft/CvT

This is an official implementation of CvT: Introducing Convolutions to Vision Transformers.

Python 223 37 Updated Jul 4, 2022

Even418

Lists (11)

audio

EEG

Emotion recognition

GNN

Machine Learning

Mei Sai

Mindspore

multimodal

NLP

pattern recognition

华为杯

Stars