High-quality single-file implementations of SOTA Offline and Offline-to-Online RL algorithms: AWAC, BC, CQL, DT, EDAC, IQL, SAC-N, TD3+BC, LB-SAC, SPOT, Cal-QL, ReBRAC

Python 1,064 124 Updated Aug 3, 2023

kzl / decision-transformer

Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.

Python 2,344 445 Updated Apr 29, 2024

CleanDiffuserTeam / CleanDiffuser

CleanDiffuser: An Easy-to-use Modularized Library for Diffusion Models in Decision Making

Jupyter Notebook 314 28 Updated Sep 28, 2024

Sea-Snell / Implicit-Language-Q-Learning

Official code from the paper "Offline RL for Natural Language Generation with Implicit Language Q Learning"

Python 197 18 Updated Jul 31, 2023

BY571 / Implicit-Q-Learning

PyTorch implementation of the implicit Q-learning algorithm (IQL)

Python 41 3 Updated Dec 17, 2021

MathFoundationRL / Book-Mathematical-Foundation-of-Reinforcement-Learning

This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."

MATLAB 3,523 469 Updated Oct 1, 2024

PKUFlyingPig / cs-self-learning

计算机自学指南

HTML 56,223 6,776 Updated Sep 13, 2024

philippe-eecs / IDQL

Repo for Implicit Diffusion Q-Learning

Python 86 11 Updated Dec 5, 2023

pablopunk / sick.vim

Sick colors for vim. Generated with

Vim Script 3 Updated Feb 14, 2022

tjuHaoXiaotian / ICML-2020-MSBCB

Code of ICML-2020 paper Dynamic Knapsack Optimization Towards Efficient Multi-Channel Sequential Advertising

Python 27 9 Updated Aug 12, 2020

labmlai / annotated_deep_learning_paper_implementations

🧑‍🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…

Python 54,394 5,617 Updated Aug 24, 2024

thumbtack / abba

Javascript and Python libraries for A/B test analysis

JavaScript 246 53 Updated Aug 25, 2017

tangxyw / RecSysPapers

推荐/广告/搜索领域工业界经典以及最前沿论文集合。A collection of industry classics and cutting-edge papers in the field of recommendation/advertising/search.

Python 1,244 195 Updated Aug 30, 2024

thunlp / BMCourse

The repo for Tsinghua summer course: Interdisciplinary Seminar on Big Models

Python 318 66 Updated Jul 15, 2022

km1994 / LLMs_interview_notes

该仓库主要记录大模型（LLMs）算法工程师相关的面试题

1,399 99 Updated Mar 31, 2024

NeuronDance / DeepRL

Deep Reinforcement Learning Lab, a platform designed to make DRL technology and fun for everyone

2,357 577 Updated Apr 11, 2022

instantX-research / InstantID

InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥

Python 10,945 798 Updated Jul 18, 2024

vkgo / OCRAutoScore

OCR自动化阅卷项目

Python 162 48 Updated Mar 20, 2024

ChuaCheowHuan / gym-continuousDoubleAuction

A custom MARL (multi-agent reinforcement learning) environment where multiple agents trade against one another (self-play) in a zero-sum continuous double auction. Ray [RLlib] is used for training.

Jupyter Notebook 138 31 Updated Jan 1, 2023

amzn / auction-gym

AuctionGym is a simulation environment that enables reproducible evaluation of bandit and reinforcement learning methods for online advertising auctions.

Jupyter Notebook 143 37 Updated Sep 3, 2024

kf-liu / The-Art-of-Linear-Algebra-zh-CN

Forked from kenjihiranabe/The-Art-of-Linear-Algebra

Graphic notes on Gilbert Strang's "Linear Algebra for Everyone", 线性代数的艺术中文版, 欢迎PR.

PostScript 4,408 447 Updated Feb 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

stongan

Block or report stongan

Lists (1)

🚀 My stack

Stars

anuragajay / decision-diffuser

OpenRL-Lab / openrl

johnjim0816 / rl-tutorials

DLR-RM / rl-baselines3-zoo

sfujim / TD3_BC

young-geng / CQL

tinkoff-ai / CORL