Lists (1)
Sort Name ascending (A-Z)
Stars
basic algorithms of reinforcement learning
A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.
Author's PyTorch implementation of TD3+BC, a simple variant of TD3 for offline RL
High-quality single-file implementations of SOTA Offline and Offline-to-Online RL algorithms: AWAC, BC, CQL, DT, EDAC, IQL, SAC-N, TD3+BC, LB-SAC, SPOT, Cal-QL, ReBRAC
Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.
CleanDiffuser: An Easy-to-use Modularized Library for Diffusion Models in Decision Making
Official code from the paper "Offline RL for Natural Language Generation with Implicit Language Q Learning"
PyTorch implementation of the implicit Q-learning algorithm (IQL)
This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."
Code of ICML-2020 paper Dynamic Knapsack Optimization Towards Efficient Multi-Channel Sequential Advertising
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…
Javascript and Python libraries for A/B test analysis
推荐/广告/搜索领域工业界经典以及最前沿论文集合。A collection of industry classics and cutting-edge papers in the field of recommendation/advertising/search.
The repo for Tsinghua summer course: Interdisciplinary Seminar on Big Models
Deep Reinforcement Learning Lab, a platform designed to make DRL technology and fun for everyone
InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥
A custom MARL (multi-agent reinforcement learning) environment where multiple agents trade against one another (self-play) in a zero-sum continuous double auction. Ray [RLlib] is used for training.
AuctionGym is a simulation environment that enables reproducible evaluation of bandit and reinforcement learning methods for online advertising auctions.
Graphic notes on Gilbert Strang's "Linear Algebra for Everyone", 线性代数的艺术中文版, 欢迎PR.
计算广告机制策略相关材料整理(A collection of research and application papers about Strategy in Internet advertising.)
《Hello 算法》:动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新,English version ongoing
精选机器学习,NLP,图像识别, 深度学习等人工智能领域学习资料,搜索,推荐,广告系统架构及算法技术资料整理。算法大牛笔记汇总