Skip to content
View stongan's full-sized avatar

Block or report stongan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.
Showing results

Unified Reinforcement Learning Framework

Python 631 61 Updated Sep 6, 2024

basic algorithms of reinforcement learning

Jupyter Notebook 192 53 Updated Aug 23, 2023

A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.

Python 2,003 510 Updated Aug 6, 2024

Author's PyTorch implementation of TD3+BC, a simple variant of TD3 for offline RL

Python 316 47 Updated Dec 18, 2021

Conservative Q Learning on top of SAC

Python 118 24 Updated Oct 15, 2022

High-quality single-file implementations of SOTA Offline and Offline-to-Online RL algorithms: AWAC, BC, CQL, DT, EDAC, IQL, SAC-N, TD3+BC, LB-SAC, SPOT, Cal-QL, ReBRAC

Python 1,064 124 Updated Aug 3, 2023

Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.

Python 2,344 445 Updated Apr 29, 2024

CleanDiffuser: An Easy-to-use Modularized Library for Diffusion Models in Decision Making

Jupyter Notebook 314 28 Updated Sep 28, 2024

Official code from the paper "Offline RL for Natural Language Generation with Implicit Language Q Learning"

Python 197 18 Updated Jul 31, 2023

PyTorch implementation of the implicit Q-learning algorithm (IQL)

Python 41 3 Updated Dec 17, 2021

This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."

MATLAB 3,523 469 Updated Oct 1, 2024

计算机自学指南

HTML 56,223 6,776 Updated Sep 13, 2024

Repo for Implicit Diffusion Q-Learning

Python 86 11 Updated Dec 5, 2023

Sick colors for vim. Generated with

Vim Script 3 Updated Feb 14, 2022

Code of ICML-2020 paper Dynamic Knapsack Optimization Towards Efficient Multi-Channel Sequential Advertising

Python 27 9 Updated Aug 12, 2020

🧑‍🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…

Python 54,394 5,617 Updated Aug 24, 2024

Javascript and Python libraries for A/B test analysis

JavaScript 246 53 Updated Aug 25, 2017

推荐/广告/搜索领域工业界经典以及最前沿论文集合。A collection of industry classics and cutting-edge papers in the field of recommendation/advertising/search.

Python 1,244 195 Updated Aug 30, 2024

The repo for Tsinghua summer course: Interdisciplinary Seminar on Big Models

Python 318 66 Updated Jul 15, 2022

该仓库主要记录 大模型(LLMs) 算法工程师相关的面试题

1,399 99 Updated Mar 31, 2024

Deep Reinforcement Learning Lab, a platform designed to make DRL technology and fun for everyone

2,357 577 Updated Apr 11, 2022

InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥

Python 10,945 798 Updated Jul 18, 2024

OCR自动化阅卷项目

Python 162 48 Updated Mar 20, 2024

A custom MARL (multi-agent reinforcement learning) environment where multiple agents trade against one another (self-play) in a zero-sum continuous double auction. Ray [RLlib] is used for training.

Jupyter Notebook 138 31 Updated Jan 1, 2023

AuctionGym is a simulation environment that enables reproducible evaluation of bandit and reinforcement learning methods for online advertising auctions.

Jupyter Notebook 143 37 Updated Sep 3, 2024

Graphic notes on Gilbert Strang's "Linear Algebra for Everyone", 线性代数的艺术中文版, 欢迎PR.

PostScript 4,408 447 Updated Feb 4, 2024

计算广告机制策略相关材料整理(A collection of research and application papers about Strategy in Internet advertising.)

136 15 Updated Feb 18, 2024

《Hello 算法》:动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新,English version ongoing

Java 96,055 12,181 Updated Sep 28, 2024

精选机器学习,NLP,图像识别, 深度学习等人工智能领域学习资料,搜索,推荐,广告系统架构及算法技术资料整理。算法大牛笔记汇总

3,073 477 Updated Apr 15, 2024
Next