schinger

Follow

wangxuguang schinger

Follow

Hi, I'm Xuguang Wang Email: xugwang@microsoft.com

5 followers · 2 following

Achievements

Achievements

Pinned Loading

pong_actor-critic pong_actor-critic Public

Trains an agent with (stochastic) Policy Gradients(actor-critic) on Pong. Uses OpenAI Gym.

Python 12 2
FullLLM FullLLM Public

Full stack LLM (Pre-training/finetuning, PPO(RLHF), Inference, Quant, etc.)

Python 1 1
llama2.c llama2.c Public

Forked from karpathy/llama2.c

Inference Llama 2 in one file of pure C

C
PPO-simplest PPO-simplest Public

PPO in one file

Python