Port of Karpathy's Let's Build GPT tutorial to Candle #1525

jeroenvlek · 2024-01-05T10:47:02Z

This adds a link to a port of this tutorial to Candle. I moved all comments from his notebook to this code as well.

You will also see a struggle between following his variable naming and my own standards ;) I will probably make that more consistent soon.

It is meant as a complement to the other tutorial link and it showcases different sides of the Candle API applied to a toy example, especially for people trying to build a model from scratch (as opposed to loading pre-trained weights).

I'm sure not everything in the port might be idiomatic use of Candle, happy to receive feedback! Just know that it works and I had to dive into the API quite often to find the corresponding functionality. (Only thing I couldn't find/work around was on-device multinomial sampling)

LaurentMazare · 2024-01-05T10:59:43Z

That's a great addition, thanks!

add link to gpt-from-scratch-rs

6994af3

LaurentMazare approved these changes Jan 5, 2024

View reviewed changes

LaurentMazare merged commit 3a7304c into huggingface:main Jan 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Port of Karpathy's Let's Build GPT tutorial to Candle #1525

Port of Karpathy's Let's Build GPT tutorial to Candle #1525

jeroenvlek commented Jan 5, 2024

LaurentMazare commented Jan 5, 2024

Port of Karpathy's Let's Build GPT tutorial to Candle #1525

Port of Karpathy's Let's Build GPT tutorial to Candle #1525

Conversation

jeroenvlek commented Jan 5, 2024

LaurentMazare commented Jan 5, 2024