Recreating-LLM

LLama3 and GPT-2 Recreation

This repository contains code and resources for recreating LLama3 and GPT-2 models, inspired by the work of Andrej Karpathy. Specifically, we are referring to his repositories and code provided in the Neural Networks: Zero To Hero video lecture series.

Repository Overview

The purpose of this repository is to provide a hands-on implementation of LLama3 and GPT-2, following the methodologies and coding practices demonstrated by Andrej Karpathy. The project aims to recreate the models while allowing for easy hacking, exploration, and learning.

Reference Repositories

nanoGPT-Lecture
- Code created in the Neural Networks: Zero To Hero video lecture series, specifically on the first lecture on nanoGPT.
- GitHub Repo: nanoGPT model.py
- Note: Model initialization is crucial for good performance. The current code will train and work fine, but its convergence is slower due to starting off in a suboptimal weight space. Future updates may cover these parts in more detail.

License

This project is licensed under the MIT License.

Updates

The repository will be updated periodically as new insights and improvements are made available by the original author, or as time permits.

Feel free to explore, contribute, and learn from this project

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
mini-gpt		mini-gpt
mini-llama31		mini-llama31
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Recreating-LLM

LLama3 and GPT-2 Recreation

Repository Overview

Reference Repositories

License

Updates

About

Releases

Packages

Languages

Vedantsahai18/Recreating-LLM

Folders and files

Latest commit

History

Repository files navigation

Recreating-LLM

LLama3 and GPT-2 Recreation

Repository Overview

Reference Repositories

License

Updates

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages