Skip to content

Vedantsahai18/Recreating-LLM

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Recreating-LLM

LLama3 and GPT-2 Recreation

This repository contains code and resources for recreating LLama3 and GPT-2 models, inspired by the work of Andrej Karpathy. Specifically, we are referring to his repositories and code provided in the Neural Networks: Zero To Hero video lecture series.

Repository Overview

The purpose of this repository is to provide a hands-on implementation of LLama3 and GPT-2, following the methodologies and coding practices demonstrated by Andrej Karpathy. The project aims to recreate the models while allowing for easy hacking, exploration, and learning.

Reference Repositories

  1. nanoGPT-Lecture
    • Code created in the Neural Networks: Zero To Hero video lecture series, specifically on the first lecture on nanoGPT.
    • GitHub Repo: nanoGPT model.py
    • Note: Model initialization is crucial for good performance. The current code will train and work fine, but its convergence is slower due to starting off in a suboptimal weight space. Future updates may cover these parts in more detail.

License

This project is licensed under the MIT License.

Updates

The repository will be updated periodically as new insights and improvements are made available by the original author, or as time permits.

Feel free to explore, contribute, and learn from this project

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%