Skip to content

Go from no deep learning knowledge to implementing GPT.

License

Notifications You must be signed in to change notification settings

Joy879/zero_to_gpt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

94 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Zero to GPT

This course will get you from no knowledge of deep learning to training a GPT model. We'll start with the basics, then build up to complex networks.

To use this course, go through each chapter from the beginning. Read the lessons, or watch the optional videos. Then look through the implementations to solidify your understanding. I also recommend implementing each algorithm on your own.

Course Outline

0. Introduction

Get an overview of the course and what we'll learn. Includes some math and NumPy fundamentals you'll need for deep learning.

1. Gradient Descent

Gradient descent is how neural networks train their parameters to match the data. It's the "learning" part of deep learning.

2. Dense networks

Dense networks are the basic form of a neural network, where every input is connected to an output. These can also be called fully connected networks.

3. Classification with neural networks

In the last two lessons, we learned how to perform regression with neural networks. Now, we'll learn how to perform classification.

4. Recurrent networks

Recurrent neural networks can process sequences of data. They're used for time series and natural language processing.

  • Lesson: Read the recurrent network tutorial (coming soon)
  • Implementation: Notebook

5. Regularization

Regularization prevents overfitting to the training set. This means that the network can generalize well to new data.

  • Lesson: Read the regularization tutorial (coming soon)

6. PyTorch

PyTorch is a framework for deep learning that automates the backward pass of neural networks. This makes it simpler to implement complex networks.

  • Lesson: Read the PyTorch tutorial (coming soon)

7. Gated recurrent networks

Gated recurrent networks help RNNs process long sequences by helping networks forget irrelevant information. LSTM and GRU are two popular types of gated networks.

  • Lesson: Read the GRU tutorial (coming soon)
  • Implementation: Notebook

8. Encoder/Decoder RNNs

Encoder/decoders are used for NLP tasks when the output isn't the same length as the input. For example, if you want to use questions/answers as training data, the answers may be a different length than the question.

  • Lesson: Read the encoder/decoder tutorial (coming soon)
  • Implementation: Notebook

9. Transformers

Transformers fix the problem of vanishing/exploding gradients in RNNs by using attention. Attention allows the network to process the whole sequence at once, instead of iteratively.

  • Lesson: Read the transformer tutorial (coming soon)
  • Implementation: Notebook

More Chapters Coming Soon

Optional Chapters

Convolutional networks

Convolutional neural networks are used for working with images and time series.

  • Lesson: Read the convolutional network tutorial (coming soon)
  • Implementation: Notebook and class

Installation

If you want to run these notebooks locally, you'll need to install some Python packages.

  • Make sure you have Python 3.8 or higher installed.
  • Clone this repository.
  • Run pip install -r requirements.txt

About

Go from no deep learning knowledge to implementing GPT.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 98.4%
  • Python 1.6%