In this project, we'll predict tomorrow's temperature using historical data. We'll start by downloading a dataset of local weather. You can customize this to your own location. Then, we'll clean the data and get it ready for machine learning. We'll build a system to make historical predictions. Then, we'll add more predictors to improve the model. We'll end with how to make next-day predictions.
Project Steps
- Download weather data
- Clean and graph data
- Create a testing framework
- Improve model accuracy
You can find the code for this project here
File overview:
predict.ipynb
- predict the temperature
To complete this project, you'll need to have a good understanding of:
- Python syntax, including functions, if statements, and data structures
- Data cleaning
- Pandas syntax
- Using Jupyter notebook
You'll also need to know the basics of machine learning.
Please make sure you've completed these Dataquest courses (or know the material) before trying this project:
- Python Introduction
- For Loops and If Statements
- Dictionaries In Python
- Functions and Jupyter Notebook
- Python Intermediate
- Pandas and NumPy Fundamentals
- Data Cleaning
- Machine Learning Fundamentals
To follow this project, please install the following locally:
- JupyerLab
- Python 3.8+
- Python packages
- pandas
- scikit-learn
We'll download our dataset from NOAA, a US government agency. You can follow these instructions to download the data:
- Go to NOAA Search
- Enter the years you want data for (I recommend starting with 1970), and search for the closest airport to you.
- Click add to cart on the airport you want.
- Go to the cart
- Select the csv format and click continue.
- Select all of the checkboxes for data types.
- Enter your email and click continue.
- You'll get an email with a link to download the data.
- Make sure to take a look at the data documentation as well.