Linear Regression
- To Develop a secondary school student performance prediction tool to identify various variables that affect educational success and failure in secondary education based on real-world data on two core subjects - mathematics and Portuguese, which provide fundamental knowledge for success in the remaining subjects - collected using school reports and questionnaires.Techstack: Python 3.7, numpy, pandas, matplotlib, sklearn, seaborn.
- I Developed a Linear regression supervised machine learning classification model whose numerical predicted value output ranges from I to V: I-(excellent/very good), II-(good), III-(satisfactory), IV-(sufficient) and V-(fail) based on the Erasmus grade conversion system.
- Math class had a 0.03548 variance score and 1.887629 root means squared error (RMSE).
- Portuguese class had a 0.09151 variance score and 1.803660 root mean squared error (RMSE).
❖ There are 33 features/attributes before data preprocessing for both Mathematics and Portuguese classes. The Mathematics class has 395 records while the Portuguese class has 649 records.
STEPS 1- Import libraries
- Fetch data and load both mathematics and Portuguese CSV files into separate pandas data frames
- Data Wrangling: Transform and Analyse the data.
- Determine the Target variable and create an Explanatory variable
- Testing and Training data split
- Build the Linear Regression model
- Run Predictions
- Evaluate the Model
- Visualise results