Udacity-Data-Scientist-Nanodegree

This repository contains my projects for Udacity's Data Scientist Nanodegree.

Project 1: Write a Data Science Blog Post

For this project I was interested in conducting exploratory data analysis using a Wine Review dataset found on Kaggle containing approximately 130k reviews from the Wine Enthusiast. I wanted the opportunity to explore the data and communicate my findings via a blog post on Medium which gives the reader insight into the questions posed.
Link to notebook
Link to Medium blog post

Project 2: Disaster Response Pipeline

I applied my data engineering skills to analyze disaster data from Figure Eight to build a model for an API that classifies disaster messages. I created a machine learning pipeline to categorize real messages that were sent during disaster events so that the messages could be sent to an appropriate disaster relief agency. The project includes a web app where an emergency worker can input a new message and get classification results in several categories. The web app also displays visualizations of the data.

Project 3: Recommendations with IBM

I analysed the interactions that users have with articles on the IBM Watson Studi platform and made recommendations to them about new articles I thought they'd like. I performed EDA, Rank Based Recommendations, User-user Based Collaborative Filtering, and Matrix factorisation.
Link to notebook

Project 4: Predicting Customer Churn for a Music Streaming Service

I used PySpark to predict customer churn for a music streaming service. The project involved:

Loading and cleaning a small subset (128MB) of a full dataset available (12GB)
Conducting Exploratory Data Analysis to understand the data and what features are useful for predicting churn
Feature Engineering to create features that will be used in the modelling process
Modelling using machine learning algorithms such as Logistic Regression, Random Forest, Gradient Boosted Trees, Linear SVM, Naive Bayes
Link to notebook
Link to blog post

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Udacity-Data-Scientist-Nanodegree

Project 1: Write a Data Science Blog Post

Project 2: Disaster Response Pipeline

Project 3: Recommendations with IBM

Project 4: Predicting Customer Churn for a Music Streaming Service

Data Scientist Nanodegree Certificate

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 102 Commits
Project 1		Project 1
Project 2		Project 2
Project 3		Project 3
Project 4		Project 4
README.md		README.md

DsGhostPos3idon/Udacity-Data-Scientist-Nanodegree

Folders and files

Latest commit

History

Repository files navigation

Udacity-Data-Scientist-Nanodegree

Project 1: Write a Data Science Blog Post

Project 2: Disaster Response Pipeline

Project 3: Recommendations with IBM

Project 4: Predicting Customer Churn for a Music Streaming Service

Data Scientist Nanodegree Certificate

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages