Skip to content

Identifying customer segments from provided dataset using clustering algorithm. Part of Udacity's Machine Learning curriculum

Notifications You must be signed in to change notification settings

beks-m/ML-Customer-segments-identifier

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Content: Unsupervised Learning

Project: Creating Customer Segments

In this project we will apply unsupervised learning techniques on product spending data collected for customers of a wholesale distributor in Lisbon, Portugal to identify customer segments hidden in the data.

Things to learn by completing this project:

  • How to apply preprocessing techniques such as feature scaling and outlier detection.
  • How to interpret data points that have been scaled, transformed, or reduced from PCA.
  • How to analyze PCA dimensions and construct a new feature space.
  • How to optimally cluster a set of data to find hidden patterns in a dataset.
  • How to assess information given by cluster data and use it in a meaningful way.

Part of Udacity's Machine Learning Nanodegree

Project resides in customer_segments.ipynb

Environment

This project runs on Python 3.5 and uses the following libraries:

Data

The customer segments data is included as a selection of 440 data points collected on data found from clients of a wholesale distributor in Lisbon, Portugal. More information can be found on the UCI Machine Learning Repository.

Note (m.u.) is shorthand for monetary units.

Features

  1. Fresh: annual spending (m.u.) on fresh products (Continuous);
  2. Milk: annual spending (m.u.) on milk products (Continuous);
  3. Grocery: annual spending (m.u.) on grocery products (Continuous);
  4. Frozen: annual spending (m.u.) on frozen products (Continuous);
  5. Detergents_Paper: annual spending (m.u.) on detergents and paper products (Continuous);
  6. Delicatessen: annual spending (m.u.) on and delicatessen products (Continuous);
  7. Channel: {Hotel/Restaurant/Cafe - 1, Retail - 2} (Nominal)
  8. Region: {Lisbon - 1, Oporto - 2, or Other - 3} (Nominal)

About

Identifying customer segments from provided dataset using clustering algorithm. Part of Udacity's Machine Learning curriculum

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.4%
  • Python 0.6%