Skip to content

Latest commit

 

History

History

Clustering

Clustering

Our goal is to segment our user base so that we can get a high-level/abstract understanding of the different types of users we have.
This provides us managable bites/goals for us to market to our users.

Steps:

  1. Attribute Selection:
    Select which attributes to use for segmenting our user base.

  2. Dataset Creating:
    Join and Wrangle flamingo-data tables to get the attributes we decided on in the previous step.

  3. Clustering:
    Segmenting our created Dataset into 3 Clusters.
    Note: The Values were all Normalised/Scaled before clustering for better understanding of results.

  4. Conclusion:
    Provides Recommendations to increase Revenue.

Working

This Jupyter Notebook contains all the working done above, step by step.
From reading files, to wrangling and joining, to normalising dataset and finally Clustering.

Note: This Notebook requries a Scala Kernel to run AND also needs the Apache Spark Libraries to be available to said Kernel.
Apache Toree recommended.