basics of clustering

13

Cluster Analysis For segmentation

Upload: b-nichols

Post on 14-Jan-2017

438 views

Category:

Education

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

Page 1: Basics of Clustering

Cluster AnalysisFor segmentation

Page 2: Basics of Clustering

Clustering

what is it? why do we use it? how do we do it?

Page 3: Basics of Clustering

What is it?• Cluster analysis is the

process of grouping a set of data into clusters.

• A cluster is a collection of data points where each observation is 1) similar to other observations in the same cluster, and 2) dissimilar to observations in other clusters

Page 4: Basics of Clustering

What is it?• Cluster analysis is a statistical tool for discovering

hidden patterns in groups of observations - e.g., on what criteria are these “clusters” made?

• Cluster analysis is still quite subjective in nature. Does it make sense?

Page 5: Basics of Clustering

In Marketing…• Clustering is used to discover

distinct groups in customer bases (e.g., segments), and use this knowledge to develop targeted marketing programs

• Another example: Insurance companies use clustering to determine “what type” of drivers are risky, and safe - and charge premiums accordingly!

Page 6: Basics of Clustering

Good Clusters have:• High: Intra-class similarity

(observations in the cluster share qualities)

• Low: Inter-class similarity (distinct clusters are different from one-another)

Page 7: Basics of Clustering

Consider- two important characteristics

Student grades work hours

a 3.5 0

b 3.7 5

c 2.9 10

d 2.0 12

e 3.0 15

f 2.8 14

work hours

grades

a

dc

b

efcluster 1

cluster 2

Page 8: Basics of Clustering

How do we use this information?

We have 2 distinct segments.

Other data we have: age, gender,

hometown, grade level, major, hair color.

What is the segment profile of each?

work hours

grades

a

dc

b

efcluster 1

cluster 2

Page 9: Basics of Clustering

Are these both viable

targets?

That depends on ….?

Are all of these characteristics useful?

How do we use this information?

We have 2 distinct segments.

Descriptive Statisticsage gender hometown major haircolor

segment 1 - works a

lotMean = 20 57% male 90% NKY 65%

Business50%

blonde

segment 2 - good

studentsMean = 20 75% male 66% OH,

IN 50% Arts 75% brown

Page 10: Basics of Clustering

How to do it!

• You need access to SPSS. You can either 1) log in to NKU’s virtual network (VPN) using the virtual desktop, or you can use a University computer. (I suggest VPN)

• Use this link to learn how to use the virtual desktop. You first have to install the VPN software if you want to do it off-campus: click here.

https://oit.nku.edu/listsoftware/vmware.html

http://access2.nku.edu/view/

Page 11: Basics of Clustering

Steps to follow…• Open your data set and save it

to a portable drive or your NKU “j” drive

• We will be using “Two-Step” cluster analysis.

• From SPSS file:

Analyze —->Classify —> two-step cluster

The Youtube tutorial is linked here if you need to review it.

Then follow the instructions on the YouTube tutorial.

https://www.youtube.com/watch?v=DpucueFsigA

Page 12: Basics of Clustering

Warnings

Don’t use binary variables in the clustering process (e.g., gender, team (yes/no)). These are “swamping variables” and will hijack your clusters.

Clusters of 3-4 are ideal, even if you have to force it and the criteria are not very good. You only have what you have…

Your data set might not ever give you “perfect” results based on the criteria discussed in the video tutorial. Thats ok. Do the best you can.

Page 13: Basics of Clustering

More on Profiles to come…

FUZZY CLUSTERING 2009/2010. 2 What is Data Clustering? Fuzzy C-Means Clustering Subtractive Clustering Data Clustering Using the Clustering GUI

Clustering Supervised vs. Unsupervised Learning Examples of clustering in Web IR Characteristics of clustering Clustering algorithms Cluster Labeling 1

So Far…… Clustering basics, necessity for clustering, Usage in various fields : engineering and industrial fields Properties : hierarchical, flat,

Hierarchical Clustering Basics -

Session 13925: MQ Clustering - The basics, …...Session 13925: MQ Clustering - The basics, advances, and what's new Neil Johnston - [email protected] WebSphere MQ z/OS L3 – IBM Hursley

The basics of Storage Microsoft clustering Grey File services

Foundations od Data Analysis I: Clustering - vsb.czhomel.vsb.cz/~kud007/files/madi5.pdf · Basics of Clustering • Clustering: It is concerned with grouping together objects that

CLUSTERING CAS for High Availability - Apereo CAS for High Availability Eric Pierce, University of South Florida • High Availability Basics • Before Clustering CAS • Failover

CSE182-L17 Clustering Population Genetics: Basics

Clustering By: Avshalom Katz. We will be talking about… What is Clustering? Different Kinds of Clustering What is DBSCAN? Pseudocode Example of Clustering

CLUSTERING. Overview Definition of Clustering Existing clustering methods Clustering examples

Vladyslav Kolbasin Stable Clustering. Clustering data Clustering is part of exploratory process Standard definition: Clustering - grouping a set of

Survey of Clustering Data Mining Techniquesrvetro/vetroBioComp/Clustering/Berkhin2006a A... · Survey of Clustering Data Mining Techniques Pavel Berkhin Accrue Software, Inc. Clustering

09 -1 Lecture 09 Clustering-based Learning Topics –Basics –K-Means –Self-Organizing Maps –Applications –Discussions

Hierarchical Clustering - unipi.itdidawiki.cli.di.unipi.it/.../dm/dm2014_clustering_hierarchical.pdf · Hierarchical Clustering Two main types of hierarchical clustering – Agglomerative:

Search Head Clustering Basics To Best Practices - Splunk...High Availability Of Search Results ︎ Artifacts are replicated across the SH members ︎ Adhocsearches are notreplicated

Interactive Exploration of Hierarchical Clustering Results HCE (Hierarchical Clustering Explorer)

Clustering - University of Minnesota Duluthrmaclin/cs5751/notes/Clustering-1PerPage.pdfCS 5751 Machine Learning Data Clustering 2 What is Clustering? • Form of unsupervised learning

DATA MINING - CLUSTERING. Clustering 4 Clustering - unsupervised classification 4 Clustering - the process of grouping physical or abstract objects into

MQ Clustering The basics, advances, and what's new...• Clustering introduces a new architectural layer, the Full Repository and Partial Repository queue managers, purely for the

Windows clustering and quorum basics

Guest Lecture: Clusteringcvml.ist.ac.at/talks/clustering-core2018.pdfsingle linkage clustering, complete linkage clustering, average linkage clustering Graph-based clustering spectral

Comparing Clustering Techniques for Telecom Churn Management · Key-Words: - Clustering Algorithms, Data Mining, Clustering for GSM, DBSCAN, Benchmarking of Clustering Methods, Churn

Clustering 3: Hierarchical clustering (continued ...ryantibs/datamining/lectures/06-clus3.pdf · Clustering 3: Hierarchical clustering (continued); choosing the number of clusters

Variant Calling and Clustering on RNA-Seq Data...Variant Calling and Clustering on RNA-Seq Data by Paul Arndt and Karsten Tausche 03.02.2017. Agenda The basics ... GATK Best-Practices:

Windows Failover Clustering Basics for the DBA - sqlha.com · Title: Windows Failover Clustering Basics for the DBA Author: Allan Hirt - Megahirtz LLC Keywords: SQL Server cluster

Clustering in Ratemaking: Applications in Territories ... · Clustering in Ratemaking: Applications in Territories Clustering OVERVIEW OF CLUSTERING ¾Purpose of Clustering in Insurance

Windows Failover Clustering Basics for the DBA - SQLHA · Title: Windows Failover Clustering Basics for the DBA Author: Allan Hirt - Megahirtz LLC Keywords: SQL Server cluster failover

Clustering on Highways: Study “ Clustering ” of Traffic on Highways

The Semantics of Clustering: Analysis of User-Generated ...people.cs.vt.edu/aendert/Alex_Endert/Research_files/AVI2012 - Clustering Study...The Semantics of Clustering: Analysis of

Iterative Reclassification in Agglomerative Clustering of... · Iterative Reclassification in Agglomerative Clustering ... clustering for finding improved ... flat clustering

Clustering 2: Hierarchical clustering

CSE601 Clustering Basics - University at Buffalojing/cse601/fa12/materials/clustering_basic… · • Land use: Identification of areas of similar land use in an earth observation

Clustering: evolution of methods to meet new challenges · Clustering of clustering algorithms7 Jain et al. (2004) hierarchical clustered 35 diﬀerent clustering algorithms into

Statistical Genomics and Bioinformatics Workshop: … · Statistical Genomics and Bioinformatics Workshop 8/16/2013 2 Clustering Basics • Clustering is the process of grouping a