sales data analysis using sql & excel

17
5 counties of Iowa selection for an alcoholism program based on Iowa liquor sales database with SQL & Excel Hanbit Choi

Upload: hanbit-choi

Post on 13-Apr-2017

236 views

Category:

Data & Analytics


1 download

TRANSCRIPT

Page 1: Sales Data Analysis using SQL & Excel

5 counties of Iowa selection for

an alcoholism program

based on Iowa liquor sales database

with SQL & Excel

Hanbit Choi

Page 2: Sales Data Analysis using SQL & Excel

5 counties of Iowa selection for

alcoholism program based on Iowa

liquor sales database

#1 SCENARIO

/ OVERVIEW

#2 BUSINESS QUESTIONS

/ ANALYSIS

# 3 EXTERNAL INFO

#4 LIMITATIONS

/ ASSUMTIONS

#5 CONCLUSION

AGENDA

Page 3: Sales Data Analysis using SQL & Excel

5 counties of Iowa selection for

alcoholism program based on Iowa

liquor sales database

SCENARIO

Iowa state officials asked me to recommend five appropriate counties to start a pilot program targeting alcoholism by analyzing Iowa liquor sales database and allowed to include external information which may impact on better decision making

Page 4: Sales Data Analysis using SQL & Excel

5 counties of Iowa selection for

alcoholism program based on Iowa

liquor sales database

OVERVIEW OF DATA

• 99 counties• Time frame : 2014, Jan to 2015, Feb• Size of Data set : 3,039,914 rows * 40 columns• Total population : 3,046,352• Average population per county : 30,771.23• Total alcohol sales amount : $ 392,293,023.61• Average alcohol sales amount by purchase : $ 128. 64

Schema of Iowa liquor data set

Page 5: Sales Data Analysis using SQL & Excel

5 counties of Iowa selection for

alcoholism program based on Iowa

liquor sales database

BUSINESS QUESTIONS

• What’s the segment of population size by county?

• What’s the sales of alcohol per county and resident?

• What’s the number of active status store by county and residents per store?

• What’s the alcohol consumption per county?

• What’s the popular alcohol item and category in the counties and its proof level?

• What’s the average profit per sales generated by county?

Defining business questions in the scenario…

Page 6: Sales Data Analysis using SQL & Excel

5 counties of Iowa selection for

alcoholism program based on Iowa

liquor sales database

ANALYSIS

-

50,000

100,000

150,000

200,000

250,000

300,000

350,000

400,000

450,000

500,000

-

10,000

20,000

30,000

40,000

50,000

60,000

70,000

80,000

90,000

100,000

total Sales population

> 100,000 then 'Large'> 20,000 then 'Medium'ELSE 'Small'

• What’s the segment of population size by county? the sales of alcohol per county?

• Big population counties generated high total sales• Polk county accounts for 22% of total sales and 14%

of total population

Average population per county : 30,771.23

Page 7: Sales Data Analysis using SQL & Excel

5 counties of Iowa selection for

alcoholism program based on Iowa

liquor sales database

ANALYSIS

-

50,000

100,000

150,000

200,000

250,000

300,000

350,000

400,000

450,000

500,000

0

50

100

150

200

250

300

350

Sales per resident Population

• What’s the sales of alcohol per resident?

• However, when it comes to sales per resident, the rank is different from the total sales• Dickinson has a small population but the sales per resident is very higher than others

Average sales per resident : $ 87.31

Page 8: Sales Data Analysis using SQL & Excel

5 counties of Iowa selection for

alcoholism program based on Iowa

liquor sales database

ANALYSIS

-

10,000,000

20,000,000

30,000,000

40,000,000

50,000,000

60,000,000

70,000,000

80,000,000

90,000,000

100,000,000

-

5,000

10,000

15,000

20,000

25,000

drink per resident total sales

• What’s the alcohol consumption per county?

• Dickinson is a small population county, but alcohol drink consumption per county is very high among other counties

Average alcohol consumption by county : 6,710.59 liter

Page 9: Sales Data Analysis using SQL & Excel

5 counties of Iowa selection for

alcoholism program based on Iowa

liquor sales database

ANALYSIS

0

200

400

600

800

1000

1200

1400

1600

1800

0

2

4

6

8

10

12

14

16

Residents per liquor store the number of stores

• What’s the number of active status store by county and residents per store?

Average number of active status store by county : 13.4

• Less residents per liquor store could represent better convenience for alcohol purchase

Page 10: Sales Data Analysis using SQL & Excel

5 counties of Iowa selection for

alcoholism program based on Iowa

liquor sales database

ANALYSIS

-

5,000,000

10,000,000

15,000,000

20,000,000

25,000,000

0

10

20

30

40

50

60

70

80

90

100

avg profit per sales total sales

• What’s the average profit per sales generated by county?

Average value : $ 35.03

• Total sales and total profit per selling by county are quite determined by population size• Therefore, referring to average profit per selling by county instead

Page 11: Sales Data Analysis using SQL & Excel

5 counties of Iowa selection for

alcoholism program based on Iowa

liquor sales database

ANALYSIS

<= 30 then 'Fine'<= 70 then 'Strong'ELSE 'Too strong'

• What’s the popular alcohol item and category in the counties and its proof level?

• Found out the most popular item and category using MODE sql code to look for the most frequently occurred sales,

People in all the counties tend to buy strong alcohol drink more

• Alcohol product proof range

0~151

• Most popular item

Barton Vodka, Black Velvet, Five O’clock, Five Star, Hawkeye

Vodka, Mccormick Vodka Pet, Southern Comfort

• Most popular category

80 PROOF VODKA / CADAIAN WHISKIES

• Its proof level

70 ‘strong’ , 80 ‘too strong’

Page 12: Sales Data Analysis using SQL & Excel

5 counties of Iowa selection for

alcoholism program based on Iowa

liquor sales database

EXTERNAL INFO

1) There is already a program targeting alcoholism in Iowa

Source : Iowa Department of Public Health

• Polk, Scott, Blackhawk, and Woodbury are #1, #3, #5, and #7 in total sales rank• These are #1, #3, #4, and #5 in population rank • Also, #2, #5, #6, and #15 in sales per resident rank• These are all included in TOP 15 alcohol consumption per resident except Woodbury county

Page 13: Sales Data Analysis using SQL & Excel

5 counties of Iowa selection for

alcoholism program based on Iowa

liquor sales database

EXTERNAL INFO

2) Revocations info including drunk driving

Source :The des moines register http://www.desmoinesregister.com/story/news/crime-and-courts/2016/04/30/driving-while-intoxicated-wrong-way/83651312/

Operating While Impaired means operating a motor with a blood alcohol content (BAC) of 0.08% or above, or while under the influence of drugs

• Polk, Woodbury, and Scott county, which are top ranked in terms of revocations, already have an alcoholism program called SBIRT IOWA since 2012

Page 14: Sales Data Analysis using SQL & Excel

5 counties of Iowa selection for

alcoholism program based on Iowa

liquor sales database

LIMITATIONS / ASSUMTIONS

Dickinson is a small population county but #1 drink per county, #1 sales per resident, and #3 residents per liquor store so I searched…

Source : Wikipedia

Assumed many visitors drink in Dickinson so the sales and the consumption goes high with no population change

No customer data / No demographic information Only retail store sales data included with no bars, restaurants, and others

• limitations in this analysis with Iowa liquor sales date set?

Tourists destination

Page 15: Sales Data Analysis using SQL & Excel

5 counties of Iowa selection for

alcoholism program based on Iowa

liquor sales database

CONCLUSION

How to select five counties with the analysis results appropriately?

By scoring the county rank of each category which were mentioned in the analysis results

Page 16: Sales Data Analysis using SQL & Excel

5 counties of Iowa selection for

alcoholism program based on Iowa

liquor sales database

CONCLUSION

Therefore, my recommendation is…

Johnson, Linn, Pottawattamie, Cerro Gordo, and Hardin

1. Listing up all the counties which were mentioned in the analysis results 2. Scoring the county rank of each category, calculating total score, and the smallest score counties should be considered for this

alcoholism program3. Excluding counties that already enrolled in the alcoholism program by Iowa4. Excluding Dickinson which has small population but the highest alcohol consumption rank because of many tourists

Already started

Already started

Tourist spot

Already started

V

VVV

V

Page 17: Sales Data Analysis using SQL & Excel

5 counties of Iowa selection for

alcoholism program based on Iowa

liquor sales database

THANK YOU