bi retail industry v1.1
TRANSCRIPT
-
8/7/2019 BI Retail Industry v1.1
1/23
Business Intelligence forRetail
Submitted by:
Vivek Sharma
Ashutosh Sinha
-
8/7/2019 BI Retail Industry v1.1
2/23
Retail Industry
1.Introduction
Retail is the most complex industry handling a wide range of
products from large numbers of suppliers and servicing the
highest number of customers.Retail Industry market trendschange most frequently than any other industry. To remain
competitive in the retail Industry, keeping an eye on all the
operations in retail business is very crucial.Understandingcustomer requirements and providing them with what they
want and still maintaining profitability requires valuable and
highly analysed information on management part to deal
with ever changing dynamic market conditions and to stay
ahead of the competition.
As retail markets become more competitive, the ability to
react quickly and decisively to ever changing trends and tocustomize products and services to individual customer
needs is more critical than ever. A business intelligence
system can be a very effective means of organizing and
analyzing the vast amount of information generated in a
retail business, and helps you to generate a more effective
business model for keeping your business profitable.
2.Retail and Business Intelligence
Successful retailers strive to accomplish three basic
objectives:
to align their business with client needs
to differentiate from competitors; and
to optimize product mix and space utilization.
-
8/7/2019 BI Retail Industry v1.1
3/23
To achieve these basic goals, retailers must be able to
successfully manage inventory, product mixes, promotions,
advertisements, supply chain dynamics, and a host of other
factors. Furthermore, As retail markets become more
competitive, the ability to react quickly and decisively to
ever changing trends and to customize products and services
to individual customer needs is more critical than ever. Lack
of information is not the problemdata to assist in making
these kinds of decisions is readily available but the problem
is that the complexity and volume of information available to
organizations is overwhelming.
Technology plays an indispensable role in supporting thebackbone of retail businesses. In a retail environment,
transactional systems, such as Point of Sales (POS) systems
are efficient in what they are intended to do - record and
retrieve large volumes of transactions and operations.
Embedded in the Point of sales systems is a "treasure trove"
of dormant, unclassified and often unused information about
what happened in the business in the preceding period like
last week etc. Legacy reporting systems often presenthistorical information in the form of standard static layouts.
These reports can neither be viewed from different
perspectives at different times nor can they provide critical
insight for retailers in answering questions such as
How can inventory levels be optimized
What are the most profitable products
Who are the most valuable customers
Increasingly, successful retailers will be those that can
effectively categorize and utilize these data and use it for
decision making.
-
8/7/2019 BI Retail Industry v1.1
4/23
3.Retail Industry analysis through BI
BI can provide useful insights into the retail sector:
Sales and profitability analysis. Store operations analysis.
Customer analysis.
Merchandise management
Supplier performance measurement
Marketing and e-commerce analysis.
Brand and marketing research.
Market Share analysis.
4.Key performance indicators of retail sector:
Sales per hour this statistic tells us about the speed at
which each individual salesperson is selling or attending to
customers compared to everyone else on the shift.
Average Sale the average selling price of each individual
salesperson compared to everyone else on the shift higher
averages show a greater knowledge of product as the
salesperson is able to sell higher ticket items. Low statistics
reveal the salesperson lacks skill in either product knowledge
or effective probing.
Items Per Sale tells us about the ability of the salesperson
to add-on to a sale.
Conversion Rate tracks how many visitors to the store are
turned into customers.
Wage to Sales Ratio compares a salespersons hourly
wages to hourly sales. This KPI differentiates clearperformers from underperformers
-
8/7/2019 BI Retail Industry v1.1
5/23
Average sales per customer or transaction Total sales
for a given period divided by the number of customers or
transactions for the same period
Sales per square foot / meter Actual sales for a given
period divided by the total floor area (in sq.ft. or meters) of
the store.
Inventory Sales per selling hour Actual sales for the
store divided by the number of selling hours during the same
period
Sales per labour hour Actual sales for the store divided
by the number of labor hours used during the same period
Inventory Turns This KPI tells us how often the average
inventory over a given period of time (usually a year) is sold
in that same period of time
Inventory Store conversion rate The number of
transactions in a given period divided by the total number ofcustomers who entered the store during the same period
Coupon conversion percentage Percentage of coupons
that have been used by customers
Profit per Customer Visit Profit obtained from each
customer visit. This way you can easily set goals for your
sales team in order to increase profits.
Units per customer or transaction Total number of
units sold in a given period divided by the number of
customers or transactions for the same period
Shelf space profitability measured per product
Price premium The relative price of a product compared
to a benchmark price (average retail price)
-
8/7/2019 BI Retail Industry v1.1
6/23
Promotion share Share of promotion products in
percentage (%) of total sales
Shortages/overages in cash registers
Sell-through percentage (%) Is a percentage of units
sold during a period and it is calculated by dividing the
number of units sold by the beginning on-hand inventory (for
that same time period)
Percentage of perishable items with past due date
Number of perishable items with past due date as apercentage of all items in store
Product visibility on shelf Measures the amount of frontal
views of a single product-package on a fully stocked shelf
Sales Gross Margin Return on Inventory Investment
The GM ROII multiplies Inventory Turns (which tells us how
healthy our stock is) by Gross margin (which tells us the
percentage of profit we make on each sale)
5. Business Intelligence software for retail: Major
vendors
1.
The IBM Cognos BI suite serves more than23,000 customers in more than 135 countries. Complete corporate performance managementsolution. Supports enterprise planning, scorecarding,and business intelligence.
2.
-
8/7/2019 BI Retail Industry v1.1
7/23
Exceptionally strong in advanced analyticanalysis. Full analysis power revealed in the hands ofuses with knowledge of SAS programming.
Subscription-based pricing model.3.
Widely established and respected vendor ofbusiness intelligence solutions.
Market leader for data warehousing tools. Gartner recognized the Siebel BusinessAnalytics platform as one of the most comprehensiveand visionary BI platforms available.
Oracle solutions are designed to integrateseamlessly with any existingdatabase systems.
Hosted and on-premise B.I. solutions available.
4.
As one of the largest business softwarecompanies, SAP can integrate BIthroughout SAP applications and your company.
Strong data scalability and performance thanksto in-memory analytics and column-based vectoring.
-
8/7/2019 BI Retail Industry v1.1
8/23
Acquisition of Business Objects makes SAP the
largest BI vendor.
5.
6.
6.Sample retail Business Intelligence
dashboards:
-
8/7/2019 BI Retail Industry v1.1
9/23
-
8/7/2019 BI Retail Industry v1.1
10/23
-
8/7/2019 BI Retail Industry v1.1
11/23
7. Designing a Business Intelligence data ware house for
Retail Industry
This section would deal with the question of how to design a
business intelligence data ware house for retail industry. Wewill explain this step by step through a small retail case
study to make it simple to understand.
Case study:
We are running a retail business which has 100 retail
stores spread over five states.
Each of the stores has a full complement of departments,including grocery, frozen foods, dairy, meat, produce,
bakery, floral, and health/beauty aids.
Each store has roughly 60,000 individual products on its
shelves.
About 55,000 of the SKUs come from outside
manufacturers and
have bar codes imprinted on the product package.
These bar codes are called universal product codes
(UPCs). UPCs are at the same grain as individual SKUs.
The remaining 5,000 SKUs come from departments such
as meat, produce, bakery, or floral. While these products
dont have nationally recognized UPCs, the grocery chain
assigns SKU numbers to them. Data is collected at two points:
Point-of-sale(POS)-when customer make purchases.
Data collection point-when vendors deliver materials.
Management concern: maximizing profit.
Charging as much as possible.
-
8/7/2019 BI Retail Industry v1.1
12/23
Lowering cost for product acquisition and overheads.
Attracting more and more customers in the highly
competitive pricing environment.
Other managements major decisions revolve around:
Promotions.
Pricing.
We shall use dimensional design process and it consists of
following four steps:
1. Selecting the Business process.
2. Declaring the Grain.
3. Choosing the dimensions.
4. Identifying the facts.
Step 1: Selecting the business process
In our retail case study, management wants to better
understand customer purchases as captured by the POS
system.
Thus the business process were going to model is POS
retail sales.
This data will allow us to analyze what products are selling
in which stores on what days under what promotional
conditions.
Step 2: Declaring the grain
Grain here is defined as the modularity level of the data.
In our case study, the most granular data is an individual
line item on a POS transaction. To ensure maximum
-
8/7/2019 BI Retail Industry v1.1
13/23
dimensionality and flexibility, we will proceed with this
grain.
While users probably are not interested in analyzing single
items associated with a specific POS transaction, we cantpredict all the ways that theyll want to cull through that
data.
For example, they business users may want to
understand the difference in sales on Monday versus
Sunday. Or they may want to assess whether its
worthwhile to stock so many individual sizes of certain
brands, such as cereal. Or they may want to understand
how many shoppers took advantage of the 50-cents-off
promotion on shampoo.
While none of these queries calls for data from one
specific transaction, they are broad questions that require
detailed data sliced in very precise ways.
Step 3: Choosing the dimensions
The dimension table contains the textual descriptors of thebusiness.
Major primary dimensions for the our grain are the date,
product, and store.
We assume that the calendar date is the date value
delivered to us by the POS system. Later, we will see what
to do if we also get a time of day along with the date.
Within the framework of the primary dimensions, we can
ask whether other dimensions can be attributed to the
data, such as the promotion under which the product is
sold.
Step 4 : Identifying the facts
-
8/7/2019 BI Retail Industry v1.1
14/23
A fact table is the primary table in a dimensional model
where the numerical performance measurements of the
business are stored
The facts collected by the POS system include the salesquantity (e.g., the number of cans of chicken noodle
soup), per unit sales price, and the sales dollar amount. In
some cases it may include the dollar cost.
Three of the facts, sales quantity, sales dollar amount, and
cost dollar amount, are beautifully additive across all the
dimensions. We can slice and dice the fact table with
impunity, and every sum of these three facts is valid andcorrect.
Whereas, dimensions like gross profit and unit price are
non additive and can be calculated through query.
Figure: Measured facts in the retail sales schema
Date Dimension
The Date dimension is the most frequently used dimension of a
datawarehouse. Some data in this table can not be calculated.
For example, whether the date was a holiday or not has to be
calculated through holiday calander input from the company.
Moreover, fiscals can be cutomized (say the financial yearextends from January to December) ending A typical date
-
8/7/2019 BI Retail Industry v1.1
15/23
dimension is shown below.
Date dimension is most frequently used for slicing operations,
and most reports are prepared on time basis. Drilldown is also
done on the basis of this dimension.
If the rows in a fact table are coming from several timezones, it
might be useful to store date and time in both local time and a
standard time. This can be done by having two dimensions for
each date dimension needed one for local time, and one for
standard time. Storing date in both local and standard time, will
allow for analysis on when facts are created in a local setting
and in a global setting as well.
Product dimension
Product dimension helps in slicing and dicing through the
various reports that can be prepared using this dimension.
The important function of this dimension is to hold as many
descriptive attributes of each SKU as possible. Typically
each SKU rolls up to brands, brands to categories, and
-
8/7/2019 BI Retail Industry v1.1
16/23
categories to departments. Generally, 50 attributes is
considered a reasonably good descriptive dimension. This
dimension is one of the 3 primary dimensions in data marts
on retail.
Store dimension
Store dimension is very similar to the product dimension, and
is subject to slow change.
-
8/7/2019 BI Retail Industry v1.1
17/23
Promotion dimension
The promotion dimension describes the promotion conditions
under which a product was sold. Instead of having a single
attribute of promotion like discount/ free product etc, thisdimension Promotion conditions include temporary price,
media name, free product name etc. Each promotion flags
appropriate flags in the row to discribe itself including
reductions, end-aisle displays, newspaper ads, and coupons.
This dimension is often called a causal dimension (as opposed
to a casual dimension) because it describes factors thought to
cause a change in product sales.
It may be difficult for users to comprehend, but is very useful
in protection against changing dimensions. Here, another tip
is to avoid null key in fact table, even if the product is not
under any promotion as it helps to get right results from
joins.
Rationale for the approach
1) Selecting a single business process to model at a go:
-
8/7/2019 BI Retail Industry v1.1
18/23
We model processes instead of trying o pursue a single
requirement/report. This is because most the
needs/requirements for the end users is on a case basis.
One can not economically use time & effort to set up
new reporting routines, everytime a new requirement
comes up. Also, the scaling of an ad hoc reporting
system could take toll on the OLTP systems, for getting
the data feed. Hence, it is wise to approach the problem
in a more systematic way, even if the immediate
requirement seems small.
Moreover, the Process approach cuts through the
department barriers and makes it easier to see thelarger pictures. Another need of production DW
designers is to see the clarity in relation to availability
of data, ie. What data is already there in the DW (A
large DW is usually replenished by multiple data feeds,
and it gets increasingly difficult for a designer to keep
track of what is already there) and what needs to be
fetched, when implementing a new requirement.
2) Declaring the grain
The general thumbrule in case of grain declaration is '
the smaller, the better'. For example, one could also
capture the denomination and currency number of the
notes used in a transaction, and that would allow for
very interesting analyses to take place. But there is a
huge chance that such level of details would mostly get
unused. Also, granularity takes a toll on the processingof cumuative results. Additional cumulative tables
would be needed to be maintained and updated for
every added level of granularity. However, with the
growing powers of hardware, this does not seem to be a
big concern.
Not providing the smallest possible level of granularity
would lead to problems of redesign, in case arequirement comes up. So, designing for a system to be
-
8/7/2019 BI Retail Industry v1.1
19/23
working 10 years from now, the designer should take
the freedom of keeping the granularity as high as
possible.
3) Choosing the dimensions:
Choosing dimensions is the art part in the design of DW.
How business people describe the data they use, can be
learned from anticipation and expertise. Dimensions
help reduce the run processing time for queries, as
dealing with large databases is tricky.
Keeping the processing time in check is essential as the
database size for Data warehouses can exceed
terabytes. Also, the cost of query gets reduced if we can
use less of SQL functions.
Keeping as many dimensions helps in slicing and dicing.
However, with shifting IT contracts, vendors tend to add
new dimensions for every new requirement without
studying the system properly. Can lead to degenratedimensions resulting in higher query cost and less
friendliness to the user.
4) Identifying the facts:
One should avoid using ratios and percentage data in
fact table. This data needs to be updated every time an
entry is made to the affecting variables. This could
imply either inconsistency due to delay in updation orincreased load due to updation. Instead such data
should be calculated by the application logic.
Best practices explained
-
8/7/2019 BI Retail Industry v1.1
20/23
1) Use of surrogate keys :
Though operational changes happen once in a decade
or two, it is always wise to be ready for the change.
Thus, it is recommended that surrogate keys be used toconnect fact and dimensions. This helps to remove the
dependency between the data stored in the row and the
key. The key must not hold any peice of information
that can possibly be used.
2) Design for extensibility
Ideally, the design should be able to accomodate new
requirements just by adding new dimension tables andan rows in fact table. In case of product launches, where
product description is not available till late, using
dummy data can be helpful. However, some
requirements can not be fulfilled this simply. For
example, if it is desired to increase the granularity of a
dimension table along with the fact table, it would be
necessary to drop the fact table and rebuild it. However,
existing applications would be unaffected.
Also, when a new data source is involving unexpected
new dimensions is encountered, it is wise not to force fit
the measurements into the existing fact table.
3) Snowflaking
Snowflaking refers to the practice of dimensional
normalization in the data warehouse design.Snowflaking makes the query design for users more
difficult. Also, the queries become more costly due to
the joins. Monetary value of disk space saving is not
very high and the higher cost of maintainance of
consistency is bore by the ETL process, which is rather
acceptable than putting the load on OLAP queries.
Moreover, datawarehouses are not meant for much of
data changes afterwards, as historic data keeping isdone here.
-
8/7/2019 BI Retail Industry v1.1
21/23
4) Centipede dimensions
Keeping too many dimensions in the model implies that
two or more dimensions need to be combined togather.
The thumrule here is that if there are more than 25dimensions, you may need to revisit the design. If the
resulting dimension is noticably smaller than the cartesian
product of the seperate dimensions, then it could be a
good decision to combine.
5) Protection against slowly changing dimensions
Dimensions such as product and store are subject to
slow changes. For example, a store may change it'slocation and thus other related fields in address
columns such as city, state and PIN code etc. Similarliry,
a product may be renamed and it's cost of production/
acquisition may change over time.
It is generally desirable to preserve the historic
information as well as the warehouses being actionable
on the basis of current information. Author of DWtoolkit, Kimball suggests various ways to cope with the
problem of changing dimensions. Of these solutions,
there are 2 types that preserve historical data.
Type 2 preserves changing attruibutes by adding 2
more attributes, start date and end date. Type 6 also
works in a similar way, but also mentions the
immediately previous applicable value to the current
value.
References
-
8/7/2019 BI Retail Industry v1.1
22/23
-
8/7/2019 BI Retail Industry v1.1
23/23