Download - Overview of business intelligence
Business Intelligence
“You can’t manage what you can’t measure.
You can’t measure what you can’t describe”
Ahsan Kabir
“A broad category of applications and technologies for gathering, storing,
analyzing, sharing and providing access to data to help enterprise users make
better business decisions” -Gartner
Why BI ?
Performance management
Identify trends
Cash flow trend
Fine-tune operations
Sales pipeline analysis
Future projections
business Forecasting
Decision Making Tools
Convert data into information
How to Think ?
• What happened?
• What is happening?
• Why did it happen?
• What will happen?
• What do I want to happen?
ERP CRM 3PtySCMBlack
books
Past
Present
Future
Data
Major Players in BI Market
Improving organizations by providing business
insights to all employees leading to better, faster,
more relevant decisions
Advanced Analytics
Self Service Reporting
End-User Analysis
Business Performance Management
Operational Applications
Microsoft Business Intelligence Vision
BI implementations
– Corporate BI
Commonly design, implement and maintain data warehouses , data models and
integrated reporting and analytics. It require significant time, expertise and money but
total business is not covered .
– Self-service BI (SSBI)
SSBI is to empower analysts so that they can design, customize and maintain their
own BI solutions. SSBI is a combination of corporate BI and extensions to empower
analysts to more fully exploit it .
– Managed BI
Ensuring responsible BI by managing review, approve and audit solutions
Data is delivered in a compliant, responsive and secure way and access permissions
are enforced
Analysis Services(SSAS)
Reporting Services(SSRS)
Integration Services(SSIS)
Master Data Services(MDS)
SharePoint
Collaboration
Excel Workbooks
PowerPivotApplications
SharePointDashboards &
Scorecards
Microsoft Business Intelligence Components
DQS
ERP/CRM DB Cloud Born Data Social Network
Step 1 : Business Analysis
Step 2 : SSIS
Different Source of Data (RDBMS, FTP, Web Services, XML, CSV, EXCEL, etc.)
DQS (Data Quality Services) Integration, cleansing, profiling
MDS (Master Data Service ) Centrally managing organizational master data
ETL (Extraction, Transformation and Loading) framework
Step 3 : SSAS
Create an OLAP multi-dimensional structure making data available for analytics and reporting
SSAS can pre-calculates, summarizes and stores the data in a highly compressed format
Reporting is provided by data through SSAS cubes
Step 4 : SSRS
SSRS (SQL Server Reporting Services) allows creating formatted and interactive reports
Step 5 :
PowerPivot, Power View, Excel services provide rapid data exploration, visualization, and
presentation experience for users . It allows users to interrogate the data from various aspects
by using charts, graphs, drill-down paths etc.
Excel and PowerPivot services can be used for deploying Excel or PowerPivot to SharePoint in
order to make it available to other people, turning Personal BI into Organizational BI.
Microsoft Business Intelligence Road Map
“…was designed specifically to be a central repository for all data in a company disparate data from transactional systems”
Data Warehouse
Characteristics DW
DW is a relational database that is designed for query and analysis
Ship and integrate data from different sources to the analyst
Contains data derived from transaction, internal-external data & archived data
But it’s not a copy of a source database
High query performance
Analysis queries place extra load on transactional systems
Query optimization is hard to do well
Queries not visible outside warehouse
Local processing at sources unaffected
Can operate when sources unavailable
Can query data not stored in a DBMS
Summarized and Extremal data at warehouse
Advantages of Warehousing
Data is kept in a
specific business
line wise.
Before enter into warehouse
Data is processed
(cleansed and transformed)
DW Architecture
Warehouse
Data Marts
Users query
the data
warehouse
Data WarehouseCorporate/Enterprise-wide
Union of all data marts
Organized on E-R model*
Data MartDepartmental
Single business process
Star-join*
DW vs. Data Mart
Transactional Databases vs. Data warehouse
1. ER modeling is used
2. 3NF Normalized
3. Data is spited into tables
4. Hard to visualize
5. Slows down the response time of the query and report
1. Dimensional modeling
2. De-normalized
3. Data is kept in fact and dimension
4. Flexible for user perspective
5. Response time and increases
the performance
Transactional Databases Warehouse Database
16
Requisition
RID(PK)
CID (FK)
WID (FK)
UID (FK)
Requestion_Date
Warehouse
WID (PK)
Location
Address
district
WU_Code
User_Profile
UId (PK)
Name
Address
CellNo
Product_Profile
PID (PK)
description
brand
category
Client_Information
CID (PK)
Name
Address
Credit_Limit
Requisition_Details
RID (PK)
RDD (FK)
PID (FK)
promotion_key (FK)
dollars_sold
units_sold
dollars_cost
Entity Relation Diagram
17
TIME
time_key (PK)
SQL_date
day_of_week
month
STORE
store_key (PK)
store_ID
store_name
address
district
floor_type
CLERK
clerk_key (PK)
clerk_id
clerk_name
clerk_grade
PRODUCT
product_key (PK)
SKU
description
brand
category
CUSTOMER
customer_key (PK)
customer_name
purchase_profile
credit_profile
Address
City
country
PROMOTION
promotion_key (PK)
promotion_name
price_type
ad_type
Sales - FACT
time_key (FK)
store_key (FK)
clerk_key (FK)
product_key (FK)
customer_key (FK)
promotion_key (FK)
dollars_sold
units_sold
dollars_cost
DIMENSONAL MODEL
Federated Databases vs. Data warehouse
Data warehouse
Create a copy of all the data and Execute queries against the copy
Federated database
Pull data from source systems as needed to answer queries
Data Warehouse Federated Database
Query
Answer
Query
Extraction Rewritten Queries
Answer
Source
Systems
Warehouse
Mediator
Data Quality problems
Name Address City House
No
DoB State Country
Ahsan CDA Avenue CTG 181/1 05/11/1978 BD
Kabir RB Avn CTG 41/6 23/04/1991 DHK Bangladesh
Name Address City House
No
DoB State Country
Ahsan CDA Avenue CTG 181/1 05/11/1978 CT Bangladesh
Kabir RB Avenue DHK 41/6 23/04/1991 DHK Bangladesh
Before
After
Indication : Completeness Accuracy Conformity Consistency
Data Quality Issues
Data Quality Issue Sample Data Problem
Standard Are data elements consistently
defined and understood ?
Gender code = M, F, U in one system and Gender
code = 0, 1, 2 in another system
Complete Is all necessary data present ? 20% of customers’ last name is blank,
50% of zip-codes are 99999
Accurate Does the data accurately represent
reality or a verifiable source?
A Supplier is listed as ‘Active’ but went out of
business six years ago
Valid Do data values fall within acceptable
ranges?
Salary values should be between
60,000-120,000
Unique Data appears several times Both John Ryan and Jack Ryan appear in the
system – are they the same person?
Data Quality Services (DQS)
Data Quality Services (DQS) is a Knowledge-Driven
data quality solution, enabling to easily improve the
quality of their data
DW Design Consideration
Simplicity
Users should understand the design
Data model should match users’ conceptual model
Queries should be easy and intuitive to write
Expressiveness
Include enough information to answer all important queries
Include all relevant data (without irrelevant data)
Performance
An efficient physical design should be possible
Component of Data Warehousing
DW consists of Fact tables and dimensions. The relationship between a Fact table and
dimensions are based on the foreign key and primary key.
Facts are numeric measurements or additive value that represent a specific business aspect or activity.
Examples :
Unit Cost,
Sale Amount,
Quantity Sold
Salary Amount
Purchase amount
Dimension has a primary key, which is called the surrogate key. The primary key of the source system will be stored in the dimension table as the business key
Dimension tables are tables that contain descriptive information. Dimension table contains a list of columns
Example :
Incase of Product
Product Name
Origin
Category
Manufacturer Date
Sales Date
The Fact table is a table with foreign keys pointing to surrogate keys of the dimension tables
24
TIME
time_key (PK)
SQL_date
day_of_week
month
STORE
store_key (PK)
store_ID
store_name
address
district
floor_type
CLERK
clerk_key (PK)
clerk_id
clerk_name
clerk_grade
PRODUCT
product_key (PK)
SKU
description
brand
category
CUSTOMER
customer_key (PK)
customer_name
purchase_profile
credit_profile
Address
City
country
PROMOTION
promotion_key (PK)
promotion_name
price_type
ad_type
Sales - FACT
time_key (FK)
store_key (FK)
clerk_key (FK)
product_key (FK)
customer_key (FK)
promotion_key (FK)
dollars_sold
units_sold
dollars_cost
Dimensional Modeling
25
The diagram resembles a star
Center of the star consists of one fact table
Points of the star are the dimension tables
Optimizes performance by keeping queries simple and
Providing fast response time
Star schema
26Star Schema for the retailer’s DW
Sales
Date
Product Store
Promotion
Fact table
Dimension tables
ONE fact table 4 dimension tables
TIME
time_key (PK)
SQL_date
day_of_week
month
STORE
store_key (PK)
store_ID
store_name
address
district
floor_type
CLERK
clerk_key (PK)
clerk_id
clerk_name
clerk_grade
PRODUCT
product_key (PK)
SKU
description
brand
category
CUSTOMER
customer_key (PK)
customer_name
purchase_profile
credit_profile
Address
City
country
PROMOTION
promotion_key (PK)
promotion_name
price_type
ad_type
Sales - FACT
time_key (FK)
store_key (FK)
clerk_key (FK)
product_key (FK)
customer_key (FK)
promotion_key (FK)
dollars_sold
units_sold
dollars_cost
DIMENSONAL MODEL
28
Goals for Logical Design
Simplicity
Users should understand the design
Data model should match users’ conceptual model
Queries should be easy and intuitive to write
Expressiveness
Include enough information to answer all important queries
Include all relevant data (without irrelevant data)
Performance
An efficient physical design should be possible
29
Step 1 : Identify business subjects and fields of information
of relevant subjects
Step 2 : Discover entities and attributes and relationships
Step 3 : Identify which information belongs to a central fact table
Step 4 : Which information belongs to its associated dimension tables
Step 5 : Identify cleansing points
Step 6 : Which data need to mange centrally
Step 7 : Define surrogate key and business key
Step 8 : Make ETL Package
Step 9 : Organize data structures on disk
Steps of DW Implementation
Thanks