sas bi part1
TRANSCRIPT
... value added solutions for business performance…
SAS BI CASE STUDY
Course Notes
Submitted by: Ramesh Palanisamy
SAS Competency - Satyam Computer Services Limited.
INTRODUCTION.........................................................................................................................................................5
ABOUT ORION STAR DATA....................................................................................................................................5
1. RELATIONAL (NORMALIZED) DATA MODEL....................................................................................................62. DIMENSIONAL DATA MODEL............................................................................................................................7
SAS MANAGEMENT CONSOLE – REGISTER METADATA PROFILE..........................................................8
1. METADATA REPOSITORY................................................................................................................................112. SUPPORT FOR DEV/TEST AND PROD ENVIRONMENT IN DI STUDIO...............................................................113. SAS DI STUDIO PROCESS...............................................................................................................................124. ORION STAR DATA PROCESS IN SAS DI STUDIO...........................................................................................13
BUILDING A CUSTOM TREE................................................................................................................................19
DEFINE SOURCE DATA.........................................................................................................................................21
DEFINE TARGET DATA.........................................................................................................................................25
CREATE FACT & DIMENSION TABLES ......................................................................................................................261. CREATE FACT TABLE – (ORDERFACT)...........................................................................................................262. PRODUCT DIMENSION (PRODDIM)..................................................................................................................323. CUSTOMER DIMENSION (CUSTDIM)...............................................................................................................344. ORGANIZATION DIMENSION (ORGDIM)..........................................................................................................355. TIME DIMENSION (TIMEDIM).........................................................................................................................376. GEOGRAPHIC DIMENSION (GEODIM)..............................................................................................................38
DEFINING THE PRIMARY AND FOREIGN KEYS...........................................................................................39
1. DEFINE PRIMARY FOR DIMENSION TABLES....................................................................................................392. DEFINE PRIMARY KEY FOR FACT TABLE.......................................................................................................44
DI PROCESS LIBRARY (TRANSFORMATIONS IN DI STUDIO)...................................................................45
CREATE ETL JOBS AND LOAD THE TARGET DATA....................................................................................49
1. POPULATE THE FACT TABLE (ORDER FACT)..................................................................................................492. POPULATE THE PRODUCT DIMENSION (PRODDIM).........................................................................................563. POPULATE THE CUSTOMER DIMENSION (CUSTDIM)......................................................................................614. POPULATE THE TIME DIMENSION (TIMEDIM)................................................................................................625. POPULATE THE ORGANIZATION DIMENSION (ORGDIM).................................................................................62
DEFINE OLAP CUBE...............................................................................................................................................67
1. POPULATE THE DETAIL TABLE FOR OLAP (DETAILTABLE)..........................................................................682. CREATE OLAP CUBE IN DI STUDIO (DWETLCUBE)....................................................................................70
VIEW OLAP DATA USING SAS-ADD IN FOR MS OFFICE.............................................................................79
VIEW OLAP DATA USING SAS OLAP VIEWER...............................................................................................82
VIEW OLAP DATA USING SAS ENTERPRISE GUIDE....................................................................................85
CREATE INFORMATION MAPS...........................................................................................................................89
VIEW OLAP DATA IN SAS WEB REPORT STUDIO.........................................................................................93
REFERENCES............................................................................................................................................................97
Introduction
This course material provides an overview of SAS BI components and how the data is integrated between each tool (SAS DI Studio, SAS OLAP Cube Studio, SAS MS-Office, Information Map Studio and Web Report Studio). Orion Star data model is used across this course for creating the Relational and Dimensional Data Model and OLAP Cubes.
About Orion Star Data
Orion Star Data Model is organized in two ways as follows
Relational Data Model Dimensional Data Model
Relational Data Model consists of the following tables
1. Country2. Street_Code3. City4. Postal_Code5. Customer_type6. Geo_type7. State8. Customer9. Supplier10.Price_List11.County12.Orders13.Order_Item14.Product_list15.Continent16.Staff17.Organization18.Organization_level19.Product_level20.Discount21.TimeData
Dimensional Data Model consists of the following tables1. Order_Fact (Fact table)2. Customer_Dim3. Organization_Dim4. Geography_Dim5. Product_Dim6. Time_Dim
1. Relational (Normalized) Data Model
2. Dimensional Data Model
SAS Management Console – Register Metadata Profile
Open SAS Management Console from Start-> SAS->SAS Management Console
Select “create a new metadata profile” for a new profile.
Select Next
Provide a name to the metadata profile and click Next
Provide the Server Name, Port, UserId and Password and click Next
Select “Foundation” as the Repository and click Next
Click Finish
1. Metadata Repository
There are three types of Repository available in SAS as follows1) Foundation2) Custom3) Project
Foundation will be the default repository and the other two repositories have dependent on the Foundation repository. Metadata administrator defines the repository for each user and provides the details for the user on the repository, user name etc…Each user having a separate repository will ensure the authorization of the user for various activities in metadata server (For eg. SASADM user will be the default admin user having full access to the metadata server. A userid XYZ created for a project repository can be controlled by the SASADM and the access is restricted to use. Change management having the check-in and check-out for each ETL jobs are managed in SAS DI studio when a project repository created for each user.
2. Support for Dev/Test and Prod Environment in DI Studio
3. SAS DI Studio Process
Oracle
Ext.Files
SAP
SAS
PeopleSoft
Others
ETL Studio (Extract)
SAS
SAS
DBMS
Staging
SAS
SAS
DBMS
Data warehouse
ETL Studio (Transform)
Customers
Products
Suppliers
Staff
Data Marts
Extract the enterprise data into staging area
ETL Studio (Load)
Cleanse and Validate data and load into a central datawarehouse
Populate the DataMarts or dimensional model that provides collection of data
4. Orion Star Data Process in SAS DI Studio
Order
Order_Item
Customer
Product_List
Supplier
Organization
Customer_type
Staff
Define the Source Data
Order Fact
Customer Dimension
Product Dimension
Organization Dimension
Time Dimension
Source Designer in ETL Studio is the component for defining the Metadata for the source tables
Target Designer is a component in ETL Studio that defines the metadata for Target tables
Define the Target Data
Process Designer in ETL Studio is the component that defines the metadata for jobs that consists process flow diagram to load the target tables
OrderFact
Orion Star
SAS Management Console
By Default, SAS Management Console will login with Foundation Repository unless the user profile has assigned a default repository. “SAS Main” in Management Console lists out the SAS Servers installed and configured.
Defining Library Metadata in DI StudioOpen SAS DI studio from Start->SAS->SAS ETL Studio
Select the Inventory tab shown above
Right click on “Libraries” folder and click “New”.
Select “SAS Base Engine Library” and click Next
Enter “Staging Library” as the library name and description is optional. Click Next
Select the path for “Staging Library”.
Provide the libref name and click Next
Select “SAS Main” as the SAS Server and click Next and then click Finish
The new library is registered in Metadata server and displayed in the Libraries folder.Alternatively you can define the above process in SAS Management console for registering the library in Metadata server.
This will invoke the same wizards provided in DI Studio.
Building a custom tree
Select the “Custom” tab in DI studio as mentioned below.
Right click “Foundation” and select “New Group”. Enter “Source Tables” as the name for the Group.
Repeat the above step again and enter “Target Tables” as the name for the Group.Once the steps completed, DI Studio will have the two groups available as shown below.
Define Source Data
Click “Source Designer” and select SAS (All our source data for this case study are in SAS) as the source Data. Click Next
Select the SAS Library as “Staging Library” and click Next
Select the tables “Customer”, “Order” and “Order_item” and click Next
Select “Source Table” folder and click Next
Click Finish to load the tables into “Source table” group.
DI Studio should have the all the table loaded as displayed above. Similar to that, Management Console reflects the tables in metadata server as below.
Repeat the above step (#4) to load the below listed tables Customer_type Organization Product Staff Supplier Country County State Continent Postal_Code Geo_Type Street_Code
Define Target Data
Target Designer wizard is used to define the Target tables. The following target tables were defined to build a star schema model
Click Target Designer Icon. A wizard opens to select the type of target table as shown below.
Create Fact & Dimension Tables
1. Create Fact Table – (OrderFact)
Select Target table and Click Next
Additional Properties are optional and can be selected to specify notes, table attributes etc… Type OrderFact and click Next
Select “SAS” as DBMS and “Staging Library” as Library and click Next
Move the following columns to selected column list as shown below
Table Name Column NameOrder_Item Order_ID
Product_IDQuantityTotal_Retail_PriceCost_Price_Per_UnitDiscount
Orders Employee_IDOrder_TypeOrder_dateDelivery_Date
Customer Customer_IDStreet_ID
Click Next
The “New” button is required to define any additional columns. “Delete” button removes the selected column.
Select “Define Indexes” button
Click “New” and add “Customer_ID” as the index name. Drag the column “Customer_ID” to the newly created index as show above.
Repeat the above step to add the indexes for Product_ID, Order_date, Order_ID, Street_ID and Employee_ID. The index definition should looks as below.
Select OK and click Next
Select “Target table” as the Group and click Next as shown below
Expand the “Target Table” group in DI studio, the “Order_Fact” table should be listed as shown below.
2. Product Dimension (ProdDim)The following is the list of columns to be imported for creating ProdDim table
Source Table Name Column NameProduct Product_ID
Product_NameSupplier_ID
Supplier Supplier_NameCountry
Manually created Columns (These columns need to be created for the target table)
Product_Category (Char (25))Product_Group (Char(25))Product_Line (Char(20))
Click NextTo manually create columns, click “New” and add the column name and length as shown below
Rename the column name “Country” as “Supplier_Country”
Create two simple index for the columns Product_ID and Product_Group as shown below.
3. Customer Dimension (CustDim)
Source Table Name Column NameCustType Customer_ID
CountryGenderCustomer_NameCustomer_FirstNameCustomer_LastNameBirth_Date
Customer Customer_TypeCustomer_Group
Manually created Columns (These columns need to be created for the target table)
Customer_Age (Num (8))Customer_Age_Group (Num(12))
a) Rename the following column names
Country -> Customer_CountryGender -> Customer_GenderBirth_date -> Customer_Birth_date
b) Create a simple index for the column Customer_ID
4. Organization Dimension (OrgDim)
Source Table Name Column NameOrganization Employee_ID
CountryOrg_Name
Staff Job_TitleGenderSalaryBirth_DateEmp_Hire_dateEmp_Term_date
Manually created Columns (These columns need to be created for the target table)
Group (Char (40))Section (Char (40))Department (Char (40))Company (Char (30))Manager_Levels (Num(8))Manager_Levels1 (Num(8))Manager_Levels2 (Num(8))Manager_Levels3 (Num(8))Manager_Levels4 (Num(8))Manager_Levels5 (Num(8))
Manager_Levels6 (Num(8))
a) Rename the following column names
Country -> Employee_CountryOrg_Name -> Employee_NameGender -> Employee_GenderBirth_date -> Employee_Birth_date
b) Create a simple index for the column Customer_ID
5. Time Dimension (TimeDim)
Source Table Name Column NameManually created Columns (These columns need to be created for the target table)
Date_ID (Num(4)) format DATE9.Weekday_Num (Num(3))Weekday_Name (Char(20))Week_Num (Num(3))Week_Name (Char(7))Month_Num (Num(3))Month_Name (Char (20))Quarter (Char(6))Year_ID (Char(4))Holiday_US (Char (45))Fiscal_Year (Char(4))Fiscal_Quarter (Char(6))Fiscal_Month_Num (Num(3))
a) Create a simple index for the column Date_ID
6. Geographic Dimension (GeoDim)
Source Table Name Column NameContinent Continent_NameCountry Country_NameState State_Code
State_NameCounty Region_Name
Province_NameCounty_Name
City CityPostal_Code Postal_CodeStreet_Code Street_Name
Final list of Source and Target table should have all the tables as mentioned below.
Defining the Primary and Foreign Keys
This demonstration illustrates adding the Primary and Foreign keys for the Fact and Dimensional tables.
1. Define Primary for Dimension Tables
1. Right Click on “CustDim” and select “Properties”2. Select the “Keys” tab and select on “Customer_ID” column.3. Select the “New” button and click “Primary Key”4. Primary Key for the column “Customer_ID” is defined and the window should
look as below.
Repeat the above steps for other tables with the following parameters to define the primary key.
Note: If the Primary Column is not found in any of the table mentioned below, it can be imported from the source table. GeoDim table needs Street_ID as the primary key column which is not available in the table. This column needs to be imported using the below steps.
Right Click “GeoDim” table and select “Properties” from the pop-up menu Select the “Columns” tab and select “Import”.
Select “Source table” folder and expand “Street_code” table and select “Street_ID” column as shown below and click “OK”
The column “Street_ID” is now added into the GeoDim table as shown below.