a bank product change1
TRANSCRIPT
-
8/6/2019 a Bank Product Change1
1/23
-
8/6/2019 a Bank Product Change1
2/23
STAGEING_TABLES----E-R MODELING FOR STAGING TABLES
CUSTOMER_DET
cust_id
name_prefix
first_namelast_name
gender
marital_status
street_nbrstreet_name
postal_code
city_name
state_code
CUSTOMER
cust_id,
income
ageyears_with_bank
nbr_children
gender
marital_status
SAVINGS
acct_nbracct_type
cust_id
ref_acct_nbrEmpno
minimum_balance
per_check_fee
account_activeacct_start_date
acct_end_date
starting_balance
ending_balance
ACCTS
Acct_process_nbracct_nbr
acct_type
cust_id
acct_start_date
acct_end_dateref_acct_nbr
Empno
CHECKING
acct_nbr
acct_type
cust_id
ref_acct_nbrEmpno
minimum_balance
account_active
acct_start_dateacct_end_date
starting_balance
ending_balance
LOAN
acct_nbr
cust_id
Agent_id
credit_limitcredit_rating
account_active
acct_start_date
acct_end_date
starting_balance
ending_balance
SAVING_TRAN
Tran_IdCust_Id
Acct_Nbr Channel_Nbr
Session_Id
Tran_DurationTran_Amt
Principal_Amt
Interest_Amt
New_BalanceTran_Date DATE,
Tran_Time
Channel
Tran_Code
CHECK_TRAN
Tran_Id
Cust_IdAcct_Nbr
Channel_Nbr
Session_Id
Tran_DurationTran_Amt
Principal_Amt
Interest_Amt
New_BalanceTran_Date DATE,
Tran_Time
Channel
Tran_Code
LOAN_TRAN
Tran_Id
Cust_Id
Acct_NbrChannel_NbrSession_Id
Tran_Duration
Tran_Amt
Principal_AmtInterest_Amt
New_Balance
Tran_Date DATE,
Tran_TimeChannel
Tran_Code
CREDIT
acct_nbrcust_id
Agent_id
credit_limit
credit_ratingaccount_active
acct_start_date
acct_end_date
starting_balanc
ending_balance
CREDIT_TRA
Tran_Id
Cust_Id
Acct_NbrChannel_Nbr
Session_Id
Tran_Duration
Tran_AmtPrincipal_Amt
Interest_Amt
New_Balance
Tran_Date DATE,Tran_Time
Channel
Tran_Code
SERVICES
Trans_idAcct_Nbr
Service
Tran_Amt
Tran_Charge
Tran_tot_amt
TRANS
cust_id
acct_typetran_period
tran_channel
tran_code
tran_typetran_count
tran_total
WIRE_TRAN
Trans_idAcct_Nbr
Amt_trans
Trans_amt_fee
AGENT
Agent_idAgent_name
Agent_type
Location
EMPLOYEE
EmpnoEmpname
Deptno
Location
ManagerSalary
Designation
-
8/6/2019 a Bank Product Change1
3/23
TABLES - DIMENSIONAL MODELING
-
8/6/2019 a Bank Product Change1
4/23
DATE_DIM
Date_key
DT_calender_Date
DT_weekday_fullDT_weekend_full
DT_calen_week_numbDT_calen_month_numbDT_calen_qtr_numbr
DT_calen_monthend
DT_calen_quater_number_mon
thDT_calen_year_nmbr
DT_calen_FISICALYear
TRANSACTION_DIM
Trans_Key
Trans_id
Trans_codeChannel_nbr
Agent_id
Session_IT
Transaction_chargeTransaction_amt
Transaction_time
BANK_TRANS_FACTCust_keyAcct_Key
Date_key
Trans_key
Amt_on_transactionAmt_of_total_earning
Amt_on_internet_emi
Profit_on_services
Profit_on_loan_credit
PRODUCT_DIM
Acct_Key
Acct_nbrAcct_type
Acct_start_date
Acct_end_date
Trans_codeTrans_id
Earnings
Transaction_fee
Service diagramAccount_active
Account_bal_credit
Channel
Ref_acct_nbr
CUSTOMER_DIM
Cust_Key
Cust_id
Name
Income
Age
Year_with_bank
nbr_children
gendermarital_status
acct_start_date
acct_end_date
street_number
street_name
customer_effi_points
customer_track_points
customer_ref_points
-
8/6/2019 a Bank Product Change1
5/23
Dimension Overview
Based on the business requirements just listed, the grain and dimensionality of
the initial model begin to emerge. We start with a core fact table that records the
primary balances of every account at the end of each month. Clearly, the grain of
the fact table is one row for each account at the end of each month. Based on
this grain declaration, we initially envision a design with only two dimensions
month and account. A data-centric designer might argue that all the other
description information, such as household, branch, and product characteristics,
should be embedded as descriptive attributes of the account dimension because
each account has only one household, branch, and product associated with it.
While this schema accurately represents the many-to-one and many-to-many
relationships in the snapshot data, it does not adequately reflect the natural
business dimensions. Rather than collapsing everything into the huge account
dimension table, additional analytic dimensions such as product and branch
mirror the instinctive way that banking users think about their businesses. These
supplemental dimensions provide much smaller points of entry to the fact table.
Thus they address both the performance and usability objectives of a
dimensional model. Finally, given that the master account dimension in a big
bank may approach 10 million members, we Follow type 2 slowly changing
dimension (SCD) for the huge dimension into something workable process. The
product and branch attributes are convenient groups of attributes to remove from
the account dimension in order to cut down on the type 2 SCD effects. Later
we'll squeeze the changing demographics and behavioral attributes out of the
account dimension for the same reasons.
-
8/6/2019 a Bank Product Change1
6/23
The product and branch dimensions are two separate dimensions because there
is a many-to-many relationship between products and branches. They both
change slowly but on different rhythms. Most important, business users think of
them as basic, distinct dimensions of the banking business.
Based on further study of the bank's requirements, we ultimately choose the
following dimensions for our initial schema: month end date, account, household,
branch, product, and status. At the intersection of these six dimensions, we take
a monthly snapshot and record the primary balance and any other metrics that
make sense across all products, such as interest paid, interest charged, and
transaction count. Remember that account balances are just like inventory
balances in that they are not additive across any measure of time. Instead, we
must average the account balances by dividing the balance sum by the number
of months.
Product Dimension
The product dimension consists of a simple product hierarchy that describes all
the bank's products, including the name of the product, type, and category. Theneed to construct a generic product categorization in the bank is the same need
that causes grocery stores to construct a generic merchandise hierarchy. The
main difference between the bank and grocery store examples is that the bank
also develops a large number of custom product attributes for each product type.
We'll defer discussion regarding the handling of these custom attributes until the
end of this chapter.
The account status dimension is a useful dimension to record the condition of the
account at the end of each month. The status records whether the account is
active or inactive or whether a status change occurred during the month, such as
a new account opening or an account closure. Rather than whipsawing the large
account dimension or merely embedding a cryptic status code or abbreviation
-
8/6/2019 a Bank Product Change1
7/23
directly in the fact table, we treat status as a full-fledged dimension with
descriptive status decodes, groupings, and status reason descriptions as
appropriate. In many ways we could consider the account status dimension to be
another example of a minidimension.
Customer Dimension
Rather than focusing solely on the bank's accounts, users also want the ability to
analyze the bank's relationship with an customer. They are interested in
understanding the overall profile of a customer, the magnitude of the existing
relationship with the customer, and what additional products should be sold to
the customer., and. These demographic attributes change over time; as you
might suspect, the users want to track the changes. If the bank focuses on
accounts for commercial entities rather than consumers, it likely has similar
requirements to identify and link corporate families.
From the bank's perspective, a customer may be comprised of several accounts
and individual account holders. For example, consider John and Mary Smith as a
single customer household. John has a checking account, and Mary has a
savings account. In addition, John and Mary have a joint checking account, creditcard, and mortgage with the bank. All five of these accounts are considered to be
a part of the same Smith household despite the fact that minor inconsistencies
may exist in the operational name and address information.
The process of relating individual accounts to households (or the commercial
business equivalent of a residential household) is not to be taken lightly. House
holding requires the development of business rules and algorithms to assign
accounts to households. There are specialized products and services to do the
matching necessary to determine household assignments. It is very common for
a large financial services organization to invest significant resources in
specialized capabilities to support its house holding needs.
-
8/6/2019 a Bank Product Change1
8/23
We decide to treat them separately because of the size of the account dimension
and the volatility of the account constituents within a household dimension, as
referenced earlier. In a large bank, the account dimension is huge, with easily
over 10 million rows that group into several million households. The customer
dimension provides a somewhat smaller point of entry into the fact table without
traversing a 10-million-row account dimension table. In addition, given the
changing nature of the relationship between accounts and customer, we elect to
use the fact table to capture the relationship rather than merely including the
household attributes on each account dimension row. In this way we avoid using
the type 2 SCD approach with the large account dimension.
Various Dimension
So far we discussed about customer and product analysis. There are other bank
related things Agent, Transaction, employee. Agent analysis to be maintained
to know about the agent information history wise according to there locations.
To give other agents policies to the agents. Transaction to be maintained for
credit account daily wise. it should have transaction information of credit
complete transaction and employee information according to there location of
the bank.
Time Dimension
So far we've restricted our discussions in this financial services chapter to month-
end balance snapshots because this level of detail typically is sufficient for
analysis. If required, we could supplement the monthly-grained snapshot fact
table with a second fact table that provides merely the most current snapshot as
of the last nightly update or perhaps is extended to provide daily-balance
snapshots for the last week or month. However, what if we face the requirement
to report an account's balance at any arbitrarily picked historical point in time?
Creating daily-balance snapshots for a large bank over a lengthy historical time
span would be overwhelming given the density of the snapshot data. If the bank
-
8/6/2019 a Bank Product Change1
9/23
has 10 million accounts, daily snapshots translate into approximately 3.65 billion
fact rows per year.
Assuming that business requirements already have driven the need to make
transaction detail data available for analysis, we could leverage this transaction
detail to determine an arbitrary point-in-time balance. To simplify matters, we'll
boil the account transaction fact table down to an extremely simple design. The
transaction type key joins to a small dimension table of permissible transaction
types. The transaction sequence number is a continuously increasing numeric
number running for the lifetime of the account. The final flag indicates whether
this is the last transaction for an account on a given day. The transaction amount
is self-explanatory. The balance fact is the ending account balance following the
transaction event.
In a situation we are taking advantage of a special situation that exists with the
surrogate date key. The date key is a set of integers running from 1 to N with a
meaningful, predictable sequence. We assign consecutive integers to the date
surrogate key so that we can physically partition a large fact table based on the
date. This neatly segments the fact table so that we can perform discreteadministrative actions on certain date ranges, such as moving archived data to
offline storage or dropping and rebuilding indexes. The date dimension is the
only dimension whose surrogate keys have any embedded semi-intelligence.
Due to its predictable sequence, it is the only dimension on which we dare place
application constraints. We used this ordering in the preceding SQL code to
locate the most recent prior end-of-day transaction.
Fact Overview
The heterogeneous product technique just discussed is appropriate for fact
tables in which a single logical row contains many product-specific facts.
Snapshots usually fit this pattern.
-
8/6/2019 a Bank Product Change1
10/23
On the other hand, transaction-grained fact tables often have a single fact that is
generically the target of a particular transaction. In such cases the fact table has
an associated transaction dimension that interprets the amount column. In the
case of transaction-grained fact tables, we typically do not need specific line-of-
business fact tables. We get by with only one core fact table because there is
only one fact. However, we still can have a rich set of heterogeneous products
with diverse attributes. In this case we would generate the complete portfolio of
custom product dimension tables and use them as appropriate, depending on the
nature of the application. In a cross-product analysis, we would use the core
product dimension table because it is capable of spanning any group of products.
In a single-product analysis, we optionally could use the custom-product
dimension table instead of the core dimension if we wanted to take advantage of
the custom attributes specific to that product type.
-
8/6/2019 a Bank Product Change1
11/23
-
8/6/2019 a Bank Product Change1
12/23
SYSTEM DEVELOPMENT
-
8/6/2019 a Bank Product Change1
13/23
5.1Specifications for Target Tables
Bank Transaction
CUST_KEY PK/
FK
CUST_ID IF
CUSTOMER_DIM.C
USTID=
CUST_ID.BANK_TR
AN_SOURCE
Lookup on CUSTOMER_DIM
(CUST_KEY,CUST_ID)
ACCT_KEY PK/
FK
ACCT_NB
R
IFPROD_DIM.ACCT
_NBR=
ACCT_NBR.BANK_T
RANS_SOURCE
Lookup on PRODUCT_DIM
(ACCT_KEY,ACCT_NBR)
AGENT_KEY PK/
FK
AGENT_ID IF
AGENT_DIM.AGENT_ID=
AGENT_ID.BANK_T
RANS_SOURCE
Lookup on AGENT_DIM
(AGENT_KEY,AGENT_ID)
TRAN_KEY PK/
FK
TRANS_ID IF
TRANSA_DIM.TRAN
S_ID=
TRANS_ID.BANK_T
RANS_SOURCES
Lookup on
TRANSACTION_DIM(TRAN_KEY
,TRANS_ID)
EMP_KEY PK/
FK
EMPNO IF
EMP_DIM.EMPNO=
EMPNO.BANK_TRA
NS_SOURCE
Lookup on EMP_BANk-
DIM(EMP_KEY,EMPNO)
AMT_ON_TRA
NSACTION
AMT_ON_
TRANSAC
TION
SYSDATE
AMT_OF_TOT
AL_EARNINGS
AMT_OF_T
OTAL_EARNINGS
ORACLE
USERNAME
PROFIT_ON_S
ERVICES
PROFIT_O
N_SERVIC
ES
5.2 Staging Column Specification
-
8/6/2019 a Bank Product Change1
14/23
Staging Column Specifications Data Source Specifications
Column Name PK
FK
Format Null Colu
mn /
Field
Pk
FK
Format Null File / Table
Name
Target Table Name: CUSTOMER_DETAILS
Source File Name:CUSTOMER_DETAILS
CUST_ID PK NUMBER N 1-to-1
CUSTOMER
_DETAILS
INCOME NUMBER N 1-to-1
CUSTOMER
_DETAILS
AGE NUMBER N 1-to-1
CUSTOMER
_DETAILS
YEARS_WITH_BA
NK NUMBER Y 1-to-1
CUSTOMER
_DETAILS
NBR_CHILDERN NUMBER
Y 1-to-1
CUSTOMER
_DETAILS
STREET_NBR
NUMBER Y 1-to-1
CUSTOMER
_DETAILS
STREET_NAME DATE
Y 1-to-1
CUSTOMER
_DETAILS
POSTAL_CODE VARCHAR2(30
) Y 1-to-1
CUSTOMER
_DETAILS
CITY_NAME VARCHAR2(30
) Y 1-to-1
CUSTOMER
_DETAILS
STATE_CODE VARCHAR2(30
) Y 1-to-1
CUSTOMER
_DETAILS
NAME_PREFIX VARCHAR2(30
) Y 1-to-1
CUSTOMER
_DETAILS
FIRST_NAME VARCHAR2(30
) Y 1-to-1
CUSTOMER
_DETAILS
LAST_NAME VARCHAR2(30) Y 1-to-1
CUSTOMER_DETAILS
GENDER VARCHAR2(30
) Y 1-to-1
CUSTOMER
_DETAILS
MARITAL_STATUS VARCHAR2(30
) Y 1-to-1
CUSTOMER
_DETAILS
-
8/6/2019 a Bank Product Change1
15/23
5.3 Dimension Modeling
Relationship between fact and dimension tables
Bank Transactions
Customer Time
Transaction Product
-
8/6/2019 a Bank Product Change1
16/23
Here is the system generated schema for the Bank Product Analysis.
5.4 SCHEMA
-
8/6/2019 a Bank Product Change1
17/23
5.5 Source table for Bank Product Analysis
Customer_details
Column Name Data TypeCust_Id (PK) Number
Name prefix Varchar2(4)
First name Varchar2(30)
Last name Varchar2(30)
Gender Varchar2(10)
Marital status Varchar2(1)
Street nbr Number
Street name Varchar2(30)
Postal code Varchar2(5)
City name Varchar2(20)
State code Vatchar2(2)
Customer
Column Name Data Type
Cust_id(PK) Number
Income Number(9,2)
Age Number
Years_with_bank Number
Nbr_childern Number
Gender Varchar2(1)
Marital_status Varchar2(1)
Account
Column Name Data Type
Acct_process_nbr Number
-
8/6/2019 a Bank Product Change1
18/23
Acct_nbr (PK) Number(16)
Acct_type Varchar2(2)
Cust_id Number
Acct_start_date Date
Acct_end_date Date
Ref_acct_nbr Number(16)
Empno Number(3)
Checking_acct
Column Name Data Type
Acct_nbr (PK) Number(3)
Acct_type Varchar2(2)
Cust_id number
Ref_acct_nbr Number(16)
Empno Number(3)
Minimum_balance Number(9,2)
Per_check_fee Number(9,2)
Account_active Varchar2(1)
Account_start_date Date
Acct_end_date Date
Starting_balance Number(9,2)
Ending_balance Number(9,2)
Acct_nbr (PK) Number(3)
Checking_tran
Column Name Data Type
Tran_id Number(9,2)
Cust_id Number
-
8/6/2019 a Bank Product Change1
19/23
Acct_nbr Number(16)
Channel_nbr Number
Session_id Number|(9,2)
Check_nbr Number
Tran_duration Number
Tran_amt Number(9,2)
Principal_amt Number(9,2)
Interest_amt Number(9,2)
New_balance Number(9,2)
Tran_date Date
Tran_time Varchar2(6)
Channel Varchar2(1)
Tran_code Varchar2(2)
Savings_acct
Column Name Data Type
Acct_nbr (PK) Number(16)
Acct_type Varchar2(2)
Cust_id Number
Ref_acct_nbr Number(16)
Empno Number(3)
Minimum_balance Number(9,2)
Account_active Varchar2(1)
Acct_start_date Date
Acct_end_date Date
Starting_balance Number(9,2)
Ending_balance Number(9,2)
Savings_tran
Column Name Data Type
-
8/6/2019 a Bank Product Change1
20/23
Tran_id (PK) Number
Cust_id Number
Acct_nbr Number (16)
Channel_nbr Number
Session_id Number(9,2)
Tran_duration Number
Tran_amt Number(9,2)
Principal_amt Number(9,2)
Interest_amt Number(9,2)
New_balance Number(9,2)
Tran_date Date
Tran_time Varchar2(6)
Channel Varchar2(1)
Tran_code Varchar2(2)
Credit_acct
Column Name Data Type
Acct_nbr (PK) Number(16)
Agent_id number
Cust_id Number
Credit_limit Number(9,2)
Credit_rating Number
Minimum_balance Number(9,2)
Account_active Varchar2(1)
Acct_start_date Date
Acct_end_date Date
Starting_balance Number(9,2)
Ending_balance Number(9,2)
Credit_tran
-
8/6/2019 a Bank Product Change1
21/23
Column Name Data Type
Tran_id (PK) Number
Cust_id Number
Acct_nbr Number (16)
Channel_nbr Number
Agent_id Number(9,2)
Session_id Number(9,2)
Tran_duration Number
Tran_amt Number(9,2)
Principal_amt Number(9,2)
Interest_amt Number(9,2)
New_balance Number(9,2)
Tran_date Date
Tran_time Varchar2(6)
Channel Varchar2(1)
Tran_code Varchar2(2)
Banking_services
Column Name Data Type
Trans _id Number(9,2)
Acct_nbr number(9,2)
Service Varchar2(10)
Tran_amt Varchar2(20)
Tran_charge Number(9,2)
Tran_tot_amt Number(9,2)
Bank_trans_source
Column Name Data Type
cust_id Number
Acct_nbr number
-
8/6/2019 a Bank Product Change1
22/23
Agent_id Number
Trans_id Number
Empno Number(4)
Transacation_amt Number(9,2)
Amt_total_earning Number(9,2)
Profit_on_services Number(9,2)
-
8/6/2019 a Bank Product Change1
23/23