building a star schema v1.1
TRANSCRIPT
Star Schemas
Patrick Cuba – Consultant
(SAS® Software)Scalable Performance Data Engine
using
Page 2
AGENDA
• Case Study – Need for SPDE• SPDE Library • Case Study – Need for SPDS• SPDS Server
Clusters Star Schema StarJoin
• Questions• References
Page 3
• Table build is 6 hours• Query time is 20 minutes
• Latest is 360GB• Generation tables hold 24 months• Generation tables grown to 1TB each
• 300+ columns• Four balances per credit card (Max 255)• 20 million customers• Growing customer base• Keeps defaults customer balance
CASE STUDY
Page 4
• At month end the cycle end and latest credit card for the month are added to SAS Generation TablesCycle-end
CASE STUDY
Month EndCycle-endCycle-end
Cycle-end
Cycle-end
Cycle-end
Month end
Month end
Month end
• Accounts cycle at different days in the month
Latest
Page 5
BASE LIBRARY
SAS Dataset
• SAS Datasets are flat files
Page
libname all_users’/disk1/metadata’;
Page 6
• Under BASE SAS License• Scalable Performance Data Engine (SPDE)• On SMP server (at least 2 CPU’s)• RAID
SPDE LIBRARY
SAS SPD Dataset
Data Part
Data Part
Data Part
Data Part
Data Part
HBX Index
IBX Index Meta
libname all_users spde ’/disk1/metadata’datapath= (’/disk2/userdata’ ’/disk3/userdata’)indexpath= (’/disk4/userindexes’ ’/disk5/userindexes’) partsize=128M;
Page 7
• Star Schema using StarJoin• Clustered Cycle & Month end
totalling 1TB
• Table build is 30-40 minutes• Query time is seconds to 5
minutes
CASE STUDY
Dimension
DimensionFact
Dimension
Dimension
Page 8
• Scalable Performance Data Server• Client/Server• SQL Pass-thru
SPD SERVER
Page 9
• Clusters
SPD SERVER
M1
M2
M3
M4
M5
M6
M7
M8
Cluster
PROC SPDO LIBRARY=domain-name; SET ACLUSER user-name; CLUSTER CREATE cluster-table-name MEM = SPD-Server-table1 MEM = SPD-Server-table2 MAXSLOT=24QUIT;
Page 10
• Facts and Dimensions
SPD SERVER
Dimension
DimensionFact
Dimension
Dimension
Pairwise :7 Joins1 Select
StarJoin:3 Steps
execute(reset nostarjoin=<1/0>)
Page 11
STARJOIN RULES
• 1. Turn it ON
Page 12
• 2. No Snowflakes
STARJOIN RULES
Dim
DimFact
Dim
Dim
Dim
Dim
Page 13
• 3. Single Fact Table
STARJOIN RULES
Dim
DimFact
Dim
Dim
• 4. Single Join Condition
Fact
• 5. Fact & Dimension Indexes
Page 14
QUESTIONS
Patrick CubaEmail: [email protected]: 0458 91 2634Linkedin: http://www.linkedin.com/in/patrickcuba
Page 15
REFERENCES
STARJOINhttp://support.sas.com/documentation/cdl/en/spdsug/63088/HTML/default/viewer.htm#n0mlj75x9c4dtzn1ves84e1op3jt.htmSAS® 9.1 Scalable PerformanceData Enginehttp://support.sas.com/documentation/onlinedoc/91pdf/sasdoc_91/base_dataeng_6996.pdfSAS® 9.2Scalable PerformanceData Enginehttp://support.sas.com/documentation/cdl/en/engspde/61887/PDF/default/engspde.pdfWhen should you use the SPDE enginehttp://support.sas.com/rnd/scalability/spde/when.html