daslab.seas.harvard.edu lab
AdaptiveDenormalizationZezhou (Alex)Liu,Stratos Idreos
Normalization Denormalization AdaptiveDenormalization
lessstorage/updatecostsslowqueries(joins)
morestorage/updatecostsfastqueries(scans,nojoins)
lessstorage/updatecostsfastqueries(scans)
• basedataliesinanormalizedstate• hotdataisadaptivelyandpartiallydenormalized on-demand
• enablestheadvantagesofbothnormalizationanddenormalization
• futurequeriescanbenefitfromfasterqueryprocessingoverthedenormalized data
• stillmaintainstheefficientspaceutilization,updates,andloadingtimecharacteristicsfoundinnormalizedschemas
Inthetimeittakestojoininputsof100millionrowsinanormalizedschema,wecanperforma(logical)joinbyscanningover10billionrowsindenormalizedschema.Thedisparityislargerwhenahigherpercentageofrowsareselected.
• Queriesenrichexistingpartialuniversaltablesverticallyand/orhorizontally.
• Futurequeriesreplacejoinsoverthesedataregionswithscansoverthepartialuniversaltables
• Amortizesoverhead&costofdenormalization
T1 T2 PartialUniversalTable
Operateswithinthegivenmemorybudgetbydroppingregionsofthepartialuniversaltableinresponsetomemorypressures.
Adaptivelydenormalizes onlyregionsofthedataastheyarequeriedandonlydatathathasnotyetbeendenormalized bypreviousqueries.
Q1 Q2 Q3
Adaptivedenormalization (AD)achievessignificantbenefitsevenwhentherequireddataisonlypartiallydenormalized.
Adaptivedenormalization (AD)improvessignificantlyoverrepeatedjoinpatternswithoutpenalizingthefirstjoinqueries.