saiku––takingtakingolapdatabases …nurkiewicz.github.io/talks/2014/33degree/slides.pdf ·...

30
Saiku Saiku – taking taking OLAP OLAP databases databases into into 21 21st st century century Tomasz Tomasz Nurkiewicz Nurkiewicz | nurkiewicz nurkiewicz.com com @tnurkiewicz tnurkiewicz Slides: bit.ly/33degree

Upload: vokiet

Post on 28-Jul-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

SaikuSaiku –– takingtaking OLAPOLAP databasesdatabasesintointo 2121stst centurycentury

TomaszTomasz NurkiewiczNurkiewicz||nurkiewicznurkiewicz..comcom @@tnurkiewicztnurkiewicz

Slides: bit.ly/33degree

WhatWhat isis SaikuSaiku??DEMODEMO

CoreCore conceptsconceptsOLAPFactDimensionHierarchy

ExampleExample factsfactsSold productTweet/forum post/shared photoWebsite hitIncoming text message...you name it

DimensionDimension"Properties of facts"

When?What?Where?Who?How?

ExampleExample dimensionsdimensionsAccessAccess loglogTimestampIPURL resourceHTTP response code

HierarchyHierarchyMultiMulti--levellevel aggregationaggregationExampleExample:: locationlocation hierarchyhierarchy

(All)ContinentCountryStateCity

MeasuresMeasuresQuantitative propertiesAggregatematching facts over themCount/Sum/Average/Min/Max

ExampleExample measuresmeasuresLoad time (page hit fact)Total price (sale fact)Age of customer

ChartingCharting -- DEMODEMO

ExportingExporting -- DEMODEMO

DrillDrill downdown -- DEMODEMO

IgnoredIgnored conceptsconceptsHypercubeMondrianMDX

YourYour ownown cubecube

StarStar schemaschema

ETLETL

ETLETL -- challengeschallengesMissing or incomplete dataHeuristicsIncremental, periodic updatesVarious data sources

SchemaSchema filefile<Schema name="Twitter">

<Cube name="Tweets" defaultMeasure="Count">

<Table name="tweet">

<DimensionUsage name="Time" source="Time"

foreignKey="time_id"/>

<Dimension name="Location" foreignKey="location_id">

<Hierarchy hasAll="true" allMemberName="All

locations">

<Table name="location"/>

<Level name="Continent" column="continent"/>

<Level name="Country" column="country"/>

<Level name="City" column="city"/>

</Hierarchy>

</Dimension>

<!-- ... -->

</Schema>

SchemaSchema WorkbenchWorkbench

Source: www.stratebi.com/cursos/olap-mdx

SecuritySecurity -- usersusersStandard user/passwordRolesSpring Security - customizable

SecuritySecurity -- datadataBy roleRestrict what can be seenTop/bottom limit

PerformancePerformanceBig data, before it was cool

Indexes on foreign keysAggregate tables

WithoutWithout AggregateAggregate tabletableSELECT COUNT(id)

FROM tweet NATURAL JOIN locations

GROUP BY locations.continent

WithWith aggregateaggregate tabletableINSERT INTO agg (cnt, l.city, l.country, l.continent)

SELECT COUNT(t.id) AS cnt, city, country, continent

FROM tweet t NATURAL JOIN locations l

GROUP BY l.city

Usages:

SELECT SUM(agg.count)

FROM agg

GROUP BY locations.continent

PentahoPentaho AggregationAggregation DesignerDesigner

Source: infocenter.pentaho.com/help/index.jsp

DeploymentDeploymentmondrian.jar - engine

saiku.war - RESTful web services

ui.war - JS front-end

DisadvantagesDisadvantagesHorizontal scalability?Stuckwith SQL databasesComplex schema definition (XML)Aggregate tables are hard

ThankThank youyou!!

Slides: nurkiewicz.github.io/talks/2014/33degree