laboratory of data science business intelligence...

16
LABORATORY OF DATA SCIENCE Business Intelligence Architectures Data Science & Business Informatics Degree

Upload: others

Post on 22-May-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: LABORATORY OF DATA SCIENCE Business Intelligence Architecturesdidawiki.cli.di.unipi.it/lib/exe/fetch.php/mds/lbi/lds.02.bi_architectures.pdf · Front-end applications Applications

LABORATORY OF DATA SCIENCE

Business Intelligence Architectures

Data Science & Business Informatics Degree

Page 2: LABORATORY OF DATA SCIENCE Business Intelligence Architecturesdidawiki.cli.di.unipi.it/lib/exe/fetch.php/mds/lbi/lds.02.bi_architectures.pdf · Front-end applications Applications

BI Architecture

Lab of Data Science

2

6 lessons – data

access

4 lessons – data

quality & ETL

5 lessons – data

mining

6 lessons – OLAP

and reporting

1 lessons –

analytic SQL

Page 3: LABORATORY OF DATA SCIENCE Business Intelligence Architecturesdidawiki.cli.di.unipi.it/lib/exe/fetch.php/mds/lbi/lds.02.bi_architectures.pdf · Front-end applications Applications

Data sources

Multiple operational data sources

Across departments of the organization, and external

sources

Type and formats

Relational, multidimensional, time-seriers, spatial, text, multimedia, ..

Issues

Standards for representations, codes, formats of text files

Standards for querying relational data sources

Basic programs for data manipulation

We will study:

Python access to text files

Python access to RDBMS

Lab of Data Science

3

Page 4: LABORATORY OF DATA SCIENCE Business Intelligence Architecturesdidawiki.cli.di.unipi.it/lib/exe/fetch.php/mds/lbi/lds.02.bi_architectures.pdf · Front-end applications Applications

Extract, Transform and Load

ETL (extract transform and load) is the process of

extracting, transforming and loading data from heterogeneous

sources in a data base/warehouse.

Typically supported by (visual) tools

We will study:

SQL Server 2016 Integration Services

4

Lab of Data Science

Page 5: LABORATORY OF DATA SCIENCE Business Intelligence Architecturesdidawiki.cli.di.unipi.it/lib/exe/fetch.php/mds/lbi/lds.02.bi_architectures.pdf · Front-end applications Applications

Extract, Transform and Load

Incremental and real-time ETL

Complex Event Processing (CEP)

5

Lab of Data Science

Page 6: LABORATORY OF DATA SCIENCE Business Intelligence Architecturesdidawiki.cli.di.unipi.it/lib/exe/fetch.php/mds/lbi/lds.02.bi_architectures.pdf · Front-end applications Applications

Data warehouse

“A data warehouse is a subject-oriented, integrated,

time-variant, and nonvolatile collection of data in

support of management’s decision-making process.”

W.H. Inmon

Data warehousing: the process of building and usinga datawarehouse

6

Lab of Data Science

Page 7: LABORATORY OF DATA SCIENCE Business Intelligence Architecturesdidawiki.cli.di.unipi.it/lib/exe/fetch.php/mds/lbi/lds.02.bi_architectures.pdf · Front-end applications Applications

Data warehouse servers

Relational DBMS (RDBMS)

With specialized index and optmizations

star-join query, bitmap index, partitioning, materialized views

We will study:

SQL Server 2016 with analytic SQL

MapReduce engine

Big data challenge

Architect (low-cost) data platform

Covered by 687AA «Distributed Data Analysis & Mining»

Lab of Data Science

7

Page 8: LABORATORY OF DATA SCIENCE Business Intelligence Architecturesdidawiki.cli.di.unipi.it/lib/exe/fetch.php/mds/lbi/lds.02.bi_architectures.pdf · Front-end applications Applications

Which DBMS for DW?8

Gartner Magic Quadrant – February 2016

Lab of Data Science

Page 9: LABORATORY OF DATA SCIENCE Business Intelligence Architecturesdidawiki.cli.di.unipi.it/lib/exe/fetch.php/mds/lbi/lds.02.bi_architectures.pdf · Front-end applications Applications

BI Architecture

Lab of Data Science

9

6 lessons – data

access

4 lessons – data

quality & ETL

4 lessons – data

mining

5 lessons – OLAP

and reporting

1 lessons –

analytic SQL

Page 10: LABORATORY OF DATA SCIENCE Business Intelligence Architecturesdidawiki.cli.di.unipi.it/lib/exe/fetch.php/mds/lbi/lds.02.bi_architectures.pdf · Front-end applications Applications

Mid-tier servers

OnLine Analytical Processing (OLAP)

Provides a multidimensional view of data warehouses

Pre-compute aggregates

in ad-hoc structures (multidimensional OLAP - MOLAP)

in relational DB (relational OLAP - ROLAP)

in-memory OLAP

We will study:

SQL Server 2016 Analysis Services and MDX Query Language

10

Lab of Data Science

Page 11: LABORATORY OF DATA SCIENCE Business Intelligence Architecturesdidawiki.cli.di.unipi.it/lib/exe/fetch.php/mds/lbi/lds.02.bi_architectures.pdf · Front-end applications Applications

Mid-tier servers

Reporting Servers

Enable definition, efficient execution, and rendering of reports

Data is retrieved from datawarehouse or OLAP servers

We will study:

Microsoft Power BI

11

Lab of Data Science

Page 12: LABORATORY OF DATA SCIENCE Business Intelligence Architecturesdidawiki.cli.di.unipi.it/lib/exe/fetch.php/mds/lbi/lds.02.bi_architectures.pdf · Front-end applications Applications

Mid-tier servers

Data/web/text mining servers

Extract descriptive & predictive models from structured/graph/textual data

We will study:

WEKA: a data mining software in Python

12

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

x

y

y < 0.33?

: 0

: 3

: 4

: 0

y < 0.47?

: 4

: 0

: 0

: 4

x < 0.43?

Yes

Yes

No

No Yes No

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

x

y

Lab of Data Science

Page 13: LABORATORY OF DATA SCIENCE Business Intelligence Architecturesdidawiki.cli.di.unipi.it/lib/exe/fetch.php/mds/lbi/lds.02.bi_architectures.pdf · Front-end applications Applications

Mid-tier servers

Enterprise Search Engine

Crawl, index and search by keywords over different types of data

Covered by 289AA «Information Retrieval»

13

Lab of Data Science

Page 14: LABORATORY OF DATA SCIENCE Business Intelligence Architecturesdidawiki.cli.di.unipi.it/lib/exe/fetch.php/mds/lbi/lds.02.bi_architectures.pdf · Front-end applications Applications

BI Architecture

Lab of Data Science

14

6 lessons – data

access

4 lessons – data

quality & ETL

4 lessons – data

mining

5 lessons – OLAP

and reporting

1 lessons –

analytic SQL

Page 15: LABORATORY OF DATA SCIENCE Business Intelligence Architecturesdidawiki.cli.di.unipi.it/lib/exe/fetch.php/mds/lbi/lds.02.bi_architectures.pdf · Front-end applications Applications

Front-end applications

Applications through which users perform BI tasks

Spreadsheets

for navigating multidimensional data

We will study: Excel

Enterprise portals

for accessing reports and dashboards

for searching through query

GUI

for accessing mining models

for exploratory data analysis

for ad-hoc queries

Vertical packaged applications for CRM, Supply-Chain, Finance, Opinion mining …

More specialized tools for building storytellings to produce understandable stories to

presents information to the users.

Covered by 602AA «Visual Analytics»

Lab of Data Science

15

Page 16: LABORATORY OF DATA SCIENCE Business Intelligence Architecturesdidawiki.cli.di.unipi.it/lib/exe/fetch.php/mds/lbi/lds.02.bi_architectures.pdf · Front-end applications Applications

Front-end applications

Lab of Data Science

16