an integrated approach to deploy dw in bi environment

21
3 rd International Conference on Computer, Communication, Control and Information Technology 2015 AN INTEGRETED APPROACH TO DEPLOY DATA WAREHOUSE IN BUSINESS INTELLIGENCE ENVIRONMENT

Upload: ranak-ghosh

Post on 06-Aug-2015

56 views

Category:

Data & Analytics


0 download

TRANSCRIPT

3rd International Conferenceon

Computer, Communication, Control and Information Technology

2015

AN INTEGRETED APPROACH TO DEPLOY DATA WAREHOUSE IN BUSINESS INTELLIGENCE

ENVIRONMENT

Ghosh R; Halder S; Sen S; “AN INTEGRETED APPROACH TO DEPLOY DATA WAREHOUSE IN BUSINESS INTELLIGENCE ENVIRONMENT”, C3IT 2015

Data, Information, Knowledge

• Data– Items that are the most elementary descriptions

of things, events, activities, and transactions– May be internal or external

• Information– Organized data that has meaning and value

• Knowledge– Processed data or information that conveys

understanding or learning applicable to a problem or activity

2

Ghosh R; Halder S; Sen S; “AN INTEGRETED APPROACH TO DEPLOY DATA WAREHOUSE IN BUSINESS INTELLIGENCE ENVIRONMENT”, C3IT 2015

Operations on Data

• Activities performed by end users in online systems– Specific, open-ended query generation

• SQL– Ad hoc reports– Statistical analysis– Building DSS applications

• Modeling and visualization capabilities• Special class of tools

– DSS/BI/BA front ends– Data access front ends– Database front ends– Visual information access systems

3

Ghosh R; Halder S; Sen S; “AN INTEGRETED APPROACH TO DEPLOY DATA WAREHOUSE IN BUSINESS INTELLIGENCE ENVIRONMENT”, C3IT 2015

Business Intelligence and Analytics

• Business intelligence– Acquisition of data and information for use in

decision-making activities• Business analytics

– Models and solution methods• Data mining

– Applying models and methods to data to identify patterns and trends

4

Ghosh R; Halder S; Sen S; “AN INTEGRETED APPROACH TO DEPLOY DATA WAREHOUSE IN BUSINESS INTELLIGENCE ENVIRONMENT”, C3IT 2015

Data Warehouse

• Subject oriented• Scrubbed so that data from heterogeneous sources are

standardized• Time series; no current status• Nonvolatile

– Read only• Summarized• Not normalized; may be redundant• Data from both internal and external sources is present• Metadata included

– Data about data• Business metadata• Semantic metadata

5

From data processing to BI…

• High level business processing is an integration of business intelligence with decision making process.

• Decision making process is an outcome of analytical processing involving OLAP based data warehousing.

• Analytical processing is performed on transactional data of substantial size where the input data may from heterogeneous sources.

• Integration of data from these heterogeneous sources also incorporates data cleaning.

• Data are processed through different cognitive process to generate information or the knowledge base.

• Knowledge base could be analyzed to improve and optimize decisions and performance of business organization, which is often termed as business intelligence.

Ghosh R; Halder S; Sen S; “AN INTEGRETED APPROACH TO DEPLOY DATA WAREHOUSE IN BUSINESS INTELLIGENCE ENVIRONMENT”, C3IT 2015

6

Ghosh R; Halder S; Sen S; “AN INTEGRETED APPROACH TO DEPLOY DATA WAREHOUSE IN BUSINESS INTELLIGENCE ENVIRONMENT”, C3IT 2015

Proposed Architecture

• Analytical processing plays a major role in business analysis.

• An integrated architecture to provide business intelligence considering integration of data from heterogeneous sources, ETL, data warehouse, data mining, virtual data warehouse, query analyzer, etc.

• This architecture is designed to work in distributed environment thus generate the knowledge and corresponding business intelligence both locally and globally.

7

Ghosh R; Halder S; Sen S; “AN INTEGRETED APPROACH TO DEPLOY DATA WAREHOUSE IN BUSINESS INTELLIGENCE ENVIRONMENT”, C3IT 2015

8

External Data Sources

Integrate to RDBMS

ETL

Data Center Mapper

DW

DM

VDW

Local B.I. Analyzer

Client Layer Application

Global Report

Query Center Mapper

Integration of Report from all local sites

B.I. Interface

Global

Query

Local Query

Report

ROLAP

Upgradation of DW, DM & VDW requirement

Set of query

Knowledge

Data Center 1

Global B.I. Analyzer

Flat Files

Semi Structure Files

Data in wave format

Relational Files

XML Files

DW

DM

VDW

Local B.I. Analyzer

Report

Knowledge

Data Center N

Global

Knowledge

Clean

Clean

Clean

Clean

Clean

User

User

Fig. 1: Proposed Architecture to Integrate Data Warehouse with Business Intelligence

Components of the Architecture

• External Data Sources

• Cleaning

• Integrate to RDBMS

• ETL (Extract-Transform-Load)

• Data Center and Data Center Mapper

• Business Intelligence Interface

• Local Business Intelligence Analyzer

• Integration of Report from all Local sites

• Global Business Intelligence Analyzer

• Query Center Mapper

• Client Layer ApplicationGhosh R; Halder S; Sen S; “AN INTEGRETED APPROACH TO DEPLOY DATA WAREHOUSE IN BUSINESS INTELLIGENCE ENVIRONMENT”, C3IT 2015

9

Cleaning

• Data cleaning is the process of detecting and correcting; corrupted or inaccurate records from a record set, table, or database.

• Major challenge is to work with heterogeneous data sources with the erroneous data.

• Algorithms should have low time complexity as the size of the input data is huge in majority of the database applications.

Ghosh R; Halder S; Sen S; “AN INTEGRETED APPROACH TO DEPLOY DATA WAREHOUSE IN BUSINESS INTELLIGENCE ENVIRONMENT”, C3IT 2015

10

Integrate to RDBMS

• Integration involves combining the clean and corrected data from heterogeneous sources and provide users a unified form of data.

• XML data to RDBMS platform

• RDF/RDFS to relational database

Ghosh R; Halder S; Sen S; “AN INTEGRETED APPROACH TO DEPLOY DATA WAREHOUSE IN BUSINESS INTELLIGENCE ENVIRONMENT”, C3IT 2015

11

ETL (Extract-Transform-Load)

• ETL generates analytical data suitable for ROLAP.

• Main issue in this process is about the workload, effort and cost.

Ghosh R; Halder S; Sen S; “AN INTEGRETED APPROACH TO DEPLOY DATA WAREHOUSE IN BUSINESS INTELLIGENCE ENVIRONMENT”, C3IT 2015

12

Data Center and Data Center Mapper

• Data of associated DW, DM and VDW.

• Data center mapper is responsible to manage this mapping process considering distribution of data, load balancing and fault tolerance.

Ghosh R; Halder S; Sen S; “AN INTEGRETED APPROACH TO DEPLOY DATA WAREHOUSE IN BUSINESS INTELLIGENCE ENVIRONMENT”, C3IT 2015

13

Business Intelligence Interface

• This interface is given to interact the users with the proposed architecture. Whenever the users wants to analyze their business for future planning this interface is used.

Ghosh R; Halder S; Sen S; “AN INTEGRETED APPROACH TO DEPLOY DATA WAREHOUSE IN BUSINESS INTELLIGENCE ENVIRONMENT”, C3IT 2015

14

Local Business Intelligence Analyzer

• Local Business Intelligence Analyzer is deployed in every data center.

• Local Business Intelligence Analyzer locally process the data to generate knowledge and report associated with that specific data center only.

Ghosh R; Halder S; Sen S; “AN INTEGRETED APPROACH TO DEPLOY DATA WAREHOUSE IN BUSINESS INTELLIGENCE ENVIRONMENT”, C3IT 2015

15

Integration of Report from all Local sites

• This module integrates the local reports from all the data centers.

• This integrated report is then passed to the global business intelligence analyzer module for the generation of global knowledge and report.

Ghosh R; Halder S; Sen S; “AN INTEGRETED APPROACH TO DEPLOY DATA WAREHOUSE IN BUSINESS INTELLIGENCE ENVIRONMENT”, C3IT 2015

16

Global Business Intelligence Analyzer

• Works with multiple data centers.

• Receives the knowledge and report generated from individual data center.

• Analyzes all these reports to identify the further business demands and new processes.

• Prepares a report set regarding the upgradation of associated DW, DM and VDW.

Ghosh R; Halder S; Sen S; “AN INTEGRETED APPROACH TO DEPLOY DATA WAREHOUSE IN BUSINESS INTELLIGENCE ENVIRONMENT”, C3IT 2015

17

Query Center Mapper

• User given queries are analyzed by the query center mapper to find out the data centers where from the data to retrieve.

• Queries are classified as local query and global query.

• Data related to local queries are retrieved from individual data center and the global queries are retrieved from multiple data centers.

Ghosh R; Halder S; Sen S; “AN INTEGRETED APPROACH TO DEPLOY DATA WAREHOUSE IN BUSINESS INTELLIGENCE ENVIRONMENT”, C3IT 2015

18

Client Layer Application

• Interacts the users with the proposed integrated data warehouse-BI architecture.

• The user given queries are received through this layer and fed into the system.

• Query center mapper then takes care of the processing of the queries and at the end after the analysis the users receive the queries from B.I. Interface layer.

Ghosh R; Halder S; Sen S; “AN INTEGRETED APPROACH TO DEPLOY DATA WAREHOUSE IN BUSINESS INTELLIGENCE ENVIRONMENT”, C3IT 2015

19

Conclusion

• This architecture integrates business intelligence in data warehouse environment along with reporting, knowledge generation, query processing, data center managing etc.

• It is deployed in distributed environment.

• Future work on this architecture includes deployment in the cloud environment to provide the business intelligence service under cloud computing paradigm.

• in addition to this service oriented architecture (SOA) could be incorporated over this architecture to provide Business Intelligence as a Service (BIaaS).

Ghosh R; Halder S; Sen S; “AN INTEGRETED APPROACH TO DEPLOY DATA WAREHOUSE IN BUSINESS INTELLIGENCE ENVIRONMENT”, C3IT 2015

20

Thank You

Ghosh R; Halder S; Sen S; “AN INTEGRETED APPROACH TO DEPLOY DATA WAREHOUSE IN BUSINESS INTELLIGENCE ENVIRONMENT”, C3IT 2015

21