sanjay data warehouse
TRANSCRIPT
-
8/12/2019 Sanjay Data Warehouse
1/18
Data Warehousing and Data
Mining Concepts in IT Industry
By:Sanjay Kaushik078
-
8/12/2019 Sanjay Data Warehouse
2/18
Data Warehousing
A data warehouse is a: subject-oriented,integrated, timevarying, non-volatile collection
of data in support of the management's decision-making process.
A data warehouse is a centralized repository that
stores data from multiple information sourcesand transforms them into a common,multidimensional data model for efficientquerying and analysis.
-
8/12/2019 Sanjay Data Warehouse
3/18
Architecture
-
8/12/2019 Sanjay Data Warehouse
4/18
Various ETL tools used in market are:
IBM InformaticaData StageOracle Warehouse Bulider
Ab InitioData Junctionmicrosoft sql server integrationtransform ondemand
transformation manager
-
8/12/2019 Sanjay Data Warehouse
5/18
ETL tools are unified enterprise dataintegration platform that allows companies and
government organizations of all sizes to access,discover and integrate data from virtually anybusiness system, in any format and deliver thatdata throughout the enterprise for query andreporting (i.e., business intelligence). ETL toolsprovide developers with an interface fordesigning source-to-target mappings,
transformation, and job control parameters.
-
8/12/2019 Sanjay Data Warehouse
6/18
-
8/12/2019 Sanjay Data Warehouse
7/18
OLAP allows business users to slice and dice data at will. Normally data inan organization is distributed in multiple data sources and areincompatible with each other.A retail example: Point-of-sales data and sales made via call-center or the
Web are stored in different location and formats. It would a timeconsuming process for an executive to obtain OLAP reports such as - Whatare the most popular products purchased by customers between the ages15 to 30?
OLTPs are designed for optimal transaction speed. When a consumer
makes a purchase online, they expect the transactions to occurinstantaneously. With a database design, call data modeling, optimized fortransactions the record 'Consumer name, Address, Telephone, OrderNumber, Order Name, Price, Payment Method' is created quickly on thedatabase and the results can be recalled by managers equally quickly ifneeded.
-
8/12/2019 Sanjay Data Warehouse
8/18
Different OLAP tools in market:
Oracle Enterprise BI ServerMicrosoft BI & OLAP toolsBM Cognos Series 10QlikView
Board Management IntelligenceToolkitHyperion SystemAP NetWeaver BI
MicrostrategyAP Business Objects Enterprise XirSAS Enterprise BI
http://www.businessintelligencetoolbox.com/business-intelligence-vendors/oracle-bi-enterprise-edition-obiee/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/oracle-bi-enterprise-edition-obiee/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/microsoft-bi-business-intelligence-bi-tools/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/ibm-cognos-business-intelligence/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/qlikview-bi-tool/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/board-management-intelligence-toolkit/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/hyperion-system-9/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/sap-netweaver-bi/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/microstrategy-bi-tool/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/sap-businessobjects/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/sas-business-intelligence/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/sas-business-intelligence/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/sas-business-intelligence/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/sas-business-intelligence/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/sap-businessobjects/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/sap-businessobjects/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/microstrategy-bi-tool/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/sap-netweaver-bi/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/sap-netweaver-bi/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/sap-netweaver-bi/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/sap-netweaver-bi/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/hyperion-system-9/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/board-management-intelligence-toolkit/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/board-management-intelligence-toolkit/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/qlikview-bi-tool/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/ibm-cognos-business-intelligence/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/ibm-cognos-business-intelligence/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/ibm-cognos-business-intelligence/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/ibm-cognos-business-intelligence/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/microsoft-bi-business-intelligence-bi-tools/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/oracle-bi-enterprise-edition-obiee/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/oracle-bi-enterprise-edition-obiee/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/oracle-bi-enterprise-edition-obiee/http://www.businessintelligencetoolbox.com/business-intelligence-vendors/oracle-bi-enterprise-edition-obiee/ -
8/12/2019 Sanjay Data Warehouse
9/18
Data Mining
Data mining is part of the knowledge discoveryprocess that offers a new way to look at data.Data mining consists of the nontrivial
extraction of implicit, previously unknown,and potentially useful information from data.Data mining is then the process of discoveringmeaningful new correlations, patterns andtrends by sifting through vast amounts of datausing statistical and mathematical techniques.
-
8/12/2019 Sanjay Data Warehouse
10/18
Different Phases :1: Exploration.
2: Model building and validation.3:Deployment.
Different Data Mining tools in market
AlphaBlox
TanagraCART DarwinSPSSAnd many more
http://www.alphablox.com/http://www.alphablox.com/ -
8/12/2019 Sanjay Data Warehouse
11/18
Case Study: Insurance Company
Specifically, data mining can help insurance firms inbusiness practices such as:
Optimizing products and pricing. Acquiring new customers.
Retaining existing customers. Performing sophisticated campaign management. Detecting fraudulent claims. Estimating outstanding loss reserve.
-
8/12/2019 Sanjay Data Warehouse
12/18
Taking the case of Detection of fraudulent claims.
fraudulent claims are an ever-present problem for insurance firms, andtechniques for identifying and mitigating fraud are critical for their long-termsuccess.Quite often, successful fraud detection analyses, such as those from a dataminingproject, can provide a very high return on investment.
-
8/12/2019 Sanjay Data Warehouse
13/18
-
8/12/2019 Sanjay Data Warehouse
14/18
-
8/12/2019 Sanjay Data Warehouse
15/18
Insurance companies around the world lose more and moremoney through fraudulent claims each year. They need torecoup this lost money so they can continue providing
superior services for their customers.
Fraudulent claims are typically not the biggest claims,because perpetrators are well aware that the big claims are
scrutinized more rigorously than average claims.for fraudulent claims, analysts must look for unusualassociations, anomalies or outlying patterns in the data.Specific analytical techniques adept at finding suchsubtleties are social network link analysis, market basket
analysis, cluster analysis and predictive modeling. Forexample, Informatica uses SPSS to segment customer databy uncovering certain relationships between data sets, whichare red flags for fraud-related losses.
-
8/12/2019 Sanjay Data Warehouse
16/18
-
8/12/2019 Sanjay Data Warehouse
17/18
Discover small subsets of claims with a high percentage ofrecoverable fraud? Isolate the factors that indicate a claim or payment requesthas a high probability of fraudulence? Develop rules and use them to flag only those claims orrequests most likely to be fraudulent?
Ensure your adjusters could review claims or requests thatare not only likely to be fraudulent but also have the greatestadjustment potential?
Capitalize on existing dataThe previously audited claims hold the key to recoupingmoney in the future. By creating models from historicalinformation, we can accurately pinpoint fraudulent claims
-
8/12/2019 Sanjay Data Warehouse
18/18
Steps for data mining:Building models to find fraudulent claimsUnderstand your dataDetermine your population makeupDiscover relationships in your dataBuild a modelUse the model against actual recordsCompare your subset to the entire populationStrategically deploy your data mining results for optimumsuccess