![Page 1: 1 1 Slide Introduction to Data Mining and Business Intelligence](https://reader031.vdocument.in/reader031/viewer/2022032605/56649e855503460f94b880a9/html5/thumbnails/1.jpg)
1 1 Slide
Slide
Introduction to Data Mining and Business Intelligence
![Page 2: 1 1 Slide Introduction to Data Mining and Business Intelligence](https://reader031.vdocument.in/reader031/viewer/2022032605/56649e855503460f94b880a9/html5/thumbnails/2.jpg)
2 2 Slide
Slide
Why Mine Data? Commercial Viewpoint
Lots of data is being collected and warehoused • Web data, e-commerce• purchases at department/
grocery stores• Bank/Credit Card
transactions
Computers have become cheaper and more powerful Competitive Pressure is Strong
• Provide better, customized services for an edge (e.g. in Customer Relationship Management)
![Page 3: 1 1 Slide Introduction to Data Mining and Business Intelligence](https://reader031.vdocument.in/reader031/viewer/2022032605/56649e855503460f94b880a9/html5/thumbnails/3.jpg)
3 3 Slide
Slide
Why Mine Data? Scientific Viewpoint
Data collected and stored at enormous speeds (GB/hour)
• remote sensors on a satellite
• telescopes scanning the skies
• microarrays generating gene expression data
• scientific simulations generating terabytes of data
Traditional techniques infeasible for raw data Data mining may help scientists
• in classifying and segmenting data• in Hypothesis Formation
![Page 4: 1 1 Slide Introduction to Data Mining and Business Intelligence](https://reader031.vdocument.in/reader031/viewer/2022032605/56649e855503460f94b880a9/html5/thumbnails/4.jpg)
4 4 Slide
Slide
Mining Large Data Sets - Motivation
There is often information “hidden” in the data that is not readily evident
Human analysts may take weeks to discover useful information
Much of the data is never analyzed at all
0
500,000
1,000,000
1,500,000
2,000,000
2,500,000
3,000,000
3,500,000
4,000,000
1995 1996 1997 1998 1999
The Data Gap
Total new disk (TB) since 1995
Number of analysts
From: R. Grossman, C. Kamath, V. Kumar, “Data Mining for Scientific and Engineering Applications”
![Page 5: 1 1 Slide Introduction to Data Mining and Business Intelligence](https://reader031.vdocument.in/reader031/viewer/2022032605/56649e855503460f94b880a9/html5/thumbnails/5.jpg)
5 5 Slide
Slide
What is business intelligence?
![Page 6: 1 1 Slide Introduction to Data Mining and Business Intelligence](https://reader031.vdocument.in/reader031/viewer/2022032605/56649e855503460f94b880a9/html5/thumbnails/6.jpg)
6 6 Slide
Slide
BUSINESS INTELLIGENCE
Business intelligence (BI) – applications and technologies used to gather, provide access to, and analyze data and information to support decision-making efforts
![Page 7: 1 1 Slide Introduction to Data Mining and Business Intelligence](https://reader031.vdocument.in/reader031/viewer/2022032605/56649e855503460f94b880a9/html5/thumbnails/7.jpg)
7 7 Slide
Slide
The Problem: Data Rich, Information Poor
Businesses face a data explosion as digital images, email in-boxes, and broadband connections doubles by 2010
The amount of data generated is doubling every year
Some believe it will soon double monthly
![Page 8: 1 1 Slide Introduction to Data Mining and Business Intelligence](https://reader031.vdocument.in/reader031/viewer/2022032605/56649e855503460f94b880a9/html5/thumbnails/8.jpg)
8 8 Slide
Slide
The Solution: Business Intelligence
Improving the quality of business decisions has a direct impact on costs and revenue
BI systems and tools results in creating an agile intelligent enterprise
![Page 9: 1 1 Slide Introduction to Data Mining and Business Intelligence](https://reader031.vdocument.in/reader031/viewer/2022032605/56649e855503460f94b880a9/html5/thumbnails/9.jpg)
9 9 Slide
Slide
The Solution: Business Intelligence
BI enables business users to receive data for analysis that is:• Reliable• Consistent• Understandable• Easily manipulated
![Page 10: 1 1 Slide Introduction to Data Mining and Business Intelligence](https://reader031.vdocument.in/reader031/viewer/2022032605/56649e855503460f94b880a9/html5/thumbnails/10.jpg)
10 10 Slide
Slide
The Solution: Business Intelligence
BI can answer tough customer questions
![Page 11: 1 1 Slide Introduction to Data Mining and Business Intelligence](https://reader031.vdocument.in/reader031/viewer/2022032605/56649e855503460f94b880a9/html5/thumbnails/11.jpg)
11 11 Slide
Slide
What is data mining?
![Page 12: 1 1 Slide Introduction to Data Mining and Business Intelligence](https://reader031.vdocument.in/reader031/viewer/2022032605/56649e855503460f94b880a9/html5/thumbnails/12.jpg)
12 12 Slide
Slide
DATA MINING
Data mining (knowledge discovery from data) •Extraction of interesting (non-trivial, implicit, previously unknown and potentially useful) patterns or knowledge from huge amount of data
![Page 13: 1 1 Slide Introduction to Data Mining and Business Intelligence](https://reader031.vdocument.in/reader031/viewer/2022032605/56649e855503460f94b880a9/html5/thumbnails/13.jpg)
13 13 Slide
Slide
What is Data Mining?
Many Definitions• Non-trivial extraction of implicit,
previously unknown and potentially useful information from data
• Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful patterns
![Page 14: 1 1 Slide Introduction to Data Mining and Business Intelligence](https://reader031.vdocument.in/reader031/viewer/2022032605/56649e855503460f94b880a9/html5/thumbnails/14.jpg)
14 14 Slide
Slide
What is (not) Data Mining?
What is Data Mining?
– Certain names are more prevalent in certain US
locations (O’Brien, O’Rurke, O’Reilly… in
Boston area)
– Group together similar documents returned by search engine according
to their context (e.g. Amazon rainforest,
Amazon.com,)
What is not Data Mining?
– Look up phone number in
phone directory
– Query a Web search engine for information
about “Amazon”
![Page 15: 1 1 Slide Introduction to Data Mining and Business Intelligence](https://reader031.vdocument.in/reader031/viewer/2022032605/56649e855503460f94b880a9/html5/thumbnails/15.jpg)
15 15 Slide
Slide
DATA MINING
Data-mining tools – use a variety of techniques to find patterns and relationships in large volumes of information • Clustering • Classification • Affinity grouping (Association
Detection)• Statistical Estimation and Prediction
![Page 16: 1 1 Slide Introduction to Data Mining and Business Intelligence](https://reader031.vdocument.in/reader031/viewer/2022032605/56649e855503460f94b880a9/html5/thumbnails/16.jpg)
16 16 Slide
Slide
Cluster Analysis
Cluster analysis – a technique used to divide an information set into mutually exclusive groups such that the members of each group are as close together as possible to one another and the different groups are as far apart as possible
CRM systems depend on cluster analysis to segment customer information and identify behavioral traits
![Page 17: 1 1 Slide Introduction to Data Mining and Business Intelligence](https://reader031.vdocument.in/reader031/viewer/2022032605/56649e855503460f94b880a9/html5/thumbnails/17.jpg)
17 17 Slide
Slide
Cluster Analysis
![Page 18: 1 1 Slide Introduction to Data Mining and Business Intelligence](https://reader031.vdocument.in/reader031/viewer/2022032605/56649e855503460f94b880a9/html5/thumbnails/18.jpg)
18 18 Slide
Slide
Classification
Classification – finds a model to categorize input information into several pre-defined groups.
E.g. classification of credit card approval applications, classification of documents, etc.
![Page 19: 1 1 Slide Introduction to Data Mining and Business Intelligence](https://reader031.vdocument.in/reader031/viewer/2022032605/56649e855503460f94b880a9/html5/thumbnails/19.jpg)
19 19 Slide
Slide
Association Detection
Association detection – reveals the degree to which variables are related and the nature and frequency of these relationships in the information
• Market basket analysis• E.g. beer and diapers were often
purchased together move them closer
![Page 20: 1 1 Slide Introduction to Data Mining and Business Intelligence](https://reader031.vdocument.in/reader031/viewer/2022032605/56649e855503460f94b880a9/html5/thumbnails/20.jpg)
20 20 Slide
Slide
Statistical Analysis
Statistical analysis – performs such functions as information correlations, distributions, calculations, and variance analysis
• Forecast – predictions made on the basis of time-series information
• Time-series information – time-stamped information collected at a particular frequency