introduction to data warehousing and mining
DESCRIPTION
data warehousing and mining introduction class from kl universityTRANSCRIPT
DATAWAREHOUSING AND MINING
BYG.RAJESH CHANDRA
EVOLUTION OF DATABASE TECHNOLOGY 1960s (Primitive File Processing)
Data collection, database creation, IMS and network DBMS 1970s to early 1980s (DBMS)
Relational data model, relational DBMS implementation ,SQL, OLTP,User Interfaces.etc
1980s: to Present (Advanced Data Bases) RDBMS, advanced data models (extended-relational, OO,
deductive, etc.) Application-oriented DBMS (spatial, scientific, engineering, etc.)
1990s: (Advanced Data Analysis) Data mining, data warehousing, multimedia databases, and Web
databases 2000s
Stream data management and mining Data mining and its applications Web technology (XML, data integration) and global information
systems
WHY MINE DATA? COMMERCIAL VIEWPOINT
Lots of data is being collected and warehoused Web data, e-commerce purchases at department/
grocery stores Bank/Credit Card
transactions
Competitive Pressure is Strong Provide better, customized services for an edge (e.g. in
Customer Relationship Management)
WHAT IS DATA MINING…..? Data mining (sometimes called data
Discovery or Knowledge Discovery Data) is the process of analyzing data from different perspectives and summarizing it into useful information.
• Extraction of interesting (non-trivial, implicit, previously unknown and potentially useful) patterns or knowledge from huge amount of data
WHY MINE DATA? SCIENTIFIC VIEWPOINT Data collected and stored at
enormous speeds (GB/hour) remote sensors on a satellite telescopes scanning the skies microarrays generating gene
expression data scientific simulations
generating terabytes of data Traditional techniques infeasible for
raw data Data mining may help scientists
in classifying and segmenting data in Hypothesis Formation
EXAMPLES: WHAT IS (NOT) DATA MINING?
What is not Data Mining?
– Look up phone number in phone directory
– Query a Web
search engine for information about “Amazon”
What is Data Mining?
– Certain names are more prevalent in certain US locations (O’Brien, O’Rurke, O’Reilly… in Boston area)
– Group together similar documents returned by search engine according to their context (e.g. Amazon rainforest, Amazon.com,)
DATA MINING IS ALSO CALLED AS..?
• Knowledge discovery (mining) in databases (KDD), knowledge extraction, data/pattern analysis, data archeology, data dredging, information harvesting, business intelligence, etc.
• Real Time Example Gold Mining
DATA WARE HOUSE = COLLECTION OF DATA BASES
WE HAVE TO USE DIFFERENT METHODS
RAW DATA =DATA BASES + NOISE DATA
DATA SELECTION AND TRANSFORMATION
DATA CLEANING AND INTEGRATION
DATA MINING
PATTERN EVALUATION
KNOWLEDGE REPRASENTATION
KNOWLEDGE REPRASENTATION
April 9, 2023
KNOWLEDGE DISCOVERY (KDD) PROCESS
Data mining—core of knowledge discovery process
Data Cleaning
Data Integration
Databases
Data Warehouse
Task-relevant Data
Selection
Data Mining
Pattern Evaluation