morgan-kaufmann-jiawei-han-micheline-kamber-datamining
TRANSCRIPT
![Page 1: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/1.jpg)
![Page 2: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/2.jpg)
![Page 3: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/3.jpg)
![Page 4: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/4.jpg)
(1960’s and earlier)
- primitive file processing
Data collection and database creation
(1970’s)
- data modeling tools
- indexing and data organization techniques
- query languages and query processing
- user interfaces
- optimization methods
- on-line transactional processing (OLTP)
Database management systems
- network and relational database systems
(mid-1980’s - present)
- advanced data models:
extended-relational, object-
oriented, object-relational
- application-oriented: spatial,
temporal, multimedia, active,
scientific, knowledge-bases,
World Wide Web.
Advanced databases systems
(2000 - ...)
New generation of information systems
Data warehousing and data mining
(late-1980’s - present)
- data warehouse and OLAP technology
- data mining and knowledge discovery
![Page 5: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/5.jpg)
How can I analyze
this data????
???
![Page 6: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/6.jpg)
Knowledge
[gold nuggets]
[ a mountain of data]
[a shovel]
[a pick]
[beads of sweat]
![Page 7: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/7.jpg)
patterns
knowledge
Integration
Cleaning &
Data
Mining
Selection &
Transformation
..
..
data
warehouse
data basesflat files
Evaluation
& Presentation
![Page 8: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/8.jpg)
WarehouseDataData
Base
EngineData Mining
Database or
Server Data Warehouse
Data cleaningdata integration filtering
Graphic User Interface
KnowledgeBase
Pattern Evaluation
![Page 9: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/9.jpg)
![Page 10: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/10.jpg)
data
warehouse
clean
transform
integrate
load
client
client
query
and
analysis
tools.
.
.
.
.
.
data source in Vancouver
data source in New York
data source in Chicago
![Page 11: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/11.jpg)
roll-upon time data
drill-down
for Q1on address
homeentertainment
(types)item
computer
phone
security
time
Q1
Q2
Q3
Q4
(cities)address
New York
Montreal
Vancouver
Chicago
14K825K605K
(quarters)
homeentertainment
(types)item
computer
phone
security
March
Feb
Jan
time(months)
(cities)address
New York
Montreal
Vancouver
Chicago
400K
150K
100K
150K
homeentertainment
(types)item
computer
phone
security
time(quarters)
Q1
Q2
Q3
Q4
address(regions)
North
South
East
West
a)
<Vancouver,Q1,security>
b)
![Page 12: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/12.jpg)
![Page 13: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/13.jpg)
![Page 14: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/14.jpg)
![Page 15: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/15.jpg)
![Page 16: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/16.jpg)
+
+
+
![Page 17: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/17.jpg)
![Page 18: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/18.jpg)
MachineLearning
StatisticsSystemsDatabase
ScienceInformation
Visualization Other disciplines
![Page 19: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/19.jpg)
![Page 20: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/20.jpg)
![Page 21: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/21.jpg)
![Page 22: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/22.jpg)
![Page 23: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/23.jpg)
![Page 24: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/24.jpg)
![Page 25: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/25.jpg)
![Page 26: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/26.jpg)
![Page 27: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/27.jpg)
![Page 28: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/28.jpg)
![Page 29: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/29.jpg)
![Page 30: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/30.jpg)
![Page 31: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/31.jpg)
![Page 32: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/32.jpg)
![Page 33: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/33.jpg)
![Page 34: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/34.jpg)
870
925
789
698
984
1002
682
784
728
623
872
591
89
38
43
882
968
746
854
1087
818
580381038927
501301023812
51231952680
Q1
Q2
Q3
Q4
New York
Montreal
(quarters)
Chicago(cities)location
14K825K 400K605K
time
security
phone
computer
item(types)
entertainmenthome
Vancouver
homeentertainment
(types)item
computer
phone
security
homeentertainment
(types)item
computer
phone
security
homeentertainment
(types)item
computer
phone
security
time(quarters)
Q1
Q2
Q3
Q4
14K825K605K 400K
New York
Montreal
Vancouver
Chicago(cities)location "SUP1" "SUP2" "SUP3"supplier = supplier = supplier =
![Page 35: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/35.jpg)
all
item location suppliertime
time, supplier item, supplier
time, location
time, item
item, location location, supplier
time, item, location
item, location, suppliertime, item, supplier
time, location, supplier
1-D cuboids
0-D (apex) cuboid
3-D cuboids
2-D cuboids
4-D (base) cuboiditem, item, location, supplier
Sales FactTime Dimensionyearquartermonthday_of_weekdaytime_key
Location Dimension
country
citystreetlocation_key
Branch Dimension
branch_key
branch_key
Item Dimension
province_or_state
item_key
time_key
branch_type
item_key
branditem_name
typesupplier_type
branch_name
location_key
dollars_soldunits_sold
![Page 36: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/36.jpg)
time_key
Sales FactTime Dimension
month
time_keyLocation Dimension
supplier_keySupplier Dimension
supplier_type
location_key
city_key countryCity Dimension
year
day_of_week
street
city_keycity
supplier_key
location_key
dollars_soldunits_sold
quarter
day
Branch Dimension
branch_typebranch_namebranch_key
item_key
branch_key
Item Dimension
province_or_state
type
item_key
branditem_name
![Page 37: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/37.jpg)
time_key
Sales Fact
units_sold
dollars_soldlocation_key
brand
Shipper Dimension
shipper_keyfrom_locationto_location
Time Dimensionyearquartermonth
time_key
day_of_weekday
location_keystreetcity
countryLocation Dimension
Shipping Factshipper_typelocation_key
Branch Dimension
branch_typebranch_namebranch_key
item_key
branch_key
item_name
Item Dimension
item_key
province_or_state
shipper_nameshipper_key
typetime_keyitem_key
dollars_costunits_shipped
![Page 38: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/38.jpg)
![Page 39: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/39.jpg)
![Page 40: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/40.jpg)
![Page 41: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/41.jpg)
British
Columbia
Vancouver Victoria
Ontario Quebec
Toronto Montreal
New York
New York Los Angeles San Francisco
California Illinois
Chicago
Canada USA
............ ... ...
...
......
all
... ... ...... ... ...
location
all
country
province_or_state
city
month
quarter
year
week
day
country
city
street
province_or_state
![Page 42: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/42.jpg)
($0 - $200]
($100 - $200]
($200 - $400]
($200 - $300]
($400 - $600]
($400 - $500]
($600 - $800]
($600 - $700] ($700 - $800]($500 - $600]($300 - $400]
($800 - $1,000]
($800 - $900]
($0 - $1000]
($0 - $100] ($900 - $1,000]
![Page 43: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/43.jpg)
phone
(types)item
computer security
time
entertainment
(quarters)
Q2
Q3
Q4
location(countries)
US
Canada
Q1
home
(cities)location
Montreal
Vancouver
time(quarters)
Q1
Q2
(types)item
homeentertainment
computer
(cities)location
New York
Montreal
Vancouver
Chicago
time(quarters)
Q1
Q3
Q4
Q2
homeentertainment
(types)item
computer
phone
security
14K825K605K 400K
on time
(from quarters
to months)
drill-downon location
roll-up
(from cities to countries)
for time="Q2"
slice
(time="Q1" or "Q2") and
dice for
(location="Montreal" or "Vancouver") and
(item="home entertainment" or "computer")
homeentertainment
(types)item
computer
phone
security
time(months)
(cities)location
Vancouver
Montreal
Chicago
New York
homeentertainment
computer
phone
security
(types)item
homeentertainment
(types)item
computer
phone
security
Chicago
New York
MontrealVancouver
(cities)location
pivot
150K
100K
150K
New York
Montreal
Vancouver
Chicago(cities)location
March
AprilMay
June
July
August
September
October
November
December
January
February
![Page 44: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/44.jpg)
time
location
customer
namestreet
continent
city
province_or_state
country
itemday
month
quarter
year
category
group
brandname typecategory
![Page 45: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/45.jpg)
![Page 46: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/46.jpg)
![Page 47: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/47.jpg)
LoadTransform
CleanExtract
Refresh
Query/Report Analysis Data Mining
OLAP Server OLAP ServerOutput
Operational Databases External sources
Data Cleaning
and
Data Integration
Data Storage
OLAP Engine
Front-End Tools
Metadata Repository
AdministrationMonitoring Data MartsData Warehouse
![Page 48: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/48.jpg)
EnterpriseData
Warehouse
Define a high-level corporate data model
model
refinement model refinement
DataMartMart
Data
Data MartsDistributed
Multi-Tier
WarehouseData
![Page 49: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/49.jpg)
![Page 50: Morgan-Kaufmann-Jiawei-Han-Micheline-Kamber-DataMining](https://reader035.vdocument.in/reader035/viewer/2022070604/62c2736611640160995b184a/html5/thumbnails/50.jpg)