what is data mining ?
DESCRIPTION
general description of data mining, its business context, the differences between data mining and statistics, example of an applicatonTRANSCRIPT
![Page 1: What is data mining ?](https://reader033.vdocument.in/reader033/viewer/2022051515/54c3af324a79594d028b458f/html5/thumbnails/1.jpg)
What is data mining ?
Johan BlommeCirculation Manager, AMP
The Datamining Garden kick-off workshopJune, 19th 2007
Regus Pegasus, Diegem
1
![Page 2: What is data mining ?](https://reader033.vdocument.in/reader033/viewer/2022051515/54c3af324a79594d028b458f/html5/thumbnails/2.jpg)
1. Introduction : “Competing on Analytics”
2
![Page 3: What is data mining ?](https://reader033.vdocument.in/reader033/viewer/2022051515/54c3af324a79594d028b458f/html5/thumbnails/3.jpg)
• Thomas Davenport : organizations that have built their very business on the ability to collect, analyze and act on data are consistently the leaders in their industry.
• The demands of business today are creating an increasing need for access to data and the use of it to maintain a sustainable competitive advantage :
– the rapid construction of data-driven analytics : • descriptive statistics ;
• predictive modeling and optimization techniques ;
– the rapid deployment of knowledge derived from data ;
– the need to give end users access to results in a form that helps them gain the insights they need to make critical business decisions.
3
![Page 4: What is data mining ?](https://reader033.vdocument.in/reader033/viewer/2022051515/54c3af324a79594d028b458f/html5/thumbnails/4.jpg)
Processes: interwoven, collaborativelinear, sequential
Tempo:periodic, slow
continuous, rapid
Assets :tangibles
intangibles
Industrial Age Information Age
4
![Page 5: What is data mining ?](https://reader033.vdocument.in/reader033/viewer/2022051515/54c3af324a79594d028b458f/html5/thumbnails/5.jpg)
5
![Page 6: What is data mining ?](https://reader033.vdocument.in/reader033/viewer/2022051515/54c3af324a79594d028b458f/html5/thumbnails/6.jpg)
2. Business drivers of data mining
6
![Page 7: What is data mining ?](https://reader033.vdocument.in/reader033/viewer/2022051515/54c3af324a79594d028b458f/html5/thumbnails/7.jpg)
Time and information drive the information age, and competitiveness will bebased on obtaining real-time information and acting on it promptly and effectively.The following changes indicate how to compete in the information age :
• more complex business environments due to globalization and deregulation ;• greater impact of change from external causes ;• a power shift from sellers to buyers, rapidly shifting customer demands and subsequent reduced product life cycles ;• constant technology change ;• faster business cycles and temporary competitive advantage ;• the need to explore collaborative strategies ;• constant change at ever-increasing speeds and shrinking strategy time horizons.
7
![Page 8: What is data mining ?](https://reader033.vdocument.in/reader033/viewer/2022051515/54c3af324a79594d028b458f/html5/thumbnails/8.jpg)
• Technology facilitates data gathering :
– e.g. RFID ;
– currently : applications mainly in production environment and logistics ;
– future possibilities : narrowcasting ;
– privacy issues !
8
![Page 9: What is data mining ?](https://reader033.vdocument.in/reader033/viewer/2022051515/54c3af324a79594d028b458f/html5/thumbnails/9.jpg)
• Technology transforms the way we live and interact :
– ubiquitous access to information is changing the economics of knowledge ;
– consumer preferences are becoming more complex and are changing more rapidly
– customers will increasingly choose how they would like to interact with organizations and will do only business with componies that meet their interaction needs ;
– the customer takes the lead ;
– technology changes the behaviour of consumers ; consequently, it is very important to track customer interactions and customer behaviour
9
![Page 10: What is data mining ?](https://reader033.vdocument.in/reader033/viewer/2022051515/54c3af324a79594d028b458f/html5/thumbnails/10.jpg)
3. Data mining defined
10
![Page 11: What is data mining ?](https://reader033.vdocument.in/reader033/viewer/2022051515/54c3af324a79594d028b458f/html5/thumbnails/11.jpg)
• Data mining is the extraction of actionable knowledge from large datasets to acquire and sustain a competitive advantage.
• Data mining is about achieving the organization’s goals, not about the maths and the statistics.
11
![Page 12: What is data mining ?](https://reader033.vdocument.in/reader033/viewer/2022051515/54c3af324a79594d028b458f/html5/thumbnails/12.jpg)
• The introduction of data warehousing in the 90’s resulted in a wider acceptance of data mining :
– operational data stored in corporate data warehouses has the potential to be exploited as business intelligence ;
– data warehouses are multidimensional structures used for on line analytical processing ;
– OLAP : • analyze information about past performance on an aggregate level
• verification-based approach : the user develops a hypothesis and then tests the data to prove or disprove the hypothesis
– data mining :• prospective data analysis
• predicting future trends, allowing businesses to make proactive, knowledge driven decisions
Data mining and statistics/OLAP can complement each other : the inductively revealed
relationships between variables can be used to formulate hypothesis and the insights gained
12
![Page 13: What is data mining ?](https://reader033.vdocument.in/reader033/viewer/2022051515/54c3af324a79594d028b458f/html5/thumbnails/13.jpg)
13
![Page 14: What is data mining ?](https://reader033.vdocument.in/reader033/viewer/2022051515/54c3af324a79594d028b458f/html5/thumbnails/14.jpg)
• Statistics vs. data mining :
– Statistical analysis is primarily concerned with confirmatory data analysis (model fitting) : testing if a proposed model of hypothetical relationships between variables does or does not provide a good explanation of the observed data.
Statistical models are based on assumptions or some theory about relationships between
variables and assume a deductive process
– Data mining : rather than verifying hypothetical patterns, data mining uses the data itself to detect such patterns.
Data mining : computational algorithms play a much greater role in building model through
exploratory data analysis (EDA). The nature of the process is inductive.
14
![Page 15: What is data mining ?](https://reader033.vdocument.in/reader033/viewer/2022051515/54c3af324a79594d028b458f/html5/thumbnails/15.jpg)
15
![Page 16: What is data mining ?](https://reader033.vdocument.in/reader033/viewer/2022051515/54c3af324a79594d028b458f/html5/thumbnails/16.jpg)
standard reports
query / drill down
alerts
forecasting
predictive modeling
optimization
degree of intelligence
business value
16
![Page 17: What is data mining ?](https://reader033.vdocument.in/reader033/viewer/2022051515/54c3af324a79594d028b458f/html5/thumbnails/17.jpg)
The CRISP-DM model is an industry- and application-neutral standard for fitting data mining into the general problem-solving strategy of a business.
17
![Page 18: What is data mining ?](https://reader033.vdocument.in/reader033/viewer/2022051515/54c3af324a79594d028b458f/html5/thumbnails/18.jpg)
4. An example of DM
The case of demand planning of magazines (AMP)
18
![Page 19: What is data mining ?](https://reader033.vdocument.in/reader033/viewer/2022051515/54c3af324a79594d028b458f/html5/thumbnails/19.jpg)
Distribution of press products :2.8 mio copies every night
19
![Page 20: What is data mining ?](https://reader033.vdocument.in/reader033/viewer/2022051515/54c3af324a79594d028b458f/html5/thumbnails/20.jpg)
Business problem :
The market for printed magazines is declining. Key reasons :
- advertising is migrating to e-media ;
- publishers are not investing in the future of printed magazines at the same rate as they are in
in the future of e-media products ;
- the young generation is brought up in an e-media world and will be less inclined to read
printed products ;
- publishers’ drive to reduce costs makes e-media publishing an attractive proposition, since
paper, printing and distribution costs can be eliminated.
The big issue in single copy sales is that of unsolds. If sales volumes go down, the distribution cost/copy
increases, since the overhead of the distribution system have to be spread over fewer magazines, and
returns as a proportion of delivered magazines increases (the fee earned by distributors is based on cover
prices of magazines and number of copies sold (instead of a cost-to-serve model).
20
![Page 21: What is data mining ?](https://reader033.vdocument.in/reader033/viewer/2022051515/54c3af324a79594d028b458f/html5/thumbnails/21.jpg)
Objective :
How to build an intelligent supply chain to improve supply chain efficiency,
reduce costs and increase profits ?
21
![Page 22: What is data mining ?](https://reader033.vdocument.in/reader033/viewer/2022051515/54c3af324a79594d028b458f/html5/thumbnails/22.jpg)
Product Planning& Development
Retail Catalog - MailInternet, WWW,
Kiosks
Suppliers
Business Understanding
• make-to stock environment• lack of visibility of supply chain, esp. day-to-day demand and stock positions• excessive inventory levels• return rates of + 60 % are not uncommon in our industry
=> Information is key : integrate internal SC activities of AMP with those of paterners to gain efficiencies across the supply chain
SAPBUSINESSWAREHOUSE
Sales Force
22
![Page 23: What is data mining ?](https://reader033.vdocument.in/reader033/viewer/2022051515/54c3af324a79594d028b458f/html5/thumbnails/23.jpg)
the traditional (linear) supply chain
23
![Page 24: What is data mining ?](https://reader033.vdocument.in/reader033/viewer/2022051515/54c3af324a79594d028b458f/html5/thumbnails/24.jpg)
Publisher Distributor Newsstand
1Information & Intelligence Sharing for Effectiveness
Product Flow
Information Flow
• POS Data Sharing• Inventory levels• Forecasts• Promotional Activities• New Product Introduction• Production & delivery schedules
the intelligent supply chain
24
![Page 25: What is data mining ?](https://reader033.vdocument.in/reader033/viewer/2022051515/54c3af324a79594d028b458f/html5/thumbnails/25.jpg)
Product Planning& Development
Retail Catalog - MailInternet, WWW,
Kiosks
Suppliers
Business Understanding
SAPBUSINESSWAREHOUSE
Sales Force
Data Preprocessing
. data normalization
. handling missing data
25
![Page 26: What is data mining ?](https://reader033.vdocument.in/reader033/viewer/2022051515/54c3af324a79594d028b458f/html5/thumbnails/26.jpg)
Product Planning& Development
Retail Catalog - MailInternet, WWW,
Kiosks
Suppliers
Business Understanding
SAPBUSINESSWAREHOUSE
Sales Force
Data Preprocessing
. flat sales model
. intermittent data modeling
. discreta data : low volume model
. apply business rules
DevelopForecast Model
26
![Page 27: What is data mining ?](https://reader033.vdocument.in/reader033/viewer/2022051515/54c3af324a79594d028b458f/html5/thumbnails/27.jpg)
Product Planning& Development
Retail Catalog - MailInternet, WWW,
Kiosks
Suppliers
Business Understanding
SAPBUSINESSWAREHOUSE
Sales Force
Data Preprocessing
. interpret results : simulation
. workflow integration (operations)
DevelopForecast Model
DeployForecasts
27
![Page 28: What is data mining ?](https://reader033.vdocument.in/reader033/viewer/2022051515/54c3af324a79594d028b458f/html5/thumbnails/28.jpg)
service degree level
monthly titles
28
![Page 29: What is data mining ?](https://reader033.vdocument.in/reader033/viewer/2022051515/54c3af324a79594d028b458f/html5/thumbnails/29.jpg)
0
25,0000
50,0000
75,0000
100,0000
0 25,0000 50,0000 75,0000 100,0000
R² = 0,5696
R² = 0,0213
% w
eigh
ted
oos
% unsolds
reference period Linear.(reference period) draw regulation Log.(draw regulation)
29
![Page 30: What is data mining ?](https://reader033.vdocument.in/reader033/viewer/2022051515/54c3af324a79594d028b458f/html5/thumbnails/30.jpg)
Improved understanding, forecasting and analysis of consumer demandImproved capability to respond and react to changesImproved stability, predictability and efficiency of supply chain operations
Improved Fill RatesImproved on-shelf availabilityMore effective demand generationactivities
IncreasedSales
Reduced lead timesReduced inventories
Reduced Inventories
Smoother SC executionMore efficient processesReduction of costs for handlingreturns
Reduced Costs
Shared visibility across supply chain
30