a powerpointwiki.hct.ac.uk/_media/computing/hnd/hndu33_lecture05.pdf · web enabled database...
TRANSCRIPT
2015-2016
Phil Smith
Learning outcomesOn successful completion of this unit you will:
1. Understand data models and database technologies. (Assignment 1)
2. Today will complete LO1
RecapLast lesson – Normalisation.
TodayFirst task is to progress the activity started last lesson on normalisation.
You have 40 minutes for this and remember this will be part of assignment 1.
Then we will cover new developments, this is the final part of learning outcome 1.
Finally Assignment 1 will be reviewed and issued.
Task 1First task is to progress the activity started last lesson on normalisation.
You have 40 minutes for this and remember this will be part of assignment 1.
New developments data mining and data warehousing
dynamic storage
web enabled database applications
other developments e.g. multimedia databases, document management systems, digital libraries
I will cover data warehousing but then you will research and present the other “new” developments listed above.
data mining and data warehousing data mining and data warehousing
Motivation and context
“Modern organizations are drowning in data but
starving for information”.
Operational processing (transaction processing)
captures, stores and manipulates data to support
daily operations. The main thrust of this unit.
Information processing is the analysis of data or other
forms of information to support decision making.
Data warehouse can consolidate and integrate
information from many internal and external sources
and arrange it in a meaningful format for making
business decisions.
Definition Data Warehouse: (W.H. Immon)
A subject-oriented, integrated, time-variant, non-updatable collection of data used in support of management decision-making processes.
Subject-oriented: e.g. customers, patients, students, products.
Integrated: Consistent naming conventions, different formats, from multiple data sources.
Time-variant: Can study trends and changes.
Nonupdatable: Read-only, periodically refreshed.
Need for Data Warehousing Integrated, company-wide view of high-quality
information (from disparate databases)
Separation of operational and informational systems and data (for improved performance)
Table 11-1: comparison of operational and informational systems
Meaning of disparate
Definition of disparate in English:
adjective
Essentially different in kind; not able to be compared http://www.oxforddictionaries.com/definition/english/disparate
In our case it relate to different sources of data.
E.g. Sql server, Mysql, Excel, Access etc
Figure 1: Generic architecture
E
T
L
One, company-wide warehouse
Periodic extraction data is not completely current in warehouse
14
The ETL ProcessCapture
Scrub or data cleansing
Transform
Load and Index
ETL = Extract, transform, and load
15
Company Facts
The data warehouse will have a table of facts,
usually specified by business analysts and
implemented by data analysts who
understand the warehouse topology.
The facts are predefined objects (e.g. stored
procedures/view) which when combined can
produce information for decision making. The
data itself can be derived from multiple
sources.
16
Data Mining Goals:
Explain observed events or conditions
Confirm hypotheses
Explore data for new or unexpected relationships
Techniques Case-based reasoning
Rule discovery
Signal processing
Neural nets
Fractals
Data Mining Data mining is knowledge discovery using a blend
of statistical, AI, and computer graphics techniques.
New buzzword, old idea.
Inferring new information from already collected data.
Traditionally job of Data Analysts.
Computers have changed this. Far more efficient to comb through data using a machine than eyeballing statistical data.
Data Mining – Two Main Components Knowledge Discovery
Concrete information gleaned from known data. Data you may not have known, but which is supported by recorded facts.
Knowledge PredictionUses known data to forecast future trends, events, etc. (ie: Stock market predictions)
Data Mining vs. Data Analysis In terms of software and the marketing thereof
Data Mining != Data Analysis
Data Mining implies software uses some intelligence over simple grouping and partitioning of data to infer new information.
Data Analysis is more in line with standard statistical software (ie: web stats). These usually present information about subsets and relations within the recorded data set (ie: browser/search engine usage, average visit time, etc. )
Key Component of Data Mining Whether Knowledge Discovery or Knowledge
Prediction, data mining takes information that was once quite difficult to detect and presents it in an easily understandable format (i.e.: graphical or statistical)
Data mining Techniques involve sophisticated algorithms, including Decision Tree Classifications, Association detection, and Clustering.
Data mining goes hand in hand with data warehouses.
Uses of Data Mining AI/Machine Learning
Combinatorial/Game Data MiningGood for analyzing winning strategies to games, and thus developing intelligent AI opponents. (ie: Chess)
Business StrategiesMarket Basket AnalysisIdentify customer demographics, preferences, and purchasing patterns.
Risk AnalysisProduct Defect AnalysisAnalyze product defect rates for given plants and predict possible complications (read: lawsuits) down the line.
Data warehouse/mining This was a very brief overview of data warehousing and
data mining.
We shall re-visit data warehousing and data mining next semester in the unit on Distributed computing.
New developments Task 2
Research the following terms
You need to include –
What the term means
How it may be used
1. dynamic storage
2. data mining and data warehousing
3. web enabled database applications
4. other developments eg multimedia databases, document management systems, digital libraries
You will be asked to give a precise of your research.
New developments
synonyms: summarize, sum up, give a summary/synopsis/precis of, give the main points of; abridge, condense, shorten, synopsize,abstract, outline,"another strategy for improving your writing skills is to precis a passage"
make a precis of (a text or speech).
New developments Assignment 1
Review and issue.
Summary
What have we learnt today?