lecture 3 - pdf - full
DESCRIPTION
Information ManagementTRANSCRIPT
2014-01-18
1
AP/ADMS 2511Management
Information Systems
Session 3 – Chapter 3
Data, Information, and Knowledge Management
Learning Objectives
1. Describe the difficulties of managing data 2. Explain how data governance is facilitated by master
data management3. Use the data hierarchy and build E-R (entity
relationship) diagrams4. Explain the characteristics of relational database
management systems and their role in information reporting
5. Explain the nature of data warehouses and data marts, their advantages and disadvantages and their role in data mining
6. Describe the knowledge management system cycle and discuss types of knowledge
What can “go wrong” with data?
4
• It can be ….– Lost– Copied– Erased– Changed– Have multiple copies (that are
slightly different!)– Hard to find– Overwhelming!
2014-01-18
2
Data quality problems almost cost Library and Archives Canada $200,000
5
• Library and Archives Canada wanted to buy an old map worth $200,000
• Then, they found that they already had two copies!
• Culprits: translation errors, incomplete data entry and data entry errors
Managing Data
Difficulties in Managing Data– Amount of data increases
exponentially
– Data are scattered and collected by many individuals using various methods and devices
– Data come from many sources
– Data security, quality, and integrity are critical
Examples of Data Sources
E-mails
Credit card swipes
RFID tags
Digital video surveillance
Radiology scans
Blogs
Learning Objectives
1. Describe the difficulties of managing data 2. Explain how data governance is facilitated by master
data management3. Use the data hierarchy and build E-R (entity
relationship) diagrams4. Explain the characteristics of relational database
management systems and their role in information reporting
5. Explain the nature of data warehouses and data marts, their advantages and disadvantages and their role in data mining
6. Describe the knowledge management system cycle and discuss types of knowledge
2014-01-18
3
Note the difference between transaction and master data
9
• Master files (master data ) is semi-permanent data, such as employee name, address, customer name, customer credit limit
• Transaction data represents business activities or events, such as payroll cheque, customer invoice
• Importance of master data management– Video: http://www.youtube.com/watch?v=MRgjLKufu34
Data Governance
Data Governance encompasses the people, processes and procedures to create a consistent, enterprise view of your data in order to:
– Improve data security
– Increase consistency & confidence in decision making
– Decrease the risk of regulatory fineshttp://www.youtube.com/watch?v=tylX6GvTu5o
Relationship between data governance and a master data management
Video: http://www.youtube.com/watch?v=0uR_-w-UQI0
Practice question 1
Ace Hardware
Learning Objectives
1. Describe the difficulties of managing data 2. Explain how data governance is facilitated by master
data management3. Use the data hierarchy and build E-R (entity
relationship) diagrams4. Explain the characteristics of relational database
management systems and their role in information reporting
5. Explain the nature of data warehouses and data marts, their advantages and disadvantages and their role in data mining
6. Describe the knowledge management system cycle and discuss types of knowledge
2014-01-18
4
Catch [Figure 6-2]
Organizing Data in a Traditional File EnvironmentOrganizing Data in a Traditional File Environment
File organization concepts• Bit: Smallest unit of data; binary digit (0,1) • Byte: Group of bits that represents a single character • Field: Group of words or a complete number • Record: Group of related fields • File: Group of records of same type
• Database: Group of related files• Entity: Person, place, thing, event about which
information is maintained• Attribute: Description of a particular entity• Key field: Identifier field used to retrieve, update, sort a
record
Data design for database and file-based systems
15
Data term for database systems
Examples Data term for file-based systems
Entity OR instance (information about one part)
137, Door Latch, $22.00, 8259
Record
Entity class (OR flat file if extracted from the database)
(all of the information for a particular entity type)
File (or table)
Data design for database and file-based systems (p. 73)
16
Data term for
database systems
Examples Data term for file-
based systems
Attribute Part_Name:Door Latch
Field
Primary key OR identifier
137 (unique part number)
Same: Primary key
Secondary keyor Foreign key
8259 (Supplier_Number)
Same: Secondary key
2014-01-18
5
An Example of a Table
Records
AttributesEntity class = Part
Primary Keys & Foreign Keys
To ensure that each record is unique in each table, we can set one field to be a Primary Key field.
A Primary Key is a field that that will contain no duplicates and no blank values.
Example
Entity class Attributes Primary Key
Based on the report above, find the entity classes, attributes and primary key that are in the database which the report issued from.
Employee
Department
Job
Employee id, employee name
Dept. Num.Dept. Name, Num of employees
Job #, Jon name, Hours
Employee #
Dept. #
Job. #
Designing Databases – Data ModelData Model• A map or diagram that represents entities and
their relationships• Used by Database Administrators to design tables
with their corresponding associations
2014-01-18
6
Entity-Relationship Modeling (ER)
Database designers plan the database design in a process called entity-relationship (ER) modeling .
ER diagrams consists of entities, attributes and relationships.
Entities:Entity instance–person, place, object, event, concept (often corresponds to a row in a table)Entity class–collection of entities (often corresponds to a table)
Relationships:Relationship instance–link between entities (corresponds to primary key-foreign key equivalencies in related tables)Relationship type–category of relationship…link between entity types
Attribute –property or characteristic of an entity or relationship type (often corresponds to a field (column) in a table)
E-R Model Constructs
Cardinality of Relationships
• One-to-One– Each entity in the relationship will have exactly one
related entity
• One-to-Many– An entity on one side of the relationship can have
many related entities, but an entity on the other side will have a maximum of one related entity
• Many-to-Many– Entities on both sides of the relationship can have
many related entities on the other side
Entity-Relationship diagram models also document what is happening with data (p. 118)
2014-01-18
7
In Class Entity-relationship diagram practice
Let design an E-R diagram for a movie database• Movies have: title, year of release, length (minutes), genre (e.g.
comedy, action), directors, actors, rating (e.g. Pg, A), studio• Studios (with a name and address) produce one or more movies• Actors (name and date of birth) have roles in one or more movies
• Directors (name and date of birth) direct one or more movies
Video explaining how to do E-R diagramshttp://www.youtube.com/watch?gl=CA&v=mQ4D0drMrYI
Solution
Learning Objectives
1. Describe the difficulties of managing data 2. Explain how data governance is facilitated by master
data management3. Use the data hierarchy and build E-R (entity
relationship) diagrams4. Explain the characteristics of relational database
management systems and their role in information reporting
5. Explain the nature of data warehouses and data marts, their advantages and disadvantages and their role in data mining
6. Describe the knowledge management system cycle and discuss types of knowledge
Problems with the traditional file environment• Data redundancy- The presence of duplicate data in multiple data files
so that the same data are stored in more than one place or location.
• Data inconsistency : The same attribute may have different values.
• Program-data dependence - The coupling of data stored in files and the specific programs required to update and maintain those files such that changes in programs require changes to the data
• Lack of flexibility - A traditional file system can deliver routine scheduled reports after extensive programming efforts, but it cannot deliver ad-hoc reports or respond to unanticipated information requirements in a timely fashion
• Poor security - Management may have no knowledge of who is accessing or making changes to the organization’s data
• Lack of data sharing and availability - Information cannot flow freely across different functional areas or different parts of the organization.
Organizing Data in a Traditional File Environment
2014-01-18
8
Database management systems (DBMS)• How a DBMS solves the problems of the
traditional file environment
The Database Approach to Data Management
A Database Management System (DBMS) is a set of computer programs that controls the creation, maintenance, and the use of the database with computer as a platform or of an organization and its end users
Disadvantages of Database Management Systems
30
• More complex (and costly) to set up and maintain
• Complex structures may be slower for processing high volume periodic transaction updates
• How a photo-sharing site manages its data (video):http://news.cnet.com/1606-2_3-23122.html
Using the relational database model
31
• Based on linked two-dimensional tables• Comprises:
– Query language (SQL)– Data dictionary– Normalization process
• How would our student information be organized in a relational table?
H.W:Practice Question ABM Inc. looks at data management and relational database aspects.
Relationships
2014-01-18
9
Practice Question 2
DBMS at China Sports Lottery
Learning Objectives
1. Describe the difficulties of managing data 2. Explain how data governance is facilitated by master
data management3. Use the data hierarchy and build E-R (entity
relationship) diagrams4. Explain the characteristics of relational database
management systems and their role in information reporting
5. Explain the nature of data warehouses and data marts, their advantages and disadvantages and their role in data mining
6. Describe the knowledge management system cycle and discuss types of knowledge
What is a data warehouse?• A specialized form of database• An architectural structure that defines how
historical data is stored• Linking components of the data warehouse via
data communications increases the scope of data available
Advantages of data warehouses
• Data is:–Organized and consistent– Integrated (and possibly ‘cleansed’)–Historical and nonvolatile–Optimized for access (for OLAP use, multi-
dimensional)
2014-01-18
10
Disadvantages/constraints of data warehouses
VERY costly and complex to establish (hardware, software and people)
Requires continual maintenance as supporting applications change
Requires high levels of security to ensure access to authorized users
What Is a Data Warehouse Used for?
• Knowledge discovery– Making consolidated reports
– Finding relationships and correlations, trends and patterns of behavior.
– Data mining
– Examples• Banks identifying credit risks• Insurance companies searching for fraud• Medical research
Data Marts
• How does a data mart differ from a data warehouse?
• Data Mart – A logical subset of the complete data warehouse. Often viewed as a restriction of the data warehouse to a single business process or to a group of related business processes targeted toward a particular business group.
Learning Objectives
1. Describe the difficulties of managing data 2. Explain how data governance is facilitated by master
data management3. Use the data hierarchy and build E-R (entity
relationship) diagrams4. Explain the characteristics of relational database
management systems and their role in information reporting
5. Explain the nature of data warehouses and data marts, their advantages and disadvantages and their role in data mining
6. Describe the knowledge management system cycle and discuss types of knowledge
2014-01-18
11
Data life cycleData: elementary description of transactions that are recorded, classified, and stored but not organized to convey any specific meaning.
Information : refers to data that have been organized so that they have meaning and value to the recipient.
Knowledge : data and/or information that has beenorganized and processedto convey understanding,experience, accumulatedlearning, and expertise.
Wisdom : knowledge accumulated and applied
41
The data life cycle in an organization, Fig 3.1, p. 114
42
Knowledge Management
Knowledge management (KM)
is a process that helps organizations manipulate important knowledge that is part of the organization’s memory, usually in an unstructured format
http://www.youtube.com/watch?v=aIM9hmI-t6w&feature=related
Two Kinds of Knowledge
Knowledge is intangible, dynamic, and difficult to measure, but without it no organization can survive.
• Tacit : or unarticulated knowledge is more personal, experiential, context specific, and hard to formalize; is difficult to communicate or share with others; and is generally in the heads of individuals and teams.
• Explicit : explicit knowledge can easily be written down and codified.
2014-01-18
12
People need to know what to do with data?
45
• Data mining / data analytics– http://www.youtube.com/watch?v=BjznLJcgSFI
• Big data, TEDxUofM - Jameson Toole -Big Data for Tomorrow– http://www.youtube.com/watch?v=HSVQ5RDBEJs
Knowledge Management System Cycle
Create knowledge
Capture knowledge
Refine knowledge
Store knowledge
Manage knowledge
Disseminate knowledge
http://www.youtube.com/watch?v=9vm77Ge2Kxs
http://www.youtube.com/watch?v=LYq9jmVtQU8
SAP LAB
SAPGUI - System ID GB2
2014-01-18
13
Add the “GB2” connectionIf it missing.
•Highlight (single left click)
the “Connections” entry in
the left hand pane of the
Logon screen.
•While pointing at the
highlighted entry, right
click.
•Select “Add a New Entry”
from the floating menu.
• Click on “Next”.
•Fill in the GB2 connection
details:
Description: “GB2”
Application Server:
“GB2.UCC.UWM.EDU”
Instance Number: “00” [two
zeros]
System ID: “GB2”
Click on “Finish”.
Client Number: 527
527
2014-01-18
14
Student Userids:GBI-### (See list posted for your ### )Case sensative
GBI-###
Do not try to delete the
asterisks. Enter the password
“###” in the SAP instruction
document means
your specific GBI #,
i.e. the unique number
you have after the “GBI-“
in your user ID login.
Example: GBI-222 The ### in the assignment w ill be : 222
When you log on for the first time the system will request you to change your password.
Enter the initial password provided here and change it to the LAST SIX DIGITS of your STUDENT ID. Maintain this password throughout the course and don’t change it.
Student initial Passwords:Fall2511*case sensitive
•Next, the following window will
pop up. Click on
Introduction to Navigation in SAP Solutions and Products
Open the file : Navigation exercise.doc in SAP (Session 3) tab.
2014-01-18
15
Assignment 1
Now you can start working on Assignment 1Please go to SAP Assignment 1
In session 4 tab
Please note that I will not be able to assist your SAP concerns over email, therefore we
will have next session a SAP session as well
Extra Reading
Normalization
Normalization is a method for analyzing and reducing a relational database to its most streamlined form for:� Minimum redundancy
� Maximum data integrity
� Best processing performance
Normalized data is when attributes in the table depend only on the primary key.
Non-Normalized Relation
2014-01-18
16
Normalizing the Database (part A) Normalizing the Database (part B)
Normalization Produces Order What is the ultimate purpose of a database management system?
Data Information Knowledge Action/wisdomAction/wisdom
Its to transform
Data driven decision making
http://www.youtube.com/watch?v=M3-4lT1K1Fc&feature=related
2014-01-18
17
Metadata
The term "meta" comes from a Greek word that denotes something of a higher or more fundamental nature. Metadata, then, is data about other data. The term refers to any data used to aid the identification, description and location of networked electronic resources
What is Metadata ?
time period
author
sources
(file) size
•title
•supplemental •information
•abstract
©2005 CSC Brands, L.P. All Rights Reserved
Metadata is analogous to product labeling.
What is Metadata?
©2005 CSC Brands, L.P. All Rights Reserved
•entity
•attributes
Metadata
2014-01-18
18
Query Languages
The most popular why to request for information from a database is by using query languages.
SQL Short for Structured Query Language. A standard protocol used to request information from databases. Servers which can handle SQL are known as SQL servers.
Sample of an SQL query.
Online Analytical Processing (OLAP)
Supports multidimensional data analysis, enabling users to view the same data in different ways using multiple dimensions� Each aspect of information—product, pricing, cost,
region, or time period—represents a different dimension
� E.g. Comparing sales in East in June vs. May and July
Enables users to obtain online answers to ad hoc questions such as these in a fairly rapid amount of time
The view that is showing is product versus region.
If you rotate the cube 90 degrees, the face that will show is product versus actual and projected sales.
If you rotate the cube 90 degrees again, you will see region versus actual and projected sales.
Other views are possible.
Multidimensional Data Model
2014-01-18
19
Data warehousing at HBC (see Fig 4.11, p. 121)
• HBC spent three years implementing data marts and data query processes
• It was difficult to get business users to give up high levels of detail
• Expected to improve access to data for over 2,000 employees
Example - From Exam
The Board of Directors of RTAL agreed that setting up a DBMS for client and candidate records is the way to go. They would like to know what data analysis tools are available to them once a DBMS is set up. You suggest data mining.Define data mining and provide two examples of how RTAL can use data mining to improve its operational efficiencies.
Legislation can affect data archiving requirements
75
• Two relatively recent legislative changes that have an effect on storage:– In Canada, PIPEDA (Personal Information Protection and
Electronic Documents Act) came into effect January 1, 2004– In the U.S. Sarbanes-Oxley July 2002
• More local bills also provide requirements for information system implementation and management– For example, Texas State Legislature Bill 3740 has requirements
for data management, systems implementation and data governance, See: http://www.legis.state.tx.us/tlodocs/81R/billtext/html/HB03740I.htm
Team work created this course
• The materials for this course were developed by Cristobal Sanchez-Rodriguez and Ingrid Splettstoesser, with the help of:– Hila Cohen– Ken Cudeck– Marius Dobre– John Kucharczuk– Carl Lapp– Donna Rex– Mario Vasilkovs