ACS1803 Lecture Outline 2 DATA MANAGEMENT CONCEPTSText, Ch. 3 How do we store data (numeric and character records) in a computer so that we can optimize on storage and retrieve data efficiently and flexibly [different possible combinations]? - we have already seen data management basics in material on MS Access dbms - we now expand this to computers in general
The Database Approach
• Traditional approach to data management:– Each distinct operational system used data files
dedicated to that system
• Database approach to data management:– Pool of related data is shared by multiple
application programs
2
-
firstly, there were individual files for each set of programslater, database approach: o one centralized source of data for all programs
to tap into o minimizes redundancy and provides retrieval
efficiency o can have one database for some subject area:
student database, employee database- how do we structure the data [on disk] for a database?- Different people need different combinations [views] of data; yet the database must satisfy all views
Relational Database
- data set up as a centralized collection of tables
- tables are linked by common columns of data
o - tables are designed together to minimize repetition
- this is the most common database structure
- - this is the one used in microcomputer databases and many larger ones too
The database management system -software that sets up the database structure (‘skeleton’ on disk according to a certain model, fills the structure with data and retrieves the data to provide meaningful information (p. 130 bot) -related to using a dbms:- data dictionary: defines each field and record, explains what each field means and who is authorized to update it (p. 143)-query language: used to extract data that satisfy certain criteria from a database
The Database Management System
Data Dictionary
• Is a document that database designers prepare to help individuals enter data
• Provides several pieces of information about each attribute in the database including:
- Name- Key (is it a key or part of a key?)- Data Type (date, alphanumeric, numeric, etc.)- Valid Value (the format or numbers allowed)
• Can be used to enforce Business Rules which are captured by the database designer to prevent illegal or illogical values from entering the database. (e.g. who has authority to enter certain kinds of data)
Data Dictionary
• Is a document that database designers prepare to help individuals enter data
• Provides several pieces of information about each attribute in the database including:
- Name- Key (is it a key or part of a key?)- Data Type (date, alphanumeric, numeric, etc.)- Valid Value (the format or numbers allowed)
• Can be used to enforce Business Rules which are captured by the database designer to prevent illegal or illogical values from entering the database. (e.g. who has authority to enter certain kinds of data)
6
e.g., SQL: Structured Query Language:- Popular language for making requests to a relational dbms (p. 145 bot) e.g., SELECT LAST NAME, DEPARTMENT, SALARY FROM EMPLOYEE WHERE DEPARTMENT = ‘4530’ AND SALARY > 25000 Popular dbms:
Access - individualOracle - Client / Server network
Data bases and computer networks - - can have centralized database (at one location)- or, fragmented, distributed database: different parts of the database are stored in the locations where they are accessed most often, but continue to be fully accessible to others;
Database Administrator (DBA): person in an IS department responsible for the database (p. 147) Databases on the Web: - the only way for organizations to conduct business on the Internet is to people outside the organizations access to their databases (e.g., items for auction) 138 bot
Text databases Databases are thought of as consisting of structured records of fields; but can also have ‘databases’ of text - - these can be on the web and can use search engines to search for keywords- - such ‘databases’ can come on CD-ROMs
Data Warehousing
- most regular organizational databases are ‘transactional’ and current; their contents change daily- - but transaction data in such a database as of a certain time (a ‘snapshot’) can be archived- - a data warehouse is a large (relational) database of archived data used to gain extra insights to support management decision making- -combines data from databases across the organization
--data mart: smaller collection of data that focuses on a particular subject or department-
DepartmentDatabases
Data Warehouse
Data Mart
Operational Informational
ExtractData
ExtractData
ExtractData
ExtractData
• Day-to-day department transactions
• Used primarily by departments
• Extracted department transactions
• Used for business analysis
• Extracted subset of a data warehouse
• Used for highly specific business analysis
Data Warehouses, Data Marts, and Data Mining (continued)
11
Data Warehousing and Miningp. 155-157
• data in warehouses is accumulated and may reflect many years of business activity
• various business analysis programs (models) can then be applied to the data warehouse to gain insight
• organizations often choose mainframe computers with multiple CPUs to store and manage data warehouses;
•• Data Mining• the process of selecting, exploring, and modelling
large amounts of data to discover previously unknown relationships
- data mining software searches through large amounts of data for meaningful patterns of information
- often used by marketing managers to see purchasing patterns
- can be used in banking: to find profitable customers and patterns of fraud
Data Warehouses, Data Marts, and Data Mining (continued)
• Predictive analysis:– Form of data mining that combines historical
data with assumptions about future conditions to predict outcomes of events
– Used by retailers to upgrade occasional customers into frequent purchasers
– Software can be used to analyze a company’s customer list and a year’s worth of sales data to find new market segments
14
Database Design: - -before we can even start asking a dbms to set up a database structure for us, we must know what data we need to store and in what form- -this work of determining what our database should look like is called database design- - it is the ‘up front’ work in using dbms
- - a first step in database design is data modelling - we look at a business and see what main items we want to keep data about [data entities] and what are the organizational relationships between these items
we then draw a diagram of entities and relationships between them; this is called data modelling - we draw an entity-relationship diagram
(see board)
STUDENT takes COURSEIs taken by
- from this first step a data analyst would eventually determine a database design (schema) (what tables we will have and where the common columns will be) -
Designing Databases - Associations
Entity Relationship Diagram (ERD) • Diagramming tool used to express entity relationships• Very useful in developing complex databases
Example• Each Home Stadium has a Team (One-to-One) • Each Team has Players (One-to-Many)• Each Team participates in Games• For each Player and Game there are Game Statistics
17
More design
• The entity-relationship diagram is then augmented to show attributes of each entity
-Then, a large ERD is “trimmed down’ (normalized) and after that, entities usually become tables and attributes become columns in the tables
Relationships in an ERD
• Relationships between entities in the ERD dictate where the common column should be to relate certain tables
The Relational Model - Example
20
REA Model
• - an ERD where the entities model specific business RESOURCES, EVENTS and AGENTS
is called a Resource Event Agent model
- This model is useful in designing databases for various business information systems (e.g. modern accounting systems)
REA Example
• COOKIES(resource)
Offered for
SALE(ev) org by SALESPERSON (ag)
attended by
CUSTOMER(ag)
Why is database knowledge important?
how you structure your data, so you shall be able to process your data -***a database is the ‘back end’ of a computer-based information system
-See handout (MS Word) on Data Modeling Explanation