lecture 1 1dr. nawaz khan, school of computing science e-mail: n.x.khan@mdx.ac.uk bis4435 – data...
Post on 12-Jan-2016
214 Views
Preview:
TRANSCRIPT
Lecture 1
1Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
BIS4435 – Data Management for Decision Support
BIS4435 – Database Management for Decision Support
Dr. Nawaz KhanSchool of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Lecture 1
2Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
BIS4435 - Online Database Systems
Coursework
Discussion Topics Each week you will have a discussion topic (DQ) DT will be available on Fridays. Reply to the DQ by Tuesdays Review and comment on your mates’ responses (at least one).
DQ requirements Initial response to DQ should be 250 words. Review and comments on your mates’ initial response should be
150 words. Appropriate references at the end of the text and proper citation of
references through out the text (avoid quotes). Harvard style ref. Print your initial post, response to your mates’ initial response and
any comment on your post that is made by your mates. Preserve it in order to attach it with your CW portfolio.
Lecture 1
3Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
More on Coursework: Self Reflection
Make sure you have one/one and half pages self reflection on both the coursework.
It should contain: What you have achieved so far What you find the most difficult thing What you find the easiest thing What challenges you face to do the coursework How have you overcome the challenges What motivated you What are the new concepts you come across and how it
helps to achieve the learning outcomes What you could have done better
Lecture 1
4Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Introduction
Copy coursework from someone ?? See University Plagiarism policy in students’ handbook Don’t forget to put references (follow specific guidelines for
referencing) Acknowledge properly, since collusions are not accepted Make sure you have cited your references
Examination: followed by the University regulations Contents of the module: 3 main parts
Theoretical aspects of relational database Large Database: Data Warehouse and Data Mining Decision Support System
General diagram of the module: see next slide
Lecture 1
5Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Theoretical Aspects
of Relational DB
Example Relational DB: Oracle
Introduction
Uncertainty, probability and Linear model, feed forward network, self-organising map,
DB Back-End Decision support System
Large Data Management
Data Strategy and Data Warehousing
Intelligent Database and Data Mining
Expert system and knowledge engineering
Fuzzy logic, genetic algorithm and hybrid intelligent
Lecture 1
6Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Reading Materials
Connolly, T.M., and Begg, C.E., Database Systems: A Practical Approach to Design, Implementation and Management, Addison-Wesley, 4th Edition, ISBN: 0321210255
Belavkin, R., Blundell, B., Cairns, P., Huyck, C., Mitchell, I., Stockman, T. (2005). Management Support Systems, Middlesex University Press, ISBN: 1-898253-68-4
Module Learning Units at OasisPlus
Lecture 1
7Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Introduction
File-based approach Shared file approach Database approach: 3-schema architecture
External schema Conceptual schema Internal schema
DBMS components Characteristics of the database approach Brief history of database systems and applications Extending database capabilities for new applications Reading suggestion:
Connolly, T.M., and Begg, C.E., Database Systems: A Practical Approach to Design, Implementation and Management, Addison-Wesley, 4th Edition, ISBN: 0321210255(chapters 1)
Global campus materials on OASIS: http://oasis.mdx.ac.uk/ (unit 1)
Lecture 1
8Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Introduction
File-based approach Data is stored in one or more separate computer files Data is then processed by computer programs -
applications Problems:
Data redundancy Data inconsistency
Lecture 1
9Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Introduction
CustomerInvoicing
PurchaseOrders
CustomerOrders
OrderFile
CustomerFile
CustomerFile
StockFile
StockFile
OrderFile
SupplierFile
StockFile
Applications
Files
StockControl
StockFile
OrderFile
Applications
Files
PurchaseOrders
StockControl
CustomerOrders
CustomerFile
StockFile
OrderFile
CustomerInvoicing
SupplierFile
File-based approach Shared file approach
Lecture 1
10Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Introduction
Shared file approach Data (files) is shared between different applications Data redundancy problem is alleviated Data inconsistency problem across different versions of the
same file is solved Other problems:
Rigid data structure: If applications have to share files, the file structure that suits one application might not suit another
Physical data dependency: If the structure of the data file needs to be changed in some way, this alteration will need to be reflected in all application programs that use that data file
No support of concurrency control: While a data file is being processed by one application, the file will not be available for other applications or for ad hoc queries
Lecture 1
11Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Introduction
Database approach DataBase Management System (DBMS): a general-
purpose software system that facilitates the processes of defining, constructing, manipulating, and sharing databases among various users and applications.
Database: a collection of related data managed by a DBMS Data: known facts that can be recorded and that have
implicit meaning Database system = the database + DBMS software DBMS provides facilities for querying, data security, and
integrity and concurrent control Database application - set of programs that use DBMS to
perform a particular business function
Lecture 1
12Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Introduction
Database approach Advantage - physical
and logical data independence - achieved by hierarchy of levels of data specification
ExternalSchema 1
ExternalSchema 2
ExternalSchema N
InternalSchema
Interface betweenconceptual schemaand internal schema
Interface betweenconceptual schema and
external schemas
ConceptualSchema
Databasephysically storedin files on disks
Lecture 1
13Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Introduction
Database approach External schema:
Describes database as it is seen by user and applications -reflects a simplified model of the world
Allows applications to see as much of data as they require, while excluding unrelated data items
Interfaces with conceptual schema May be modified or created without altering physical storage of
data, modification reflected in the interface
Lecture 1
14Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Introduction
Database approach Conceptual schema:
Describes the universe of interest to the users of the database system - data required
Concerned with data rather than storage or access, concentrates on describing entities, data types, relationships, user operations, and constraints
Interfaces with external and internal schema Logical data independence is the capability to change the
conceptual schema without having to change external schemas or application programs
Lecture 1
15Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Introduction
Database approach Internal schema:
Definition of the way in which data is physically stored Interface with conceptual schema identifies how an item in the
conceptual schema is stored and accessed Physical data independence is the capability to change the
internal schema without having to change the conceptual schema. Hence, the external schemas need not to be changed as well
Lecture 1
16Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Introduction
File-based approach Shared file approach Database approach: 3-schema architecture
External schema Conceptual schema Internal schema
DBMS components Characteristics of the database approach Brief history of database systems and applications Extending database capabilities for new applications
Lecture 1
17Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Introduction
DBMS components DBMS engine - central component User interface - languages & interfaces Data dictionary - data structure, user & access right, rules Performance management - query optimisation & DBMS
reorganisation Data integrity - intra-record, referential integrity &
concurrency Backup and recovery - log Application development - CASE tool Security management - protect and control access to
database & data dictionary See the reading suggestion for more details
Lecture 1
18Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Introduction
Characteristics of the database approach Self-describing nature of a database systems Insulation between programs and data, and data abstraction
Program-data independence + Program-operation independence = Data abstraction
A data model is a type of data abstraction
Support of multiple views of the data Sharing of data and multi-user transaction processing Other advantages of using the DBMS approach
Controlling redundancy Restricting unauthorized access Providing persistent storage for program objects Providing storage structures for efficient query processing Providing backup and recovery
Lecture 1
19Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Introduction
Characteristics of the database approach Other advantages of using the DBMS approach (cont.)
Providing multiple user interfaces Representing complex relationships among data Enforcing integrity constraints Permitting inference and actions using rules: deductive
database systems
Lecture 1
20Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Introduction
Extending database capabilities for new applications Example applications: storage and retrieval of images,
videos, data mining (large amounts of data need to be stored and analyzed), spatial databases, time series applications, …
More complex data structures than relational representation New data types except for the basic numeric and character
string types New operations and query languages for new data types New storage and retrieval methods
Lecture 1
21Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Distributed Database
A distributed database management systems employ a number of computer workstations at different sites. These workstations are part of a local network system. The workstations contain a set of hardware and software that allow them to be an integral part of this network and the DDBMS must rely on these network components for its data exchange. The workstations are needed to be attached to each other through a communication media that allow the sites to interact and to carry data
Lecture 1
22Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Introduction
Summary: Introduction to the module, BIS4435File-based approach Shared file approach Database approach DBMS components Characteristics of the database approach Brief history of database systems and applications Distributed Database Reading suggestion: do not forget !!
Next week: Relational Data Model Reading suggestion:
[1]: Chapter 1 (The relational data model and relational database constraints)
[3]: Unit 1: Global Campus Unit
Lecture 1
23Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
BIS4435
BIS4435 – Data Management for Decision Support
Lecture 2: Relational Data Model
Dr. Nawaz KhanSchool of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Lecture 1
24Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
BIS4435
Introduction
Unit 1: Introduction to the Module: Introduction to the DatabaseUnit 2: Fundamentals of Relational and Object Model:
Reading Suggestion:
Unit 1:Connolly, T.M., and Begg, C.E., Database Systems: A Practical Approach to Design, Implementation and Management, Addison-Wesley, 4th Edition, ISBN: 0321210255(chapters 1)Global campus materials on OASIS: http://oasis.mdx.ac.uk/ (unit 1)
Unit 2:
Connolly, T.M., and Begg, C.E., Database Systems: A Practical Approach to Design, Implementation and Management, Addison-Wesley, 4th Edition, ISBN: 0321210255(chapters 3)
Lecture 1
25Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Relational Data Model
Outline Basic Concepts: relational data model, relation schema,
domain, tuple, cardinality & degree, database schema, etc. Relational Data Model Constraints
key, primary key & foreign key entity integrity constraint referential integrity
Update Operations on Relations insert deletion modification
Summary Q&A
Lecture 1
26Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Relational Data Model
Basic concepts Relational data model: represents a database in the form of
relations - 2-dimensional table with rows and columns of data. A database may contain one or more such tables. A relation schema is used to describe a relation
Relation schema: R(A1, A2,…, An) is made up of a relation name R and a list of attributes A1, A2, . . ., An. Each attribute Ai is the name of a role played by some domain D in the relation schema R. R is called the name of this relation. The degree of a relation is the number of attributes n of its relation schema.
Domain D: D is called the domain of Ai and is denoted by dom(Ai). It is a set of atomic values and a set of integrity constraints
STUDENT(Name, SSN, HomePhone, Address, OfficePhone, Age, GPA)
Degree = ??
dom(GPA) = ??
Lecture 1
27Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Relational Data Model
Basic concepts Tuple: row in table, record Cardinality: number of tuples (rows) Database schema S = {R1, R2,…, Rm} A relation (or relation state, relation instance) r of the relation
schema R(A1, A2, . . ., An), also denoted by r(R), is a set of n-tuples r = {t1, t2, . . ., tm}. Each n-tuple t is an ordered list of n values t = <v1, v2, . . ., vn>, where each value vi, i=1..n, is an element of dom(Ai) or is a special null value. The ith value in tuple t, which corresponds to the attribute Ai, is referred to as t[Ai]
Relational data modelDatabase schemaRelation schema
RelationTuple
Attribute
Lecture 1
28Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Relational Data Model
A relation can be conveniently represented by a table, as the above example shows
The columns of the tabular relation represent attributes Each attribute has a distinct name, and is always referenced by
that name, never by its position Each row of the table represents a tuple. The ordering of the
tuples is immaterial and all tuples must be distinct
Lecture 1
29Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Relational Data Model
Outline Basic Concepts: relational data model, relation schema,
domain, tuple, cardinality & degree, database schema, etc. Relational Data Model Constraints
key, primary key & foreign key entity integrity constraint referential integrity
Update Operations on Relations insert deletion modification
Summary Q&A
Lecture 1
30Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Relational Data Model
Relational Data Model Constraints Key, primary key & foreign key
Key: A key K of a relation R is a subset of the attributes of R which has the following time-independent properties:
– Unique identification: The value of K must uniquely identify each tuple in R
– Non-redundancy: No attribute in K can be discarded without destroying property 1
Primary Key: There may be more than one set of attributes which satisfy both properties. Each of them is called a candidate key. In this case only one candidate key must be chosen as the key called primary key. Primary key (PK) cannot contain null value
Lecture 1
31Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Relational Data Model
Relational Data Model Constraints Key, primary key & foreign key
Single primary key:TASK (TASK_NAME, START_DATE,EXPECTED_COMP_DATE,
COMP_DATE, EMPNO)
Combined primary key: If we want to store in the task table, the details of tasks for a number of different projects, TASK_NAME is no longer a unique identifier. The solution is to use a further attribute to provide a unique identifier for each task.
TASK (PROJECT_NAME, TASK_NAME, START_DATE, EXPECTED_COMP_DATE, COMP_DATE, EMPNO)
Lecture 1
32Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Relational Data Model
Example: A relational database schema that we call COMPANY = {EMPLOYEE, DEPARTMENT, DEPT_LOCATIONS, PROJECT, WORKS_ON, DEPENDENT}
Lecture 1
33Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Relational Data Model
Relational Data Model Constraints Key, primary key & foreign key
Foreign Key (FK): A set of attributes is a FK of R1 if and only if it is the primary key of another relation schema R2
FK represents the relationship between R1 and R2 FK can contain null value, but we may avoid this by replacing with a
flag Example:
TASKS (TASK_NAME, START_DATE, EXP_COMP_DATE, COMP_DATE, EMPNO, PROJECT_NAME*)
PROJECT (PROJECT_NAME, START_DATE, EXP_COMP_DATE, COMP_DATE, PROJECT_LEADER)
To form the link between the two tables, we place the primary key of the PROJECT table into the TASK table. Through the use of PROJECT_NAME as a FK, we can see for any given task, the project to which it belongs
Lecture 1
34Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Relational Data Model
Relational Data Model Constraints Entity Integrity Constraint: all entities in a database must be
identified by the primary key. That is why NO primary key value can be null
Referential Integrity Constraint: The value of a foreign key must be meaningful, which means that a tuple in one relation that refers to another relation must refer to an existing tuple in that relation
Example: JobList and Company tables (see notes for labs 02, 03)
JobID CompanyID JobTitle Salary …. JobList
CompanyID CompanyName Address ...Company
Lecture 1
35Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Relational Data Model
Outline Basic Concepts: relational data model, relation schema,
domain, tuple, cardinality & degree, database schema, etc. Relational Data Model Constraints
key, primary key & foreign key entity integrity constraint referential integrity
Update Operations on Relations insert deletion modification
Summary Q&A
Lecture 1
36Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Relational Data Model
Update Operations on Relations Insertion: to insert a new tuple t into a relation R. When
inserting a new tuple, it should make sure that the database constraints are not violated:
The value of an attribute should be of the correct data type (i.e. from the appropriate domain).
The value of a prime attribute (i.e. the key attribute) must not be null
The key value(s) must not be the same as that of an existing tuple in the same relation
The value of a foreign key (if any) must refer to an existing tuple in the corresponding relation (labs !!)
Two options if the constraints are violated: Reject the operation Rectify the operation
Lecture 1
37Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Relational Data Model
Update Operations on Relations Deletion: to remove an existing tuple t from a relation R.
When deleting a tuple, the following constraints must not be violated:
The tuple must already exist in the database The referential integrity constraint is not violated
Four options if the constraints are violated: Reject the operation Rectify the operation to find the existing tuple concerned Also remove the tuples that reference the tuple being deleted Modify the referencing attribute values (i.e. values of foreign
keys) that cause the violation Modification: to change values of some attributes of an
existing tuple t in a relation R Test all above operations and constraints during the labs
Lecture 1
38Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
BIS4229 – Industrial Data Management Technology
ACTIVITY
Lecture 1
39Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Introduction
Summary: Relational Data Model
Basic concepts Constraints Update operations on relations
Reading suggestion: do not forget !! (other relational operations)
Next week: ER Modeling:
Reading suggestion: – [1]: Chapter 3, 7
– [3]: Unit 2
Lecture 1
40Dr. Nawaz Khan, School of Computing ScienceE-mail: n.x.khan@mdx.ac.uk
Relational Data Model
Outline Basic Concepts: relational data model, relation schema,
domain, tuple, cardinality & degree, database schema, etc. Relational Data Model Constraints
key, primary key & foreign key entity integrity constraint referential integrity
Update Operations on Relations insert deletion modification
Summary Q&A
top related