![Page 1: CS317 File and Database Systemsmercury.pr.erau.edu/~siewerts/cs317/documents/Lectures/...– Bit-rot (media eventually fails, limited storage lifetime) Variety, Depends on Level of](https://reader035.vdocument.in/reader035/viewer/2022071402/60ede90fc81be058025c9de5/html5/thumbnails/1.jpg)
September 23, 2015 Sam Siewert
CS317 File and Database Systems
Lecture 5, Part-2 – ORDBMS http://www.ibmbigdatahub.com/video/ibm-big-data-minute-drowning-petabytes
![Page 2: CS317 File and Database Systemsmercury.pr.erau.edu/~siewerts/cs317/documents/Lectures/...– Bit-rot (media eventually fails, limited storage lifetime) Variety, Depends on Level of](https://reader035.vdocument.in/reader035/viewer/2022071402/60ede90fc81be058025c9de5/html5/thumbnails/2.jpg)
SQL Theory and Standards
DBMS Design (Connolly-Begg Chapter 10)
Part-2 Development Lifecycle
Sam Siewert
2
![Page 3: CS317 File and Database Systemsmercury.pr.erau.edu/~siewerts/cs317/documents/Lectures/...– Bit-rot (media eventually fails, limited storage lifetime) Variety, Depends on Level of](https://reader035.vdocument.in/reader035/viewer/2022071402/60ede90fc81be058025c9de5/html5/thumbnails/3.jpg)
For Discussion… Big Data – Velocity, volume, variety, veracity [2014] 1. Daily – 2.5 quintillion bytes (2,500,000,000,000,000,000) or 2 Exabytes, or
46,566,128 50GB Blu-Ray Discs, IBM Estimate
2. Annually – 7.5 billion in global population, produce/consume 2.25 unique Blu-Rays per Year, or 23 DVDs (assuming even distribution – unlikely)
3. Annually – If produced/consumed by US population alone – 53 Blu-Rays per Year or 564 DVDs per person
4. Data in Total is 40 trillion gigabytes or 800 billion Blu-Rays for just over 100 (unique) Blu-Rays per person globally
5. Data by Powers of 10 and 2 – 264 is 16 Exabytes of Addressable Data [PC limit]
6. Data Max Veolicity is 100 Gbps is Fastest Ethernet [8b/10b – 10 billion bytes per second]
7. How much is Truly Unique Data vs. Duplicated
8. What is the Quality (Veracity) of this Data?
Sam Siewert 3
![Page 4: CS317 File and Database Systemsmercury.pr.erau.edu/~siewerts/cs317/documents/Lectures/...– Bit-rot (media eventually fails, limited storage lifetime) Variety, Depends on Level of](https://reader035.vdocument.in/reader035/viewer/2022071402/60ede90fc81be058025c9de5/html5/thumbnails/4.jpg)
Big Data Volume and Velocity Can Be Estimated as Shown – Disk drives shipped and in use – Online data only, or removable and archive media as well? – Bit-rot (media eventually fails, limited storage lifetime)
Variety, Depends on Level of Data Duplication – Enterprise Storage System Deduplication – E.g. EMC Deduplication – Internet Archive [petabytes] and Wayback machine,
http://www.loc.gov/about/general-information/ [traditional volumes], Stanford Digital Repository, National Archives, National A/V Conservation
Veracity, perhaps Most Challenging Part – Is the Data Correct – Not Corrupted – Is it Valid – From a Known, Trusted Source, Corresponding to
Metadata Description – Has the Data Been Processed and if so, How? – Is it Raw Data (from a sensor, user, other)? – Veracity is difficult – E.g. http://berkeleyearth.org/about-data-set
Sam Siewert 4
![Page 5: CS317 File and Database Systemsmercury.pr.erau.edu/~siewerts/cs317/documents/Lectures/...– Bit-rot (media eventually fails, limited storage lifetime) Variety, Depends on Level of](https://reader035.vdocument.in/reader035/viewer/2022071402/60ede90fc81be058025c9de5/html5/thumbnails/5.jpg)
Quiz #2
Let’s Go Over it …
Sam Siewert
5
![Page 6: CS317 File and Database Systemsmercury.pr.erau.edu/~siewerts/cs317/documents/Lectures/...– Bit-rot (media eventually fails, limited storage lifetime) Variety, Depends on Level of](https://reader035.vdocument.in/reader035/viewer/2022071402/60ede90fc81be058025c9de5/html5/thumbnails/6.jpg)
Quiz #2 Average was 68.3, Std. Deviation was 17.5 - Primarily Need to Study Book More Quiz #1 – 81.5, 8.5 (Ideal) – Mostly from In-Class Notes Let’s Go Over Solutions Now with Book Citations Solutions Provide References Back to the Book – Posted on Canvas as Well
Sam Siewert 6
![Page 7: CS317 File and Database Systemsmercury.pr.erau.edu/~siewerts/cs317/documents/Lectures/...– Bit-rot (media eventually fails, limited storage lifetime) Variety, Depends on Level of](https://reader035.vdocument.in/reader035/viewer/2022071402/60ede90fc81be058025c9de5/html5/thumbnails/7.jpg)
Quiz #2 - Review
Sam Siewert 7
Equi-join is a specific type of Theta-Join where the Predicate tests for EQUIVALENCE ONLY
Review BOOK citations for Correct Answer Carefully before Next Quiz and Exam
![Page 8: CS317 File and Database Systemsmercury.pr.erau.edu/~siewerts/cs317/documents/Lectures/...– Bit-rot (media eventually fails, limited storage lifetime) Variety, Depends on Level of](https://reader035.vdocument.in/reader035/viewer/2022071402/60ede90fc81be058025c9de5/html5/thumbnails/8.jpg)
Quiz #2 - Review
Sam Siewert 8
See p. 119, 132, 1) Selection [Restriction], 2) Projection [Projection], 3) Union [Join – Specific Union], 4) Set Difference [Codd Omits], 5) Cartesian Product [Permutation]
Encouraged! See Class Notes and Example of TC,RA, and Use of DISTINCT
Review BOOK citations for Correct Answer Carefully before Next Quiz and Exam
![Page 9: CS317 File and Database Systemsmercury.pr.erau.edu/~siewerts/cs317/documents/Lectures/...– Bit-rot (media eventually fails, limited storage lifetime) Variety, Depends on Level of](https://reader035.vdocument.in/reader035/viewer/2022071402/60ede90fc81be058025c9de5/html5/thumbnails/9.jpg)
Required [Except Intersection]
Pearson Education © 2014 9
intersection can be composed as R – (R – S)
![Page 10: CS317 File and Database Systemsmercury.pr.erau.edu/~siewerts/cs317/documents/Lectures/...– Bit-rot (media eventually fails, limited storage lifetime) Variety, Depends on Level of](https://reader035.vdocument.in/reader035/viewer/2022071402/60ede90fc81be058025c9de5/html5/thumbnails/10.jpg)
Nice to Have! - Relational Algebra Operations – Composed from Required
Pearson Education © 2014 10
![Page 11: CS317 File and Database Systemsmercury.pr.erau.edu/~siewerts/cs317/documents/Lectures/...– Bit-rot (media eventually fails, limited storage lifetime) Variety, Depends on Level of](https://reader035.vdocument.in/reader035/viewer/2022071402/60ede90fc81be058025c9de5/html5/thumbnails/11.jpg)
Quiz #2 - Review
Sam Siewert 11
Review BOOK citations for Correct Answer Carefully before Next Quiz and Exam
![Page 12: CS317 File and Database Systemsmercury.pr.erau.edu/~siewerts/cs317/documents/Lectures/...– Bit-rot (media eventually fails, limited storage lifetime) Variety, Depends on Level of](https://reader035.vdocument.in/reader035/viewer/2022071402/60ede90fc81be058025c9de5/html5/thumbnails/12.jpg)
PK, FK EQUIVALENCE Book Says that EQUIVALENCE for Equi-Join is Predicate that Uses “=“ – p. 126 (bottom) This is Simplistic, especially for Multi-table Joins and PKs formed from more than One Attribute E.g. if(X == Y) Can in Fact Involve a Complex Comparison – E.g. if X is a vector = [1, 1, 3] and Y is a vector, then
EQUIVALENCE requires Comparison of Each Component – If((X[0] == Y[0]) && (X[1] == Y[1]) && (X[2] == Y[2]))
Likewise, Consider Simple Tuples of FirstName, LastName, DoB [PK=FirstName, LastName] Another Relation [FK=FirstName, LastName] with Street Address, City, Zipcode Sam Siewert 12
![Page 13: CS317 File and Database Systemsmercury.pr.erau.edu/~siewerts/cs317/documents/Lectures/...– Bit-rot (media eventually fails, limited storage lifetime) Variety, Depends on Level of](https://reader035.vdocument.in/reader035/viewer/2022071402/60ede90fc81be058025c9de5/html5/thumbnails/13.jpg)
Join Cheat Sheet http://www.codeproject.com/KB/database/Visual_SQL_Joins/Visual_SQL_JOINS_orig.jpg
Sam Siewert 13
![Page 14: CS317 File and Database Systemsmercury.pr.erau.edu/~siewerts/cs317/documents/Lectures/...– Bit-rot (media eventually fails, limited storage lifetime) Variety, Depends on Level of](https://reader035.vdocument.in/reader035/viewer/2022071402/60ede90fc81be058025c9de5/html5/thumbnails/14.jpg)
JOINS You Must Know MySQL Join Support – Inner, Cross, Left, Right, Outer, Natural, Multi-table with Predicates (Theta and Equi-Join) Cross-Join [p. 171, Matches Theory p. 126] Theta-Join [p. 170 – 3 Table Join] Equi-Join [p. 168-169] Natural-Join (Rarely Used, but Matches Theory on p. 127) Inner-Join (Not in Book! But, Common in MySQL) Alternative Form – Nested Queries [p. 164] Other Joins You are Not Responsible For (Less Useful)
Sam Siewert 14
![Page 15: CS317 File and Database Systemsmercury.pr.erau.edu/~siewerts/cs317/documents/Lectures/...– Bit-rot (media eventually fails, limited storage lifetime) Variety, Depends on Level of](https://reader035.vdocument.in/reader035/viewer/2022071402/60ede90fc81be058025c9de5/html5/thumbnails/15.jpg)
Connolly-Begg Chapter 9
ORDBMS Extensions to SQL (SQL:2011)
Part -2
Sam Siewert
15
![Page 16: CS317 File and Database Systemsmercury.pr.erau.edu/~siewerts/cs317/documents/Lectures/...– Bit-rot (media eventually fails, limited storage lifetime) Variety, Depends on Level of](https://reader035.vdocument.in/reader035/viewer/2022071402/60ede90fc81be058025c9de5/html5/thumbnails/16.jpg)
Unstructured Data BLOBs - Binary Large Objects – Images – Digital Video and Audio – Digital Media – Binary Data (Documents and Code), Perhaps Proprietary – http://mercury.pr.erau.edu/~siewerts/extra/images/example-
images/Moose-to-Skeleton.png – http://mercury.pr.erau.edu/~siewerts/extra/images/example-
images/Sled-Dogs.jpg – http://mercury.pr.erau.edu/~siewerts/extra/images/example-
images/korean-air-profile.jpg
CLOBs – Character Large Objects – Log files and Traces (IT) – Transaction Logs – XML, HTML, XDS, etc. [Web documents typically via HTTP,
HTTPS]
Sam Siewert 16
![Page 17: CS317 File and Database Systemsmercury.pr.erau.edu/~siewerts/cs317/documents/Lectures/...– Bit-rot (media eventually fails, limited storage lifetime) Variety, Depends on Level of](https://reader035.vdocument.in/reader035/viewer/2022071402/60ede90fc81be058025c9de5/html5/thumbnails/17.jpg)
OO Concepts – “Real World” OOA – Object Oriented Analysis – Define Class Hierarchies (Abstract Classes with Attributes) and
Interfaces (Public, Private) and Methods (Operations) – Inheritance and Multiple Inheritance
OOD – OO Design – Encapsulation of Methods with Data (Attributes) for Abstract and
Derived Classes – Instantiation and Use of Objects [Use Cases]
OOP – Object Oriented Programming (Java, C++, …) – Programming Language – Direct Implementation of OOD – Implementation of Re-useable OO Code Libraries
Boost - http://www.boost.org/ OpenCV [C++ version] Many More … in other OOPLs
Sam Siewert 17
![Page 18: CS317 File and Database Systemsmercury.pr.erau.edu/~siewerts/cs317/documents/Lectures/...– Bit-rot (media eventually fails, limited storage lifetime) Variety, Depends on Level of](https://reader035.vdocument.in/reader035/viewer/2022071402/60ede90fc81be058025c9de5/html5/thumbnails/18.jpg)
Classes Useful in Real World E.g. Biology – Kingdom, Phylum, Class, Order, Genus, Species [Multiple Inheritance Examples], Proven Use Parts – Components compose Sub-system(s) compose System(s) compose System of Systems Supports Re-Use of Objects Instantiated from Class Hierarchy Multiple Inheritance – Odd? Can be Abstract, Derived and Concrete
– E.g. Mathematical, Data Structures, Image Processing
– Organization of Information (Classes in Ontological Web Language)
– Simulation of Physical Systems – Most Often Software Libraries
Sam Siewert 18
http://en.wikipedia.org/wiki/Platypus#mediaviewer/File:Wild_Platypus_4.jpg
https://www.youtube.com/watch?v=kDay5OWDPn4#t=26
![Page 19: CS317 File and Database Systemsmercury.pr.erau.edu/~siewerts/cs317/documents/Lectures/...– Bit-rot (media eventually fails, limited storage lifetime) Variety, Depends on Level of](https://reader035.vdocument.in/reader035/viewer/2022071402/60ede90fc81be058025c9de5/html5/thumbnails/19.jpg)
Quick Review of OO [not just C++] Encapsulation of Data and Methods in an Instantiated Object Objects are Instances from a Class Hierarchy
– Classes Define Encapsulated Data and Methods Virtual Functions can Be Refined Pure Virtual Functions in Abstract Classes Defined must be Refined
– Can Inherit Data and Methods from Parent Classes – Can In Fact Have Multiple Inheritance – Instantiated Objects Call Dynamically Bound Methods [Determined at Runtime]
Enables Semantic Overload [Can be Done without OO too]
– Overloaded Functions (Methods), Resolved by Type Signatures or Subtype/Sub-class
– Overloaded Operators (E.g. math operators work not only on integers and real numbers, but also vectors, matrices, and complex numbers)
– Derived Data Types from Base types
Polymorphism – Parametric – Re-useable Templates (E.g. Ada and Java Generic, C++ Template) – Functional Semantic Overloading – Dynamic or Subtype or Subclass Polymorphism using Late Binding
OOPs – Smalltalk to more current Java, C++, Ada95, … CLOS Sam Siewert 19
![Page 20: CS317 File and Database Systemsmercury.pr.erau.edu/~siewerts/cs317/documents/Lectures/...– Bit-rot (media eventually fails, limited storage lifetime) Variety, Depends on Level of](https://reader035.vdocument.in/reader035/viewer/2022071402/60ede90fc81be058025c9de5/html5/thumbnails/20.jpg)
Operator and Function Overloading What is Required to Be OO? Common Consensus is – Encapsulation, Class Hierarchy, Polymorphism (Parametric & Subtype or Subclass with Late Binding), Inheritance Operator Overloading Not Required (E.g. Java Frowns Upon, No Support) Some PLs have OO Features, but not All Sam Siewert 20 http://en.wikipedia.org/wiki/Operator_overloading
![Page 21: CS317 File and Database Systemsmercury.pr.erau.edu/~siewerts/cs317/documents/Lectures/...– Bit-rot (media eventually fails, limited storage lifetime) Variety, Depends on Level of](https://reader035.vdocument.in/reader035/viewer/2022071402/60ede90fc81be058025c9de5/html5/thumbnails/21.jpg)
Storing Objects in Relational Databases
One approach to achieving persistence with an OOPL is to use an RDBMS as the underlying storage engine. – O2 – merged with Informix and acquired by IBM – ObjectStore - http://www.objectstore.com/ – Objectivity - http://www.objectivity.com/products/objectivitydb – Versant - http://www.actian.com/products/operational-databases/
Requires mapping class instances (i.e. objects) to one or more tuples distributed over one or more relations. To handle class hierarchy, have two basics tasks to perform:
(1) design relations to represent class hierarchy; (2) design how objects will be accessed.
Pearson Education © 2009 21
![Page 22: CS317 File and Database Systemsmercury.pr.erau.edu/~siewerts/cs317/documents/Lectures/...– Bit-rot (media eventually fails, limited storage lifetime) Variety, Depends on Level of](https://reader035.vdocument.in/reader035/viewer/2022071402/60ede90fc81be058025c9de5/html5/thumbnails/22.jpg)
Storing Objects in Relational Databases
Pearson Education © 2009 22
![Page 23: CS317 File and Database Systemsmercury.pr.erau.edu/~siewerts/cs317/documents/Lectures/...– Bit-rot (media eventually fails, limited storage lifetime) Variety, Depends on Level of](https://reader035.vdocument.in/reader035/viewer/2022071402/60ede90fc81be058025c9de5/html5/thumbnails/23.jpg)
Mapping Classes to Relations Number of strategies for mapping classes to
relations, although each results in a loss of semantic information.
(1) Map each class or subclass to a relation: Staff (staffNo, fName, lName, position, sex, DOB, salary) Manager (staffNo, bonus, mgrStartDate) SalesPersonnel (staffNo, salesArea, carAllowance) Secretary (staffNo, typingSpeed)
Pearson Education © 2009 23
![Page 24: CS317 File and Database Systemsmercury.pr.erau.edu/~siewerts/cs317/documents/Lectures/...– Bit-rot (media eventually fails, limited storage lifetime) Variety, Depends on Level of](https://reader035.vdocument.in/reader035/viewer/2022071402/60ede90fc81be058025c9de5/html5/thumbnails/24.jpg)
Mapping Classes to Relations (2) Map each subclass to a relation
Manager (staffNo, fName, lName, position, sex, DOB, salary, bonus, mgrStartDate) SalesPersonnel (staffNo, fName, lName, position, sex, DOB, salary, salesArea, carAllowance) Secretary (staffNo, fName, lName, position, sex, DOB, salary, typingSpeed)
(3) Map the hierarchy to a single relation Staff (staffNo, fName, lName, position, sex, DOB, salary, bonus, mgrStartDate, salesArea, carAllowance, typingSpeed, typeFlag)
Pearson Education © 2009 24
![Page 25: CS317 File and Database Systemsmercury.pr.erau.edu/~siewerts/cs317/documents/Lectures/...– Bit-rot (media eventually fails, limited storage lifetime) Variety, Depends on Level of](https://reader035.vdocument.in/reader035/viewer/2022071402/60ede90fc81be058025c9de5/html5/thumbnails/25.jpg)
ORDBMSs RDBMSs currently dominant database technology with estimated sales of US$24billion in 2011, expected to grow to US$37billion by 2016 . Vendors of RDBMSs conscious of threat and promise of OODBMS. Agree that RDBMSs not currently suited to advanced database applications, and added functionality is required. Reject claim that extended RDBMSs will not provide sufficient functionality or will be too slow to cope adequately with new complexity. Can remedy shortcomings of relational model by extending model with OO features.
Pearson Education © 2014 25
![Page 26: CS317 File and Database Systemsmercury.pr.erau.edu/~siewerts/cs317/documents/Lectures/...– Bit-rot (media eventually fails, limited storage lifetime) Variety, Depends on Level of](https://reader035.vdocument.in/reader035/viewer/2022071402/60ede90fc81be058025c9de5/html5/thumbnails/26.jpg)
ORDBMSs - Features OO features being added include: – user-extensible types, – encapsulation, – inheritance, – polymorphism, – dynamic binding of methods, – complex objects including non-1NF objects, – object identity.
Pearson Education © 2014 26
![Page 27: CS317 File and Database Systemsmercury.pr.erau.edu/~siewerts/cs317/documents/Lectures/...– Bit-rot (media eventually fails, limited storage lifetime) Variety, Depends on Level of](https://reader035.vdocument.in/reader035/viewer/2022071402/60ede90fc81be058025c9de5/html5/thumbnails/27.jpg)
ORDBMSs - Features However, no single extended relational model. All models: – share basic relational tables and query
language, – all have some concept of ‘object’, – some can store methods (or procedures or
triggers).
Some analysts predict ORDBMS will have 50% larger share of market than RDBMS.
Pearson Education © 2014 27
![Page 28: CS317 File and Database Systemsmercury.pr.erau.edu/~siewerts/cs317/documents/Lectures/...– Bit-rot (media eventually fails, limited storage lifetime) Variety, Depends on Level of](https://reader035.vdocument.in/reader035/viewer/2022071402/60ede90fc81be058025c9de5/html5/thumbnails/28.jpg)
Stonebraker’s View
Pearson Education © 2014 28
![Page 29: CS317 File and Database Systemsmercury.pr.erau.edu/~siewerts/cs317/documents/Lectures/...– Bit-rot (media eventually fails, limited storage lifetime) Variety, Depends on Level of](https://reader035.vdocument.in/reader035/viewer/2022071402/60ede90fc81be058025c9de5/html5/thumbnails/29.jpg)
Advantages of ORDBMSs Resolves many of known weaknesses of RDBMS. Reuse and sharing: – reuse comes from ability to extend server to
perform standard functionality centrally; – gives rise to increased productivity both for
developer and end-user. Preserves significant body of knowledge and experience gone into developing relational applications.
Pearson Education © 2014 29
![Page 30: CS317 File and Database Systemsmercury.pr.erau.edu/~siewerts/cs317/documents/Lectures/...– Bit-rot (media eventually fails, limited storage lifetime) Variety, Depends on Level of](https://reader035.vdocument.in/reader035/viewer/2022071402/60ede90fc81be058025c9de5/html5/thumbnails/30.jpg)
Disadvantages of ORDBMSs Complexity. Increased costs. Proponents of relational approach believe simplicity and purity of relational model are lost. Some believe RDBMS is being extended for what will be a minority of applications. OO purists not attracted by extensions either. SQL now extremely complex.
Pearson Education © 2014 30
![Page 31: CS317 File and Database Systemsmercury.pr.erau.edu/~siewerts/cs317/documents/Lectures/...– Bit-rot (media eventually fails, limited storage lifetime) Variety, Depends on Level of](https://reader035.vdocument.in/reader035/viewer/2022071402/60ede90fc81be058025c9de5/html5/thumbnails/31.jpg)
SQL:2011 - New OO Features Type constructors for row types and reference types. User-defined types (distinct types and structured types) that can participate in supertype/subtype relationships. User-defined procedures, functions, methods, and operators. Type constructors for collection types (arrays, sets, lists, and multisets). Support for large objects – BLOBs and CLOBs. Recursion.
Pearson Education © 2014 31