info systems spring 2013
DESCRIPTION
Info systems Spring 2013. Focus of class. Cutting edge topics in information systems Conceptual material, as opposed to coding If you don’t have a good knowledge of relational databases and SQL, now is a good time to study it… A good MySQL book: murach mysql. Class blog & reaching me. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Info systems Spring 2013](https://reader035.vdocument.in/reader035/viewer/2022062310/568166a8550346895dda9b2c/html5/thumbnails/1.jpg)
INFO SYSTEMS SPRING 2013
![Page 2: Info systems Spring 2013](https://reader035.vdocument.in/reader035/viewer/2022062310/568166a8550346895dda9b2c/html5/thumbnails/2.jpg)
Focus of class• Cutting edge topics in information systems• Conceptual material, as opposed to coding• If you don’t have a good knowledge of relational
databases and SQL, now is a good time to study it…• A good MySQL book: murach mysql
![Page 3: Info systems Spring 2013](https://reader035.vdocument.in/reader035/viewer/2022062310/568166a8550346895dda9b2c/html5/thumbnails/3.jpg)
Class blog & reaching me• http://rogerking.me
• All assignments and announcements will be on the blog.• I will not send out class email.• Only read blog entries starting on December 30.
• To contact me• [email protected]• Do not use my CU mail, please. It is inundated with Spam.• 303 437 7419.• To talk to me, see me after class, or send email for an appointment.• rogerbuzzking on Skype, but I do not stay logged into Skype; you
must call or email first.
![Page 4: Info systems Spring 2013](https://reader035.vdocument.in/reader035/viewer/2022062310/568166a8550346895dda9b2c/html5/thumbnails/4.jpg)
Required books• books:
• NoSQL Distilled and • Seven Databases in Seven Weeks
![Page 5: Info systems Spring 2013](https://reader035.vdocument.in/reader035/viewer/2022062310/568166a8550346895dda9b2c/html5/thumbnails/5.jpg)
Assignments and exams• Two exams – material will be from
• 2 required books• Slides posted on blog, along with corresponding lectures• Handouts posted on blog
• Exam and projects• Midterm – 35 % of final grade
• On March 11• Second exam – 35 % of final grade
• This will be on the last schedule day of class, NOT at the final slot• i.e., it will be on May 1
• Small projects/assignments – 30% of final grade total
![Page 6: Info systems Spring 2013](https://reader035.vdocument.in/reader035/viewer/2022062310/568166a8550346895dda9b2c/html5/thumbnails/6.jpg)
The First Assignment• Build a simple application using Mongo• Groups of 1 or 2• Due date to be posted on blog• The application will hold very large video and audio objects• More later…
![Page 7: Info systems Spring 2013](https://reader035.vdocument.in/reader035/viewer/2022062310/568166a8550346895dda9b2c/html5/thumbnails/7.jpg)
Suggestions…• Please come to class.• Feel free to suggest references and/or topics.• Speak up at any time – it would be nice to have an
interactive class.
![Page 8: Info systems Spring 2013](https://reader035.vdocument.in/reader035/viewer/2022062310/568166a8550346895dda9b2c/html5/thumbnails/8.jpg)
Likely and possible topics• Post relational database systems• Full text processing• “Big” data• Databases and XML• Media and databases (images, video, audio, animation)• Personal information systems (not just databases)• Integrating diverse data sets• ???
![Page 9: Info systems Spring 2013](https://reader035.vdocument.in/reader035/viewer/2022062310/568166a8550346895dda9b2c/html5/thumbnails/9.jpg)
Review of relational databases• Schema based technology
• Small schema• Lots of data
• SQL, set-based data manipulation• Multi-language application development• Strict 2 phase ACID transactions• Rigidly structured, identically structured, simply structured
data• User defined types extensions for “objects in the small”
![Page 10: Info systems Spring 2013](https://reader035.vdocument.in/reader035/viewer/2022062310/568166a8550346895dda9b2c/html5/thumbnails/10.jpg)
Relational basics• Tables (relations)• Columns (attributes)• Domains• Rows (tuples)• Primary and candidate keys• Foreign keys• Null values• Triggers• SQL as DDL and DML
![Page 11: Info systems Spring 2013](https://reader035.vdocument.in/reader035/viewer/2022062310/568166a8550346895dda9b2c/html5/thumbnails/11.jpg)
Relational applications• A lot of data, but in amounts no longer considered huge• Transactions manipulating sets of rows• Complete accuracy is required• Not real time or interactive• If distributed, only a modest number of centrally controlled
servers• If distributed, there is a single, unified schema• Complex objects built by connecting tables• Logic of data is mostly in schema and SQL, not app• Mix of reads and writes, without writes being minimal• SQL programs (run as transactions) compiled once, and
optimized, run many times
![Page 12: Info systems Spring 2013](https://reader035.vdocument.in/reader035/viewer/2022062310/568166a8550346895dda9b2c/html5/thumbnails/12.jpg)
Important facts about SQL DBs• Elegant, set theoretic underpinnings• Originally defined with an algebra:
• Projection, join, union/difference/intersection, selection• Closed under this algebra
• SQL is a tuple calculus based on the algebra• Declarative• Far more compact than algebra
• Equally elegant algebra for optimization• Query results are legal view tables• Serious impedance mismatch between o-o languages and
SQL
![Page 13: Info systems Spring 2013](https://reader035.vdocument.in/reader035/viewer/2022062310/568166a8550346895dda9b2c/html5/thumbnails/13.jpg)
Syntax of SQLSELECT select_listFROM table_sourceWHERE search_condition
![Page 14: Info systems Spring 2013](https://reader035.vdocument.in/reader035/viewer/2022062310/568166a8550346895dda9b2c/html5/thumbnails/14.jpg)
ExampleSELECT invoice_total, payment_total, credit_total, invoice_total - payment_total - credit_total AS balance_dueFROM invoicesWHERE NOT (invoice_total >= 5000 OR NOT invoice_date <= '2011-08-01')
![Page 15: Info systems Spring 2013](https://reader035.vdocument.in/reader035/viewer/2022062310/568166a8550346895dda9b2c/html5/thumbnails/15.jpg)
ExampleSELECT 'Paid' AS source, invoice_number, invoice_date, invoice_totalFROM paid_invoicesWHERE invoice_date >= '2011-06-01'ORDER BY invoice_total DESC
![Page 16: Info systems Spring 2013](https://reader035.vdocument.in/reader035/viewer/2022062310/568166a8550346895dda9b2c/html5/thumbnails/16.jpg)
Manipulating multiple tables• You can use the Where clause, T.color = R.color• You can use an explicit join
• Equijoin, outerjoin
SELECT vendor_name, invoice_number, invoice_totalFROM vendors LEFT JOIN invoices ON vendors.vendor_id = invoices.vendor_idORDER BY vendor_name
![Page 17: Info systems Spring 2013](https://reader035.vdocument.in/reader035/viewer/2022062310/568166a8550346895dda9b2c/html5/thumbnails/17.jpg)
AggregatesAVG([ALL|DISTINCT] expression)SUM([ALL|DISTINCT] expression)MIN([ALL|DISTINCT] expression)MAX([ALL|DISTINCT] expression)COUNT([ALL|DISTINCT] expression)COUNT(*) – counts nulls
![Page 18: Info systems Spring 2013](https://reader035.vdocument.in/reader035/viewer/2022062310/568166a8550346895dda9b2c/html5/thumbnails/18.jpg)
ExampleSELECT COUNT(*) AS number_of_invoices, SUM(invoice_total – payment_total – credit_total) AS total_dueFROM invoicesWHERE invoice_total – payment_total – credit_total > 0
![Page 19: Info systems Spring 2013](https://reader035.vdocument.in/reader035/viewer/2022062310/568166a8550346895dda9b2c/html5/thumbnails/19.jpg)
Strict 2 phase transactions
• Each SQL program is within a begin and end transaction pair• Each transaction has its own workspace for DB items it is going
to update• Any transactions that overlap in execution time will appear to
have run in some serial order• This is done by transactions requesting read and write locks
(also known as shared and exclusive locks)• Read locks can be shared with other readers• Write locks cannot be shared with readers or writers• All locks held until the end of the transaction• They are released and then the changes that a transaction has
made are moved to the DB
![Page 20: Info systems Spring 2013](https://reader035.vdocument.in/reader035/viewer/2022062310/568166a8550346895dda9b2c/html5/thumbnails/20.jpg)
Serializability & transactions• The net effect is that the transactions that overlap in
execution time appear to have run in some serial order• Transactions can be undone by throwing away the local
store (conceptually, at least)• The write period at the end of the transaction must be
atomic• The two phases:
• Request read, write, upgrade locks (and wait on locks) and process
• Release locks and move updates to the DB• There is a notion of “serializability” which means that the
actual schedule of executed steps corresponds to some serial order of running the transactions
![Page 21: Info systems Spring 2013](https://reader035.vdocument.in/reader035/viewer/2022062310/568166a8550346895dda9b2c/html5/thumbnails/21.jpg)
Major concept: DB “state”• Changes at the end of each transaction (potentially)• Relies on 2 Phase transactions• We tradeoff throughput in favor of having safe state
changes
![Page 22: Info systems Spring 2013](https://reader035.vdocument.in/reader035/viewer/2022062310/568166a8550346895dda9b2c/html5/thumbnails/22.jpg)
Normalization• Start with an entity model• Map to tables• Create PKs and FKs• Create other constraints• Normalize tables• Goals
• Minimize redundant data• Minimize “update anomalies”• Minimize misinterpreted db by documenting more constraints
• Problems• Skinny tables and more joins• Increases impedance mismatch
![Page 23: Info systems Spring 2013](https://reader035.vdocument.in/reader035/viewer/2022062310/568166a8550346895dda9b2c/html5/thumbnails/23.jpg)
Dependencies• FD
• We say that ai FD-> aj• Or “ai functionally determines aj”
• MVD->• We say that ai MVD-> aj• Or “ai multivalued determines aj”
• Note: the right side of an FD or an MVD can be a set of attributes
![Page 24: Info systems Spring 2013](https://reader035.vdocument.in/reader035/viewer/2022062310/568166a8550346895dda9b2c/html5/thumbnails/24.jpg)
Third normal form• First (1NF) The value stored at the intersection of each
row and column must be a scalar value, and a table must not contain any repeating columns.
• Second (2NF) Every non-key column must depend on the entire primary key.
• Third (3NF) Every non-key column must depend only on the primary key.
![Page 25: Info systems Spring 2013](https://reader035.vdocument.in/reader035/viewer/2022062310/568166a8550346895dda9b2c/html5/thumbnails/25.jpg)
Fixed 3NF and 4NF• Boyce-Codd (BCNF) A non-key column can’t be
dependent on another non-key column. • Fourth (4NF) A table must not have more than one
multivalued dependency, where the primary key has a one-to-many relationship to non-key columns.
![Page 26: Info systems Spring 2013](https://reader035.vdocument.in/reader035/viewer/2022062310/568166a8550346895dda9b2c/html5/thumbnails/26.jpg)
ExampleCustomer ID Address ZIP18 112 First 80304 17 123 Ash 80303 16 123 Ash 80303
Break into two tables:
Customer ID AddressAddress Zip
![Page 27: Info systems Spring 2013](https://reader035.vdocument.in/reader035/viewer/2022062310/568166a8550346895dda9b2c/html5/thumbnails/27.jpg)
ExampleMothers_Phone Fathers_Phone Child_Name
Break into: Mothers_Phone Child_Name 3030000000 Sue 3031111111 SueAnd Fathers_Phone Child_Name 3032222222 Sue
3033333333 Sue
Note: both fields needed for PK
![Page 28: Info systems Spring 2013](https://reader035.vdocument.in/reader035/viewer/2022062310/568166a8550346895dda9b2c/html5/thumbnails/28.jpg)
Stored programs• Stored procedures (can be called by an application)• Stored functions (can be called by an SQL program)• Triggers (tied to an operation like INSERT)• Events (tied to a clock)