cs 4400 database systems
TRANSCRIPT
CS4400DatabaseSystems
Meeting2:DatamanagementsystemsBrandonMyers
UniversityofIowa
DatabasesandDatabasemanagementsystems(DBMS)• Examplesofdatabases
• ExamplesofDBMSs
Anexample:onlinemusicstreamingservice• Whatdatamustitcontain?
• Whatcapabilitiesareneeded?
Summaryofdatamanagementrequirements1.Abletodescribereal-worldentitiesintermsofstoreddata2.Persistentlystorelargedatasets3.Efficientlyquery&update4.Changestructure(e.g.,addattributes)5.Concurrencycontrol:enablesimultaneousupdates6.Crashrecovery7.Securityandintegrity,provenance
DBMSprovidesthesesothatuserscanfocusonapplicationlogic
Peopleanddatabases
1. Appdeveloper:writesprogramsthatupdateandquerythedataintheDB
2. DBdesigner:modelsthedatabychoosingtablesandtheirattributes
3. DBadmin(“DBA”):operatesthedatabase,diagnosesperformanceproblems
4. Dataanalyst:datamining(inferringusefulinformation),dataintegration(combiningdisparatedata)
5. DBMSimplementer:buildstheDBMS
In4400we’lltrytogiveyousomeexperienceinalloftheseroles,although4and5arehugetopicsthatdemandtheirowncourses
Datastructuresanddatabases
• InCS2230(orequivalent)wasallaboutdatastructures
• Whatisthedifferencebetweenadatabaseandadatastructure?
Howdowerepresentourdataforthepurposesofmakingqueriesandupdates?
Howdowerepresentourdataforthepurposesofmakingqueriesandupdates?
DATAMODEL,theinterfacetoyourdata
Wehaveabunchofstudents,eachwithanameandamajor.WhatmightbeagoodDATAMODELforthisdataset?
Howdowerepresentourdataforthepurposesofmakingqueriesandupdates?
DATAMODEL,theinterfacetoyourdata
• e.g.,Relational• semi-structured,specificallyXML,JSON• graph• key-value
Now,therelationaldatamodel
(seetheboard)
SQLandSQLite
(seethenotesinthe.sql file)
Datawarehousestodatalakes• Conventionally,businesseswouldhave:
1. Businessoperationssupportedby:aDBMSfortransactions(e.g.,sales,supplychainorders)
2. Businessintelligencesupportedby:aDBMSforstoringastructuredandindexedarchiveofrecentandhistoricaldata(thinklibrary)calledadatawarehouse.Employeesanalyzedthedatatoinformdecisions.
• Today,companieslikeMicrosoftrefertodatalakes,replacingthecarefullymaintaineddatabasesofadatawarehousewithenormousquantitiesofrawdata
• Whenthedataneedstobeanalyzed,itistransformedwithparallelprocessingsystems
• in4400we’llexploresemi-structureddata,parallelprocessing,andnon-relationalsystems(“NoSQL”)
From Chaos to Order | by Wiertz Sébastien
Whattodonow
• HW1,whichisdue1/25,11:59pm• startearlybecauseitinvolvesanewtool:sqlite
• LookforthecoursepoliciessurveyintheAnnouncementsofICON,ifyouhaven’ttakenit
Attribution
• SomeslidesinspiredorquotedfromUWCSE344• Peopleanddatabases• Datawarehousestodatalakes• Summaryofdatamanagementrequirements• https://courses.cs.washington.edu/courses/cse344/