1 information management dig 3563 – lecture 17 file structures and cloud computing j. michael...
TRANSCRIPT
![Page 1: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/1.jpg)
1
Information Management
DIG 3563 – Lecture 17File Structures
and
Cloud Computing
J. Michael Moshell
University of Central Florida
Original image* by Moshell et al .
Imagery is fromWikimedia except where marked with *.
![Page 2: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/2.jpg)
-2 -
File System Organization
recovermyfiles.com
* Disks have sectors; each sector
has an address (integer)
* A file is a collection of sectors. They can
be contiguous or fragmented.
* To find the sectors comprising a file,
we need a directory.
* The directory system records which
sectors belong to each file.
* The Operating System has software
to manage directories & files.
planetoftunes.com
![Page 3: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/3.jpg)
-3 -
Formatting a Disk
* factory (low level) format:
- timing tracks, etc.
"marks in the parking lot"
- usually not re-doable
* local reformatting:
* Check for read/write errors
* Mark good sectors and bad ones
* Create a list of available sectors
* Set up file structure:
- directory
- boot sector (for bootable drives)
stripespls.com
![Page 4: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/4.jpg)
-4 -
File System Organization
recovermyfiles.com
* A simple (conceptual) architecture:
Directory:
* at sectors 22010, 22021 we have records:
So the file is a linked list (like a treasure
hunt) through the disk's sectors.
(Not all disks are organized this way.) planetoftunes.com
Dirnum Filename Filesize Headsector
1 addresses.doc 144300 220102 employees.doc 99800 335003 payroll.xls 17100 334824 etc
block dirnum nextblock data ...22010 1 22021 Adams, John \t 222 West ...22021 22040 Wilson, Steve \t 333 East ...
![Page 5: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/5.jpg)
-5 -
File System Errors
* Disk drive hardware checks parity when reading sectors
* If a parity error occurs, data may have been lost
* Usually this just reports a failure to the OS and you're stuck.
However – the actual disk drive hardware can probably still
read the data; it just doesn't LIKE it.
So, specialized software can sometimes get this "bad checksum" data and display it ... we discuss this shortly.
![Page 6: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/6.jpg)
-6 -
File System Organization
recovermyfiles.com
* Deleting a file:
The OS keeps an available sector list
of sectors that can be reused.
To DELETE a file, the system just
changes its first and last links. (Think of out-of-service boxcars).
The data is not gone, it's just unlinked.
It will be overwritten, when (and if)
the OS needs more space.
tdc.ca
![Page 7: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/7.jpg)
-7 -
Losing and Recovering Datarecovermyfiles.com
Now what if the directory or a sector gets
screwed up?
a) software error: erase the pointer or link to
a file.
or
b) hardware error: part of directory or sector gets corrupted
The data is still out there, but OS can't find it.
If you can directly READ THE SECTORS, you will find
broken strands of spaghetti ... with clues in 'em.
restaurantwidow.com
![Page 8: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/8.jpg)
-8 -
Recovering Data
What clues exist?
Links (obviously) if it's a linked system
Try to reconstruct the files, or fragments of them
Directory item numbers, if these exist
Try to "work backwards" and reconstruct the directory
The data itself (e. g. search for "Adams")
Use syntactic knowledge to match up partial sentences
in blocks. Which block might match that one?
.. and we re nguins live in Antarc...
spect the opinions of...
492.7 \t 333.9e14 ...
![Page 9: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/9.jpg)
-9 -
Recovering Data
If you have 'bad sectors' (i. e. bad checksums)
Read the data and override the parity error messages
Humans are normally required to look at the data and piece it
back together.
Success is not guaranteed.
Formatting a drive writes 0 in all the sectors. SOME claim they
can recover what was there before (maybe NSA can?)
But it is not a high-percentage bet.
![Page 10: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/10.jpg)
-10 -
Forensics: Finding Hidden Stuff
* simplest cases: just "erased" your files?
- straightforward disk recovery may work.
* the famous photocopier story.
- copiers have hard drives and remember what was copied.
http://www.cbsnews.com/stories/2010/04/19/eveningnews/main6412439.shtml
* RAMsticks are just like hard drives; "delete" does not empty.
(Nonvolatile RAM versus volatile RAM.
Why isn't it ALL nonvolatile?)
macforensiclabs.com
![Page 11: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/11.jpg)
-11 -
Forensics: Finding Hidden Stuff
* virtual memory: copies part of your RAM
into hard drive on computer.
* those images may include print queues and other information
that can be recovered.
* backup systems may not have been reformatted even if the main
hard drive was reformatted.
* offsite backup probably was NOT reformatted; old sectors may
have copies of data you wanted to make disappear.
macforensiclabs.com
![Page 12: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/12.jpg)
-12 -
File Structures: Summary
* vocabulary terms throughout lecture
* backup/archive/redundant storage
* criteria for choice of offsite backup
* understand and explain disk organization
* understand how disk errors occur
* analyze what data could be recovered from a particular accident
* discuss forensic issues concerning disk data erasure and recovery
motifake.com
![Page 13: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/13.jpg)
-13 -
Cloud Computing and
Digital Asset Management
• First let's look at the Cloud
- Where did it come from?
- What is it?
- How can it help me?
- What new skills will I need to use it?
- What effect does Cloud have on DAM?
![Page 14: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/14.jpg)
-14 -
As of the Year 2000 ...
• Most Internet Service Providers sold ( ... rented ...)• dedicated hosting
One website: delivered by 1 computermystore.com
• shared virtual hosting yourstore.com
N websites each got 1/Nth computerhistore.com
herstore.com
![Page 15: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/15.jpg)
Built giant 'ad hoc'
systems with
thousands of CPUs
and petabytes of
storage.
-15 -
phaseoneenterprises.com
And a few giants (Yahoo, Google,
Amazon)
![Page 16: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/16.jpg)
And a few giants (Yahoo, Google,
Amazon)
Built giant 'ad hoc'
systems with
thousands of CPUs
and petabytes of
storage.
Amazon noticed ...
less than 10% of their capacity was being used
most of the time. -16 -
phaseoneenterprises.com
![Page 17: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/17.jpg)
-17 -
en.wikimedia.org
... and in 2006 launched
Amazon Web Services The 'utility model': power plants
have capacity to meet
AVERAGE demand
and so can
deliver UNLIMITED*
power to some customers
when needed.
(*"Unlimited" as long as << total capacity)
![Page 18: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/18.jpg)
Astronomers worldwide
now schedule time on
big telescopes
through the Internet
and don't have to go to a cold mountaintop
and stay up all night
to capture imagery.
-18 -
as.utexas.edu
The Shared Telescope Model
![Page 19: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/19.jpg)
NASA released NEBULA in 2008,
to share research computers
instead of building additional
data centers.
NEBULA is an open source cloud management
system.
-19 -
The Shared Computing Model
![Page 20: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/20.jpg)
Before PCs, we
programmed on punch-cards
-20 -
as.utexas.edu
... resembles the old Mainframe
Timeshare model
![Page 21: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/21.jpg)
Before PCs, we
programmed on punch-cards
and thought it was a
great INNOVATION
when time-sharing
became possible.
-21 -
as.utexas.edu
... resembles the old Mainframe
Timeshare model
![Page 22: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/22.jpg)
In 1965 this was SCARCE
and we were NUMEROUS
(relatively)
(Skilled specialists who wanted to use computers) -22 -
as.utexas.edu
But with one fundamental difference:
redlinecs.com.au
![Page 23: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/23.jpg)
In 2012 this is ABUNDANT
and we are
EVERYONE
-23 -
allthingsdistributed.com
But with one fundamental difference:
reuters.com
![Page 24: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/24.jpg)
... may reduce your company's IT costs
* software is expensive – so RENT it
* hardware is expensive to update – so RENT it
* buildings are expensive – so share them
* land is expensive – build in rural areas
-24 -
... relies on fast, reliable networks
![Page 25: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/25.jpg)
1. Agility through dynamic provisioning
- Order up "supercomputer for an hour"
2. API Accessibility
- Your program can specify the needed QOS*
QOS: Quality of Service:
- Maximum guaranteed latency (e. g. <1ms)
- Minimum guaranteed CPU (e. g. >1 petaflop) -25 -
Key Cloud Concepts:
![Page 26: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/26.jpg)
"Floating Point Operations" like x=239.44*456.3733
per second
Math models (physics, stock market, statistics)
may need tera = billion*billion of flops
giga = 109
tera = 1012
peta = 1015
exa = 1018
-26 -
What's a flop?
![Page 27: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/27.jpg)
1. Agility through dynamic provisioning
- Order up "supercomputer for an hour"
2. API Accessibility
- Your program can specify the needed QOS*
3. Virtualization
- You "THINK" you have your own machine
- Protection models don't need to be reinvented
http://www.vmware.com/virtualization/ -27 -
Key Cloud Concepts:
![Page 28: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/28.jpg)
SECURITY.
(I know this guy)
http://www.acsac.org/2012/workshops/ccw/
One solution (for larger firms): Build your own Cloud.
http://www.enterprisenetworkingplanet.com/ebooks/50950510/95900/4190310/ -28 -
One Key Cloud Concern:
![Page 29: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/29.jpg)
bigbird.com
cookie.com
elmo.com
kermit.com
piggie.com
-29 -
Quickly, web-hosts realized that they
could virtualize their service
![Page 30: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/30.jpg)
-30 -
Software as a Service (SaaS)
The 800 pound anthropoid:
Salesforce.com
http://www.salesforce.com
sales cloud (CRM systems)
force.com – build your own
pin.primate.wisc.edu
![Page 31: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/31.jpg)
-31 -
Digital Asset Management
in the Cloud
pin.primate.wisc.edu
1. Simple: Dropbox
2. Specialized for software: Github
3. Rich metadata -> DAM (e. g. AlienBrain)
Media Valet - http://www.mediavalet.co/home.aspx
Widen
Fordela
![Page 32: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/32.jpg)
-32 -
Digital Asset Management
in the Cloud
pin.primate.wisc.edu
1. Simple: Dropbox
2. Specialized for software: Github
3. Rich metadata -> DAM (e. g. AlienBrain)
Media Valet - http://www.mediavalet.co/home.aspx
"CMIS compliant?"
![Page 33: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/33.jpg)
-33 -
Content Management
Interoperability Standard
http://en.wikipedia.org/wiki/Content_Management_Interoperability_Services
CMIS is an open standard that defines how DAM
systems can manage metadata ("generic properties")
for files and folders.
Adobe, HP, IBM, Microsoft, Oracle + + +
![Page 34: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/34.jpg)
-34 -
Digital Asset Management
in the Cloud
pin.primate.wisc.edu
1. Simple: Dropbox
2. Specialized for software: Github
3. Rich metadata -> DAM (e. g. AlienBrain)
Media ValetWiden - http://www.widen.com/
Fordela
![Page 35: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/35.jpg)
-35 -
Digital Asset Management
in the Cloud
pin.primate.wisc.edu
1. Simple: Dropbox
2. Specialized for software: Github
3. Rich metadata -> DAM (e. g. AlienBrain)
Media ValetWiden
Fordela http://www.fordela.com/ - VIDEO focus
(started by LucasArts veterans)
![Page 36: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/36.jpg)
-36 -
Choosing a DAM System
pin.primate.wisc.edu
Here's a logically organized Buyer's Guide
http://www.datamation.com/storage/digital-asset-management-buying-guide-1.html
![Page 37: 1 Information Management DIG 3563 – Lecture 17 File Structures and Cloud Computing J. Michael Moshell University of Central Florida Original image* by](https://reader035.vdocument.in/reader035/viewer/2022062421/56649cb75503460f9497d77b/html5/thumbnails/37.jpg)
-37 -
Choosing a DAM System
pin.primate.wisc.edu
Here's a logically organized Buyer's Guide
http://www.datamation.com/storage/digital-asset-management-buying-guide-1.html
End of lecture ... End of lectureS.
When we return ... Project Show-and-tell!