datawarehouse workflow: etlp extract transform loadprovide make user- friendly formats dynamic...
TRANSCRIPT
![Page 1: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/1.jpg)
Datawarehouse Workflow: ETLP
Extract Transform Load Provide
Make user-friendly formats
Dynamic database
Charts & MapsTools & websites
Archive native formats
![Page 2: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/2.jpg)
Datawarehouse Workflow: ETLP
Extract Transform Load Provide
tools
models
add meta information
netCDF on web server
transform to netCDF
netCDF on OPeNDAP
server
data providers = data users
data
Make user-friendly formats
Dynamic database
Charts & MapsTools & websites
Archive native formats
![Page 3: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/3.jpg)
DATA = RAW DATA + PROCESSING
DATA =
RAW DATA (volts)
History will never change!
one parameter at one place at one time
PROCESSING
Interpretation does change!
e.g. instrument deterioriation, recalibration
+
NASA satellite data with open source SeaDas processing toollkit (in IDL)• L0: dump of recorded voltages, only averaged over 16 pixels
• LAC: MLAC
• GAC• L1: voltages + satellite track• L2 ~ physical quantities• L3 ~ binned in space (1 grid instead of zillions of warped photos)• L4 ~ binned in time (climatology)
Deltares Aukepc for flumes• Stored raw data• With calibration
coefficients• Allows for recalibration
![Page 4: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/4.jpg)
Datawarehouse Workflow
Extract Transform Load Provide
Subversion repository
tools
models
add meta information
netCDF on web server
transform to netCDF
netCDF on OPeNDAP
server
data
data providers = data users
Make user-friendly formats
Dynamic database
Charts & MapsTools & websites
Archive native formats
![Page 5: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/5.jpg)
Programme today, and current session
1
3 D:\...
3 http://…
tools
models
add meta information
netCDF on web server
transform to netCDF
netCDF on OPeNDAP
server
data
data providers = data users
2
Extract Transform Load Provide
Subversion repository
![Page 6: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/6.jpg)
Repository username
• Get username and password.• Why, OpenEarth is open, right? Yes, but closed community• For best quality all actions are logged:• Nothing can be lost, only temporarily disabled• So anyone can be allowed to join
Every file is logged …
… and every line in every file is logged.
![Page 7: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/7.jpg)
commit
central database: repos.deltares.nl
local copy
D:\ E:\ F:\
REPOSITORY basics
delete
add
copy
update
browse
checkout
![Page 8: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/8.jpg)
commit
central database: repos.deltares.nl
local copy
D:\ E:\ F:\
REPOSITORY browse
delete
add
copy
update
browse
checkout
![Page 9: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/9.jpg)
REPOSITORY browse
![Page 10: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/10.jpg)
commit
central database: repos.deltares.nl
local copy
D:\ E:\ F:\
REPOSITORY checkout
delete
add
copy
update
browse
checkout
• Not handy to get files one by one with browser• Get them all at once with free program
![Page 11: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/11.jpg)
REPOSITORY checkout
• Download and install Tortoise (http://tortoisesvn.net/)• Make a checkout in e.g. F:\checkouts\• No need to back this up, it’s only a copy ...
![Page 12: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/12.jpg)
REPOSITORY checkout
• Copy url from browser (case sensitive!)• Make sure that tree of local copy resembles server
![Page 13: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/13.jpg)
commit
central database: repos.deltares.nl
local copy
D:\ E:\ F:\
REPOSITORY commit
delete
add
copy
update
browse
checkout
![Page 14: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/14.jpg)
REPOSITORY commit
up to date
modified
![Page 15: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/15.jpg)
REPOSITORY commit
![Page 16: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/16.jpg)
commit
central database: repos.deltares.nl
local copy
D:\ E:\ F:\
REPOSITORY update
delete
add
copy
update
browse
checkout
![Page 17: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/17.jpg)
REPOSITORY update
![Page 18: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/18.jpg)
REPOSITORY update
![Page 19: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/19.jpg)
REPOSITORY statistics
![Page 20: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/20.jpg)
commit
central database: repos.deltares.nl
local copy
D:\ E:\ F:\
REPOSITORY add
delete
add
copy
update
browse
checkout
![Page 21: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/21.jpg)
REPOSITORY add a raw dataset
• OpenEarthRawData is very big: don’t make a full checkout• To add a thing, first make an empty checkout of the destination.
![Page 22: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/22.jpg)
REPOSITORY add a raw dataset
• There are 2 copies of 1 file on your PC:
• Visible working copy, for editing
• Hidden shadow copy, to detect changes• Before adding a file to the server, a shadow copy must be created.• Allows
for
offline
working
![Page 23: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/23.jpg)
REPOSITORY add a raw dataset
• Now the addition must be simply be committed as any change
![Page 24: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/24.jpg)
REPOSITORY add
• The repository is supposed to be working anytime• Do not play with the actual repository• All advanced users will by annoyed by this• But then, how I can I learn how to work with it?• Solution: use the sandbox• Play around at the highest level as much as you like• And clean up afterwards (delete)• With your browser:
http://repos.deltares.nl/repos/OpenEarthTools/sandbox
![Page 25: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/25.jpg)
commit
central database: repos.deltares.nl
local copy
D:\ E:\ F:\
REPOSITORY delete
delete
add
copy
update
browse
checkout
![Page 26: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/26.jpg)
REPOSITORY add a raw dataset
• There are 2 copies of 1 file on your PC:
• Visible working copy, for editing
• Hidden shadow copy, to detect changes• When deleting a file on the server, your shadow copy be informed• Allows
for
working
offline
![Page 27: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/27.jpg)
REPOSITORY add a raw dataset
• Now the deletion must be simply be committed as any change
![Page 28: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/28.jpg)
REPOSITORY delete
• Now delete the addition you made in
http://repos.deltares.nl/repos/OpenEarthTools/sandbox• And check the log file, to see what colleagues did.
![Page 29: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/29.jpg)
commit
central database: repos.deltares.nl
local copy
D:\ E:\ F:\
REPOSITORY copy
delete
add
copy
update
browse
checkout
![Page 30: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/30.jpg)
REPOSITORY copy
• Again: first inform shadow copy locally, then commit to server …• Drag with right-mouse button
![Page 31: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/31.jpg)
OpenEarthRawData
Raw data are stored under https://repos.deltares.nl/repos/OpenEarthRawData/trunk/
• Data are stored with copyright holder as main directory.This allows
• copyright holders to maintain their own data
• copyright holders to shift easily from private to open source
• users to identify whom to acknowlegde• Data should also contain
• dedicated processing scripts (if not in OpenEarthTools)
• url file to web source
• INSPIRE XML meta-data file
![Page 32: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/32.jpg)
Summary: current session
1
3 D:\...
3 http://…
tools
models
add meta information
netCDF on web server
transform to netCDF
netCDF on OPeNDAP
server
data
data providers = data users
2
Extract Transform Load Provide
Subversion repository
![Page 33: Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats](https://reader036.vdocument.in/reader036/viewer/2022070308/551c132e550346a84f8b54f6/html5/thumbnails/33.jpg)
Next: use OpenEarthTools to make netCDF
1
3 D:\...
3 http://…
tools
models
add meta information
netCDF on web server
transform to netCDF
netCDF on OPeNDAP
server
data
data providers = data users
2
Extract Transform Load Provide
Subversion repository