Download - CMIP5 Download Tutorial
CMIP5 Download Tutorial
Jennifer M. Adams12 January 2012
/data/cmip5/extras/CMIP5_Tutorial.pptx
Vocabulary Lesson
/data/cmip5/docs/data_reference_syntax.pdf
Eleven keywords uniquely describe a data set:Project. Institute. Model. Experiment.
Frequency. Realm. MIP Table. Product. Variable. Ensemble. Version.
Project (Activity): cmip5
Institute: The institute responsible for model results, e.g. MOHC, NOAA-GFDL, NCAR, MPI-M, NASA-GMAO.…
Model: The name of the model used,e.g. HadCM3, GFDL-ESM2M, CCSM4, MPI-ESM-LR….
Experiment: The name of the experiment family and type,e.g. amip, amip4xCO2, decadal1960, piControl, historical….
Frequency: The interval between time samples. Options are:yr, mon, day, 6hr, 3hr, subhr, monClim, fx
Realm: The high level modeling component. Options are: atmos, ocean, land, landIce, seaIce, aerosol, atmosChem, ocnBgchem
MIP Table: A spreadsheet* entry for realm and variable components,e.g. Amon, Omon, Lmon, LImon, day, ….* /shared/cmip5/docs/standard_output.xlsx
Variable: A short name that identifies a physical quantity, e.g. pr, ps, psl, tas, tauu, tauv uas, vas….
Product: A designation of CMIP5 files. Options are: output, output1, output2, unsolicited
Ensemble: A name that distinguishes among closely related simulations and includes 3 numbers: realization (rN), initialization method (iM), and perturbed physics (pL).e.g. r1i1p1, r7i1p1, r1i3p1, r1i1p2.…
Version: Uniquely identifies a particular version of the data set, e.g. v20110923, v20111208, v1, v2.…
Desired & Acquired Data Sets
/project/cmip5/jma/desired.txtThis list, based on user requests, is managed by Tim and Jennifer.
/project/cmip5/jma/acquired.txt This list, based on contents of /data/cmip5/data, is auto-updated.
Each Downloader will be assigned an item from the desired list and must grab all available models and ensemble members for the given Experiment/Realm/Frequency/MIP Table.
ESGF Gateways
For Downloading:http://pcmdi3.llnl.gov/esgcet/home.htmhttp://cmip-gw.badc.rl.ac.uk/home.htmhttp://ipcc-ar5.dkrz.de/home.htmFor Searching:http://esg.prototype.ucar.edu/home.htm
Always Use Firefox!
Gateway Login
Authentication
OpenID: https://pcmdi3.llnl.gov/esgcet/myopenid/jennifer
Username: jenniferPassword: sdf,WER.5
Gateway Search
Gateway Search
Gateway Search Results
Dataset URL
Dataset ID
Select Files
Download Wget Script
Rename Wget Script
mv wget-download.sh wget.cmip5.output1.MOHC.HadCM3.decadal2000.mon.atmos.Amon.r1i2p1.v20110708.sh
IT IS VERY IMPORTANT THAT YOU DO THIS CORRECTLY!
Because …1. Otherwise there is no way to tell what dataset wget-download.sh is configured to grab
2. I need the keywords in the new script name to enable several automation scripts3. Some of the metadata (esp. version number) can’t be captured any other way
Set Up Work EnvironmentCreate a top level working directory (you will need my help with this):mkdir /shared/working/cmip5/jma
Make several subdirectories under your top level directory, one for each wget: cd /shared/working/cmip5/jmamkdir a01 a02 a03 a04 a05 a06 a07 a08 a09 a10 a11 a12 a13 a14 a15 a16
sftp wget.renamed.sh from laptop to /shared/working/cmip5/jma/a01/
Certificatesmkdir $HOME/.esgcd $HOME/.esgcp /data/cmip5/extras/MyProxyLogon-ESG.jar .cp /homes/jma/.esg/update ../update
Launch the WgetLogin to a server: cpu1-cpu6cd /shared/working/cmip5/jma/a01
/project/cmip5/jma/dorun.sh & This script filters out unwanted files, notes wget format, runs edited wget script, captures output in log file, and monitors script’s progress. Run it in the background.
/project/cmip5/jma/ckrun.sh When dorun.sh is no longer working, this script evaluates the success of wgets, checks if all desired files are here, returns status of download.
Repeat As NecessaryIf ckrun.sh determines that the download is complete, it will create a file called ‘done’. Take a moment to celebrate, then move on to next data set.
If ckrun.sh determines that download is incomplete, there was an error.Look for a file in working subdir named ‘formatA’ or ‘formatB’ or ‘formatC’. If format=A:
• Go back to the Gateway and get a new wget script • Copy new wget.renamed.sh into working subdirectory• Rerun dorun.sh
If format=B or C: • You may need to update your certificates• You do not need a new wget script; just rerun dorun.sh
Pitfalls & Shortcuts• Watch out for zombies, a.k.a. processes that are hung and can’t be killed• Use dataset URLs to circumvent the search• Start by downloading only these high-priority models first:
Model Gateway
CCSM4 http://www.earthsystemgrid.org
GFDL-CM3 http://pcmdi3.llnl.gov/esgcet
GISS-E2-H http://pcmdi3.llnl.gov/esgcet
HadCM3 http://cmip-gw.badc.rl.ac.uk
MPI-ESM-LR http://ipcc-ar5.dkrz.de