irods 4.0 tutorial i. getting started
DESCRIPTION
iRODS 4.0 Tutorial I. Getting Started. Leesa Brieger. iRODS Tutorial Preview. iRODS Getting Started Unix client (icommands) U sage iRODS Data Grid Administration Installing server and iCAT Changing default settings Setting up users - PowerPoint PPT PresentationTRANSCRIPT
iRODS 4.0 Tutorial
I. Getting Started
Leesa Brieger
2
iRODS Tutorial Preview
I. iRODS Getting Started – Unix client (icommands)– Usage
II. iRODS Data Grid Administration– Installing server and iCAT– Changing default settings– Setting up users– Installing resource server and adding new resources
to a data grid/zone– Federating with other grids/zones, remote users– Microservices and rules for policy implementation
and enforcement
3
I. iRODS Getting Started
4
iRODS Info
• Main page: http://irods.org
• Github repositories: https://github.com/irods/
(Request a guest account to get access.)
• Github tickets (issues):
https://github.com/irods/irods/issues?state=open
• Chat list: [email protected]
• The original iRODS wiki: https://wiki.irods.org
• The iRODS Manual: http://irods.org/dev/wp-content/uploads/2014/03/irods-manual-4.0.0.pdf
5
iRODS Books
Available from Amazon
• The integrated Rule-Oriented Data System (iRODS) Micro-service Workbookhttp://www.amazon.com/dp/1466469129
• An updated microservices reference book is coming out soon.
6
iRODS Download
• Download link for binary packages from the iRODS main page
http://irods.org/download/
• Open source, BSD license
• Binary packages – rpm and deb – available for CentOS, Red Hat, Debian, Ubuntu, SUSE. Planned: Mac OSX, Windows– See http://irods.org/2014/03/irods-4-0-0-released/ – 4.0 tested platforms: CentOS 5 & 6, SuSE 11 & 12, Ubuntu 10 & 12
• iRODS source is available via GitHubhttps://github.com/irods/irods
• Other iRODS packages and plug-ins are available via GitHub athttps://github.com/irods
7
Install iRODS Unix Client (icommands)
• Download the client package for your platform at https://irods.org/download/
• Install the package– Ubuntu: sudo dpkg –i irods-icommands-4.0.0-64bit.deb– CentOS 6: sudo rpm –i irods-icommands-4.0.0-64bit-centos6.rpm
• This will:– install the icommands to your path– advise you to create an environment file: ~/.irods/.irodsEnv
8
The iRODS Environment File
• ~/.irods/.irodsEnv
• The .irodsEnv file determines which data grid (zone) the icommands client connects to - it determines your whole iRODS environment– server host– iRODS user name– port number– default resource– home directory– working directory– …
• Use multiple environment files to choose from among many grids– only one file at a time has the name .irodsEnv– such environment changes are only feasible for Unix accounts that
are not the “irods” service account
9
Working with a Demo Data Grid
• If you have an account on an iRODS data grid, find
– your account name and password
– the environment file (.irodsEnv) that goes with it
• Make directory .irods/ in your home directory
> mkdir ~/.irods
• Copy the .irodsEnv file into ~/.irods; edit if necessary to insert your user name, zone name, default resource, etc
• This will direct your client to the intended data grid, as the intended user.
10
Sample .irodsEnv file (partial)
# iRODS server host name: irodsHost ‘tutoricat.renci.org.renci.org'
# iRODS server port number: irodsPort 1247
# Default storage resource name: irodsDefResource ’resc1'
# Home directory in iRODS: irodsHome '/centosgrid/home/leesa'
# Current directory in iRODS: irodsCwd '/centosgrid/home/leesa'
# Account name: irodsUserName ’leesa'
# Zone: irodsZone ’centosgrid’…
In this example, user name is “leesa”
iRODS data grids are referred to as “zones”
11
Some iRODS Clients
• iDrop web – iDrop, iDrop-litehttp://iren-web.renci.org:8080/idrop-web/login/login
• PHP web browserhttp://iren-web.renci.org/rodsweb
• icommands – Unix clienthttps://www.irods.org/index.php/icommands
• FUSE (Filesystem in Userspace) clienthttps://www.irods.org/index.php/iRODS_FUSE
• Many others supplied by user communities
12
Unix client: icommands
Seehttps://wiki.irods.org/index.php/icommands for the complete
list and full documentation.
Unix-like
ils
ipwd
icd
ichmod
irm
imkdir
ipasswd
irsync
ichksum
imv
icp
ienv
FTP-like
iinit
iexit
iput
iget
These slides do not contain an exhaustive list of the icommands.
13
icommands (continued)
Metadata-related
imeta
iquest
idbo
Informational
ienv
ilsresc
iuserinfo
ihelp – list all icommands
Functional
ireg
ibun
irepl
Rule-oriented
irule
iqstat
iqdel
iqmod
idbug
The –h option to any icommand gives documentation on that command.
14
icommands - examples
> iinitEnter your current iRODS password: *****
> ienvNOTICE: Release Version = rods4.0.0, API Version = dNOTICE: irodsHost=tutoricat.renci.orgNOTICE: irodsPort=1247NOTICE: irodsDefResource=resc1NOTICE: irodsHome=/centosgrid/home/leesaNOTICE: irodsCwd=/centosgrid/home/leesaNOTICE: irodsUserName=leesaNOTICE: irodsZone=centosgrid…
> iexit full
This will store scrambledpw in ~/.irods/.irodsA
View your environment
This deletes the~/.irods/.irodsAauthentication file.
15
icommands - examples
> ipwd/centosgrid/home/leesa
> ils/centosgrid/home/leesa: irods-resource-4.0.0-64bit.deb C- /centosgrid/home/leesa/slides
Directories in iRODS are called collections.
This entry is a collection.
Listing the contents of your current directory.
Collection naming convention: /zone_name/home/user_name/collection_name
See your current working directory.
16
> iput –K notes (-K: to calculate and store checksums)
> iput iquest-attributes (no checksums)
> ils /centosgrid/home/leesa:
iquest-attributes irods-resource-4.0.0-64bit.deb notes C- /centosgrid/home/leesa/slides
> iget –k irods-resource-4.0.0-64bit.deb (-k: to verify checksum without
storing)
> irm –f notes(-f to force the remove, physically. Without –f, the file is simply moved to the Trash collection on the same resource.)
icommands – to ingest and retrieve data
-k and –K options forchecksum calculation
and comparison
17
icommands - examples
> ils -L/centosgrid/home/leesa:
leesa 0 demoResc 1883 2014-05-01.13:58 & iquest-attributes demoResc generic
/var/lib/irods/iRODS/Vault/home/leesa/iquest-attributes
leesa 0 resc1 28937358 2014-04-30.14:47 & irods-resource-4.0.0-64bit.deb
02b7dd912af6e615f4680594fe3e2dc8 resc1 generic /var/irodsVault/home/
leesa/irods-resource-4.0.0-64bit.deb
C- /centosgrid/home/leesa/slides
Long listing of my current directory
Shows owner, resource, file, checksum (if calculated),physical path, …
18
ilsresc – list available resources
See the resources available on your data grid
On centosgrid:
> ilsrescresc1
demoResc resc2
Resources can share hosts but typically don’t.
Each resource has its own unique physical location
(path).
19
> ils -A
/centosgrid/home/leesa: ACL - leesa#centosgrid:own Inheritance - Disabled
iquest-attributes ACL - leesa#centosgrid:own
irods-resource-4.0.0-64bit.deb ACL - leesa#centosgrid:own
C- /centosgrid/home/leesa/slides
icommands – ACLs (access control lists) and ichmod
> ichmod read guest iquest-attributes
> ils –A
/centosgrid/home/leesa: ACL - leesa#centosgrid:own Inheritance - Disabled
iquest-attributes ACL - leesa#centosgrid:own
guest#centosgrid:read object
irods-resource-4.0.0-64bit.deb ACL - leesa#centosgrid:own
C- /centosgrid/home/leesa/slides
20
ichmod and group “public”
> ichmod -r read public slides
> ils -A slides
/centosgrid/home/leesa/slides:
ACL - leesa#centosgrid:own g:public#centosgrid:read object Inheritance - Disabled
irods4.0-intro.pptx ACL - public#centosgrid:read object leesa#centosgrid:own
Every user in the data gridis a member of user group
“public”
“ichmod –h” for help
21
> ils /centosgrid/home/rods: hello
> irepl -R resc1 hello
> ils/centosgrid/home/rods: hello
> ils -L/centosgrid/home/rods: rods 0 demoResc 22 2014-04-16.12:28 & hello demoResc generic /var/lib/irods/iRODS/Vault/home/rods/hello
rods 1 resc1 22 2014-05-01.17:03 & hello resc1 generic /var/irodsVault/home/rods/hello
irepl – replicating data objectsa replica ≠ a copy
Replicating does not create a new logical object; copying does.
Use the long listing to see all replicas and
multiple physical locations
The collection doesn’t change with replication of
data object “hello” which is still the same logical object
as before.
22
irepl
• See https://irods.sdsc.edu/index.php/irepl for complete list of options
• Some options
-B: Backup mode - if a good copy already exists in selected resource, don't make another copy
-P: show progress
-r: recursive replication
-R: destination resource
-X restart_filename: turn on restart option and accumulate restart info into restart_filename
“irepl –h” for help
23
Ingest data into iRODS without making an additional copy and with no data transfer
Example• Unix directory /projects/state-data (not an iRODS collection) contains
500TB of state LiDAR data.• We want to manage the LiDAR data with iRODS. • How to ingest the data into iRODS without copying or transferring
500TB?!
continued…
ireg – register data into iRODS
In-place ingestion of data into iRODS
“ireg –h” for help
24
Set-up• /projects/state-data must be mounted on the iRODS server host
• Data admin sets up /projects/state-data as an iRODS resource: > iadmin mkresc stateResc unixfilesystem
tutorres.renci.org:/projects/state-data
Creating resource:Name: "stateResc"Type: "unixfilesystem"Host: "tutorres.renci.org"Path: "/projects/state-data"Context: ""
continued …
ireg example (continued)
stateResc is now in listed as a resource:> ilsresc
resc1 demoResc resc2 stateResc
25
Usage
• Data admin registers existing data object or collection into iRODS:> ireg -C -R stateResc /projects/state-data/ /centosgrid/home/state-data/
> ils /centosgrid/home/state-data /centosgrid/home/state-data: lidar-d23720431300go20040824.asc
Register incoming files rigorously OR modify a directory only through iRODS
once it has been registered to keep the iCAT consistent with the directory.
ireg example (continued)
If new files are added to the directory, ireg can be run again, with option –f, to pick up as-yet unregistered files.
/centosgrid/home/state-data/Is now an iRODS collection
contained in stateResc
26
ibun – for bundling files
Tar up and expand files for efficient iput/iget
• iput a tarball and expand it within iRODS> tar -chlf tutorials.tar -C irods-4.0 .
> iput -Dtar tutorials.tar
> ibun -x tutorials.tar irods-4.0
icommand> ils/centosgrid/home/leesa: iquest-attributes irods-resource-4.0.0-64bit.deb notes tutorials.tar
> ils/centosgrid/home/leesa: iquest-attributes irods-resource-4.0.0-64bit.deb notes tutorials.tar C- /centosgrid/home/leesa/irods-4.0 C- /centosgrid/home/leesa/slides
icommand
Unix command to tar up a directory
27
ibun – for bundling files
Tar up and expand files for efficient iput/iget
• Tar up files in iRODS for iget
> ibun -cDtar slides.tar slides
> iget slides.tar
> tar –xvf slides.tar –C slides
“ibun –h” for help
icommand
Unix
> ils/centosgrid/home/leesa: iquest-attributes irods-resource-4.0.0-64bit.deb notes slides.tar tutorials.tar C- /centosgrid/home/leesa/irods-4.0 C- /centosgrid/home/leesa/slides
icommand
28
iquest – querying the iCAT
• To see attributes that can be queried:> iquest attrs
• See https://wiki.irods.org/index.php/iquest for examples.
• > iquest "SELECT DATA_PATH, DATA_NAME, DATA_CHECKSUM WHERE
DATA_RESC_NAME like 'resc2'”
DATA_PATH = /opt/irodsVault/home/leesa/notesDATA_NAME = notesDATA_CHECKSUM = 0ac4e079aa230d87353fec33f1ec71cd------------------------------------------------------------DATA_PATH = /opt/irodsVault/home/rods/yum-fuse-libraries-notesDATA_NAME = yum-fuse-libraries-notesDATA_CHECKSUM = ------------------------------------------------------------
(Checksums displayed only when they have
already been calculated and stored, of course)
29
iquest – querying the iCAT
• Useful when you want to remove a resource and you discover it isn’t empty:> iquest "SELECT DATA_NAME, USER_NAME, COLL_NAME WHERE
DATA_RESC_NAME like ‘tempResc'"DATA_NAME = slide-list-htmlUSER_NAME = rodsCOLL_NAME = /centosgrid/trash/home/rods-----------------------------------------------------------------
• Administrator can add SQL strings to be invoked by users of the data grid
• Data grid-specific queries (added by admin)> iquest --sql 'pre-defined SQL string' [format] [arguments]
A file was removed
with irm but notphysically deleted
(moved to Trash on
the resource)
30
imeta – add, view, modify metadata
• > imeta add –d hello “Date” “2 february 2012”
• > imeta ls –d hello
AVUs defined for dataObj hello:attribute: Meta1value: hellounits: ----attribute: Datevalue: 2 february 2012units:
• > imeta rm –d hello “Meta1” “hello”
“imeta –h” for help
31
iadmin – administrative functions
• h for help, “h <command>” for help for that command
• q to exit
• Add new users, modify passwords, add new resources, federate to remote zones, create resource groups,…– mkresc/rmresc– mkuser/rmuser, moduser (modify passwords)– mkzone/rmzone, modzone (for federation)
• Information on users, resources, tokens, etc– lt (el-tee), “lt <token type>” for listing of that token’s possible
values– lu, lr, lz,…
32
1. Admin user must set up the cloud resource, for example:– S3 Resource: s3Resc
2. This is just another iRODS resource for the users, who can manage their cloud data just as all other iRODS data:
> iput –K –R s3Resc my_file (put data into the cloud resource)
> irepl –R s3Resc another_file (replicate into the cloud resource)
> ichmod read public my_file (give public access to data in the cloud)
> ichmod write user1 my_collection (give user1 permission to write
to a collection in the cloud)
Cloud Resources
33
Realizable Objects – coming in 4.1???
• Typical iRODS objects contain data
• Realizable objects: – symbolic links to iRODS objects– symbolic links to external data sources– workflow procedures to regenerate data
• Implemented in community iRODS, still to come for the merged release (planned for 4.1)
34
Database Resources????
• Database Resource (DBR): a database, queried and updated via SQL (or other, for non-SQL)
• Database object (DBO): an interface to a DBR, typically a (SQL) query that returns results
• iRODS agent will open and close DB as needed; results of the query are directly returned to user
• Query results are stored to an iRODS data object, a DBO Results file (DBOR).
• iRODS access controls are applied on the DBR and DBO.
35
Microservices and Rules
Microservices• C code• the unit of work within iRODS• called by rules – can only be run by a rule
Rules• execute the microservices• compose microservices and other rules into workflows
36
Rules
• Rules can run workflows of microservices (and other rules)
• Implement computer actionable policies
• Provide implementation of data policies– Retention, distribution, arrangement– Authenticity, provenance, description– Integrity, replication, synchronization– Deletion, trash cans, versioning– Archiving, staging, caching– Authentication, authorization, sharing– Access, approval, IRB, audit trails, report generation– Assessment criteria, validation– Derived data product generation, format parsing
37
Running Rules
• Triggered by events/policy points
• Contained in the (distributed) rule base: – /etc/irods/core.re– many rules with the same name; the first rule with satisfied
condition is executed; others are skipped
• Can be run with irule: manual execution
• Delayed execution– iqstat– iqdel
38
irule – to run a rule manually
• Example rules to tweak and run in the software distribution iRODS/clients/icommands/test/rules3.0
• irule -F rulemsiListEnabledMS.r – list enabled microservices
Works with 4.0.1, not 4.0
• irule -F rulemsiAdmShowCoreRE.r - can only be run by admin users
• NB: not all the test rules in iRODS/clients/icommands/test/rules3.0 are running yet in 4.0.