Download - Towards Data Grid Standard Implementations
![Page 1: Towards Data Grid Standard Implementations](https://reader030.vdocument.in/reader030/viewer/2022032605/56812b5d550346895d8f7fbb/html5/thumbnails/1.jpg)
San Diego Supercomputer CenterSan Diego Supercomputer Centerwww.irods.org iRODS DGMS
Towards Data Grid Standard Implementations
Arun Jagatheesan
San Diego Supercomputer Center
Open Grid Forum 19 Jan 31, 2007 – session II
![Page 2: Towards Data Grid Standard Implementations](https://reader030.vdocument.in/reader030/viewer/2022032605/56812b5d550346895d8f7fbb/html5/thumbnails/2.jpg)
San Diego Supercomputer Centerwww.irods.org IROS DGMS 2
Outline
• Community Introduction : OGF-GFS• User perspective• Developer/Vendor Perspective• Need for standard community implementation• Community implementation process• GFS-WG community architecture sketch• Follow-up actions
![Page 3: Towards Data Grid Standard Implementations](https://reader030.vdocument.in/reader030/viewer/2022032605/56812b5d550346895d8f7fbb/html5/thumbnails/3.jpg)
San Diego Supercomputer Centerwww.irods.org IROS DGMS 3
Motivation
• Global namespace for unstructured data storage • Collaboration amongst multiple partners / teams• Long-term management of unstructured data
• Files, collection-based digital entities
![Page 4: Towards Data Grid Standard Implementations](https://reader030.vdocument.in/reader030/viewer/2022032605/56812b5d550346895d8f7fbb/html5/thumbnails/4.jpg)
San Diego Supercomputer Centerwww.irods.org IROS DGMS 4
NIH BIRN Data Grid
![Page 5: Towards Data Grid Standard Implementations](https://reader030.vdocument.in/reader030/viewer/2022032605/56812b5d550346895d8f7fbb/html5/thumbnails/5.jpg)
San Diego Supercomputer Centerwww.irods.org IROS DGMS 5
World Wide Datagrid
![Page 6: Towards Data Grid Standard Implementations](https://reader030.vdocument.in/reader030/viewer/2022032605/56812b5d550346895d8f7fbb/html5/thumbnails/6.jpg)
San Diego Supercomputer Centerwww.irods.org IROS DGMS 6
Used or Required by
• Large scale academic projects• Federal agencies (NARA, LoC, …)• Fortune 500, Forbes Global 2000, ….
![Page 7: Towards Data Grid Standard Implementations](https://reader030.vdocument.in/reader030/viewer/2022032605/56812b5d550346895d8f7fbb/html5/thumbnails/7.jpg)
San Diego Supercomputer Centerwww.irods.org IROS DGMS 7
DGMS Concept-wise
• Large-scale logical file system + File System+ Database System+ Grid Computing
= Data Grid Management System (DGMS)
• Core Concepts• Logical shared collections • Logical shared resources• Collaborative communities
![Page 8: Towards Data Grid Standard Implementations](https://reader030.vdocument.in/reader030/viewer/2022032605/56812b5d550346895d8f7fbb/html5/thumbnails/8.jpg)
San Diego Supercomputer Centerwww.irods.org IROS DGMS 8
Problem solved / Requirements –1
• Collaborative logical namespace• Global collaborations of multiple teams• Collaborations of multiple organizations • Avoid multiple mount points as they restrict scalability of
the collaboration• Coordinated data sharing at any granular level (data,
metadata, annotations,…)
![Page 9: Towards Data Grid Standard Implementations](https://reader030.vdocument.in/reader030/viewer/2022032605/56812b5d550346895d8f7fbb/html5/thumbnails/9.jpg)
San Diego Supercomputer Centerwww.irods.org IROS DGMS 9
Problem solved / Requirements –2
• Data Distribution• Multi-site replicas reduce access times• Replicas have the same logical name everywhere in the
enterprise (big plus for users)• Concept of replica, copy, cache• Replicas controlled by user, admin, system-enabled
(automated or policy based)• Reduce WAN latency (chattiness)
![Page 10: Towards Data Grid Standard Implementations](https://reader030.vdocument.in/reader030/viewer/2022032605/56812b5d550346895d8f7fbb/html5/thumbnails/10.jpg)
San Diego Supercomputer Centerwww.irods.org IROS DGMS 10
Problem solved / Requirements –3
• Data Classification and Discovery• Major advantage for Global 2000 companies• Tag data with any arbitrary metadata schema• Each team can organize its data based on user-defined
attributes• Multiple teams can have different metadata attributes on
the same data• Query, discover and access data without knowing path or
protocol to be used
![Page 11: Towards Data Grid Standard Implementations](https://reader030.vdocument.in/reader030/viewer/2022032605/56812b5d550346895d8f7fbb/html5/thumbnails/11.jpg)
San Diego Supercomputer Centerwww.irods.org IROS DGMS 11
User Perspective
• Designed for Off the shelf • don’t want to assemble (or DIY) • But able to customize the solution
• One point of contact or responsibility• If it does not work I have one mailing list or number to call
![Page 12: Towards Data Grid Standard Implementations](https://reader030.vdocument.in/reader030/viewer/2022032605/56812b5d550346895d8f7fbb/html5/thumbnails/12.jpg)
San Diego Supercomputer Centerwww.irods.org IROS DGMS 12
Vendor/developer perspective
• “OGF-GFS compatible” • OGF-GFS Data Grid Applications• OGF-GFS Data Grid Appliance
• Ease of standard evolution• Avoid unnecessary dependencies on multiple interfaces
for operations that are the same granular level
• Ability to collaborate, learn and compete• An end-to-end solution with common interface• Additional capabilities that add value to the solution
![Page 13: Towards Data Grid Standard Implementations](https://reader030.vdocument.in/reader030/viewer/2022032605/56812b5d550346895d8f7fbb/html5/thumbnails/13.jpg)
San Diego Supercomputer Centerwww.irods.org IROS DGMS 13
Lessons Learnt
• Software v/s Specification• Software implementation to engage and collaborate as we
define standards (unless every wants to invest on software development from the start)
• Make both the user and vendor/developer happy• Have users happy to be confident to share requirements
and demand for the standards from vendors/developers• Vendors/developers know it’s a real thing that can be
implemented around their existing products or software
![Page 14: Towards Data Grid Standard Implementations](https://reader030.vdocument.in/reader030/viewer/2022032605/56812b5d550346895d8f7fbb/html5/thumbnails/14.jpg)
San Diego Supercomputer Centerwww.irods.org IROS DGMS 14
The scope (from GFS Architecture)
• A single interface• Protocols
• A hybrid of XML and byte-level protocol• XML – command channel of operations• Byte-level – data movement
• Possible Functionalities • File namespace and file operations (read, write, …• Meta-data operations (user-defined metadata, search)• Data Grid Language for policy, rules etc.,
![Page 15: Towards Data Grid Standard Implementations](https://reader030.vdocument.in/reader030/viewer/2022032605/56812b5d550346895d8f7fbb/html5/thumbnails/15.jpg)
San Diego Supercomputer Centerwww.irods.org IROS DGMS 15
What could be the right high level picture?
DGMS
XML-command protocol
XML-command protocol
Byte-level data protocol
Byte-level data protocolObject-transfer
Facilitate SOA
![Page 16: Towards Data Grid Standard Implementations](https://reader030.vdocument.in/reader030/viewer/2022032605/56812b5d550346895d8f7fbb/html5/thumbnails/16.jpg)
San Diego Supercomputer Centerwww.irods.org IROS DGMS 16
What could be the right high level picture?
DGMSserver
XML-command protocol
XML-command protocol
Byte-level data protocol
Byte-level data protocol
DGMSserver
DGMSserver
![Page 17: Towards Data Grid Standard Implementations](https://reader030.vdocument.in/reader030/viewer/2022032605/56812b5d550346895d8f7fbb/html5/thumbnails/17.jpg)
San Diego Supercomputer Centerwww.irods.org IROS DGMS 17
User perspective
Logical Resources
Multiple Replicas
Users from different
organizations
User defined meta data for
data discovery
Secret Recipe
![Page 18: Towards Data Grid Standard Implementations](https://reader030.vdocument.in/reader030/viewer/2022032605/56812b5d550346895d8f7fbb/html5/thumbnails/18.jpg)
San Diego Supercomputer Centerwww.irods.org IROS DGMS 18
So what will we be doing (products?)
• Definition• Concept ( data grid namespace, resource-namespace…)• Initial functionalities (DGMS operations to be targeted)• Namespace (Files, Metadata, Resource, Policy rules)
• XML protocol • XML-handshake and message transfer between DGMS-
client and DGMS-server
• Most importantly…• Software as a common framework for the evolution,
adoption and growth of the standard and DGMS concepts
![Page 19: Towards Data Grid Standard Implementations](https://reader030.vdocument.in/reader030/viewer/2022032605/56812b5d550346895d8f7fbb/html5/thumbnails/19.jpg)
San Diego Supercomputer Centerwww.irods.org IROS DGMS 19
So how will we do it? (process)
• Community-based open design (OPEN FORUM)• Design discussions as a community• Code through multiple parties to make sure we keep the
vendor/developer community and user community engaged
• Community-based open standard (OPEN STDS)• Specs written using wiki and other mechanisms• Community based spec for OGF• Interoperability workshops and Workshops along with
other relevant agencies like SNIA or DMTF
![Page 20: Towards Data Grid Standard Implementations](https://reader030.vdocument.in/reader030/viewer/2022032605/56812b5d550346895d8f7fbb/html5/thumbnails/20.jpg)
San Diego Supercomputer Centerwww.irods.org IROS DGMS 20
How can you get started?• Initial requirements
• Can you delete email? (sign up for our mailing list)• Got Bandwidth and browser? (Visit our group page)• Can you scream or shout or smile ( join our WG sessions)
• Are you a user or consumer or researcher?• Tell us what is needed?• What should be there for you to put this open source
software/standard in production
• Are you a vendor/developer?• Have your engineer or developer talk to us (we will convert him to a
DGMS developer or DGMS Guru)• We are developing a open standard – take advantage of it and
develop a value added solution around it
![Page 21: Towards Data Grid Standard Implementations](https://reader030.vdocument.in/reader030/viewer/2022032605/56812b5d550346895d8f7fbb/html5/thumbnails/21.jpg)
San Diego Supercomputer Centerwww.irods.org IROS DGMS 21
When do we get started?
• Right now (Hmmm.. We did long time back)• Conference calls every other week
• Mostly Wednesdays• Attend through phone call, Skype or Polycom Video
conference (any thing you like)• Discussions influencing, design requirements
• Face to face meeting• Once every quarter (planned), OGF sessions
![Page 22: Towards Data Grid Standard Implementations](https://reader030.vdocument.in/reader030/viewer/2022032605/56812b5d550346895d8f7fbb/html5/thumbnails/22.jpg)
San Diego Supercomputer Centerwww.irods.org IROS DGMS 22
Suggestions, comments, critics
• TO DO• Standard operations based on policies/rules• Take advantage of OGF standards as possible• Other commercial or magic tools could be used below the
standard
• NOT TO DO
![Page 23: Towards Data Grid Standard Implementations](https://reader030.vdocument.in/reader030/viewer/2022032605/56812b5d550346895d8f7fbb/html5/thumbnails/23.jpg)
San Diego Supercomputer Centerwww.irods.org IROS DGMS 23
Conclusions
• Data Grids• Data Grid Management systems (DGMS)• Very good user need in academic and non-academics• Need for standards framed by Grid File System WG
• Software-included Spec Strategy