wp3 the status of the eu datagrid's r-gma system steve fisher / ral 24/4/2003
DESCRIPTION
WP3 Steve Fisher/RAL - 24/4/2003R-GMA3 GMA From GGF Very simple model Does not define: –Data model –How data are moved from Producer to Consumer –What registry looks like Producer Consumer Registry Store location Lookup location execute or streamTRANSCRIPT
WP3
The status of the EU DataGrid's R-GMA system
Steve Fisher / RAL24/4/2003
Steve Fisher/RAL - 24/4/2003R-GMA 2
WP3Who we are• Heriot-Watt, Edinburgh
– Andrew Cooke, Werner Nutt • IBM-UK
– James Magowan, (Manfred Oevers), Paul Taylor• INFN
– Roberto Barbera, Giuseppe Save, Gennaro Tortone• Queen Mary, University of London
– Roney Cordenonsi, (Ari Datta)• CCLRC/PPARC
– Rob Byrom, Laurence Field, Steve Hicks, Manish Soni, Antony Wilson, (Xiaomei Zhu), Jason Leake
– Linda Cornwall, Abdeslem Djaoui, Steve Fisher, Robin Middleton• SZTAKI, Hungary
– Peter Kacsuk, Norbert Podhorszki• Trinity College Dublin
– Brian Coghlan, Stuart Kenny, David O’Callaghan, (John Ryan)
Steve Fisher/RAL - 24/4/2003R-GMA 3
WP3GMA
• From GGF• Very simple model• Does not define:
– Data model– How data are
moved from Producer to Consumer
– What registry looks like
Producer
Consumer
Registry
Store location
Lookup
locatio
n
execute or
stream
Steve Fisher/RAL - 24/4/2003R-GMA 4
WP3R-GMA
• Use the GMA from GGF
• A relational implementation– Powerful data model
and query language
• Applied to both information and monitoring
• Creates impression that you have one RDBMS per VO
Producer
Consumer
Registry
Store location
Lookup
locatio
n
execute or
stream
Steve Fisher/RAL - 24/4/2003R-GMA 5
WP3Relational Data Model
• Not a general distributed RDBMS system, but a way to use the relational model in a distributed environment where global consistency is not important.
• Producers announce: SQL “CREATE TABLE” publish: SQL “INSERT”
• Consumers collect: SQL “SELECT” • Some producers, the Registry and Schema make use
of RDBMS as appropriate – but what is central is the relational model.
Steve Fisher/RAL - 24/4/2003R-GMA 6
WP3Producer Consumer• Consumer can issue one-off queries
– Similar to normal database query• Consumer can also start a continuous query
– Requests all data published which matches the query
• Can be seen as an alert mechanism
Steve Fisher/RAL - 24/4/2003R-GMA 7
WP3Registry choices
• Decided early to keep them separate• In fact they have different requirements for
distribution/replication• Each implemented with one RDBMS per
instance
Registry (of Producers
and Consumers)
Schema (descriptions
of tables)
Steve Fisher/RAL - 24/4/2003R-GMA 8
WP3Virtual RDBMS• Creates impression that you have one
RDBMS per VO– This makes it very easy to use– 1 integrated system– 1 query language
• Users like it• But how will it fit in with GridServices?
Steve Fisher/RAL - 24/4/2003R-GMA 9
WP3Producers
• DataBaseProducer – Supports History Queries– Information not lost– Supports joins – Clean up strategy
• StreamProducer – Supports Continuous Queries– In memory data structure– Can define minimum retention period
• ResilientStreamProducer – Supports Continuous Queries– Like the StreamProducer but won’t lose data if system crashes– So slightly slower
• LatestProducer – Supports Latest Queries– Just holds the latest information for any “primaryish” key– Supports joins
• CanonicalProducer – Supports anything– Offers anything as relations
Steve Fisher/RAL - 24/4/2003R-GMA 10
WP3Archiver (Re-publisher)
• It is a combined Consumer-Producer • You just have to tell it what to collect and it
does so on your behalf• Re-publishes to any kind of “Insertable” (i.e.
not to the CanonicalProducer)
Steve Fisher/RAL - 24/4/2003R-GMA 11
WP3Canonical Producer• Allows user defined code to be invoked to respond to
SQL query• Developed in collaboration with CrossGrid
CPAPI
User Code
CanonicalProducerServlet
Files
CreateTable, Port, Protocol, Security, SQL Support, Multiple Query Support
Security
Insert
Query
Port
Register
Steve Fisher/RAL - 24/4/2003R-GMA 12
WP3Functionality - mediator• Queries posed against a virtual data base• The Mediator must:
– find the right Producers– combine information from them
• Hidden component – but vital to R-GMA• Can now merge information from several
producers • The final mediator will take “any” SQL
statement and do the right thing
See Werner Nutt’s talk
Steve Fisher/RAL - 24/4/2003R-GMA 13
WP3Topologies
• Normally publish via SP
• Archivers instantiated with a Producer and a Predicate
• Must avoid cycles in the graph
A SP
A SP
A LP
A HP
SP
SP
SP
SP
Steve Fisher/RAL - 24/4/2003R-GMA 14
WP3Schema & Contributions
CPULoad (Global Schema)Country Site Facility Load TimestampUK RAL CDF 0.3 19055711022002UK RAL ATLAS 1.6 19055611022002UK GLA CDF 0.4 19055811022002UK GLA ALICE 0.5 19055611022002CH CERN ALICE 0.9 19055611022002CH CERN CDF 0.6 19055511022002
CPULoad (Producer 3)
CH CERN ATLAS 1.6 19055611022002
CH CERN CDF 0.6 19055511022002
CPULoad (Producer 1)
UK RAL CDF 0.3 19055711022002
UK RAL ATLAS 1.6 19055611022002
CPULoad (Producer 2)
UK GLA CDF 0.4 19055811022002
UK GLA ALICE 0.5 19055611022002
Steve Fisher/RAL - 24/4/2003R-GMA 15
WP3Contributions are Views
CPULoad (Producer 1)
UK RAL CDF 0.3 19055711022002
UK RAL ATLAS 1.6 19055611022002
CPULoad (Producer 2)
UK GLA CDF 0.4 19055811022002
UK GLA ALICE 0.5 19055611022002
SELECT * FROM cpuLoad
WHERE country = ’UK’ AND site = ’RAL’
SELECT * FROM cpuLoad
WHERE country = ’UK’ AND site = ’GLA’
Steve Fisher/RAL - 24/4/2003R-GMA 16
WP3R-GMA Tools• R-GMA CLI
– Command Line Interface (similar to MySQL)– Supports single query and interactive modes– Can perform simple operations with Consumers,
Producers and Archivers
• R-GMA Browser– JSP application dynamically generating web pages– Supports pre-defined and user-defined queries
• Pulse– R-GMA Java client-based GUI– Supports streaming and simple graphical displays
Steve Fisher/RAL - 24/4/2003R-GMA 17
WP3GIN and GOUT (Gadget IN and Gadget OUT)
R-GMA Consumers
LDAPInfoProvider
GIN
LDAPServer
LDAPInfoProvider
CircularBuffer Producer
GIN
Consumer (CE)
Consumer (SE)
Consumer (SiteInfo) RDBMS
DataBase Producer
GOUT
ConsumerAPI
Archiver
CircularBuffer Producer
R-GMA
GLUESchema
Steve Fisher/RAL - 24/4/2003R-GMA 18
WP3R-GMA – How? • Currently based on servlet technology
– Behind every API there is a Servlet– Multiple hand crafted APIs
• Java, C++, C, Python and Perl
– Tomcat– Soft state registration– Uniform exception handling
• To ensure that useful messages and stack traces are preserved.
Steve Fisher/RAL - 24/4/2003R-GMA 19
WP3OGSIfication• Have recently started the migration to web
and grid services– Apache axis– WSDL generated APIs– Will provide a wrapper for backwards compatibility
Steve Fisher/RAL - 24/4/2003R-GMA 20
WP3
• All Grid Services• OGSA Factories, GSH, GSR• Registry includes HandleMapper• SQL as Service Data Element Query Language
ConsumerFactory
ProducerInstance
OGSIfied R-GMA
Sensor
ProducerAPI
Application
ConsumerAPI
Schema
RegistryConsumerInstance
ProducerFactory
Steve Fisher/RAL - 24/4/2003R-GMA 21
WP3OGSIfication issues• Consider XML as internal representation of
service data elements– Depends on other developments
• Consider XQuery as service data elements query language– Depends on how XQuery develops
• X-GMA ??– Will this be distinguishable from what is in GT3
Steve Fisher/RAL - 24/4/2003R-GMA 22
WP3Resilience - Registry
• Will have one logical registry and schema per VO
• Each logical registry will have multiple physical “copies”
• Each entry in registry has 3 possible states
• Transmit new records and deleted records and checksum after records deleted locally
• Self healing even supports new registry instances
• Consumer uses any instance• Fail over mechanism not yet
implemented• Schema more tricky
Producer1
Producer2
Registry2Info mastered by Registry2
Copy of info from Registry1
Copy of info from Registry3
Registry3Info mastered by Registry3
Copy of info from Registry1
Copy of info from Registry2
Registry1Info mastered by Registry1
Copy of info from Registry2
Copy of info from Registry3
See
poster
Steve Fisher/RAL - 24/4/2003R-GMA 23
WP3
Soft-state Registration and the Registry• Registry records existence of Producers and
Consumers• Registry holds last contact time and ‘expiry’
time• Producers and Consumers periodically
refresh their time stamps• Producer and Consumer servlets avoid
unnecessary traffic to Registry• Scheduled removal of entries that have
timed-out
Steve Fisher/RAL - 24/4/2003R-GMA 24
WP3Resilience Testing• Taking 7 components
– Schema– 2 registry instances– Producer API– Consumer API– Producer Servlet with other APIs– Consumer Servlet with other APIs
• Consider each component in turn– Break the network and bring it back– Close the component down and bring it back– Crash the component and bring it back
• Will also consider real life scenarios
Steve Fisher/RAL - 24/4/2003R-GMA 25
WP3Performance• By design:
– Very flexible - to avoid bottlenecks– Powerful queries allow a single query to be made
• Performance and Optimisation– Will use NetLogger and profiling tools to identify
possible bottlenecks• Internally not high speed because of XML etc
Steve Fisher/RAL - 24/4/2003R-GMA 26
WP3Summary• R-GMA is a combined Grid information and
monitoring system• Supports notion of Virtual Database• Recently deployed in the EDG development
testbed• Now focusing on reliability, stability and
performance
Thanks to the EU and our national funding agencies for their support of this work
http://hepunx.rl.ac.uk/edg/wp3/
Steve Fisher/RAL - 24/4/2003R-GMA 27
WP3And finally GGF8…• RGIS-RG
– The two short sessions will be held:• Session 1: Database use cases and best practices in the grid
environment (outside the traditional data areas)» Using databases to store application metadata» Using databases to store monitoring information» Using databases as a grid registry» Creating grid registries for locating relational and XML
databases• Session 2: Data discovery in the grid environment
– We will also discuss our milestones and future directions. (e.g should we include XML as well as Relational models.)
– See http://hepunx.rl.ac.uk/ggf/rgis-rg
• A GMA BOF is planned for GGF8