introduction on r-gma shi jingyan computing center ihep

30
Introduction on R-GMA Shi Jingyan Computing Center IHEP

Upload: gabriel-francis

Post on 29-Dec-2015

218 views

Category:

Documents


2 download

TRANSCRIPT

Introduction on R-GMA

Shi Jingyan

Computing CenterIHEP

Content

R-GMA R-GMA concept R-GMA components Accounting system

Relational Grid Monitoring Architecture

-- introduction

Models the information infrastructure of a Grid as a set of Consumers (who request information), Producers (who provide information) and a single Registry (which mediates the communication between producers and consumers).Impose a standard query language (a subset of SQL): producer publishes tuples with INSERT statement; consumer query tuple with SELECT statement.All tuples carry a time-stamp to support monitoring system

R-GMA Introduction (cont.)

R-GMA Introduction (cont.)

Architecture:

R-GMA Introduction (cont.)

the information resources of a VO is in a single virtual database containing a set of virtual table.a single schema contains the name and structure of each virtual table in the system.a single registry contains a list, for each table, of producers who have offered to publish rows for the table.a consumer runs an SQL query against a table, and the registry selects the best producers to answer the query in a process called mediation. The consumer then contacts each producer directly, combines the information, and returns a set of tuples.Mediation process is hidden from the user. There is no central repository holding the contents of the virtual table.

R-GMA Introduction (cont.)Producers:

Primary producer: user's code periodically inserts tuples which is then stored internally by the producer. The producer answers consumer queries from its own storage.

Secondary producer: populates its own storage by running its own query against the virtual table. The user code only sets the process running; the tuples come from other producers.

On-demand Producer: no internal storage; data is provided by the user code in direct reponse to a query forwarded on to it by the producer service.

R-GMA Introduction (cont.)

Consumer: each consumer represents a single SQL SELECT query on the virtual database and obtain the answer tuple from the producer after the mediation.Mediation: The query is first passed to the Registry to identify which producers, for each virtual table in the query, must be contacted to answer it. The process is called Mediation.

R-GMA Introduction (cont.)

Types of query continuous query: all new tuples matched

the query will be streamed into the consumer's tuple-storage as soon as they are inserted into the virtual table by the rpoducers.

One-time queries: History-query: all versions of any matching

tuples are returned. Latest-query: only the tuples representing

the ”current state” are returned. Static query: database-like query and do

not contain R-GMA time-stamps.

R-GMA Introduction (cont.)

Retention Periods: LatestRetentionPeriod: is inserted into

each tuple published by a Primary Producer and remains there when a tuple is re-published by a Secondary Producer.

HistoryRetentionPeriod: Producer declare a HistoryRetentionPeriod for each table to which they are publishing tuples.

A latest-query returns only those tuples which have not exceeded their LatestRetentionPeriod for the table. A history-query returns all versions of tuples which have not exceeded the producer's HistoryRetentionPeriod for the table.

R-GMA Introduction (cont.)

R-GMA Introduction (cont.)

Web Service Architecture: R-GMA conforms to the Web Services

Architecture. 6 principal services:Primary producer,Secondary

producer,On-demand producer, Consumer, Registry and Schema

Each service has one WSDL document. Message is used to communicate with the

services. Message sequence and format are also specified

in WSDL.

R-GMA Introduction (cont.)

R-GMA uses ”SOAP messaging over http/s” in a request/response pattern.

Apel—accounting in LCG-2Apel software is composed of Apel Log Processor and Flexible archiver.Apel Log Processor: parses log files to extract job information and publishes it using R-GMA.Flexible Archiver:Located on the Grid Operation Center(GOC). Receive the data for the accounting table from all sites participating in the R-GMA configuration, it will contain an amalgamation of all accounting data from each site.

Apel—accounting in LCG-2 (cont.)

Apel Log Processor

used to parse GateKeeper and PBS event logs generated at a site. The extracted data is pieced together to form an accounting record detailing the owner of a submitted job with the resources used to excute the job itself. Accounting records are then published using R-GMA.Accounting records are then collated together into a centralised repository on the GOC using an R-GMA Secondary Producer.

Aple Log Processor (cont.)

Aple Log Processor (cont.)parsed log files: /var/log/globus-gatekeeper.log /var/log/message /var/spool/pbs/server_priv/accounting

Tables used in Apel EventRecords GkRecords MessageRecords SpecRecords LcgRecords (published)

Flexible Archiver

Examples – Two Servlets

The first one: Provides a web page as the user

interface. Create a consumer to show the statistic info from the accounting data on the date the user provides

Example – Two servlets (cont.)

The second example: Create a primary producer to publish the

statistic infomation of the accounting data which can be queried from the browser servlet provided by RGMA software package

IHEP Accounting plan

Pbs log file: /var/spool/pbs/server_priv/accountingPerl program analyse log file to generate DB dataJava program uses producer to publish the necessary accounting info by joining DB dataRgma server has registry function to maintain the virtual tableSummary accounting info with respect to user.

+-------------------------+-------------+------+-----+---------+-------+| Field | Type | Null | Key | Default | Extra |+-------------------------+-------------+------+-----+---------+-------+| theDate | date | YES | | NULL | || eventID | varchar(60) | YES | | NULL | || siteName | varchar(30) | YES | | NULL | || localUser | varchar(20) | YES | | NULL | || localGroup | varchar(20) | YES | | NULL | || jobName | varchar(30) | YES | | NULL | || queueName | varchar(20) | YES | | NULL | || jobCreateTime | varchar(10) | YES | | NULL | || jobQueuedTime | varchar(10) | YES | | NULL | || jobEligibleTime | varchar(10) | YES | | NULL | || startTime | varchar(10) | YES | | NULL | || endTime | varchar(10) | YES | | NULL | || execHOST | varchar(30) | YES | | NULL | || resource_List_cput | time | YES | | NULL | || resource_List_neednodes | varchar(30) | YES | | NULL | || sessionID | int(10) | YES | | NULL | || exitStatus | int(2) | YES | | NULL | || resources_Used_cput | time | YES | | NULL | || resources_Used_mem | int(16) | YES | | NULL | || resources_Used_vmem | int(16) | YES | | NULL | || resources_Used_walltime | time | YES | | NULL | |+-------------------------+-------------+------+-----+---------+-------+