federated hierarchical filter grids
DESCRIPTION
Federated Hierarchical Filter Grids. STTR-funded project with Indiana, Caltech and Deep Web Technologies A Grid infrastructure for Data Analysis Integrates with the LHC Tiered Computing Model Directly supports general Scientific Analysis - PowerPoint PPT PresentationTRANSCRIPT
Federated Hierarchical Filter Grids
• STTR-funded project with Indiana, Caltech and Deep Web Technologies
• A Grid infrastructure for Data Analysis
• Integrates with the LHC Tiered Computing Model
• Directly supports general Scientific Analysis
• In the HEP case, the Gridlet is instantiated as a Rootlet
IS GridletIS Gridlet IS Gridlet
IS GridletIS Gridlet IS Gridlet IS Gridlet
IS Gridlet
SearchPlanning
ConstructionManagement
Portal
Presentation
FederationMacrosopic Workflow
General System Services-----------------------Messaging/Data transportNotificationSecurityFault ToleranceMetadataDirectoryCollaborationReplica Management
SessionManagement
Fig. 1: The FHFG Architecture Composed of Information Service Gridlets managed by general Grid system services with a portlet-based portal user interfaceThe FHFG Architecture Composed of Information Service Gridlets managed by general Grid system services with a portlet-based portal user interface
Database
SS
SS
SS
SS
SS
SS
SS
SS
SS
SS
FS
FS
FS
FS
FS
FS
FS
FS FS
FS
FS
FS
FS
FS
FS
FS
FS FS
FS
FSPortal
FS
OS
OS
OS
OS
OS
OS
OS
OS
OS
OS
OS
OS
MD
MD
MD
MD
MD
MD
MD
MD
MD
MetaDataFilter Service
Sensor Service
OtherService
AnotherGrid
Raw Data Data Information Knowledge Wisdom
Decisions
SS
SS
AnotherService
AnotherService
SSAnother
Grid SS
AnotherGrid
SS
SS
SS
SS
SS
SS
SS
SS
FS
SOAP Messages
Filter GridsThree Features:• Information services
present data through traditional interfaces
• Filters that accept data between these interfaces, transform and re-present
• Streaming connections between all services:– High performance– Archiving– Security– Fault tolerance– Narada Brokering
Information Resource
ReceiveRequest/Select
Get Status
Multi-Resolution Data Get
IS =
InformationService
BFS =
Basic FilterService
Fig 2: FHFG is built from Information resources wrapped as Web Services and Basic Filters that either transform or aggregate Information. Information Services and Filters support identical Service Interfaces.
Filter Resource
ReceiveRequest/Select
Get Status
Multi-Resolution Data Get
IssueRequest/Select
RequestStatus
Multi-Resolution Data Put
Filter Grids are built from Information resources wrapped as Web Services and Basic Filters that either transform or aggregate Information. Information Services and Filters support identical Service Interfaces.
Information Resource
Request/Select Status MultiResolution Get
IS =
InformationService
Filter Resource
Request/Select Status MultiResolution Get
MultiResolution PutIssue QueriesBFS =
Basic FilterService
Filters either transform or Aggregate Information
HEP Event Analysis using Filter GridsAnalysis tool of choice is Root.Typical analysis activity is
– Loading many files containing event data– Passing each event through a selection filter– Subjecting each selected event to a set of algorithms– Creating summary information in the form of histograms/tables/files
Analysis: starts with small event samples, then applied to much larger samplesFrequently these are remotely located in the Grid
Our HEP implementation is a Filter Grid consisting of Clarens-hosted “Rootlets”.
Each Rootlet is a full instance of the Root application, but limited in scope:– The user’s Root loads a Clarens plug-in– The Clarens interface to the Dataset Location Service allows a list of remote
datasets to be generated– The client contacts each remote Grid node, connects to the Clarens server there,
and instantiates a Rootlet– The user’s analysis selection code is passed over the network to the Rootlet– The list of event data files is passed to the Rootlet– The Rootlet executes, and terminates. – The output histograms/tables/files are then made available via the Clarens
server, and fetched, aggregated and processed as required.
Physicist at Tier3 using Root on GBytes of ntuplesLoads Clarens Root plugin. Connects to Clarens at
Tier2. Sends analysis code (.C/.h files).Clarens creates Rootlet, passes it .C/.h filesRootlet runs analysis code on TBytes of ntuples,
creating high statistics output data.Root at Tier3 receives and plots data
RootletsRoot embedded in a Clarens server
Root nTuples
Clarens Plugin XML/RPCXML/RPC
GBytesGBytes Root nTuples
~10 TBytes~10 TBytes
Analysis.C, Analysis.h
Tier3Tier3 Tier2Tier2
Higgs diphoton Analysis using Rootlets