CLAs Reconstruction and Analysis
Physics Data Processing with SOA based Framework
Vardan Gyurjyan on behalf of Clas12 software group
Outline
Problem statement SOA based framework as a solution Current status of the ClaRA project Future plans Conclusion
April 19, 2023V. Gyurjyan
Computing power
CMOS TechnologySingle Chip Integration
Node/Rack Integration
Network Integration
April 19, 2023V. Gyurjyan
Integration
Chip4 Cores
Computer Card1Chip13.6GF/s
Mode Card32 Computer Cards435 GF/s
Rack32 Mode Cards13.9 TF/s
IBM Blue Gene 72 Racks1PF/s
April 19, 2023V. Gyurjyan
Network Evolution
1
10
100
1,000
10,000
100,000
1,000,000
1980 1983 1986 1989 1992 1995 1998 2001
Nor
mal
ized
Gro
wth
sin
ce 1
980
User Traffic2x / 12months
Router Capacity2.2x / 18months
Moore’s Law2x / 18 months
Network Capacity2x / 7 months
CAT410Mbps10base-T
CAT5100Mbps100base-T
CAT5e1Gbps1000base-T
2003 – CAT6 10Gbps 2007 – CAT7 100Gbps
April 19, 2023V. Gyurjyan
High Performance Computing Trends
1. Exponential growth in processor performance (coming to an end)
2. Power cost = System cost: invention required
3. Growth in level of parallelism (near term solution)
April 19, 2023V. Gyurjyan
IBM Approach – Path to Petascale
Multiple modest cores on a single chip rather than one high-performance processor Watts/FLOP will not improve much from future
technologies. Linux environment and MPI (standard messaging
interface)
April 19, 2023V. Gyurjyan
Specifics of the Offline Software
Lifetime of the software >= lifetime of the experiment. Collaborative nature of the development. Coexistence of parallel running applications for the
single experiment. Unprecedented scale and complexity of the physics
computing environment Physics computing environment must keep up with
fast growing computing technologies Large worldwide user base.
April 19, 2023V. Gyurjyan
PDP (Physics Data Processing) ApplicationConventional vs. parallel/distributed
April 19, 2023V. Gyurjyan
Running Conventional Software Application
Copycheckout
Give up
Configure
Compile
Fix errors
Run
Modified?
Complain
yesno
yesno
ok
April 19, 2023V. Gyurjyan
Programming Errors Compile time
Program does not compile. Compiler reports a “best guess” of the problem Undeclared variables or functions Missing semicolon or brace Typos Missing files or libraries Type ambiguities
Run time Executable crashes or has unexpected behavior May not appear for all conditions or all data sets Uninitialized variables Memory errors Numeric errors Type errors in print statements Closing a NULL file pointer Accessing a NULL pointer Variables out of scope
April 19, 2023V. Gyurjyan
Challenges of the Conventional Approach
Difficult to organize and coordinate activities Difficult to maintain Inevitable fragmentation of the software Poor scalability Computing skills are required to use physics data
processing applications
April 19, 2023V. Gyurjyan
ABC
BA
A B C
CLAS 6
CLAS 12
A+B << CC: requires a few or no programming skills
April 19, 2023V. Gyurjyan
Where we start?
Each bite is a clear, simple, single purpose application, developed by group B member.
Group A, with a tight collaboration with group B and C shall control and manage the process, never loosing maniacal focus on a big picture (elephant).
April 19, 2023V. Gyurjyan
“Things should be made as simple as possible, but not simpler.”Albert Einstein
April 19, 2023V. Gyurjyan
Language and Architecture Evolution
Structured and Proceduralprogramming
Object Orientedprogramming
Assem
bly
Lan
gu
ag
e
Serv
ice O
rien
ted
pro
gra
mm
ing
April 19, 2023V. Gyurjyan
SOA SOA promotes the goal of separating service users from the
service implementation. Style of building reliable systems that deliver functionality as
services Loose coupling between interacting services Directories and addressing mechanisms are at the center of SOA.
ProgramArbitrary format Arbitrary format
ServiceStandard format Standard format
Complex
Specialized, simple
April 19, 2023V. Gyurjyan
Attributes of Services
Well defined, easy-to-use, somewhat standardized interface Self-contained with no visible dependencies to other
services (almost) Always available but idle until requests come Location transparency Easily accessible and usable readily, no “integration”
required New services can be offered by combining existing services Quantifiable quality of service
April 19, 2023V. Gyurjyan
Service Interface
Standard message based Highly Polymorphic
Intent is enough Implementation can be changed in ways that do not
break all the service consumers
April 19, 2023V. Gyurjyan
Service Orientation is scalable
End users can consume and combine a lot of services since they don’t have to know or “learn” how the services are made.
Service providers (A+B) can offer their services to a lot more consumers by optimizing The user interface Access Implementations
April 19, 2023V. Gyurjyan
“On Demand” Physics Data Processing
Use software as you need Much lower setup time, forget about
Installation Implementation Training Maintenance
Scalable and effective usage of resources Parallelism (CPU, Storage, Bandwidth…)
April 19, 2023V. Gyurjyan
What is ClaRA?
Framework that Implements SOA. Service development environment. Toolbox of generic physics data processing services. Network distributed platform. The “Glue”, binding together services into an
algorithmic data analysis application.
April 19, 2023V. Gyurjyan
Design criteria Framework service shall be simple to use and easy to learn. Framework service should be customizable to be able to adapt to the different
data processing tasks. Framework shall provide context sensitive help and assistance, with many
real world physics data processing application examples. Framework shall provide ready to use services, encapsulating essential
functionalities of the physics data processing system. Services shall be reusable and easily replaceable. Physics data processing application design and implementation shall require
a few or no programming skills. Neither specific computing environment, nor compiling shall be necessary to
build and run physics data processing application. Framework shall provide graphical environment for physics data processing
application development. Frameworks platform shall be network distributed, and shall have temporal
continuity. The new system shall provide World Wide Web access to the services for
remote configuration and execution of the data processing applications. The necessary security considerations must be addressed.
April 19, 2023V. Gyurjyan
Data and Algorithm
Framework advocates clear separation between: a) data and algorithm b) transient and persistent data
Methods in the data object will be limited to manipulations of the internal data members only.
Algorithm will process one type of data and generate data objects of a different type.
Algorithm Data Data
April 19, 2023V. Gyurjyan
Persistent and Transient Data
Physics algorithm objects should not use data objects directly in the persistent storage.
Transient data storage as a means of communication between physics algorithms.
Two different optimization criteria for applications using persistent and transient data.
Being independent from the persistent storage technology.
April 19, 2023V. Gyurjyan
Current Status
Geometry
Service
Magnetic
Field Map
Service
GEMCService
TrackingService
bCNUService
Event Data
Service
ClaRA cMsg Platform
Thin Clients
WWWClaRA WebServices Platform
MathService
StatService
ProbabilityService
GeometryService
Matrices Service
April 19, 2023V. Gyurjyan
Examples
EVIO event producer and EVIO event consumer services (C++).
data producer and data consumer services. C examples use cMsg payload (ASCII).
C++ geometry service client example Java geometry service client example Web services JSP clients
April 19, 2023V. Gyurjyan
Tracking composite application
Transient data
Space-point
maker
Coarse track finder
Cluster Analyze
r
Ambiguity solver
Track fitter
Histogram
builder
Persistent data
ClaRA cMsg Platform
Thin Clients
April 19, 2023V. Gyurjyan
Tracking application service decomposition
DetectorData
EvtData
StatData
TransientEvtData
TransientDetData
TransientStatData
Track candidates
Resolved Tracks
Space Points
Raw Data
Final Tracks
SpacePointFormation
CoarseTrackFinder
SeadMaker
VertexFinder
ClusterAnalyzer
AmbiguitySolver
TrackFitter
TrackScoring
Supervisor
start
start
start
start
retrieve
record
retrieve
record
retrieve
record
retrieve
Transient Storage
Tracking State machine
April 19, 2023V. Gyurjyan