fairroot status and plans

43
FairRoot Status and plans Mohammad Al-Turany 6/25/13 M. Al-Turany, ALICE Offline Meeting 1

Upload: edward-velez

Post on 02-Jan-2016

53 views

Category:

Documents


10 download

DESCRIPTION

FairRoot Status and plans. Mohammad Al-Turany. What is FairRoot Framework? And why it is needed?. Simulation-, Reconstruction-, and Analysis-Framework 2003 started as 2 person project for the CBM experiment at FAIR - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: FairRoot Status and plans

FairRoot

Status and plans

Mohammad Al-Turany

6/25/13 M. Al-Turany, ALICE Offline Meeting 1

Page 2: FairRoot Status and plans

What is FairRoot Framework? And why it is needed?

6/25/13 M. Al-Turany, ALICE Offline Meeting 2

http://fairroot.gsi.de

• Simulation-, Reconstruction-, and Analysis-Framework • 2003 started as 2 person project for the CBM experiment at FAIR• Long list of base and/or ready to use modules and base classes of

needed by the particle experiments

Page 3: FairRoot Status and plans

Current hot topics in FairRoot

• Database interface o Re-design the database interface based on TSQLServer

• ZeroMQ integrationo Use of ZeroMQ as a communication layer

• Building, testing and quality assurance systemso Coverage tests, quality tests and unit tests

• Online monitoring o For test beams and detector proto-types

• GPU support and integration

• Time based simulation

6/25/13 M. Al-Turany, ALICE Offline Meeting 3

Page 4: FairRoot Status and plans

long list of people

who have

contributed pieces of

code to FairRoot

since the project

started end of 2003

6/25/13 M. Al-Turany, ALICE Offline Meeting 4

Core Team:• Mohammad Al-Turany IT • Denis Bertini IT• Florian Uhlig CBM / IT• Radek Karabowicz PANDA / IT• Dmytro Kresan R3B/ IT• Tobias Stockmanns PANDA

(FZJ)

People participated to major features:Ilse König HADESVolker Friese CBMOlaf Hartman PANDA

FairRoot Developers:

Student:Dennis Klein (finished 02.2013)Alexey Rybalchenko (EE)

Page 5: FairRoot Status and plans

FairRoot Group at the GSI

• Mohammad Al-Turany (IT)

• Denis Bertini (IT)

• Radoslaw Karabowicz (IT/PANDA)

• Dymtro Kresan (IT/R3B)

• Anar Manafov (IT)

• Alexey Rybalchenko (Master Student)

• Yago Gonzalez Rozas (Guest scientist)

• Florian Uhlig (IT/CBM)

• N.N. (Sep.2013) (IT)

6/25/13 M. Al-Turany, ALICE Offline Meeting 5

Page 6: FairRoot Status and plans

Design

6/25/13 M. Al-Turany, ALICE Offline Meeting 613.03.13

Florian Uhlig ROOT Users Workshop, Saas Fee

Root

TE

ve

RO

OT IO

TG

eo

TV

irtu

alM

C

Cin

t

TTr

ee …

Pro

of

Geant3

Geant4

Genat4

_VM

C

Libraries

VG

M

FairRoo

t

Run

Manager

IO

Manager

Runti

me

DB

DB

In

terf

ace

Even

t D

ispla

y

MC

A

pplic

ati

on

Module

Dete

ctor

Task

Magneti

c Fi

eld

Even

t G

enera

tor

CbmRoot

PandaRoot AsyEosRoot

R3BRoot SofiaRoot MPDRoot

FopiRoot EICRoot

Page 7: FairRoot Status and plans

Start testing the VMC concept for CBM

First Release of CbmRoot

MPD (NICA) start also using FairRoot

ASYEOS joined(ASYEOSRoot)

GEM-TPC seperated from PANDA branch (FOPIRoot)

Panda decided to join->FairRoot: same Base package for different experiments

R3B joinedEIC (Electron Ion Collider BNL)EICRoot

2011201020062004

FairRoot : Timeline

2012

SOFIA (Studies On Fission with Aladin)

6/25/13 M. Al-Turany, ALICE Offline Meeting 7

ENSAR-ROOTCollection of modules used by structural nuclear phsyics exp.

2013

Page 8: FairRoot Status and plans

Database Re-Design

6/25/13 M. Al-Turany, ALICE Offline Meeting 8

Page 9: FairRoot Status and plans

Database in FairRoot:The real database in FairRoot is completely hidden from the user and/or software developer

• The runtime database is not a database in the classical

sense, but a parameter manager.

• It knows the “I/O”s defined by the user and all parameter

containers needed for the actual analysis and/or

Simulation.

• It manages the automatic initialization and saving of the

parameter containers

• After all initialization the complete list of runs and related

parameter versions are saved either to Database (Oracle,

MySql, …) or to ROOT files. 6/25/13 M. Al-Turany, ALICE Offline Meeting 9

Page 10: FairRoot Status and plans

FairRoot DB Design (Old)

10

FairRoot

Run Manager

RunTime Database

ASCII FileConfigurationparameters.

IO Manager

Root FileMC-pointsDigits, etc

Root FileConfigurationparameters.

Oracle

6/25/13 M. Al-Turany, ALICE Offline Meeting

Page 11: FairRoot Status and plans

FairRoot DB extended

11

FairRoot

Run Manager

RunTime Database

ASCII FileConfigurationparameters.

IO Manager

Root FileMC-pointsDigits, etc

Root FileConfigurationparameters.

TSQLServerOracle

Postgresql

MySQL

DB Interface

6/25/13 M. Al-Turany, ALICE Offline Meeting

Page 12: FairRoot Status and plans

Re-design Database interface based on ROOT Database Connectivity (RDBC) API which provides uniform interface to Oracle, MySQL, PgSQL • Database Interface in FairRoot using TSQLServer

– (MySQL, Oracle, PostGre,... )

• Allows multiple connections to Dbs at runtime• Adds Version Management

• Data type: Real and/or MC• Detector type• Date and Time Range

• Reduces SQL coding• Simple Predefined Table• Only Simple SQL used• Ultimately Generic Container

• Handles Write/Read access

6/25/13 M. Al-Turany, ALICE Offline Meeting 12

Page 13: FairRoot Status and plans

Detector

Time

Version

Validity time range (UTC)

STS CALMVD CAL

MVD TEMP

Version Mangment

• It must be possible to get a consistent set of information for any date (e.g. The start time of a certain run).

• It must be possible to get an answer to the question: 'Which parameters were used when analyzing this run X years ago?' (The calibration might have been optimized several times since this date. Maybe some bugs have been detected and corrected in the mean time.)

6/25/13 M. Al-Turany, ALICE Offline Meeting 13

RunID t

Time

Page 14: FairRoot Status and plans

Version Management The Query process1. Context ( Timestamp,Detector,Version) is the primary key2. Context converted to unique SeqNo3. SeqNo used as keys to access all rows in main table4. System gives user access of all such rows

SEQNoContextmatched

SeqNO Col 1 …

Col nValidity Frame

900001020900001020

900001020

900001020

Bigtable a Distributed Storage System forStructured Data, Google inc. OSDI 2006

Auxiliary validity table

D. Bertini

146/25/13 M. Al-Turany, ALICE Offline Meeting

Page 15: FairRoot Status and plans

New Data transfer layer for FairRoot

6/25/13 M. Al-Turany, ALICE Offline Meeting 15

Page 16: FairRoot Status and plans

The Online Reconstruction and analysis

6/25/13

300 GB/s20M Evt/s

< 1 GB/s25K Evt/s

We have the fastest algorithms but:How to distribute the processes? How to manage the data flow? How to recover processes when they crash?How to monitor the whole system?……

1 TB/s 1 G

B/s> 60 000 CPU-core

or Equivalent GPU, FPGA, …

> 60 000 CPU-core

or Equivalent GPU, FPGA, …

M. Al-Turany, ALICE Offline Meeting 16

Page 17: FairRoot Status and plans

Design constrains

• Highly flexible: o different data paths should be modeled.

• Adaptive: o Sub-systems are continuously under development and improvement

• Should works for simulated and real data: o developing and debugging the algorithms

• It should support all possible hardware where the algorithms

could run (CPU, GPU, FPGA)

• It has to scale to any size! With minimum or ideally no effort.

6/25/13 M. Al-Turany, ALICE Offline Meeting 17

Page 18: FairRoot Status and plans

Data transport

• How to handle dynamic components, i.e. pieces that go

away temporarily?

• How to handle messages that we can't deliver

immediately? Particularly, if we're waiting for a component

to come back on-line

• What if we need to use a different network transport. Say,

multicast instead of TCP unicast? Or IPv6? Do we need to

rewrite the applications, or is the transport abstracted in

some layer?6/25/13 M. Al-Turany, ALICE Offline Meeting 18

Page 19: FairRoot Status and plans

Before Re-inventing the Wheel

• What is available on the market and in the community?o A very promising package: ZeroMQ is available since 2 years

• Do we intend to separate online and offline? NO

• Multi-Threaded concept or Multi-Processes based on message

queues?o Message based systems allow us to decouple producers from consumers.

o We can spread the work to be done over several processes and machines.

o We can manage/upgrade/move around programs (processes)

independently of each other.

6/25/13 M. Al-Turany, ALICE Offline Meeting 19

Page 20: FairRoot Status and plans

ØMQ (zeromq)

• A socket library that acts as a concurrency framework.

• Carries messages across inproc, IPC, TCP, and multicast.

• Connect N-to-N via fanout, pubsub, pipeline, request-reply.

• Asynch I/O for scalable multicore message-passing apps.

• 30+ languages including C, C++, Java, .NET, Python.

• Most OS’s including Linux, Windows, OS X, PPC405/PPC440.

• Large and active open source community.

• LGPL free software with full commercial support from iMatix.

6/25/13 20M. Al-Turany, ALICE Offline Meeting

Page 21: FairRoot Status and plans

What does it deliver?

• It handles I/O asynchronously, in background threads. o These communicate with application threads using lock-free data structures,

o Concurrent ØMQ applications need no locks, semaphores, or other wait

states.

• Components can come and go dynamically and ØMQ will automatically

reconnect. o You can start components in any order.

o You can create "service-oriented architectures" (SOAs) where services can

join and leave the network at any time.

• When a queue is full, ØMQ o Automatically blocks senders, or

o Throws away messages, depending on the kind of messaging you are doing

(the so-called "pattern").

6/25/13 M. Al-Turany, ALICE Offline Meeting 21

Page 22: FairRoot Status and plans

What does it deliver?

• It does not impose any format on messages. o They are blobs of zero to gigabytes large.

o You can use any other product (Protocol) on top to represent

your data (Google's protocol buffers, etc).

• Applications talk to each other over arbitrary transports:

TCP, multicast, in-process, inter-process. o You don't need to change your code to use a different transport.

6/25/13 M. Al-Turany, ALICE Offline Meeting 22

Page 23: FairRoot Status and plans

The built-in core ØMQ patterns are:

• Request-reply, which connects a set of clients to a set of

services. (remote procedure call and task distribution

pattern)

• Publish-subscribe, which connects a set of publishers to

a set of subscribers. (data distribution pattern)

• Pipeline, which connects nodes in a fan-out / fan-in

pattern that can have multiple steps, and loops. (Parallel

task distribution and collection pattern)

• Exclusive pair, which connect two sockets exclusively

6/25/13 M. Al-Turany, ALICE Offline Meeting 23

Page 24: FairRoot Status and plans

Current Status

• The Framework deliver some components which can be

connected to each other in order to to optimize data flow

topology.

• All component share a common base called Device (ZeroMQ

Class).

• Devices are grouped by three categories:o Source: Sampler

o Message-based Processor: • Sink, BalancedStandaloneSplitter, StandaloneMerger, Buffer

o Content-based Processor: Processor

6/25/13 M. Al-Turany, ALICE Offline Meeting 24

Page 25: FairRoot Status and plans

Panda Example

6/25/13

Experiment/detector

specific code

Framework classes that can be used

directly

M. Al-Turany, ALICE Offline Meeting 25

FairMQ package

Page 26: FairRoot Status and plans

Computing Unit

Detector Simulation

Example for Panda online reconstruction hierarchy (scenario)

MVD Pixel data Mvd Strip data

Clusterer Clusterer

REQ

REP REP

REP REP

Tracker

REQ REQSUB

PUB

SUB

SUB

Parameter

databasePUB

SUB

SUB

PUB

SUB

REP

6/25/13 M. Al-Turany, ALICE Offline Meeting 26

Log XPUB Log XPUB

Log XPUB Log Aggregate

Log Writer

XSUB

XPUB

XSUB

Page 27: FairRoot Status and plans

Correct semantics for logging

• Pub/Sub sockets

• Never block

• Lossy! (if needed)

• Buffer sizes / locations configurable

• Arbitrary message size

6/25/13 M. Al-Turany, ALICE Offline Meeting 27

Page 28: FairRoot Status and plans

Results

• Throughput of 940 Mbit/s was measured which is very close

to the theoretical limit of the TCP/IPv4/GigabitEthernet

• The throughput for the named pipe transport between two

devices on one node has been measured around 1.7 GB/s

6/25/13 M. Al-Turany, ALICE Offline Meeting 28

Each message consists of digits in one panda event for one detector, with size of few kBytes

Page 29: FairRoot Status and plans

Payload in Mbyte/s as function of message size

128

Byte

256

Byte

512

Byte

1 kB

yte

2 kB

yte

4 kB

yte

8 kB

yte

16 k

Byte

32 k

Byte

64 k

Byte

128

kByt

e

256

kByt

e

512

kByt

e0

200

400

600

800

1000

1200

1400

10 Gbit 56 Gbit IB

6/25/13 M. Al-Turany, ALICE Offline Meeting 29

ZeroMQ works on InfiniBand but using IP over IB

Page 30: FairRoot Status and plans

ZeroMQRoot (Event loop)

6/25/13

FairRootManager

FairRunAna

FairTasks

Init()Re-Init()Exec()

Finish()

FairMQProcessorTask

Init()Re-Init()Exec()

Finish()

ROOT Files, Lmd Files, Remote event server, …

Integrating the existing software:

M. Al-Turany, ALICE Offline Meeting 30

Page 31: FairRoot Status and plans

FairBase/examples/Tutorial3

6/25/13 M. Al-Turany, ALICE Offline Meeting 31

Fairbase/example/Tutorial3

Page 32: FairRoot Status and plans

Next to implement

• Local and central Log processors

• Command channels and objects (messages)

• Automatic monitoring and configuration

(hopefully till the end of this year!)

6/25/13 M. Al-Turany, ALICE Offline Meeting 32

Page 33: FairRoot Status and plans

Summary

• ZeroMQ communication layer is integrated into our offline

framework (FairRoot)

• On the short term we will keep both options ROOT based

event loop and concurrent processes communicating with

each other via ZeroMQ.

• On long Term we are moving away from single event loop

to distributed processes.

Thanks you !

6/25/13 M. Al-Turany, ALICE Offline Meeting 33

Page 34: FairRoot Status and plans

Native InfiniBand/RDMA is faster than IP over IB

6/25/13 M. Al-Turany, ALICE Offline Meeting 34

Implementing ZeroMQ over IB verbs will improve the performance.

Page 35: FairRoot Status and plans

Device• Each processing stage of a pipeline is occupied

by a process which executes an instance of the Device class

6/25/13 M. Al-Turany, ALICE Offline Meeting 35

Page 36: FairRoot Status and plans

Sampler

• Devices with no inputs are categorized as sources

• A sampler loops (optionally: infinitely) over the loaded events and send them through the output socket.

• A variable event rate limiter has been implemented to control the sending speed

6/25/13 M. Al-Turany, ALICE Offline Meeting 36

Page 37: FairRoot Status and plans

Message format (Protocol)

• Potentially any content-based processor or any source can

change the application protocol. Therefore, the framework

provides a generic Message class that works with any

arbitrary and continuous junk of memory

(FairMQMessage).

• One has to pass a pointer to the memory buffer, the size in

bytes, and can optionally pass a function pointer to a

destructor, which will be called once the message object is

discarded.

6/25/13 M. Al-Turany, ALICE Offline Meeting 37

Page 38: FairRoot Status and plans

New simple classes without ROOT are used in the Sampler (This enable us to use non-ROOT clients) and reduce the messages size.

6/25/13 M. Al-Turany, ALICE Offline Meeting 38

Page 39: FairRoot Status and plans

Processor design

6/25/13 M. Al-Turany, ALICE Offline Meeting 39

Processor

N-Data output sockets

N-Data Input

sockets

Log serverCommand

Client

Page 40: FairRoot Status and plans

Content-based Processor

• The Processor device has at least one input and one output socket.

• A task is meant for accessing and potentially changing the message content.

6/25/13 M. Al-Turany, ALICE Offline Meeting 40

Page 41: FairRoot Status and plans

Message-based Processor

• All message-based processors inherit from Device and operate on messages without interpreting their content.

• Four message-based processors have been implemented so far

6/25/13 M. Al-Turany, ALICE Offline Meeting 41

Page 42: FairRoot Status and plans

6/25/13

MVD data

Clusterer

MVD Tracker

MVD data

FairMQBalancedStandaloneSplitter

Clustrer Clustrer Clustrer

FairMQStandaloneMerger

Tracker Tracker Tracker

Example for Fan-out/Fan-in the data path for load balancing

M. Al-Turany, ALICE Offline Meeting 42

Page 43: FairRoot Status and plans

6/25/13

MVD data

Clusterer

MVD Tracker

MVD data

FairMQBalancedStandaloneSplitter

Clustrer Clustrer Clustrer

FairMQStandaloneMerger

Example for Fan-out/Fan-in the data path for load balancing

M. Al-Turany, ALICE Offline Meeting 43

MVD Tracker