it/ee palaver - 20.06.06fair daq - j.adamczewski, s.linev1

30
IT/EE Palaver - 20.06.0 6 FAIR DAQ - J.Adamczewski, S.Linev 1

Upload: milo-chapman

Post on 13-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: IT/EE Palaver - 20.06.06FAIR DAQ - J.Adamczewski, S.Linev1

IT/EE Palaver - 20.06.06 FAIR DAQ - J.Adamczewski, S.Linev 1

Page 2: IT/EE Palaver - 20.06.06FAIR DAQ - J.Adamczewski, S.Linev1

IT/EE Palaver - 20.06.06 FAIR DAQ - J.Adamczewski, S.Linev 2

Outline

• Requirements of DAQ for CBM

• The InfiniBand Cluster, uDAPL

• XDAQ tests on the IB-Cluster

• Summary and further evaluation

Page 3: IT/EE Palaver - 20.06.06FAIR DAQ - J.Adamczewski, S.Linev1

IT/EE Palaver - 20.06.06 FAIR DAQ - J.Adamczewski, S.Linev 3

CBM data acquisition

data dispatcher

FEEdeliver time stamped

data

CNetcollect data into buffers

Detector Detector Detector

collect

~50000 FEE chips

event dispatcher

1000x1000 switchingBNetsort time stamped data

~1000 links a 1 GB/sec

HNethigh level selection

to high level computing

and archiving~1 GB/sec Output

processing

PNetprocess events

level 1&2 selection

subfarm subfarm subfarm ~100 subfarms

~100 nodes per subfarm

~10 dispatchers → subfarm

~1000 collectors

~1000 active buffers

TNettime distribution

Page 4: IT/EE Palaver - 20.06.06FAIR DAQ - J.Adamczewski, S.Linev1

IT/EE Palaver - 20.06.06 FAIR DAQ - J.Adamczewski, S.Linev 4

CBM DAQ features

• Triggerless data transport until filter farm

• Event building on full data rate ~1TB/s

• B-net: ~1000 nodes, high-speed interconnections

• Linux may run on all DAQ nodes (even FPGAs)

• Test cluster with InfiniBand:

small „demonstrator“ set-up within next year

Page 5: IT/EE Palaver - 20.06.06FAIR DAQ - J.Adamczewski, S.Linev1

IT/EE Palaver - 20.06.06 FAIR DAQ - J.Adamczewski, S.Linev 5

InfiniBand test cluster

• 4 nodes:

– Double Opteron 2.2 GHz, 2GB RAM

– Mellanox MHES18-XT host adapter

– 2x Gigabit Ethernet host adapters

– SuSE Linux 9.3, x64bit version

• Mellanox MTS2400 24X InfiniBand switch

PCI Express x8

Page 6: IT/EE Palaver - 20.06.06FAIR DAQ - J.Adamczewski, S.Linev1

IT/EE Palaver - 20.06.06 FAIR DAQ - J.Adamczewski, S.Linev 6

Testcluster networks

InfiniBand switch: MTS2400

InfiniBand HCA: MHES18-XT

Gigabit Ethernet switch

Ethernet cabling

InfiniBand cabling

Page 7: IT/EE Palaver - 20.06.06FAIR DAQ - J.Adamczewski, S.Linev1

IT/EE Palaver - 20.06.06 FAIR DAQ - J.Adamczewski, S.Linev 7

InfiniBand software

• Mellanox IB Gold 1.8.0– IPoIB: IP over InfiniBand driver– uDAPL: User Direct Access Programming Layer– MPI: Message Passing Interface– others ...

Page 8: IT/EE Palaver - 20.06.06FAIR DAQ - J.Adamczewski, S.Linev1

IT/EE Palaver - 20.06.06 FAIR DAQ - J.Adamczewski, S.Linev 8

Direct Access Transport

• Developed by DAT collaborative - http://www.datcollaborative.org

• Objectives:– Transport and platform (OS) independence– Define user- (uDAPL) and kernel-level (kDAPL)

APIs

• Features:– Zero-copy data transfer– RDMA - Remote Direct Memory Access

Page 9: IT/EE Palaver - 20.06.06FAIR DAQ - J.Adamczewski, S.Linev1

IT/EE Palaver - 20.06.06 FAIR DAQ - J.Adamczewski, S.Linev 9

uDAPL test application

• C++ wrapper for C-based uDAPL library• Same program runs on all nodes• All-to-all connectivity• Time synchronization between nodes ~2 µs• One master (deliver commands) and many slaves• Several kinds of traffic schedule can be generated

and executed• Statistics like transfer rate, lost packets, schedule

accuracy can be measured

Page 10: IT/EE Palaver - 20.06.06FAIR DAQ - J.Adamczewski, S.Linev1

IT/EE Palaver - 20.06.06 FAIR DAQ - J.Adamczewski, S.Linev 10

All-to-all communication test

• Each node able to send data to any other node

• Message- and RDMA-based transfer are supported

• Master generate schedule and distribute it to all slaves

• At predefined time schedule execution is started

• Several schedules were tested:

– One-to-all schedule

– All-to-one schedule

– All-to-all round-robin

– All-to-all fix target

master

node02node01

node03

Page 11: IT/EE Palaver - 20.06.06FAIR DAQ - J.Adamczewski, S.Linev1

IT/EE Palaver - 20.06.06 FAIR DAQ - J.Adamczewski, S.Linev 11

Scheduled test results

Message-based data transfer

0100200300400500600700800900

1000

1 10 100

Packet size, KB

Dat

a ra

te, B

/us

round-robin

fix target

one-to-all

all-to-one

RDMA-based data transfer

0100200300400500600700800900

1000

1 10 100

Packet size, KB

Dat

a ra

te, B

/us

round-robin

fix target

one-to-all

all-to-one

• Slight difference between message- and RDMA-based transfer

• Reasonable transfer rates starting from 16 KB buffers

Page 12: IT/EE Palaver - 20.06.06FAIR DAQ - J.Adamczewski, S.Linev1

IT/EE Palaver - 20.06.06 FAIR DAQ - J.Adamczewski, S.Linev 12

Chaotic (schedule-less) transfer tests

• Packets send without schedule

• Next packet send when previous is completed

• Performance depends on receiving queue size

• For small packets performance 2 times better than scheduled approach

• Low CPU usage 0

100

200

300

400

500

600

700

800

900

1000

1 10 100

Packet size, KB

Da

ta r

ate

, B

/us

maxium queue mid queue size

min queue size scheduled transfer

Page 13: IT/EE Palaver - 20.06.06FAIR DAQ - J.Adamczewski, S.Linev1

IT/EE Palaver - 20.06.06 FAIR DAQ - J.Adamczewski, S.Linev 13

Data Acquisition framework requirements

• Configuration of multiple nodes

(connections and algorithms, database?)

• Controls and monitoring

• Data transport (format, protocol, hardware drivers?)

• Message logging

• Error handling and failure recovery (ARMOR?)

• Modular architecture, extensability! (sub-detector tests)

• User interface?

Page 14: IT/EE Palaver - 20.06.06FAIR DAQ - J.Adamczewski, S.Linev1

IT/EE Palaver - 20.06.06 FAIR DAQ - J.Adamczewski, S.Linev 14

Standard DAQ framework for LHC CMS experiment

(Orsini, Gutleber) http://xdaqwiki.cern.ch

• C++ libraries on Linux, modular packages

• Each node: context, xdaq process with

embedded xdaq applications

• Configuration: XML

• Data transport: I2O protocol (Intelligent IO)

• Communication: http, cgi; SOAP messages

The CMS XDAQ framework

Page 15: IT/EE Palaver - 20.06.06FAIR DAQ - J.Adamczewski, S.Linev1

IT/EE Palaver - 20.06.06 FAIR DAQ - J.Adamczewski, S.Linev 15

XDAQ features continued... http://xdaqwiki.cern.ch

• State machines (sync/async FSM)

• Message logger, error handler

• Monitoring tool

• Hardware access library (HAL)

• Front End Driver (FED kit, for CMS!)

• Job Control (task handler for node control)

• others:

• exceptions

• threads

• infospace

• data (de)serializers , ...

The CMS XDAQ framework

Page 16: IT/EE Palaver - 20.06.06FAIR DAQ - J.Adamczewski, S.Linev1

IT/EE Palaver - 20.06.06 FAIR DAQ - J.Adamczewski, S.Linev 16

XDAQ on the CBM InfiniBand cluster

Installation problems:

• XDAQ distribution not 64 bit save!

Code adjustments required

• kernel modules for CERN linux excluded!

• drivers for CMS hardware (FED) excluded

• coretools done, worksuite partially,…

Basic evaluation possible!

Page 17: IT/EE Palaver - 20.06.06FAIR DAQ - J.Adamczewski, S.Linev1

IT/EE Palaver - 20.06.06 FAIR DAQ - J.Adamczewski, S.Linev 17

XDAQ tests on CBM InfiniBand cluster

Tested features on IB cluster:

http://wiki.gsi.de/cgi-bin/view/Daq4FAIR/DaqClusterSoftwareXdaqStatus

• hyperdaq webserver (main test controls ui)

• xml configuration setup

• control variable access and export (infospace)

• SOAP messaging (test controller application)

• state machines (sync./async. with web/SOAP ui)

• monitoring application (very raw ui!)

• multithreading (workloop framework)

• exceptions, data wrappers, message logging, toolbox,…

• i2o data transport (tcp roundtrip example)

Page 18: IT/EE Palaver - 20.06.06FAIR DAQ - J.Adamczewski, S.Linev1

IT/EE Palaver - 20.06.06 FAIR DAQ - J.Adamczewski, S.Linev 18

XDAQ: web server

hyperdaq web interface

Page 19: IT/EE Palaver - 20.06.06FAIR DAQ - J.Adamczewski, S.Linev1

IT/EE Palaver - 20.06.06 FAIR DAQ - J.Adamczewski, S.Linev 19

XDAQ executive with applications

Common XML configuration file:

• Each node knows all

applications on all nodes!

• Unique addressing by

context (url) and application id

Page 20: IT/EE Palaver - 20.06.06FAIR DAQ - J.Adamczewski, S.Linev1

IT/EE Palaver - 20.06.06 FAIR DAQ - J.Adamczewski, S.Linev 20

XDAQ: SOAP messaging

xrelay: send SOAP control messages

• example commands:

Configure, Enable, Halt, Reset,..

• any new commands may be defined

• web interface not suited as

real control system UI!

• XDAQ applications may exchange

SOAP messages with other UI

(Labview?)

Page 21: IT/EE Palaver - 20.06.06FAIR DAQ - J.Adamczewski, S.Linev1

IT/EE Palaver - 20.06.06 FAIR DAQ - J.Adamczewski, S.Linev 21

XDAQ: messaging and peer transport

0..*

0..*

0..1

0..1

pt::JxtaMessenger

+send:void

service:std::string

exception

UnknownProtocolOrService DuplicateEndpoint AddressMismatch DuplicateListener InvalidFrame Exception NoListener SendFailure PeerTransportNotFound InvalidAddress ReceiveFailure NoMessenger MaxMessageSizeExceeded

pt::JxtaListener

+processIncomingMessage:void

service:std::string

pt::BinaryMessenger

+~BinaryMessenger+send:void

localAddress:pt::Address::Reference destinationAddress:pt::Address::Reference

pt::Messenger

+~Messenger

service:std::string

pt::HTAccessSecurityPolicy

+checkAccess:bool+checkAuth:bool+isAccessLimited:bool+isAuthLimited:bool

policyName:std::string name:std::string type:std::string

pt::PeerTransport

+createAddress:pt::Address::Reference+createAddress:pt::Address::Reference+isServiceSupported:bool

type:TransportType protocol:std::string supportedServices:std::vector<std::string>

pt::PeerTransportAgent

+addPeerTransport:void+removePeerTransport:void+getPeerTransport:pt::PeerTransport*+createAddress:pt::Address::Reference+createAddress:pt::Address::Reference+getMessenger:pt::Messenger::Reference+removeListener:void+getListener:pt::Listener*+getListeners:std::vector<pt::Listener*>+removeAllListeners:void+addListener:void

peerTransports:std::vector<pt::PeerTransport*>

pt::SecurityPolicyFactory

-instance_:SecurityPolicyFactory*-policies_:std::map<std::string,pt::SecurityPolicy*,std::less<std::string>>

+getInstance:SecurityPolicyFactory*+destroyInstance:void+getSecurityPolicy:pt::SecurityPolicy*+addSecurityPolicyImplementation:void+removeSecurityPolicyImplementation:void+getSecurityPolicyNames:std::vector<std::string>-SecurityPolicyFactory-~SecurityPolicyFactory

pt::SecurityPolicy

+~SecurityPolicy

policyName:std::string

pt::PeerTransportSender

+getMessenger:pt::Messenger::Reference

pt::PeerTransportAgentImpl

-instance_:PeerTransportAgentImpl*-senders_:std::vector<pt::PeerTransport*>-receivers_:std::vector<pt::PeerTransport*>-services_:std::map<std::string,pt::Listener*,std::less<std::string>>

+~PeerTransportAgentImpl+instance:PeerTransportAgent*+addPeerTransport:void+removePeerTransport:void+getPeerTransport:pt::PeerTransport*+createAddress:pt::Address::Reference+createAddress:pt::Address::Reference+getMessenger:pt::Messenger::Reference+addListener:void+removeListener:void+getListener:pt::Listener*+getListeners:std::vector<pt::Listener*>+removeAllListeners:void

peerTransports:std::vector<pt::PeerTransport*>

pt::Address

+~Address+toString:std::string+equals:bool

protocol:std::string service:std::string serviceParameters:std::string

pt::BinaryListener

+processIncomingMessage:void

pt::SOAPListener

+processIncomingMessage:xoap::MessageReference

service:std::string

pt::Listener

service:std::string

pt::SOAPMessenger

+~SOAPMessenger+send:xoap::MessageReference

service:std::string localAddress:pt::Address::Reference destinationAddress:pt::Address::Reference

pt::PeerTransportReceiver

-listeners_:std::map<std::string,pt::Listener*,std::less<std::string>>

+~PeerTransportReceiver+addServiceListener:void+removeServiceListener:void+removeAllServiceListeners:void+isExistingListener:bool+getListener:pt::Listener*+config:void

peer transport

xoap::SOAPBodyElementxoap::SOAPFault

+SOAPFault+getFaultCode:std::string+getFaultString:std::string+setFaultCode:void+setFaultString:void

xoap::StandardObjectPolicy

+dispose:void

exception

HTTPException Exception

xoap::Endpoint

#id_:std::string

+Endpoint+toString:std::string

xoap::SOAPConnection

#httpURLConnection_:HTTPURLConnection

+SOAPConnection+call:xoap::SOAPMessage

xoap::URLEndpoint

+URLEndpoint

URL:std::string

xoap::SOAPAllocator

+allocate:void *+dispose:void+testMemoryLeakage:void

xoap::MimeHeaders

-headers_:std::multimap<std::string,std::string,std::less<std::string>>

+MimeHeaders+getHeader:std::vector<std::string>+setHeader:void+addHeader:void+removeHeader:void+removeAllHeaders:void

allHeaders:std::multimap<std::string,std::string,std::less<std::string>>

xoap::XStr

-fUnicodeForm:XMLCh*

+XStr+XStr+~XStr+unicodeForm:const XMLCh*+operatorconstXMLCh*:

xoap::HTTPURLConnection

#socket_:int

+HTTPURLConnection+~HTTPURLConnection+receiveFrom:std::string+sendTo:void+close:void+connect:void#receive:int#send:void#open:void#extractMIMEBoundary:std::string#writeHttpPostMIMEHeader:void#writeHttpPostHeader:void

xoap::SimpleReferenceCount

-counter:unsigned long *

+SimpleReferenceCount+init:void+dispose:void+increment:void+decrement_and_is_zero:bool

xoap::SOAPName

#prefix_:std::string#name_:std::string#uri_:std::string

+SOAPName+operator==:bool

qualifiedName:std::string localName:std::string prefix:std::string URI:std::string

T:typenameSimpleReferenceCount:typename CounterPolicy =StandardObjectPolicy:typename ObjectPolicy =

CounterPolicyObjectPolicy

xoap::CountingPtr

-object_pointed_to:T*

+CountingPtr+CountingPtr+CountingPtr+~CountingPtr+operator=:CountingPtr<T,CP,OP>&+operator=:CountingPtr<T,CP,OP>&+operator->:T*+operator*:T&+isNull:bool-init:void-attach:void-detach:void

LISTENER:classxoap::Method

+obj_:LISTENER*+func_:xoap :: MessageReference+name_:std::string+namespaceURI_:std::string

+~Method+type:std::string+invoke:xoap::MessageReference+name:std::string+name:void+namespaceURI:void+namespaceURI:std::string

toolbox::lang::Methodxoap::MethodSignature

+invoke:xoap::MessageReference+name:std::string

i2o

0..*

0..1

1..*

0..1

0..1

0..1

0..1

xdaq::WebApplicationxdata::ActionListener

PeerTransportTCP

-pts_:tcp::PeerTransportSender*-ptr_:tcp::PeerTransportReceiver*-autosize_:xdata::Boolean-maxPacketSize_:xdata::UnsignedLong-poolName_:xdata::String

+XDAQ_INSTANTIATOR:+PeerTransportTCP+~PeerTransportTCP+Default:void+apply:void+failurePage:void-actionPerformed:void-reset:void

tcp::Transmitter

+isConnected_:bool+retry_:unsigned long

+Transmitter+~Transmitter+connect:void+disconnect:void+receive:int+send:void+close:void+isConnected:bool

pt::Addresstcp::Address

#url_:toolbox::net::URL#service_:std::string

+Address+~Address+toString:std::string+equals:bool

protocol:std::string service:std::string serviceParameters:std::string host:std::string port:std::string URL:std::string path:std::string socketAddress:sockaddr_in

i2o::Messengertcp::I2OMessenger

-pt_:tcp::PeerTransportSender*-channel_:tcp::Channel*-destination_:pt::Address::Reference-local_:pt::Address::Reference

+I2OMessenger+~I2OMessenger+send:void

localAddress:pt::Address::Reference destinationAddress:pt::Address::Reference

tcp::Channel

#sockaddress_:sockaddr_in#sockaddressSize_:socklen_t#socket_:int

+Channel+~Channel+connect:void+disconnect:void+receive:int+send:void+close:void+isConnected:bool

exception

CannotConnect Exception

pt::PeerTransportSenderTask

tcp::PeerTransportSender

-outputQueue_:toolbox::squeue<tcp::PostDescriptor>-activated_:bool-done_:bool

+PeerTransportSender+~PeerTransportSender+createAddress:pt::Address::Reference+createAddress:pt::Address::Reference+isServiceSupported:bool+getMessenger:pt::Messenger::Reference+svc:int+post:void+start:void+stop:void

type:pt::TransportType protocol:std::string supportedServices:std::vector<std::string>

toolbox::lang::Classtcp::ReceiverLoop

-sockets_:std::vector<int>-address_:pt::Address::Reference-fdset_:fd_set-allset_:fd_set-timeout_:timeval-done_:bool-stop_:bool-maxfd_:int-current_:int-listenfd_:int-accepted_:int-nochannels_:int-nready_:int-autoSize_:bool-maxPacketSize_:unsigned long-listener_:i2o::Listener*-pool_:toolbox::mem::Pool*-manager_:toolbox::mem::MemoryPoolFactory*-process_:toolbox::task::ActionSignature*-logger_:Logger

+ReceiverLoop+~ReceiverLoop+receive:int+send:void+disconnect:void+connect:void+close:void+process:bool+addServiceListener:void+removeServiceListener:void+removeAllServiceListeners:void+activate:void+cancel:void-accept:int-isConnected:bool-isActive:bool-onRequest:void-safeReceive:void

_enum ReceiverLoop h 50

address:pt::Address::Reference pool:toolbox::mem::Pool* maxPacketSize:unsigned long autoSize:bool

structtcp::PostDescriptor

+ref:toolbox::mem::Reference*+channel:tcp::Channel*+handler:toolbox::exception::HandlerSignature*+context:void *

pt::PeerTransportReceivertcp::PeerTransportReceiver

#autoSize_:bool#maxPacketSize_:unsigned long#loop_:std::vector<tcp::ReceiverLoop*>#pool_:toolbox::mem::Pool*#logger_:Logger

+PeerTransportReceiver+~PeerTransportReceiver+createAddress:pt::Address::Reference+createAddress:pt::Address::Reference+addServiceListener:void+removeServiceListener:void+removeAllServiceListeners:void+isServiceSupported:bool+config:void+start:void+stop:void

type:pt::TransportType protocol:std::string supportedServices:std::vector<std::string> pool:toolbox::mem::Pool* maxPacketSize:unsigned long autoSize:bool

tcp0..1

1..*

0..1

exception

Exception

i2o::Messengerfifo::I2OMessenger

-pt_:fifo::PeerTransport*

+I2OMessenger+send:void

localAddress:pt::Address::Reference destinationAddress:pt::Address::Reference

pt::PeerTransportSenderpt::PeerTransportReceiver

toolbox::lang::Classfifo::PeerTransport

#sync_:BSem*#mutex_:BSem*#fifo_:toolbox::rlist<fifo::PostDescriptor>#pending_:int#i2oListener_:i2o::Listener*#messenger_:pt::Messenger::Reference#localAddress_:pt::Address::Reference#process_:toolbox::task::ActionSignature*#logger_:Logger

+PeerTransport+~PeerTransport+post:void+process:bool+createAddress:pt::Address::Reference+createAddress:pt::Address::Reference+isServiceSupported:bool+getMessenger:pt::Messenger::Reference+addServiceListener:void+removeServiceListener:void+removeAllServiceListeners:void+config:void

_enum PeerTransport h 119

type:pt::TransportType protocol:std::string supportedServices:std::vector<std::string>

pt::Addressfifo::Address

#url_:toolbox::net::URL#service_:std::string

+Address+~Address+toString:std::string+equals:bool

protocol:std::string service:std::string serviceParameters:std::string host:std::string port:std::string URL:std::string path:std::string

xdaq::ApplicationPeerTransportFifo

-pt_:fifo::PeerTransport*

+XDAQ_INSTANTIATOR:+PeerTransportFifo+~PeerTransportFifo

structfifo::PostDescriptor

+ref:toolbox::mem::Reference*+handler:toolbox::exception::HandlerSignature*+context:void *

fifo0..1

0..1

0..1

0..*

pt::SOAPMessengerhttp::SOAPLoopbackMessenger

-local_:pt::Address::Reference-destination_:pt::Address::Reference

+SOAPLoopbackMessenger+~SOAPLoopbackMessenger+send:xoap::MessageReference

localAddress:pt::Address::Reference destinationAddress:pt::Address::Reference

http::Utils

+receiveHeaderFrom:char *+receiveBodyFrom:char *+receiveFrom:char *+sendTo:void+replyTo:void+replyBodyTo:void+replyHeaderTo:void#extractMIMEBoundary:std::string#writeHttpPostMIMEHeader:void#writeHttpPostHeader:void#writeHttpReplyMIMEHeader:void#writeHttpReplyHeader:void

toolbox::lang::Classhttp::ReceiverLoop

-sockets_:std::vector<int>-clientIP_:std::map<int,std::string,std::less<int>>-clientHost_:std::map<int,std::string,std::less<int>>-address_:pt::Address::Reference-fdset_:fd_set-allset_:fd_set-maxfd_:int-current_:int-listenfd_:int-accepted_:int-nochannels_:int-nready_:int-listener_:pt::SOAPListener*-cgiListener_:xgi::Listener*-process_:toolbox::task::ActionSignature*-httpRootDir_:std::string-logger_:Logger-policy_:pt::HTAccessSecurityPolicy*-is_:xdata::InfoSpace*-aliasName_:std::string-aliasPath_:std::string

+ReceiverLoop+~ReceiverLoop+receive:int+send:void+disconnect:void+connect:void+close:void+process:bool+addServiceListener:void+removeServiceListener:void+removeAllServiceListeners:void+activate:void-accept:int-isConnected:bool-isActive:bool-onRequest:void-reply:void-authenticateUser:bool-verifyAccess:bool-isBrowserSupported:bool

_enum ReceiverLoop h 40

address:pt::Address::Reference

exception

CannotConnect Exception

http::ClientChannel

+ClientChannel+connect:void+disconnect:void+receive:int+send:void+close:void+isConnected:bool

pt::Addresshttp::Address

#url_:toolbox::net::URL*#service_:std::string

+Address+~Address+toString:std::string+equals:bool

protocol:std::string service:std::string serviceParameters:std::string host:std::string port:std::string URL:std::string path:std::string socketAddress:sockaddr_in

pt::PeerTransportReceiverhttp::PeerTransportReceiver

#loop_:std::vector<http::ReceiverLoop*>#logger_:Logger#is_:xdata::InfoSpace*

+PeerTransportReceiver+~PeerTransportReceiver+createAddress:pt::Address::Reference+createAddress:pt::Address::Reference+addServiceListener:void+removeServiceListener:void+removeAllServiceListeners:void+isServiceSupported:bool+config:void

type:pt::TransportType protocol:std::string supportedServices:std::vector<std::string>

pt::PeerTransportSenderhttp::PeerTransportSender

#sync_:BSem*#mutex_:BSem*#logger_:Logger

+PeerTransportSender+~PeerTransportSender+createAddress:pt::Address::Reference+createAddress:pt::Address::Reference+isServiceSupported:bool+getMessenger:pt::Messenger::Reference

type:pt::TransportType protocol:std::string supportedServices:std::vector<std::string>

http::Channel

#sockaddress_:sockaddr_in#sockaddressSize_:socklen_t#socket_:int#mutex_:BSem

+Channel+~Channel+connect:void+disconnect:void+receive:int+send:void+close:void+isConnected:bool+lock:void+unlock:void

xdaq::ApplicationPeerTransportHTTP

-pts_:http::PeerTransportSender*-ptr_:http::PeerTransportReceiver*-aliasName_:xdata::String-aliasPath_:xdata::String

+XDAQ_INSTANTIATOR:+PeerTransportHTTP+~PeerTransportHTTP

pt::SOAPMessengerhttp::SOAPMessenger

-channel_:http::ClientChannel*-local_:pt::Address::Reference-destination_:pt::Address::Reference

+SOAPMessenger+~SOAPMessenger+send:xoap::MessageReference

localAddress:pt::Address::Reference destinationAddress:pt::Address::Reference

http

0..1

0..1

i2o::Listener

+getService:std::string+processIncomingMessage:void

service:std::string

LISTENER:classstruct

i2o::Method

+obj_:LISTENER*+func_:void+fid_:unsigned long

+type:std::string+invoke:void+fid:unsigned long+fid:void

toolbox::lang::Methodi2o::MethodSignature

+invoke:void+fid:unsigned long

i2o::Messenger

+getService:std::string

service:std::string

exception

Exception

i2o::utils::Dispatcher

#logger_:Logger#registry_:xdaq::ApplicationRegistry*#context_:xdaq::ApplicationContext*#addressMap_:i2o::utils::AddressMap*

+Dispatcher+~Dispatcher+processIncomingMessage:void

pt::BinaryListener

+processIncomingMessage:void

pt::BinaryMessenger

+~BinaryMessenger+send:void+getLocalAddress:pt::Address::Reference+getDestinationAddress:pt::Address::Reference localAddress:pt::Address::Reference destinationAddress:pt::Address::Reference

pt::Listener

+getService:std::string

service:std::string

pt::Messenger

+~Messenger+getService:std::string

service:std::string

i2o::utils::AddressMap

-instance_:AddressMap*-tidToApplication_:std::map<unsignedlong,xdaq::ApplicationDescriptor*,std::less<unsignedlong>>-applicationToTid_:std::map<xdaq::ApplicationDescriptor*,unsignedlong,std::less<xdaq::ApplicationDescriptor*>>

+getApplicationDescriptor:xdaq::ApplicationDescriptor*+getTid:unsigned long+setApplicationDescriptor:void+removeTid:void+clear:void+getInstance:AddressMap*+destroyInstance:void

instance:AddressMap*

SOAPmessages

transport layer

0..*

0..1

1..*

0..1

0..1

0..1

0..1

xdaq::WebApplicationxdata::ActionListener

PeerTransportTCP

-pts_:tcp::PeerTransportSender*-ptr_:tcp::PeerTransportReceiver*-autosize_:xdata::Boolean-maxPacketSize_:xdata::UnsignedLong-poolName_:xdata::String

+XDAQ_INSTANTIATOR:+PeerTransportTCP+~PeerTransportTCP+Default:void+apply:void+failurePage:void-actionPerformed:void-reset:void

tcp::Transmitter

+isConnected_:bool+retry_:unsigned long

+Transmitter+~Transmitter+connect:void+disconnect:void+receive:int+send:void+close:void+isConnected:bool

pt::Addresstcp::Address

#url_:toolbox::net::URL#service_:std::string

+Address+~Address+toString:std::string+equals:bool

protocol:std::string service:std::string serviceParameters:std::string host:std::string port:std::string URL:std::string path:std::string socketAddress:sockaddr_in

i2o::Messengertcp::I2OMessenger

-pt_:tcp::PeerTransportSender*-channel_:tcp::Channel*-destination_:pt::Address::Reference-local_:pt::Address::Reference

+I2OMessenger+~I2OMessenger+send:void

localAddress:pt::Address::Reference destinationAddress:pt::Address::Reference

tcp::Channel

#sockaddress_:sockaddr_in#sockaddressSize_:socklen_t#socket_:int

+Channel+~Channel+connect:void+disconnect:void+receive:int+send:void+close:void+isConnected:bool

exception

CannotConnect Exception

pt::PeerTransportSenderTask

tcp::PeerTransportSender

-outputQueue_:toolbox::squeue<tcp::PostDescriptor>-activated_:bool-done_:bool

+PeerTransportSender+~PeerTransportSender+createAddress:pt::Address::Reference+createAddress:pt::Address::Reference+isServiceSupported:bool+getMessenger:pt::Messenger::Reference+svc:int+post:void+start:void+stop:void

type:pt::TransportType protocol:std::string supportedServices:std::vector<std::string>

toolbox::lang::Classtcp::ReceiverLoop

-sockets_:std::vector<int>-address_:pt::Address::Reference-fdset_:fd_set-allset_:fd_set-timeout_:timeval-done_:bool-stop_:bool-maxfd_:int-current_:int-listenfd_:int-accepted_:int-nochannels_:int-nready_:int-autoSize_:bool-maxPacketSize_:unsigned long-listener_:i2o::Listener*-pool_:toolbox::mem::Pool*-manager_:toolbox::mem::MemoryPoolFactory*-process_:toolbox::task::ActionSignature*-logger_:Logger

+ReceiverLoop+~ReceiverLoop+receive:int+send:void+disconnect:void+connect:void+close:void+process:bool+addServiceListener:void+removeServiceListener:void+removeAllServiceListeners:void+activate:void+cancel:void-accept:int-isConnected:bool-isActive:bool-onRequest:void-safeReceive:void

_enum ReceiverLoop h 50

address:pt::Address::Reference pool:toolbox::mem::Pool* maxPacketSize:unsigned long autoSize:bool

structtcp::PostDescriptor

+ref:toolbox::mem::Reference*+channel:tcp::Channel*+handler:toolbox::exception::HandlerSignature*+context:void *

pt::PeerTransportReceivertcp::PeerTransportReceiver

#autoSize_:bool#maxPacketSize_:unsigned long#loop_:std::vector<tcp::ReceiverLoop*>#pool_:toolbox::mem::Pool*#logger_:Logger

+PeerTransportReceiver+~PeerTransportReceiver+createAddress:pt::Address::Reference+createAddress:pt::Address::Reference+addServiceListener:void+removeServiceListener:void+removeAllServiceListeners:void+isServiceSupported:bool+config:void+start:void+stop:void

type:pt::TransportType protocol:std::string supportedServices:std::vector<std::string> pool:toolbox::mem::Pool* maxPacketSize:unsigned long autoSize:bool

DAPL

new development

Page 22: IT/EE Palaver - 20.06.06FAIR DAQ - J.Adamczewski, S.Linev1

IT/EE Palaver - 20.06.06 FAIR DAQ - J.Adamczewski, S.Linev 22

XDAQ: I2O messaging and peer transport

Page 23: IT/EE Palaver - 20.06.06FAIR DAQ - J.Adamczewski, S.Linev1

IT/EE Palaver - 20.06.06 FAIR DAQ - J.Adamczewski, S.Linev 23

Peer Transport DAPL (ptdapl)

GSI development 03/2006-05/2006

• C++ wrapper class for uDAPL C library

(class TBasic)

• peer transport tcp as „template“ (starting point)

• uDAPL buffers managed within XDAQ memory pool

(class TEndpointBuffer)

• avoids memcopy and new buffer allocation for each send

package:

• lookup if posted memory reference is known as send buffer

• user code can write directly into uDAPL send buffer

• multiple threads for sending, releasing, and receiving buffer

Page 24: IT/EE Palaver - 20.06.06FAIR DAQ - J.Adamczewski, S.Linev1

IT/EE Palaver - 20.06.06 FAIR DAQ - J.Adamczewski, S.Linev 24

Peer Transport DAPL structure

uDAPLpeer

transport

sender receiver

transmitter

messenger

receiverloop

buffer

pool

message (I2O)

channel

Page 25: IT/EE Palaver - 20.06.06FAIR DAQ - J.Adamczewski, S.Linev1

IT/EE Palaver - 20.06.06 FAIR DAQ - J.Adamczewski, S.Linev 26

XDAQ executive

MyRoundTrip

HyperDAQ

I2O roundtrip setup

sender receiver

XDAQ executive

MyRoundTrip

token() token()

EnableAction()

default()

I2O callbackI2O callback

web application

state machine

Peer Transport

Page 26: IT/EE Palaver - 20.06.06FAIR DAQ - J.Adamczewski, S.Linev1

IT/EE Palaver - 20.06.06 FAIR DAQ - J.Adamczewski, S.Linev 28

HyperDAQHyperDAQ

I2O sender/receiver setup

sender receiver

XDAQ executive

MyDataSource

Benchmark()

EnableAction()

default()

workloop

(thread)

web application

state machine

XDAQ executive

MyDataDrain

token()

I2O callbackPeer Transport

default()

Page 27: IT/EE Palaver - 20.06.06FAIR DAQ - J.Adamczewski, S.Linev1

IT/EE Palaver - 20.06.06 FAIR DAQ - J.Adamczewski, S.Linev 29

Transport bandwidth B and latency

P: package size

: transfer time („latency“)

Observation: linear with P:

B

P

P

P big: network limit

P small: framework latency !

Page 28: IT/EE Palaver - 20.06.06FAIR DAQ - J.Adamczewski, S.Linev1

IT/EE Palaver - 20.06.06 FAIR DAQ - J.Adamczewski, S.Linev 30

ptDAPL benchmark results

Test Nbufs,rcv 0

(µs)C

(µs/kByte)1/0

(kHz)Bmax

(MByte/s)

MyRoundTrip 20 5.53 1.069 181 935

MySourceDrain 50 0.99 1.037 1010 964

MySourceDrain 100 1.7 1.017 588 983

MySourceDrain 200 1.83 1.013 546 987

MySourceDrain 1000 1.22 1.029 820 972

Page 29: IT/EE Palaver - 20.06.06FAIR DAQ - J.Adamczewski, S.Linev1

IT/EE Palaver - 20.06.06 FAIR DAQ - J.Adamczewski, S.Linev 31

Benchmark summary

955 MB/s

plain uDAPL

234 MB/s

XDAQ, I2O,

TCP over IB

905 MB/s

XDAQ, I2O,

ptDAPL

Page 30: IT/EE Palaver - 20.06.06FAIR DAQ - J.Adamczewski, S.Linev1

IT/EE Palaver - 20.06.06 FAIR DAQ - J.Adamczewski, S.Linev 32

Summary and Outlook

• CBM DAQ requires fast builder network: InfiniBand test cluster

• IB hardware and software installed, tested,...

• uDAPL: developed C++ wrapper and test applications

• many tests with different traffic patterns were performed

• more tests on bigger InfiniBand clusters are required

• XDAQ as software framework:

• tested features: SOAP, web server, data transport,...

• I2O implementation for uDAPL done!

• cluster configuration? control system? scalability?

hardware access?