lhcb continuous integration and deployment system · improvements to the lhcb software performance...
TRANSCRIPT
![Page 1: LHCb Continuous Integration and Deployment System · Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski,](https://reader036.vdocument.in/reader036/viewer/2022071009/5fc71e6051035f3c5f7450e2/html5/thumbnails/1.jpg)
![Page 2: LHCb Continuous Integration and Deployment System · Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski,](https://reader036.vdocument.in/reader036/viewer/2022071009/5fc71e6051035f3c5f7450e2/html5/thumbnails/2.jpg)
LHCb Continuous Integration andDeployment SystemA message based approachS.-G. Chitic, B. Couturier, M. Clemencic, J. Closier on behalf of the LHCb collaboration
CHEP 2018, Sofia, Bulgaria
12/07/2018 LHCb Continuous Integration and Deployment System 2
![Page 3: LHCb Continuous Integration and Deployment System · Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski,](https://reader036.vdocument.in/reader036/viewer/2022071009/5fc71e6051035f3c5f7450e2/html5/thumbnails/3.jpg)
Introduction
Distributed Continuous Integration System
Deployment System
Conclusion
12/07/2018 LHCb Continuous Integration and Deployment System 3
![Page 4: LHCb Continuous Integration and Deployment System · Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski,](https://reader036.vdocument.in/reader036/viewer/2022071009/5fc71e6051035f3c5f7450e2/html5/thumbnails/4.jpg)
Why the need for a complex system?
• AFS is phasing out, the users needed a centralized location for the
(nightly) builds to be installed
• CVMFS has been chosen BUT deployment is laborious and slow:
• Each installation needs to be done on Stratum0
• Each file modification needs to be included in a transaction
• Even with the current infrastructure, the Stratum0 server is busy all
day for approx. 220 Gb of installations
• Each transaction needs to be serialized. No parallel transactions can
co-exist
12/07/2018 LHCb Continuous Integration and Deployment System 4
![Page 5: LHCb Continuous Integration and Deployment System · Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski,](https://reader036.vdocument.in/reader036/viewer/2022071009/5fc71e6051035f3c5f7450e2/html5/thumbnails/5.jpg)
Distributed Continuous IntegrationSystem
12/07/2018 LHCb Continuous Integration and Deployment System 5
![Page 6: LHCb Continuous Integration and Deployment System · Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski,](https://reader036.vdocument.in/reader036/viewer/2022071009/5fc71e6051035f3c5f7450e2/html5/thumbnails/6.jpg)
General architecture
12/07/2018 LHCb Continuous Integration and Deployment System 6
Build servers
Test servers
Periodic scheduler
Commit code
Pull
com
mits
Trigger builds
Trigger Tests
Save builds results
Save tests
results
Notify build
complet
ed
Trigger build installation
Trigger tests
Results reporting
Reporting dashboard front-end
Performance testing*
STRATUM-0 STRATUM-1
* Poster 271. Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski, B. Couturier
![Page 7: LHCb Continuous Integration and Deployment System · Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski,](https://reader036.vdocument.in/reader036/viewer/2022071009/5fc71e6051035f3c5f7450e2/html5/thumbnails/7.jpg)
Why RabbitMQ?
• Multi-protocol support: AMQP, MQTT, etc
• Reliability: persistence, delivery ACK and high availability
• Flexible Routing
• Management UI
• Clustering and federations: already tested for our usage
• Plugin System and community supported libraries for different
programming languages (e.g. pika for Python)
12/07/2018 LHCb Continuous Integration and Deployment System 7
![Page 8: LHCb Continuous Integration and Deployment System · Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski,](https://reader036.vdocument.in/reader036/viewer/2022071009/5fc71e6051035f3c5f7450e2/html5/thumbnails/8.jpg)
AMQP Protocol
• Network wire-level protocol
• Defines hows clients and brokers talk
• Data serialization (framing)
• Heartbeat
• Hidden in client libraries
• AMQP Model
• Define routing and storing messages
• Defines rules how these are wired together
• Exported API
12/07/2018 LHCb Continuous Integration and Deployment System 8
![Page 9: LHCb Continuous Integration and Deployment System · Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski,](https://reader036.vdocument.in/reader036/viewer/2022071009/5fc71e6051035f3c5f7450e2/html5/thumbnails/9.jpg)
RabbitMQ usage in LHCb CI System
• Used as a message bus between different components of the system
• Decouples message producers from consumer on different nodes at
different stages in the system
• Provides persistent queues
• Allows for message prioritization
• Easily used and managed with pika in Python
12/07/2018 LHCb Continuous Integration and Deployment System 9
![Page 10: LHCb Continuous Integration and Deployment System · Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski,](https://reader036.vdocument.in/reader036/viewer/2022071009/5fc71e6051035f3c5f7450e2/html5/thumbnails/10.jpg)
Deployment System
12/07/2018 LHCb Continuous Integration and Deployment System 10
![Page 11: LHCb Continuous Integration and Deployment System · Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski,](https://reader036.vdocument.in/reader036/viewer/2022071009/5fc71e6051035f3c5f7450e2/html5/thumbnails/11.jpg)
Deployment System
12/07/2018 LHCb Continuous Integration and Deployment System 11
Continuous Integration agent
CVMFSNotify build ready
Consume build readyPrioritized
builds installation RabbitMQ
Connector
CVMFSLogger
CVMFSExecuter
CERN IT MonitoringGateway
Gets Jobs and returns results
Sends messages to log
Sends stats for IT monitoring
Stats to Kibana
![Page 12: LHCb Continuous Integration and Deployment System · Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski,](https://reader036.vdocument.in/reader036/viewer/2022071009/5fc71e6051035f3c5f7450e2/html5/thumbnails/12.jpg)
Priority policy
• Needed because of ”burst” effect from the builds servers
• Allow for more important components to be installed first
• Ordering transparent for CVMFS installer
• Order can be changed during a day installation through Continuous
Integration Agent
12/07/2018 LHCb Continuous Integration and Deployment System 12
![Page 13: LHCb Continuous Integration and Deployment System · Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski,](https://reader036.vdocument.in/reader036/viewer/2022071009/5fc71e6051035f3c5f7450e2/html5/thumbnails/13.jpg)
Flexibility
• For better distribution of installation:
• First install all the components for the most important platform on
all the software projects
• After, install on a per software project priority base
• Smaller installation result in installations to be propagated faster
• Possibility of injecting / removing installations
• Possibility of reordering the installation
• Better management of installation errors using separate queue in
RabbitMQ
12/07/2018 LHCb Continuous Integration and Deployment System 13
![Page 14: LHCb Continuous Integration and Deployment System · Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski,](https://reader036.vdocument.in/reader036/viewer/2022071009/5fc71e6051035f3c5f7450e2/html5/thumbnails/14.jpg)
Future works
• Reduce the single point of failure by using a cluster of messaging
nodes
• Take advantage of the new message bus:
• Notify other distributed components
• Inform users about the status of the system
• Improve scalability of the system
• Improve the systems monitoring and error management
12/07/2018 LHCb Continuous Integration and Deployment System 14
![Page 15: LHCb Continuous Integration and Deployment System · Improvements to the LHCb software performance testing infrastructure using message queues and big data technologies - M.P. Szymanski,](https://reader036.vdocument.in/reader036/viewer/2022071009/5fc71e6051035f3c5f7450e2/html5/thumbnails/15.jpg)
Conclusion
• End-to-end continuous integration and deployment system
• Decoupled components on different nodes using messaging bus -
RabbitMQ
• Flexible installation system for a otherwise laborious and slow task
• Complex system but easily developed and monitored with Python
• More new opportunities using the new messaging bus
12/07/2018 LHCb Continuous Integration and Deployment System 15