solving big problems with open source: condor
DESCRIPTION
This is an introduction to the queue distribution system Condor and how are we using it at the I3A (http://i3a.unizar.es)TRANSCRIPT
> Solving Big problems with OS: Condor
> Antonio Sanz ([email protected]) > 09 / Nov / 11
2
3
> Antonio Sanz
> I3A System Manager
> HERMES HPC cluster sysadmin
> @antoniosanzalc
4
5
Show / Know / Use
6Problema inicial
3. Sistemas de gestión de colas : Condor
> Dr. Good
> Neurologist
> Alzheimer research
> Process 20000
brain image scans
(1h/image)
> A thousand times. Maybe two.
7Problema inicial > Mrs. Nice
> Santa’s Logistic Officer
> Gift transportation
> Analize 6x10e7 possible load/reindeers/routes
(10min/analysis)
> Before Christmas!
8
Hey … ! It’s a 64K one !
9
Queue distribution systems
10Condor Basics
3. Sistemas de gestión de colas : Condor
Single queue
11
12Condor Basics
Multiple queues
3. Sistemas de gestión de colas : Condor
13
14
Problem partitioning
15Problem can be broken into independent pieces
16Condor Basics
Oh Yeah!
17Condor Basics
For loops are your best friends
3. Sistemas de gestión de colas : Condor
18Condor Basics
3. Sistemas de gestión de colas : Condor
While loops …can sometimes be convinced
19Condor Basics
Do it yourself !3. Sistemas de gestión de colas : Condor
20Condor Basics
>
21Condor Basics
Heterogeneous computing
22
Resource harvesting
23
Requirements
24
Job Surveillance
25Condor Basics
Fair use of resources
3. Sistemas de gestión de colas : Condor
26
Checkpoints
27Condor Basics
Nested jobs (DAG)
28Condor Basics
Email Notifications
29
Grid & Cloud Computing
30Condor Basics
Flexibility
31
… with Hadoop, MPI, OpenMP, GPU
32Condor Basics
3. Sistemas de gestión de colas : Condor
33
How Condor works
34
Management
[Hello, Dave]
35
Compute
36Condor Basics
Job list � ClassAd
3. Sistemas de gestión de colas : Condor
37
Resource list � ClassAd
38
Matchmaking
39Condor Basics
Priority Management
40
Data
Transfer
41Condor Basics
3. Sistemas de gestión de colas : Condor
Job running
42
Job Monitoring
43
Job End
44
Example
45
Hello, World !!
#!/bin/sh# I’m hola.shecho Hola mundo desde `hostname`
# # A Hello World .. In Condor!# # I’m hello.subUniverse = vanilla Executable = hola.shLog = hola.logOutput = hola.outError = hola.errQueue
46Lanzar el cálculo
condor_submit
4. Condor Basics – Un cálculo fácil
47Lanzar el cálculo
condor_q
48
Something tastier…#!/bin/sh# I’m hello2.shOUTPUT=hello${1}.resultcat hello.input >> $OUTPUT cat echo Hello world, I’m job $1 here from
`hostname` > $OUTPUT
# Execute n times with different outputsUniverse = vanillaExecutable = hello2.shTransfer_input_files = hello.inputWhenToTransferOutput = ON_EXIT_OR_EVICTArguments = $(Process)Log = hello.logOutput = hello.outQueue 10
49
Perfect Simulation
4. Condor Basics – Un cálculo fácil
50
Extra Bonus
51
Dynamic Partitioning
52Condor Basics
Configurable Jobs
53
Advanced Accounting
54
Dynamic Checkpointing
55Condor Basics
Hadoop Integration
3. Sistemas de gestión de colas : Condor
56
Green Computing
57Condor Basics
GPU Integration
58
I3A & Condor
59
Gaming IA
60
MRI
Brain
Analysis
61Communication
Systems
62
Tissue Modelling
63Condor Basics
3. Sistemas de gestión de colas : Condor
64Condor Basics
> Conclusiones
3. Sistemas de gestión de colas : Condor
65
Example
66
Antonio Sanz
@antoniosanzalc
Slides here:
http://web.hermes.cps.unizar.es/doc/condor.pdf