juhan kim, the kias cac january 23, 2011. to build linux cluster from the commodity hardware like...
Post on 19-Dec-2015
216 views
TRANSCRIPT
How to Build Parallel Computers in a Lab
Juhan Kim, The KIAS CAC
January 23, 2011
Moor’s Law
Court
esy
of
Gre
g B
runo
The Cheapest Way to Get a Super Computer To build linux cluster
from the commodity hardware like cpu (x-eon,amd), network (gi-gabit,infiniband), etc
Overview of the TOP500.org http://www.top500.org/ Top 500 supercomputers from 1993 –
now Lists in order of LINPACK test (double
precision) Rpeak, Rmax
Top500 : Countries
Top500 : processor types
Top500 : network
Top500 : Operating systems
Top500: Applications
Top500: # of CPU cores
Cluster Hardware CPU
AMD Opteron Pentium (Xeon) Itanium2 benchmark test: http://www.spec.org/cpu/
12 integer benchmarks Video compression, quantum computing, path-finding, combinartorial optimization,
discrete event simulation, etc. 17 floating-point benchmarks
Fluid dynamics, quantum chemistry, general relativity, ray tracing, quantum chem-istry, weather prediction, biochemestry, etc.
mean of 12/17 resulting speeds/throughputs of integer/floating-point calculation No I/O & No network communication Spec[xx]2006 : to measure the speed of a core Spec[xx]2006_rate : to measure the throughput of a chip/chips
Network Commodity 1G/10G Infiniband
Intel X86
Most widely used processor
EM64T : 64bit sup-ported
Fastest integer arithmetic process
Fastest floating point arithmetic process
AMD Opteron
x86_64 architec-ture
Good performance/cost in floating-point & integer arithmetic process
Itanum2
First released in 2001
Impressive Linpack efficiecy: ~90%
SpecFP2006_rate
W300,000
W500,000
SpecFP2006 rate
W1,000,000
$700
Processor GHz Cores SPECint2006
SPECfp2006 Price
Xeon E5620 2.4 4 31.6 37 0.5 MW
Opteron 4184 2.8 6 25.2 34.2 350 USD
Opteron 6174 2.2 12 21.5 30.7 1.5 MW
Intel Core i7-940 2.9 4 30.8 31.3 0.8 MW
Intel Itanium2(18MB)
1.6 2 11.5 17.7 700 USD
IBM Power7 (780)
- 16 44 (4.14GHz) 71.5 (3.86GHz) ???
Processor Summary (Processor Speed)
•Performance: • IBM Power >> Xeon > Opteron 4184 > Core i7 > Opteron 6174 > Ita-nium2
•Performance/cost :• Xeon > Opteron 4184 > Core i7 > Opteron 6174 > Itanium2
Processor GHz (Gflops)
Ncore Nthread
SPECint2006_rate/core
SPECfp2006_rate/core
Xeon E5620 2.4(38.4) 4 2 27.8 21.1(84.8/chip)
Opteron 4184 2.8(67.2) 6 1 19.8 15.1(90.6/chip)
Opteron 6174 2.2 (105.6) 12 1 16.1 13.0(156/chip)
Intel Core i7-940 2.9 (46.4) 4 2 29 21.5(86/chip)
Intel Itanium2(18MB)
1.6 (12.8) 2 1 11.7 7.25(14.5/chip)
IBM Power7 (780)
3.72 (89.3 (?))
6 4 40 35.5(142/chip)
Processor Summary (multi processing speed)
200
400
600
800
1000
1200
1400
1600
1800
2000
0
20
40
60
80
100
120
140
SpecColumn1
Performance(thoughput)/cost
I7-860/920: best perf/costOpteron series: show a good performance/costI7-940, x5670: poor ratio
E5620I7-940
X5670
I7-860
Opteron6172Opteron6136
E5630x5550
I7-920
8cores
12cores
Off-the-self computer boxes Dell PowerEdge R610
1U rack mount 2-way Xeon E5620
(2.4Ghz, 4cores) 24GB Memory 160GB HardDisk 3,075 USD (5.38MWon) Specfp2006: 37 Specfp2006_rate: 169 Xeon E5620 vs. E5506
(2.13Ghz) 20 % cheaper 30% slower
Dell Opteron server
Dell PowerEdge R815 1U rack mount 2-way Opteron 6128
(2Ghz, 8cores) 32GB Memory 250GB disk 4,556 USD Specfp2006: 28 Specfp2006_rate: 233 Opteron 6128 vs. 6136
(2.4Ghz) 40% more expensive 20 % faster
IBM Xeon Server
IBM x3650 M3 2-way Xeon E5620
(2.4Ghz, 4cores) 2U rack mount 24GB mem 146GB Disk 5,958 USD Specfp2006: 37 Specfp2006_rate:
169
IBM Opteron server
IBM x3755 M3 3U rack mount 2-way Opteron 6128
(2Ghz, 8cores) 24GB Memory 300GB disk 7,729 USD Specfp2006: 28 Specfp2006_rate: 233 Opteron 6128 vs. 6136
(2.4Ghz) 10% more expensive 20 % faster
IBM Power7 server IBM Power 710 Express
1-way socket Power7/ 3.Ghz (4cores) 4GB memory 150GB disk Specfp2006_rate: ??? 5,852 USD 8,120 USD (6cores,3.7Ghz;
16GB) 14,620 USD (8cores,
3.55Ghz; 16GB) Specfp2006_rate
213 (6core, 3.72Ghz, 8,120 USD)
248 (8cores, 3.55Ghz, 14,620 USD)
Comparisons
Throughput/cost1. Dell PowerEdge Xeon R610 : 0.055/$2. Dell PowerEdge Opteron R815 : 0.051/$3. IBM Opteron x3755 : 0.03/$4. IBM Xeon x3650 : 0.028/$5. IBM Power7 6-core : 0.026/$
Network
Myrinet
http://www.myri.com/myrinet/prod-uct_list.html
Throughput : 800 MB/s Latency: 2.7μs Cost: 20,000 USD for 16port
1,250 USD/port ~ 1.4 Mwon/port
Infiniband network (latency: 0.1 μsecond)
•Throughput (QDR): 40Gbps•Infiniband Chassis: 50MWon•Leaf (18port) : 10MW•Adapter card: 1MW•Cable: 0.8MW
•64nodes: 255MW•32nodes : 130MW (?)•Cost/port = 4MW/port
Quadrics
No more available Closed in 2009
System Monitoring Tool Ganglia
Using lm_sensors CPU load, network usage,
Disk usage, temperature Providing statistics (daily,
weekly, monthly usages) How to install: http://
blog.naver.com/PostView.nhn?blogId=me2you2&logNo=100037667623&redirect=Dlog&widgetTypeCall=true
Adding temperature monitoring script to Ganglia# cat > /etc/
cron.d/cputemp
0-59/1 * * * * root /pack-ages/exam-ples/cputemp
^D#
#!/bin/bash# author: Mike Snitzer <[email protected]># desc: used to make lm_sensors metrics available to ganglia# /etc/sysconfig/ganglia is used to specify INTERFACECONFIG=/etc/sysconfig/ganglia[ -f $CONFIG ] && . $CONFIG
#default to eth0if [ -z "$MCAST_IF" ]; then MCAST_IF=eth0fiGMETRIC_BIN=/usr/bin/gmetric# establish a base commandline#GMETRIC="$GMETRIC_BIN -i $MCAST_IF"GMETRIC="$GMETRIC_BIN”SENSORS=/usr/bin/sensors# load the lm_sensors modulesmodule=`/sbin/lsmod | awk '{print $1}' | grep i2c-piix4`if [ -z "$module" ]; then /sbin/modprobe i2c-piix4 # lm87 is for supermicro P3TDLE, replace when appropriate /sbin/modprobe lm87 /sbin/modprobe coretempfi# send cpu temps if gmond is running`/sbin/service gmond status > /dev/null`if [ $? -eq 0 ]; then # send cpu temperatures let count=0 for temp in `${SENSORS} | grep Core | cut -b 8-18`; do $GMETRIC -t float -n "cpu${count}_temp" -u "C" -v $temp let count+=1 done
# send cpu fan speed let count=0 for fan in `${SENSORS} | grep fan1 | cut -b 9-17`; do $GMETRIC -t uint32 -n "cpu${count}_fan" -u "RPM" -v $fan let count+=1 donefi
Research Funds available in Ko-rea
http://www.nrf.re.kr/html/kr/busi-ness/business_01_01_02.html
Danawa Linux CPU Cluster (33n-odes)
Installation: Jan. 2010 Intel i7 Bloomfield 920 Single socket, 8GB memory 1G network SpecFP2006_rate= 78.9/4cores 4 cabinets (44U) 4 X 3kW UPS Danawa price: 51MWon (
http://www.danawa.com/) Company-proposed price:
150MWon 1month assembling by one per-
son 1month system installing by one
person Need a lot of man power
GPU Cluster @ CAC 25 sets of CPU+GPU
machines Peak performance ~
10x64(dual socket, quad core) node CPU clusters
Total cost ~ 1.5 x 64 node CPU cluster
CPU nodes: IBM x3550 (dual Xeon E3640, 24 GB, 3x1Gbps)
20KWatt power
Fund vs. Cluster size
1. 32node Danawa cluster (SPecfp2006_rate: 2560)2. 64node IBM Xeon cluster (8960) + 3year warranty3. 32node Dell Opteron cluster (7488)4. 64node Dell Opteron cluster (14976)5. 24node Nvidia S2050 + IBM Xeon cluster (10x#2)
1 23 4 4i2i3i
5
Conclusion
Spec2006/Rpeak Xeon E5620> i7-940 > Power7 >
Opteron >> Itanium Spec2006/cost
i7 series>Xeon E-series>Opteron>Xeon X-series>>Itanium
Cost IBM > Dell >> Danawa
Conclusion 2/ Discussion
Interconnection Infiniband/Myrinet:
40/20% of total cost based on the IBM/Dell Cluster 80/60 % of total cost based on the Danawa Cluster
Disk Storage Lustre File System: a few 10s Mwon for 10TB
one (a few?) MDS (Meta Data Server) + many OSS (Object Storage Server)
SAN : very expensive