![Page 1: Copyright, HiPERiSM Consulting, LLC, George Delic, Ph.D. HiPERiSM Consulting, LLC (919)484-9803 P.O. Box 569, Chapel Hill, NC](https://reader034.vdocument.in/reader034/viewer/2022052701/56649ca45503460f94964a30/html5/thumbnails/1.jpg)
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com
George Delic , Ph.D.
HiPERiSM Consulting, LLC
(919)484-9803
P.O. Box 569,
Chapel Hill, NC [email protected]
http://www.hiperism.com
HiPERiSM Consulting, LLC.
![Page 2: Copyright, HiPERiSM Consulting, LLC, George Delic, Ph.D. HiPERiSM Consulting, LLC (919)484-9803 P.O. Box 569, Chapel Hill, NC](https://reader034.vdocument.in/reader034/viewer/2022052701/56649ca45503460f94964a30/html5/thumbnails/2.jpg)
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com
CHOOSING A COMPILER FOR AQM APPLICATIONS ON LINUX
George Delic, Ph.D.
Models-3 User’s WorkshopOctober 27-29, 2003
RTP, NC
![Page 3: Copyright, HiPERiSM Consulting, LLC, George Delic, Ph.D. HiPERiSM Consulting, LLC (919)484-9803 P.O. Box 569, Chapel Hill, NC](https://reader034.vdocument.in/reader034/viewer/2022052701/56649ca45503460f94964a30/html5/thumbnails/3.jpg)
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com
Overview
1. Introduction2. Choice of Hardware3. Choice of Compilers4. Choice of Benchmarks5. Comparing Execution Times6. Evaluation of SSE Results7. Tests for AQM’s8. Conclusions
![Page 4: Copyright, HiPERiSM Consulting, LLC, George Delic, Ph.D. HiPERiSM Consulting, LLC (919)484-9803 P.O. Box 569, Chapel Hill, NC](https://reader034.vdocument.in/reader034/viewer/2022052701/56649ca45503460f94964a30/html5/thumbnails/4.jpg)
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com
Introduction
MotivationAQM’s are migrating to COTS hardwareLinux is preferredRich choice of compilers is now availableNeed to learn about portability issues
What is known about compilers for IA-32?CMAQ releases switch compilers w/o commentWhere is the analysis of differences in
Performance?Numerical accuracy & stability?Portability problems?
![Page 5: Copyright, HiPERiSM Consulting, LLC, George Delic, Ph.D. HiPERiSM Consulting, LLC (919)484-9803 P.O. Box 569, Chapel Hill, NC](https://reader034.vdocument.in/reader034/viewer/2022052701/56649ca45503460f94964a30/html5/thumbnails/5.jpg)
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com
Choice of Hardware & Compilers
HardwareIntel Pentium III (933 MHz, dual processor) with
SSE extensions and 256MB L2 cacheLinux 2.4.20 kernel
Fortran compilers for IA-32Absoft 8.0Intel 7.1Lahey 5.6Portland CDK 4.0
![Page 6: Copyright, HiPERiSM Consulting, LLC, George Delic, Ph.D. HiPERiSM Consulting, LLC (919)484-9803 P.O. Box 569, Chapel Hill, NC](https://reader034.vdocument.in/reader034/viewer/2022052701/56649ca45503460f94964a30/html5/thumbnails/6.jpg)
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com
Choice of Benchmarks
Kallman Integer and Logical AlgorithmUses only I & L operations with bit intrinsicsNegligible I/O and memory operationsSix cases with problem size scaling
Stommel Ocean Model sp Floating Point AlgorithmJacobi iteration sweep over 2-D physical
domainRegular loops optimal for testing vectorizationSix cases in the range N=2x103 to 7x103 with
N2=4 to 49 million data points
![Page 7: Copyright, HiPERiSM Consulting, LLC, George Delic, Ph.D. HiPERiSM Consulting, LLC (919)484-9803 P.O. Box 569, Chapel Hill, NC](https://reader034.vdocument.in/reader034/viewer/2022052701/56649ca45503460f94964a30/html5/thumbnails/7.jpg)
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com
Choice of Benchmarks (cont.)
Princeton Ocean Model dp FP AlgorithmExample of “real-world” code that is
numerically unstable with sp arithmetic! 500+ vectorizable loops to exercise compilers9 procedures account for 85% of CPU time 2-Day simulation for two cases:
Small problem: 65 x 49 x 21 Large problem: 100 x 40 x 15
![Page 8: Copyright, HiPERiSM Consulting, LLC, George Delic, Ph.D. HiPERiSM Consulting, LLC (919)484-9803 P.O. Box 569, Chapel Hill, NC](https://reader034.vdocument.in/reader034/viewer/2022052701/56649ca45503460f94964a30/html5/thumbnails/8.jpg)
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com
Comparing Execution Times: Kallman compiler switches
Compiler and version
Compiler command and selected switches
Absoft 8.0 f90 –O3 –ffixed
Intel 7.1 ifc –O3 –tpp6 -FI
Lahey 5.6 lf95 –tpp –fix
Portland 4.0 pgf90 –fast
![Page 9: Copyright, HiPERiSM Consulting, LLC, George Delic, Ph.D. HiPERiSM Consulting, LLC (919)484-9803 P.O. Box 569, Chapel Hill, NC](https://reader034.vdocument.in/reader034/viewer/2022052701/56649ca45503460f94964a30/html5/thumbnails/9.jpg)
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com
Comparing Execution Times: Kallman (seconds)
N Absoft Intel Lahey Portland
30 0.21 0.36 0.48 0.60
44 40.38 80.19 98.45 135.29
48 6.44 13.15 16.16 22.52
52 23.03 48.20 59.30 83.28
56 197.78 412.83 509.31 712.42
60 12891.58 26734.09 32833.08 45451.38
![Page 10: Copyright, HiPERiSM Consulting, LLC, George Delic, Ph.D. HiPERiSM Consulting, LLC (919)484-9803 P.O. Box 569, Chapel Hill, NC](https://reader034.vdocument.in/reader034/viewer/2022052701/56649ca45503460f94964a30/html5/thumbnails/10.jpg)
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com
Comparing Execution Times: Kallman (log10 seconds)
Kallman Integer & Logical Algorithm (PIII 933 MHz)
-1
0
1
2
3
4
5
1 2 3 4 5 6
Case
Lo
g1
0 o
f w
all
tim
e (
se
co
nd
s)
Absoft
Intel
Lahey
Portland
![Page 11: Copyright, HiPERiSM Consulting, LLC, George Delic, Ph.D. HiPERiSM Consulting, LLC (919)484-9803 P.O. Box 569, Chapel Hill, NC](https://reader034.vdocument.in/reader034/viewer/2022052701/56649ca45503460f94964a30/html5/thumbnails/11.jpg)
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com
Comparing Execution Times: Kallman (ratio to Absoft time)
Kallman Integer & Logical Algorithm (PIII 933 MHz)
0
0.5
1
1.5
2
2.5
3
3.5
4
1 2 3 4 5 6
Case
Ra
tio
to
Ab
so
ft t
ime
Intel / AbsoftLahey / AbsoftPortland /Absoft
![Page 12: Copyright, HiPERiSM Consulting, LLC, George Delic, Ph.D. HiPERiSM Consulting, LLC (919)484-9803 P.O. Box 569, Chapel Hill, NC](https://reader034.vdocument.in/reader034/viewer/2022052701/56649ca45503460f94964a30/html5/thumbnails/12.jpg)
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com
Comparing Execution Times: SOM (POM) compiler switches
(without SSE)Compiler and version
Compiler command and selected switches
Absoft 8.0 f90 –s –cpu:p6–O3 (-N113) –ffixed
Intel 7.1 ifc –O3 (-r8) –tpp6 -FI
Lahey 5.6 lf95 –tpp (-dbl) –fix
Portland 4.0 pgf90 –fast (-r8) –Mvect
![Page 13: Copyright, HiPERiSM Consulting, LLC, George Delic, Ph.D. HiPERiSM Consulting, LLC (919)484-9803 P.O. Box 569, Chapel Hill, NC](https://reader034.vdocument.in/reader034/viewer/2022052701/56649ca45503460f94964a30/html5/thumbnails/13.jpg)
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com
Comparing Execution Times: SOM without SSE (seconds)
N Absoft Intel Lahey Portland
2000 50.0 38.8 36.4 41.4
3000 110.5 94.4 87.7 92.7
4000 197.7 159.6 150.3 163.3
5000 305.3 224.3 246.8 253.1
6000 443.4 320.0 332.0 388.5
7000 586.5 427.6 477.9 524.4
![Page 14: Copyright, HiPERiSM Consulting, LLC, George Delic, Ph.D. HiPERiSM Consulting, LLC (919)484-9803 P.O. Box 569, Chapel Hill, NC](https://reader034.vdocument.in/reader034/viewer/2022052701/56649ca45503460f94964a30/html5/thumbnails/14.jpg)
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com
Comparing Execution Times: SOM (without SSE)
SOM Floating Point Algorithm (PIII 933 MHz)
0
100
200
300
400
500
600
700
1 2 3 4 5 6
Case
Wa
ll t
ime
(s
ec
on
ds
)
AbsoftIntelLaheyPortland
![Page 15: Copyright, HiPERiSM Consulting, LLC, George Delic, Ph.D. HiPERiSM Consulting, LLC (919)484-9803 P.O. Box 569, Chapel Hill, NC](https://reader034.vdocument.in/reader034/viewer/2022052701/56649ca45503460f94964a30/html5/thumbnails/15.jpg)
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com
Statistics for four compilers: SOM (without SSE)
SOM Floating Point Algorithm (PIII 933 MHz): Statistics for four compilers
0
100
200
300
400
500
600
1 2 3 4 5 6
Case
Wa
ll t
ime
(s
ec
on
ds
)
Mean
StandardDeviationCoefficient ofVariation x 1000
![Page 16: Copyright, HiPERiSM Consulting, LLC, George Delic, Ph.D. HiPERiSM Consulting, LLC (919)484-9803 P.O. Box 569, Chapel Hill, NC](https://reader034.vdocument.in/reader034/viewer/2022052701/56649ca45503460f94964a30/html5/thumbnails/16.jpg)
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com
Comparing Execution Times: POM (without SSE)
POM Floating Point Algorithm (PIII 933 MHz)
0100200
300400500600700
800900
1000
1 2
Case
Wa
ll t
ime
(s
ec
on
ds
)
AbsoftIntelLaheyPortland
Case Absoft Intel Lahey Portland
1 909.1 826.4 728.8 836.3
2 825.1 786.9 671.2 755.3
![Page 17: Copyright, HiPERiSM Consulting, LLC, George Delic, Ph.D. HiPERiSM Consulting, LLC (919)484-9803 P.O. Box 569, Chapel Hill, NC](https://reader034.vdocument.in/reader034/viewer/2022052701/56649ca45503460f94964a30/html5/thumbnails/17.jpg)
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com
Statistics for four compilers: Variability vs. problem size
Coefficient of Variation for four compilers(PIII 933 MHz, without SSE)
00.05
0.10.15
0.20.25
0.30.35
0.40.45
0.5
1 2 3 4 5 6
Case
Sta
nd
ard
De
via
tio
n /
Me
an
KallmanSOMPOM
![Page 18: Copyright, HiPERiSM Consulting, LLC, George Delic, Ph.D. HiPERiSM Consulting, LLC (919)484-9803 P.O. Box 569, Chapel Hill, NC](https://reader034.vdocument.in/reader034/viewer/2022052701/56649ca45503460f94964a30/html5/thumbnails/18.jpg)
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com
Evaluation of SSE Results
IA-32 HardwareIntel Pentium III+ supports Streaming-
Single-Instruction-Multiple-Data Extensions (SSE)
Linux 2.4.20 kernel supports SSE
Fortran compilers that enable SSEIntel 7.1Portland CDK 4.0
![Page 19: Copyright, HiPERiSM Consulting, LLC, George Delic, Ph.D. HiPERiSM Consulting, LLC (919)484-9803 P.O. Box 569, Chapel Hill, NC](https://reader034.vdocument.in/reader034/viewer/2022052701/56649ca45503460f94964a30/html5/thumbnails/19.jpg)
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com
Comparing Execution Times: SOM (POM) compiler switches
(with SSE)
Compiler and version
Compiler command and selected switches
Intel 7.1 ifc –O3 -xK (-r8) –tpp6 -FI
Portland 4.0 pgf90 –fast (-r8) –Mvect=sse
![Page 20: Copyright, HiPERiSM Consulting, LLC, George Delic, Ph.D. HiPERiSM Consulting, LLC (919)484-9803 P.O. Box 569, Chapel Hill, NC](https://reader034.vdocument.in/reader034/viewer/2022052701/56649ca45503460f94964a30/html5/thumbnails/20.jpg)
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com
Comparing Execution Times: SOM (with SSE)
SOM Floating Point Algorithm (PIII 933 MHz)
0
100
200
300
400
500
600
1 2 3 4 5 6
Case
Wa
ll t
ime
(s
ec
on
ds
)
IntelIntel (SSE)PortlandPortland (SSE)
![Page 21: Copyright, HiPERiSM Consulting, LLC, George Delic, Ph.D. HiPERiSM Consulting, LLC (919)484-9803 P.O. Box 569, Chapel Hill, NC](https://reader034.vdocument.in/reader034/viewer/2022052701/56649ca45503460f94964a30/html5/thumbnails/21.jpg)
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com
Comparing Execution Times: POM (with SSE)
POM Floating Point Algorithm (PIII 933 MHz)
0
100
200
300
400
500
600
700
800
900
1 2
Case
Wa
ll t
ime
(s
ec
on
ds
)
IntelIntel (SSE)PortlandPortland (SSE)
![Page 22: Copyright, HiPERiSM Consulting, LLC, George Delic, Ph.D. HiPERiSM Consulting, LLC (919)484-9803 P.O. Box 569, Chapel Hill, NC](https://reader034.vdocument.in/reader034/viewer/2022052701/56649ca45503460f94964a30/html5/thumbnails/22.jpg)
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com
Evaluation of SSE Results
Fortran compilers with SOM (sp)Intel 7.1
Average speed up of 1.44
Portland CDK 4.0Average speed up of 1.70
Fortran compilers with POM (dp)Intel 7.1
Average speed up of 1.25
Portland CDK 4.0Average speed up of 1.19
![Page 23: Copyright, HiPERiSM Consulting, LLC, George Delic, Ph.D. HiPERiSM Consulting, LLC (919)484-9803 P.O. Box 569, Chapel Hill, NC](https://reader034.vdocument.in/reader034/viewer/2022052701/56649ca45503460f94964a30/html5/thumbnails/23.jpg)
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com
Tests for AQM’s
Next steps for CMAQ with four compilers:• Report on portability issues• Re-compilation of all libraries• Performance instrumentation & analysis• Numerical & stability analysis• OpenMP performance study
Please propose scenarios worthwhile using for these tests!
![Page 24: Copyright, HiPERiSM Consulting, LLC, George Delic, Ph.D. HiPERiSM Consulting, LLC (919)484-9803 P.O. Box 569, Chapel Hill, NC](https://reader034.vdocument.in/reader034/viewer/2022052701/56649ca45503460f94964a30/html5/thumbnails/24.jpg)
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com
Conclusions
Hardware: COTS is the way to go but ……. Linux: Operating System is popular but ….. Programming Environment: rich in choices Consequences for AQM: the combination
of hardware, Linux, and programming environment needs careful on-going evaluation.
HiPERiSM is ready for this task!
![Page 25: Copyright, HiPERiSM Consulting, LLC, George Delic, Ph.D. HiPERiSM Consulting, LLC (919)484-9803 P.O. Box 569, Chapel Hill, NC](https://reader034.vdocument.in/reader034/viewer/2022052701/56649ca45503460f94964a30/html5/thumbnails/25.jpg)
Copyright, HiPERiSM Consulting, LLC, http://www.hiperism.com
HiPERiSM’s URL
http://www.hiperism.com
Talk to us about your requirements