http://www.gsic.titech.ac.jp/sc18
Inter-Cloud InfrastructureCloud Infrastructure for Big Data Analysis
Inter-Cloud Infrastructure for Big Data Analysis
Academic Cloud@ Hokkaido Univ.
Cloud Resource@ NII
@National Institute of Genetics
Cloud Platform@ Kyushu Univ.
TSUBAME3.0@Tokyo Tech.
Public Cloud Resource@ SINET Inter-Cloud
etc.
Exabyte DB onGlobal Object Storage
TSUBAME3 Bigdata Storage
High Bandwidth& Secure Network
by SINET5 Tight connection of Supercomputer and Inter-Cloud resources via SINET L2VPN
BackgroundHPC
HPC users can cause congestion on PFSNot all the HPC systems have fast PFS
CloudCloud storage cannot provide enough I/O throughput for data in-
tensive applications.Loose consistency model in cloud storage
SystemHuronFS (Hierarchical, UseR-level and ON-demand File system)
as a new tier in torage hierarchy.Several dedicated nodes/instances as burst buffers to accelerate
accesses by buffering intermediate data.Implemented on CCI : Utilizing the high performance network in
HPC & High portabilityImplemented with FUSE : Supporting POSIX & No code modifica-
tion required
Consistent hash
Computenode X
HuronFS
IOnode
IOnode
IOnode
Master 0
IOnode
IOnode
SHFS 0
IOnode
IOnode
IOnode
Master Y
IOnode
IOnode
SHFS Y
IOnode
IOnode
IOnode
Master M-1
IOnode
IOnode
SHFS M-1
Parallel File System / Shared Cloud Storage
HuronFS: New HPC Storage Hierachy for HPC & Cloud
BackgroudCloud platforms exhibit elasticity, flexibility, usability and scalability, which
have been attracting users to use these environments as a cost effective measure to run their applications or businesses. However, the feasibility of running high performance computing applications on clouds has always been a concern mainly due to virtualization overheads and high-latency in-terconnection network.
GoalsTo investigate the potential role of these virtual machines in addressing
the needs of HPC and data-intensive workloadsExperimentation
Performance evaluation of applications on AWS C4 instances against the baseline results of a supercomputer, TSUABME-KFC
Evaluation of HPC-Big Data Applications
Acknowledgments. These researches are supported by CREST, JST CREST
Grant Numbers JPMJCR1303 and JPMJCR 1501.
(Research Area: Advanced Core Technologies for Big Data Integration).
* 2 MPI ranks and 12 openmp threads per node achieved the best performance on TSUBAME-KFC. In case of AWS EC2, it was 8 MPI ranks and 12 openmp threads per instance.
Graph500 Benchmark at Scale 26
NICAM-DC-MINI: A compute-intensive miniapp from Fiber miniapp suite
TSUBAME-KFC: Intel Xeon E5-2620v2 x2, InfiniBand FDRAmazon EC2 c4.8xlarge instance: Intel Xeon E5-2666v3, 10Gbps Ethernet
010,00020,00030,00040,00050,00060,00070,00080,00090,000
1 4 16
Exec
utio
n tim
e [s
ec]
# of nodes (10 MPI ranks per node)
TSUBAME-KFCComputation Communication Stall
010,00020,00030,00040,00050,00060,00070,00080,00090,000
1 4 16
Exec
utio
n tim
e [s
ec]
# of instance (10 MPI ranks per instance)
AWS EC2 c4.8xlarge instancesComputation Communication Stall
00.20.40.60.8
11.21.41.6
2 4 8 16Exec
utio
n tim
e [s
ec]
# of nodes(2 MPI ranks, 12 OpenMP threads per node)
TSUBAME-KFCApp Time MPI Time
00.20.40.60.8
11.21.41.6
2 4 8 16Exec
utio
n tim
e [s
ec]
# of instance(8 MPI ranks, 12 OpenMP threads per instance)
AWS EC2 c4.8xlarge instancesApp Time MPI Time
0
2000
4000
6000
8000
10000
12000
14000
1 2 4 8 16 32 64
Thro
ughp
ut (M
iB/s
ec)
The number of compute nodes ( = The number of IONs)
CloudBB (read)CloudBB (write)
Amazon S3 (write)Amazon S3 (read)
0
500
1000
1500
2000
2500
3000
3500
1 2 4 8 16
Thro
ughp
ut (M
iB/s
)
Nodes (8 processes per node)
write read
Multiple Client Sequential I/O Performance
Building a Testbed Infrastructure on Overlay CloudUsing SINET5 network infrastructure (100Gbps Network)Cooperation with Colud and SupercomputerProviding the Science Data RepositoryTestbed: Peta Bytes class object storageReal System: Exa Bytes class object storage
Cloud and Supercomputer FederatonDevelopment of collaboration technology between cloud and
supercomputer using TSUBAME 3.0 as a test case
DockerContainer
Apply cloud container technologyto Supercomputer for resourcecooperation with the Inter-Cloud
Code available at: https://github.com/EBD-CREST/HuronFS
System: Amazon EC2 Tokyo Region, Instance Type: m3.xlarge, Cloud Storage: Amazon S3, Mount Method: s3f, Chunk Size: 5MB, Client Local Buffer Size: 100MB
TSUBAME2 AMAZON Web Service