what is gridftp? l high-performance, reliable data transfer protocol optimized for high-bandwidth...

32
What is GridFTP? High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks Based on FTP protocol - defines extensions for high-performance operation and security We supply a reference implementation: Server Client tools (globus-url-copy) Development Libraries Multiple independent implementations can interoperate Fermi Lab and U. Virginia have home grown servers that work with ours.

Upload: laurel-norris

Post on 05-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

What is GridFTP?

High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks

Based on FTP protocol - defines extensions for high-performance operation and security

We supply a reference implementation: Server Client tools (globus-url-copy) Development Libraries

Multiple independent implementations can interoperate Fermi Lab and U. Virginia have home grown servers that

work with ours.

Page 2: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

GridFTP

Two channel protocol like FTP Control Channel

Communication link (TCP) over which commands and responses flow

Low bandwidth; encrypted and integrity protected by default

Data Channel Communication link(s) over which the

actual data of interest flows High Bandwidth; authenticated by

default; encryption and integrity protection optional

Page 3: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

Performance

Disk transfer between Urbana, IL and San Diego, CA

0

5

10

15

20

0 10 20 30 40 50 60 70

Degree of Striping

Throughput (Gbit/s) # Stream = 1 # Stream = 2 # Stream = 4# Stream = 8 # Stream = 16 # Stream = 32

Page 4: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

GridFTP over UDT

GridFTP uses XIO for network I/O operations XIO presents a POSIX-like interface to many

different protocol implementations

GSI

TCP

Default GridFTP

GridFTP over UDT

GSI

UDT

Page 5: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

Reliable File Transfer Service (RFT)

RFT Service

RFT Client

SOAP Messages

Notifications(Optional)

GridFTP Server

GridFTP Server

CC CC

DC

Persistent Store

GridFTP client WSRF complaint fault-tolerant service

Page 6: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

Architecture Overview

RFTClient

VO 1

RFT Service

GridFTPStriped Server

GridFTPClient A

GridFTPServer

GridFTPServer GridFTP

Client BBB

VO 2

GridFTPStriped Server

GridFTPServer

RFT

A

Page 7: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

GridFTP Service

On demand transfer service When a connection is formed, resources are dedicated GridFTP might say “not now” Not a queuing service

Transfer data as fast as possible Maximize resource usage

Without over heating!

Page 8: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

What GridFTP Does

Fast data transfer service Cluster to cluster copy tool

Intra-cluster broadcast tool Multi-cast transfers

Scalable Need more throughput, add more stripes

Page 9: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

RFT Service

Orchestrates transfers on client’s behalf Third party transfers Interacts with many GridFTP servers

Sees a bigger picture VO level

Queue requests RFT should not say no

Retry requests on failure Optimizes its workload

Page 10: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

What RFT Does

Reliable service DB backend Recovers from GridFTP and RFT service failures

Batch requests Light weight sessions Submit a Request Wait for notifications

Started, finished, failed, etc

Page 11: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

GridFTP: On Demand Service

Resources are limited Data transfers are heavy weight operations Sometimes hardware is too busy

Adding another transfer can cause thrashing Collective system throughput goes down

GridFTP might say “no”

Transfer requests happen immediately We do not queue, or delay transfers An established session means an active transfer

Page 12: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

Why Doesn’t GridFTP Queue

A GridFTP session is heavy weight Idle sessions consume resources

Backward compatible protocol

Sometimes less is more Goal: Maximize the collective throughput

Sum of all active transfer rates Too many transfers cause thrashing

Results in lower collective throughput Avoid overheating system resources

It is in the systems best interest We know what’s good for you

Page 13: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

GridFTP Session Resources

Even for an idle session Active TCP control channel

Part of the 959 protocol. A session is defined by a TCP connection

Fork/setuid process

Robustness File system/OS permissions

OS buffer space

Data channels require large TCP OS buffers

Active transfers Lots of memory/Net/Disk IO

Avoid too small of partitions

Page 14: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

If GridFTP Always Said Yes

OOM: the out of memory handle OS optimistic provision of TCP buffers Random processes will be killed Meltdown

Shared FS overuse Pushing the I/O throughput beyond optimal Causing OOM on IOD machines

Shares of bandwidth too small 1 Million transfers at 500b/s each? OR 10 transfers at 100Mb/s each

Page 15: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

Simultaneous Sessions

Goal: Collective throughput entire servers bytes transferred / time

Not the number of transfers at once

Only reasons for more than 1 connection Provide an interactive service for many One session does not use all of the local resource

The remote side is the bottleneck Hide control messaging overhead in another sessions data transfer payload

Page 16: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

Remote Bottleneck

Allow more than one simultaneous transfer to use all resources

10Gb/s

Page 17: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

But We Want Queuing!

May I offer you something in an RFT? RFT says yes Server side retries

Light weight sessions GridFTP does the heavy lifting Queues up requests of pending transfers Notification upon completion Scalability

Manages/Optimizes access to GridFTP Servers

Page 18: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

GridFTPdst

GridFTPsrc

RFT Session Interactions

RFT

GridFTPsrc

GridFTPdst

ClientRequest

Notification

Page 19: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

Scalability

GridFTP Connection rejection is a feature

It SHOULD say no Intended to scale to system transfer rates

Not beyond them To scale up add more nodes as stripes (dynamic

backbends) Use faster NICs

RFT Intended to scale to memory

It should not say no

Page 20: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

GridFTP Broke My Cluster!

GridFTP will push hardware as hard as it is allowed

But not harder

sudo rm –rf / Did sudo break the FS?

ssh –u root host1 fork.bomb Did sshd take down the host?

globus-url-copy –tcp-bs 100GB <src> <dst>

Did GridFTP break the cluster?

Page 21: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

Resource Protection

Limits need to be in place to protect Knowing it is ok to say ‘no’ is step 1

What will hardware allow? How fast are my disks? How fast is my NIC? How fast is can I send data while using the NetFS? How many WAN transfers can I support with system memory? How many simultaneous transfers can are reasonable to sustain?

Page 22: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

Fast Transfer Resources

CPU Packet switching

Memory OS buffers (BWDP) User space buffers WAN needs much more

System bus Disk

Shared FS? (net also)

Network Router and LAN

TCP Buffers

CPU

Bus

Page 23: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

Cluster Components

Disk Shared I/O servers

Net Backplate bandwidth

Systems CPU/Memory Are IODs and GridFTP servers co

located?Shared IO Servers

GridFTP Backends

GridFTP Frontends

Page 24: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

Connection Caps

As a function of system memory Cap = |mem| / (2MB + avg(BWDP)) Never more than |mem| / 4MB

service gsiftp{ instances = 20 socket_type = stream wait = no

env += GLOBUS_LOCATION=… env += LD_LIBRARY_PATH=… server = /usr/local/globus-4.0.1/sbin/globus-gridftp-server server_args = -i -p 2811

disable = no}

% globus-gridftp-server –connection-max 20

Page 25: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

Connection Caps

As a function of system bandwidth Cap = min(FS.BW, Net.BW) / (Target average transfer rate)

As a function of my gut 20 - 50 Best guess based on personal experience Typically this is where collective BW plateaus

Page 26: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

System Buffer Limits

Limit the amount of OS space per conneciton

Auto tuning 16MB - 64MB

% sysctl -w net.core.rmem_max=<value>% sysctl -w net.core.wmem_max=<value>

% cat /proc/sys/net/ipv4/tcp_wmem4096 16384 4194304

% cat /proc/sys/net/ipv4/tcp_rmem4096 16384 4194304

Page 27: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

GFork Memory Manager

Dynamically rations memory 10% of the allowed connections get 90% of the memory Remaining session get half of available memory

Allows for high connection limits |mem| / 2MB

Page 28: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

Conclusions GridFTP is an on demand service

OK to say no RFT is a VO level queuing service

please use it http://www.gridftp.org [email protected]

Page 29: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

What is OGSA DAI?

OGSA-DAI executes workflows OGSA-DAI is not just for data access, also

does data updates, transformations and delivery.

Page 30: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

The Globus Replica Location Service

Distributed registry Records the locations of data

copies Allows replica discovery RLS maintains mappings

between logical identifiers and target names

Must perform and scale well: support hundreds of millions

of objects hundreds of clients

Mature and stable component of the Globus Toolkit

Replica Location Indexes

Local Replica Catalogs

LRC LRC LRC LRC

RLI RLI RLI

RLI RLI

Page 31: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

Approach: Combine Pegasus Workflow Management with Globus Data Replication Service

Workflow Planner: Pegasus

Data Placement

Service: Globus

DRS

Compute Cluster Storage

ElementsJobs Data

Transfer

Workflow Tasks

Staging Request

Setup Transfers

Page 32: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines

Examples of Placement Policies