accelerated software as a service...
TRANSCRIPT
![Page 1: Accelerated Software as a Service (SaaS)on-demand.gputechconf.com/gtc/2014/presentations/S4787-acceler… · ACCELERATED SOFTWARE AS A SERVICE Mike Houston ... Mobile and Cloud Computing](https://reader033.vdocument.in/reader033/viewer/2022050205/5f58b4985fd5466d8c41f6f8/html5/thumbnails/1.jpg)
ACCELERATED SOFTWARE AS A SERVICE
Mike Houston
Principle Engineer, Mobile and Cloud Computing
NVIDIA
![Page 2: Accelerated Software as a Service (SaaS)on-demand.gputechconf.com/gtc/2014/presentations/S4787-acceler… · ACCELERATED SOFTWARE AS A SERVICE Mike Houston ... Mobile and Cloud Computing](https://reader033.vdocument.in/reader033/viewer/2022050205/5f58b4985fd5466d8c41f6f8/html5/thumbnails/2.jpg)
GPU ACCELERATION AS A SERVICE Easy datacenter deployment and use
GPU is a new thing in datacenter
— Lots of acceleration potential
— Low level libraries can be challenging to get used and deployed
Provide acceleration through a simple REST API
— Equivalent of a dynamic library in datacenter
— Wrap all functionality and technical magic
— Easier integration with deployed SW infrastructure
![Page 3: Accelerated Software as a Service (SaaS)on-demand.gputechconf.com/gtc/2014/presentations/S4787-acceler… · ACCELERATED SOFTWARE AS A SERVICE Mike Houston ... Mobile and Cloud Computing](https://reader033.vdocument.in/reader033/viewer/2022050205/5f58b4985fd5466d8c41f6f8/html5/thumbnails/3.jpg)
IMAGE RESIZING
![Page 4: Accelerated Software as a Service (SaaS)on-demand.gputechconf.com/gtc/2014/presentations/S4787-acceler… · ACCELERATED SOFTWARE AS A SERVICE Mike Houston ... Mobile and Cloud Computing](https://reader033.vdocument.in/reader033/viewer/2022050205/5f58b4985fd5466d8c41f6f8/html5/thumbnails/4.jpg)
Challenge:
Heterogeneity at Increasing Scale
Photo by Brad Frost/ CC BY2.0
![Page 5: Accelerated Software as a Service (SaaS)on-demand.gputechconf.com/gtc/2014/presentations/S4787-acceler… · ACCELERATED SOFTWARE AS A SERVICE Mike Houston ... Mobile and Cloud Computing](https://reader033.vdocument.in/reader033/viewer/2022050205/5f58b4985fd5466d8c41f6f8/html5/thumbnails/5.jpg)
Common Ingest Pipeline of today
JPEG
Decode
Storage
ResizeJPEG
Encode
2K res photo
Up to 20 different
resolutions
![Page 6: Accelerated Software as a Service (SaaS)on-demand.gputechconf.com/gtc/2014/presentations/S4787-acceler… · ACCELERATED SOFTWARE AS A SERVICE Mike Houston ... Mobile and Cloud Computing](https://reader033.vdocument.in/reader033/viewer/2022050205/5f58b4985fd5466d8c41f6f8/html5/thumbnails/6.jpg)
Facebook + Instagram: 400M Photo Uploads Per Day
0
50
100
150
200
250
300
350
400
450
2009 2010 2011 2012
# o
f Photo
Uplo
ads
per
Day (
in M
il.)
![Page 7: Accelerated Software as a Service (SaaS)on-demand.gputechconf.com/gtc/2014/presentations/S4787-acceler… · ACCELERATED SOFTWARE AS A SERVICE Mike Houston ... Mobile and Cloud Computing](https://reader033.vdocument.in/reader033/viewer/2022050205/5f58b4985fd5466d8c41f6f8/html5/thumbnails/7.jpg)
Photo Delivery
0
5000
10000
15000
20000
25000
2009 2010 2011 2012
# o
f Photo
per
Day (
in M
il.)
400M Uploads / Day
25B Views / Day
(web + mobile)
![Page 8: Accelerated Software as a Service (SaaS)on-demand.gputechconf.com/gtc/2014/presentations/S4787-acceler… · ACCELERATED SOFTWARE AS A SERVICE Mike Houston ... Mobile and Cloud Computing](https://reader033.vdocument.in/reader033/viewer/2022050205/5f58b4985fd5466d8c41f6f8/html5/thumbnails/8.jpg)
WHY FAST RESIZING Generate multiple sizes on ingest
— Reduce server count and handle higher loads
— Afford to do higher quality filtering
Generate resizes on the fly
— Reduce storage costs, especially for the long tail
— Offer lots of different options for image manipulation
— Remove need to predefined breakpoints
— Optimize for a given target device
Less bandwidth, faster page load
Requires high throughput, low latency solution -> GPUs
![Page 9: Accelerated Software as a Service (SaaS)on-demand.gputechconf.com/gtc/2014/presentations/S4787-acceler… · ACCELERATED SOFTWARE AS A SERVICE Mike Houston ... Mobile and Cloud Computing](https://reader033.vdocument.in/reader033/viewer/2022050205/5f58b4985fd5466d8c41f6f8/html5/thumbnails/9.jpg)
HOW? Key topics we’ll cover
— Request handling
— Resource access
— Performance
— Deployment scenarios
Punch-line for GPU offload
— Efficient management and scheduling of resources critical
![Page 10: Accelerated Software as a Service (SaaS)on-demand.gputechconf.com/gtc/2014/presentations/S4787-acceler… · ACCELERATED SOFTWARE AS A SERVICE Mike Houston ... Mobile and Cloud Computing](https://reader033.vdocument.in/reader033/viewer/2022050205/5f58b4985fd5466d8c41f6f8/html5/thumbnails/10.jpg)
REST SYNTAX EXAMPLES – HTTP GET URL layout: /<operation>/<width>x<height>/path/to/file.[jpg|jpeg]?param1=value1[;|&]param2=value2...
Examples:
http://sass.com/resize/1024x1024/pix/Growth_of_cubic_bacteria_25x16.jpg
http://sass.com/resize/800x/pix/example.jpg?quality=50&crop=10x10x100x100
http://sass.com/resize/x/pix/example.jpg?quality=75type=progressive&optimize=yes
![Page 11: Accelerated Software as a Service (SaaS)on-demand.gputechconf.com/gtc/2014/presentations/S4787-acceler… · ACCELERATED SOFTWARE AS A SERVICE Mike Houston ... Mobile and Cloud Computing](https://reader033.vdocument.in/reader033/viewer/2022050205/5f58b4985fd5466d8c41f6f8/html5/thumbnails/11.jpg)
REQUEST HANDLING 1 Need something that can handle heavy request load
— Sometimes referred to as “c10k” problem
— Leads you towards event based solutions
Need deep control over resources and scheduling
— Leads you towards thread based solutions
Recommendation: use hybrid solution
— Event model for asynchronous IO
High connection concurrency and good failover handling
— Thread pool for GPU access
Allocations, streams, transfers, and kernels belong to a worker thread
![Page 12: Accelerated Software as a Service (SaaS)on-demand.gputechconf.com/gtc/2014/presentations/S4787-acceler… · ACCELERATED SOFTWARE AS A SERVICE Mike Houston ... Mobile and Cloud Computing](https://reader033.vdocument.in/reader033/viewer/2022050205/5f58b4985fd5466d8c41f6f8/html5/thumbnails/12.jpg)
REQUEST HANDLING 2 Scheduling the GPUs critical
— Want overlapped communication and computation
— Schedule to multiple GPUs
Load balancing
Split CPU and GPU load
— Split processing pipeline up to which core makes the most sense
— CPU
JPEG parsing and decode
— GPU
DCT/IDCT, resize, filters, compression, etc.
![Page 13: Accelerated Software as a Service (SaaS)on-demand.gputechconf.com/gtc/2014/presentations/S4787-acceler… · ACCELERATED SOFTWARE AS A SERVICE Mike Houston ... Mobile and Cloud Computing](https://reader033.vdocument.in/reader033/viewer/2022050205/5f58b4985fd5466d8c41f6f8/html5/thumbnails/13.jpg)
RESOURCE ACCESS Local files
— Async file i/o
HTTP backend
— Talk to backend severs – Amazon S3, webserver
— Need robust failover handling – more on this later
![Page 14: Accelerated Software as a Service (SaaS)on-demand.gputechconf.com/gtc/2014/presentations/S4787-acceler… · ACCELERATED SOFTWARE AS A SERVICE Mike Houston ... Mobile and Cloud Computing](https://reader033.vdocument.in/reader033/viewer/2022050205/5f58b4985fd5466d8c41f6f8/html5/thumbnails/14.jpg)
PLACES TO START - HTTP Full solutions to plug into
— Nginx
— Apache
— Varnish
Roll your own
![Page 15: Accelerated Software as a Service (SaaS)on-demand.gputechconf.com/gtc/2014/presentations/S4787-acceler… · ACCELERATED SOFTWARE AS A SERVICE Mike Houston ... Mobile and Cloud Computing](https://reader033.vdocument.in/reader033/viewer/2022050205/5f58b4985fd5466d8c41f6f8/html5/thumbnails/15.jpg)
ROLLING YOUR OWN HTTP SERVER Why?
— No policy imposed on you
— Better control of GPU resources and scheduling
How?
— Boost::ASIO examples
“HTTP Server 3”
— Single IO service and a thread pool
“HTTP Client”
— Asynchronous communication with server
— HTTP request/response parsing
http-parser (Joyent)
— Nvidia NPP
Fundamental image processing routines
![Page 16: Accelerated Software as a Service (SaaS)on-demand.gputechconf.com/gtc/2014/presentations/S4787-acceler… · ACCELERATED SOFTWARE AS A SERVICE Mike Houston ... Mobile and Cloud Computing](https://reader033.vdocument.in/reader033/viewer/2022050205/5f58b4985fd5466d8c41f6f8/html5/thumbnails/16.jpg)
CRITICAL OPTIMIZATIONS JPEG parser
— Can quickly become CPU bottleneck when GPU is doing the majority of the work
Huffman decode
— Decode can be quite expensive, especially progressive decode
Huffman tree build – progressive output support
— Not generally GPU friendly, but offload here helps reduce CPU load
Latency, latency, latency
— No batching allowed
latency kills
— Critical scheduling of transfers, kernels, and allocation
No blocking CPU or GPU, ever
![Page 17: Accelerated Software as a Service (SaaS)on-demand.gputechconf.com/gtc/2014/presentations/S4787-acceler… · ACCELERATED SOFTWARE AS A SERVICE Mike Houston ... Mobile and Cloud Computing](https://reader033.vdocument.in/reader033/viewer/2022050205/5f58b4985fd5466d8c41f6f8/html5/thumbnails/17.jpg)
HOW FAST? Compared to GraphicsMagick 1.3.19 + OpenMP + scheduling
optimizations
AWS EC2 g2.2xlarge
— ~5X the throughput, ~5.5X better latency
Dual E5-2667 + 2 Tesla K10s
— ~23X the throughput, ~20X better latency
![Page 18: Accelerated Software as a Service (SaaS)on-demand.gputechconf.com/gtc/2014/presentations/S4787-acceler… · ACCELERATED SOFTWARE AS A SERVICE Mike Houston ... Mobile and Cloud Computing](https://reader033.vdocument.in/reader033/viewer/2022050205/5f58b4985fd5466d8c41f6f8/html5/thumbnails/18.jpg)
WEB INFRASTRUCTURE CAN BE TRICKY There is the HTTP specification and there is reality
— You must be super strict when interacting with clients
— You must be pretty lax when interacting with servers
Network stack tuning
— Still needed for heavy connection loads
— Some distributions have “modes” for this – see RHEL
Load test!
— Different tools hit you differently
Siege – generate high load and beat on systems
Apachebench – easy to use and good statistics
Iago – can generate consistent transaction load and excellent statistics
Security
![Page 19: Accelerated Software as a Service (SaaS)on-demand.gputechconf.com/gtc/2014/presentations/S4787-acceler… · ACCELERATED SOFTWARE AS A SERVICE Mike Houston ... Mobile and Cloud Computing](https://reader033.vdocument.in/reader033/viewer/2022050205/5f58b4985fd5466d8c41f6f8/html5/thumbnails/19.jpg)
QUICK INTRODUCTION TO SECURITY Errors in your SW you really want to prevent
— Denial of service attacks
— Privilege escalation
— Information leakage
Server lockdown
— Do everything you can at the OS level
Lots of scripts out there
Make sure you are up to date
— Lock down IPs you talk to
If talking only to localhost, lock it down to localhost
![Page 20: Accelerated Software as a Service (SaaS)on-demand.gputechconf.com/gtc/2014/presentations/S4787-acceler… · ACCELERATED SOFTWARE AS A SERVICE Mike Houston ... Mobile and Cloud Computing](https://reader033.vdocument.in/reader033/viewer/2022050205/5f58b4985fd5466d8c41f6f8/html5/thumbnails/20.jpg)
QUICK INTRODUCTION TO FUZZING What you build may be exposed to the whole world
Inject lots of errors and really beat on things
You will likely find
— Segfaults
— Infinite loops
— Races
— Unexpected behavior
Makes your code better
But, makes you bitter
![Page 21: Accelerated Software as a Service (SaaS)on-demand.gputechconf.com/gtc/2014/presentations/S4787-acceler… · ACCELERATED SOFTWARE AS A SERVICE Mike Houston ... Mobile and Cloud Computing](https://reader033.vdocument.in/reader033/viewer/2022050205/5f58b4985fd5466d8c41f6f8/html5/thumbnails/21.jpg)
RADAMSA - HTTPS://CODE.GOOGLE.COM/P/OUSPG/WIKI/RADAMSA
Black box fuzzer
— Generate lots of negative testing inputs
— Given exemplar inputs, generate random versions
Great for testing parsers
— HTTP
— JPEG
— Commands
Can take a really long time to trigger issues
Use it! (or alternative)
![Page 22: Accelerated Software as a Service (SaaS)on-demand.gputechconf.com/gtc/2014/presentations/S4787-acceler… · ACCELERATED SOFTWARE AS A SERVICE Mike Houston ... Mobile and Cloud Computing](https://reader033.vdocument.in/reader033/viewer/2022050205/5f58b4985fd5466d8c41f6f8/html5/thumbnails/22.jpg)
DEPLOYMENT SCENARIOS Amazon EC2 – G2.2xlarge instances
Content Delivery Networks (CDNs)
Backend infrastructure
![Page 23: Accelerated Software as a Service (SaaS)on-demand.gputechconf.com/gtc/2014/presentations/S4787-acceler… · ACCELERATED SOFTWARE AS A SERVICE Mike Houston ... Mobile and Cloud Computing](https://reader033.vdocument.in/reader033/viewer/2022050205/5f58b4985fd5466d8c41f6f8/html5/thumbnails/23.jpg)
AMAZON EC2 – G2 INSTANCES Hardware virtualized
You get ½ a GRID K520 and 8 vCPUs
Take off the shelf AMI, add CUDA driver/toolkit, launch
— Ubuntu 12.04 LTS
— CUDA 6.0
Integration with current SW infrastructure
— Run service behind stack on same node
— Nodes as standalone server
![Page 24: Accelerated Software as a Service (SaaS)on-demand.gputechconf.com/gtc/2014/presentations/S4787-acceler… · ACCELERATED SOFTWARE AS A SERVICE Mike Houston ... Mobile and Cloud Computing](https://reader033.vdocument.in/reader033/viewer/2022050205/5f58b4985fd5466d8c41f6f8/html5/thumbnails/24.jpg)
CONTENT DELIVERY NETWORKS Massive distributed caching
Sit behind Varnish/Squid/Nginx as a backend
— Filter requests
— Cache output from image server
— Load balance multiple image servers at the datacenter level
Talk back to Varnish/Squid/Nginx as server
— Cache base images
— Customer’s access keys don’t go through your SW (S3 buckets)
![Page 25: Accelerated Software as a Service (SaaS)on-demand.gputechconf.com/gtc/2014/presentations/S4787-acceler… · ACCELERATED SOFTWARE AS A SERVICE Mike Houston ... Mobile and Cloud Computing](https://reader033.vdocument.in/reader033/viewer/2022050205/5f58b4985fd5466d8c41f6f8/html5/thumbnails/25.jpg)
BACKEND INFRASTRUCTURE “Bring your own servers”
— Add Tesla K10/20/40
GRID VCA