caching up and down the stack

59
Caching Up and Down the Stack Long Island/Queens Django Meetup 5/20/14

Upload: dan-kuebrich

Post on 06-May-2015

179 views

Category:

Software


2 download

DESCRIPTION

Whether you're looking to make your web app run faster or scale better, one great way to achieve both is to simply do less work. How? By using caches, the data hidey-holes which generations of engineers have thoughtfully left at key junctures in computing infrastructure from your CPU to the backbone of the internet. Requests into web applications, which span great distances and often involve expensive frontend and backend lifting are great candidates for caching of all types. We'll discuss the benefits and tradeoffs of caching at different layers of the stack and how to find low-hanging cachable fruit, with a particular focus on server-side improvements

TRANSCRIPT

Page 1: Caching Up and Down the Stack

Caching Up and Downthe Stack

Long Island/Queens Django Meetup 5/20/14

Page 2: Caching Up and Down the Stack

Hi, I’m Dan Kuebrich

● Software engineer, python fan● Web performance geek● Founder of Tracelytics, now part of AppNeta● Once (and future?) Queens resident

Page 3: Caching Up and Down the Stack
Page 4: Caching Up and Down the Stack

DJANGO

Page 5: Caching Up and Down the Stack

What is “caching”?

● Caching is avoiding doing expensive worko by doing cheaper work

● Common examples?o On repeat visits, your browser doesn’t download

images that haven’t changedo Your CPU caches instructions, data so it doesn’t

have to go to RAM… or to disk!

Page 6: Caching Up and Down the Stack

What is “caching”?

Uncached

Client

Data Source

Page 7: Caching Up and Down the Stack

What is “caching”?

Client

Data Source

Uncached Cached

Cache Intermediary

Client

Data Source

Page 8: Caching Up and Down the Stack

What is “caching”?

Client

Data Source

Uncached Cached

Cache Intermediary

Client

Data Source

Fast!

Slow...

Page 9: Caching Up and Down the Stack

“Latency Numbers Every Programmer Should Know”

Systems Performance: Enterprise and the Cloud by Brendan Gregg http://books.google.com/books?id=xQdvAQAAQBAJ&pg=PA20&lpg=PA20&source=bl&ots=hlTgyxdrnR&sig=CCjddHrY1H6muMVW9BFcbdO7DDo&hl=en&sa=X&ei=dS7oUquhOYr9oAT9oYGoDw&ved=0CCkQ6AEwAA#v=onepage&q&f=false

Page 10: Caching Up and Down the Stack

A whole mess of caching:● Browser cache● CDN● Proxy / optimizer● Application-based

o Full-pageo Fragmento Object cache

● Databaseo Query cacheo Denormalization

Closer to the user

Closer to the data

Page 11: Caching Up and Down the Stack

Caching in Django apps: Frontend

● Client-side assets● Full pages

Page 12: Caching Up and Down the Stack

Client-side assets

Page 13: Caching Up and Down the Stack

Client-side assets

Page 14: Caching Up and Down the Stack

Client-side assets● Use HTTP caches!

o Browsero CDNo Intermediate proxies

● Set policy with cache headerso Cache-Control / Expireso ETag / Last-Modified

Page 15: Caching Up and Down the Stack

HTTP Cache-Control and Expires● Stop the browser from even asking for it● Expires

o Pick a date in the future, good til then

● Cache-controlo More flexibleo Introduced in HTTP 1.1o Use this one

Page 16: Caching Up and Down the Stack

HTTP Cache-Control and Expires

dan@JLTM21:~$ curl -I https://login.tv.appneta.com/cache/tl-layouts_base_unauth-compiled-162c2ceecd9a7ff1e65ab460c2b99852a49f5a43.css

HTTP/1.1 200 OKAccept-Ranges: bytesCache-Control: max-age=315360000Content-length: 5955Content-Type: text/cssDate: Tue, 20 May 2014 23:12:16 GMTExpires: Thu, 31 Dec 2037 23:55:55 GMTLast-Modified: Fri, 16 May 2014 20:51:19 GMTServer: nginxConnection: keep-alive

Page 17: Caching Up and Down the Stack

HTTP Cache Control in Django

https://docs.djangoproject.com/en/dev/topics/cache/

Page 18: Caching Up and Down the Stack

ETag + Last-Modified

Page 19: Caching Up and Down the Stack

ETag + Last-Modified

dan@JLTM21:~$ curl -I www.appneta.com/stylesheets/styles.css

HTTP/1.1 200 OKLast-Modified: Tue, 20 May 2014 05:52:50 GMTETag: "30854c-1c3d3-4f9ce7d715080"Vary: Accept-EncodingContent-Type: text/css...

Page 20: Caching Up and Down the Stack

ETag + Last-Modified

dan@JLTM21:~$ curl -I www.appneta.com/stylesheets/styles.css --header 'If-None-Match: "30854c-1c3d3-4f9ce7d715080"'

HTTP/1.1 304 Not ModifiedLast-Modified: Tue, 20 May 2014 05:52:50 GMTETag: "30854c-1c3d3-4f9ce7d715080"Vary: Accept-EncodingContent-Type: text/cssDate: Tue, 20 May 2014 23:21:12 GMT...

Page 21: Caching Up and Down the Stack

ETag vs Last-Modified

● Last-Modified is date-based● ETag is content-based● Most webservers generate both

● Some webservers (Apache) generate etags

that depend on local stateo If you have a load-balanced pool of servers working

here, they might not be using the same etags!

Page 22: Caching Up and Down the Stack

A whole mess of caching:● Browser cache● CDN● Proxy / optimizer● Application-based

o Full-pageo Fragmento Object cache

● Databaseo Query cacheo Denormalization

Page 23: Caching Up and Down the Stack

CDNs

● Put content closer to your end-userso and offload HTTP requests from

your servers● Best for static assets● Same cache control policies apply

Page 24: Caching Up and Down the Stack

Full-page caching

Client

Data Source

Varnish

No internet standards necessary!

Page 25: Caching Up and Down the Stack

Full-page caching: mod_pagespeed

Client

Data Source

mod_pagespeed

● Dynamically rewrites pages with frontend optimizations

● Caches rewritten pages

Page 26: Caching Up and Down the Stack

A whole mess of caching:● Browser cache● CDN● Proxy / optimizer● Application-based

o Full-pageo Fragmento Object cache

● Databaseo Query cacheo Denormalization

Page 27: Caching Up and Down the Stack

Full-page caching in Django

Page 28: Caching Up and Down the Stack

Wait, where is this getting cached?

● Django makes it easy to configureo In-memoryo File-basedo Memcachedo etc.

Page 29: Caching Up and Down the Stack

Full-page caching: dynamic pages?

Page 30: Caching Up and Down the Stack

Full-page caching: dynamic pages?

Page 31: Caching Up and Down the Stack

Fragment caching

Page 32: Caching Up and Down the Stack

Full-page caching: dynamic pages?

Page 33: Caching Up and Down the Stack

Full-page caching: the ajax solution

Page 34: Caching Up and Down the Stack

Object cachingdef get_item_by_id(key):

# Look up the item in our databasereturn session.query(User)\

.filter_by(id=key)\ .first()

Page 35: Caching Up and Down the Stack

Object cachingdef get_item_by_id(key):

# Check in cacheval = mc.get(key)# If exists, return itif val:

return val# If not, get the val, store it in the cacheval = return session.query(User)\

.filter_by(id=key)\ .first()

mc.set(key, val)return val

Page 36: Caching Up and Down the Stack

Object caching

@decoratordef cache(expensive_func, key):

# Check in cacheval = mc.get(key)# If exists, return itif val:

return val# If not, get the val, store it in the cacheval = expensive_func(key)mc.set(key, val)return val

Page 37: Caching Up and Down the Stack

Object caching@cachedef get_item_by_id(key):

# Look up the item in our databasereturn session.query(User)\

.filter_by(id=key)\ .first()

Page 38: Caching Up and Down the Stack

Object caching in Django

Page 39: Caching Up and Down the Stack

A whole mess of caching:● Browser cache● CDN● Proxy / optimizer● Application-based

o Full-pageo Fragmento Object cache

● Databaseo Query cacheo Denormalization

Page 40: Caching Up and Down the Stack

Query caching

Client

Actual tables

Database

Query Cache

Cached?

Page 41: Caching Up and Down the Stack

Query cachingmysql> select SQL_CACHE count(*) from traces; +----------+| count(*) |+----------+| 3135623 |+----------+1 row in set (0.56 sec)

mysql> select SQL_CACHE count(*) from traces;+----------+| count(*) |+----------+| 3135623 |+----------+1 row in set (0.00 sec)

Page 42: Caching Up and Down the Stack

Query caching

Page 43: Caching Up and Down the Stack

Query caching

Uncached

Cached

Page 44: Caching Up and Down the Stack

Denormalization

mysql> select table1.x, table2.y from table1 join table2 on table1.z = table2.q where table1.z > 100;

mysql> select table1.x, table1.y from table1 where table1.z > 100;

Page 45: Caching Up and Down the Stack

A whole mess of caching:● Browser cache● CDN● Proxy / optimizer● Application-based

o Full-pageo Fragmento Object cache

● Databaseo Query cacheo Denormalization

Page 46: Caching Up and Down the Stack

Caching: what can go wrong?

● Invalidation● Fragmentation● Stampedes● Complexity

Page 47: Caching Up and Down the Stack

Invalidation

Client

Data Source

Cache Intermediary

Update!

Write

Invalidate

Page 48: Caching Up and Down the Stack

Invalidation on page-scale● Browser cache● CDN● Proxy / optimizer● Application-based

o Full-pageo Fragmento Object cache

● Databaseo Query cacheo Denormalization

More savings,generally more invalidation...

Smaller savings,generally less invalidation

Page 49: Caching Up and Down the Stack

Fragmentation

● What if I have a lot of different things to cache?o More misseso Potential cache eviction

Page 50: Caching Up and Down the Stack

Fragmentation

Your pages / objects

Fre

quen

cy o

f Acc

ess

Page 51: Caching Up and Down the Stack

Fragmentation

Your pages / objects

Fre

quen

cy o

f Acc

ess

Page 52: Caching Up and Down the Stack

Stampedes

● On a cache miss extra work is done● The result is stored in the cache● What if multiple simultaneous misses?

Page 53: Caching Up and Down the Stack

Stampedes

http://allthingsd.com/20080521/stampede-facebook-opens-its-profile-doors/

Page 54: Caching Up and Down the Stack

Complexity

● How much caching do I need, and where?● What is the invalidation process

o on data update? on release?● What happens if the caches fall over?● How do I debug it?

Page 55: Caching Up and Down the Stack

Takeaways

● The ‘how’ of caching:o What are you caching?o Where are you caching it?o How bad is a cache miss?o How and when are you invalidating?

Page 56: Caching Up and Down the Stack

Takeaways

● The ‘why’ of caching:o Did it actually get faster?o Is speed worth extra complexity?o Don’t guess – measure!o Always use real-world conditions.

Page 57: Caching Up and Down the Stack

Questions?

?

Page 58: Caching Up and Down the Stack

Thanks!

● Interested in measuring your Django app’s performance?o Free trial of TraceView:

www.appneta.com/products/traceview● See you at Velocity NYC this fall?● Twitter: @appneta / @dankosaur

Page 59: Caching Up and Down the Stack

Resources● Django documentation on caching: https://docs.djangoproject.com/en/dev/topics/cache/● Varnish caching, via Disqus:

http://blog.disqus.com/post/62187806135/scaling-django-to-8-billion-page-views● Django cache option comparisons:

http://codysoyland.com/2010/jan/17/evaluating-django-caching-options/● More Django-specific tips:

http://www.slideshare.net/csky/where-django-caching-bust-at-the-seams● Guide to cache-related HTTP headers:

http://www.mobify.com/blog/beginners-guide-to-http-cache-headers/● Google PageSpeed: https://developers.google.com/speed/pagespeed/module