how$linkedin$used$tcp$anycast$to$make$ - nanog archive · 2018. 7. 27. · anycast •...

43

Upload: others

Post on 29-Aug-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’
Page 2: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

How  LinkedIn  used  TCP  Anycast  to  make  the  site  faster  

Ritesh  Maheshwari                              Shawn  Zandi  

Page 3: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

Anycast  

•  Anycast  provides  a  distributed  service  via  rou8ng.  •  It  is  not  really  different  than  unicast.  

•  NLRI  object  with  mul8ple  next-­‐hops.  •  It  simply  works  for  both  TCP  and  UDP  applica8ons.  (use  

with  cau8ons!)        

Page 4: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

SF  

CHI  

NYC  

Bob  

www.linkedin.com  2001:db8::1/56  

www.linkedin.com  2001:db8::1/56  

www.linkedin.com  2001:db8::1/56  

Page 5: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

Anycast  with  ECMP  

•  Not  a  real  issue  in  today’s  internet  •  Consistent  flow  rou8ng  is  required  (per  packet  load  

balancing  breaks  Anycast)  –  Pre_y  Much  Standard  •  Most  BGP  implementa8ons  do  not  load  balance  across  

different  AS-­‐PATHs  even  with  same  size.    

Page 6: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

Anycast  Complica8ons  

•  Broken  MTU  Challenges  •  ICMP  message  may  not  reach  the  intended  receiver  

to  report  MTU  problem.  Adjus8ng  MSS  can  help.  •  RPF  Checks  •  Mul8ple  covering  prefixes  -­‐  Only  one  Service  Address  

should  be  covered  by  each  adver8sed  prefix  /24  or  /56  •  Monitoring!          

Page 7: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

 But!  

How  to  measure  Anycast  effec8veness?  

Page 8: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

What  is  RUM?    

JavaScript  (Client-­‐code)  to  measure  performance  

  •  DNS  Time  •  Connec8on  8me  •  First  Byte  Time  •  Download  Time  •  Page  Load  Time  

Page 9: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

What  are  PoPs?    Point  of  Presence  /  PoP  •  Small-­‐scale  data  centers  •  Proxy  servers  at  LinkedIn  (ATS)  

Page 10: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

Without  PoPs  Browser   Data  Center  

connec8on  8me   250ms  

Page 11: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

Without  PoPs  Browser   Data  Center  

connec8on  8me  

server  compute  

8me  

250ms  

500ms  

Page 12: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

Without  PoPs  Browser   Data  Center  

connec8on  8me  

3-­‐5  round  trips  

first    byte    8me    

+  page  

download  8me  

5  RTTs  =  5x250ms  =  1250ms  

server  compute  

8me  

250ms  

Total  =  2000ms  

500ms  

Page 13: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

With  PoPs  Browser   Data  Center  PoP  

100ms  

250ms  

Page 14: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

With  PoPs  Browser   Data  Center  PoP  

100ms  connec8on  8me  

Old  TCP  Connec8on  

Page 15: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

With  PoPs  Browser   Data  Center  PoP  

100ms  connec8on  8me  

one  round  trip  

first    byte    8me    

+  page  

download  8me  

Old  TCP  Connec8on  

server  compute  

8me  500ms  

Page 16: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

With  PoPs  Browser   Data  Center  PoP  

100ms  connec8on  8me  

one  round  trip  

5  RTTs  =  5x100ms  =  500ms  

Total  =  1100ms  900  ms  gain!  

first    byte    8me    

+  page  

download  8me  

Old  TCP  Connec8on  

500ms  server  

compute  8me  

Page 17: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

How  are  users  assigned  to  PoPs?  Through  DNS:    

 IP  handed  based  on  user’s  resolver  country      

#  Spain  $  dig  @109.69.8.51  +short  www.linkedin.com  91.225.248.80    

#  California  $  dig  +short  www.linkedin.com  216.52.242.80    

Page 18: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

Should  India  connect  to  Singapore  or  Dublin?  

 How  to  assure  op,mal  PoPs  assignment?      

Page 19: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

RUM  beacons  Fetch  a  8ny  object  from  each  candidate  PoP    For each pop_name, 1.  Start timer 2.  Fetch {pop_name}.perf.linkedin.com/pop/admin 3.  Stop timer Send data back to our servers

•  Millions  of  agents!  •  Analyze  data  to  find  “op8mal”  PoP  per  country  

Page 20: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

We  can  assign  countries  to  new  PoPs!  

Country   PoP  Median  Beacon  

Time(ms)  China   Hong  Kong   434  China   Dublin   1216  China   Singapore   515  India   Hong  Kong   1368  India   Dublin   1042  India   Singapore   898  

Page 21: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

We  can  audit  current  assignment!  Country   Is  PoP  op8mal?   Current  PoP   Op8mal  PoP  India   TRUE   Singapore   Singapore  Pakistan   FALSE   Singapore   Dublin  Spain   TRUE   Dublin   Dublin  Brazil   FALSE   US  West  Coast   US  East  Coast  Netherlands   TRUE   Dublin   Dublin  UAE   FALSE   US  West  Coast   Dublin  Italy   TRUE   Dublin   Dublin  

Mexico   TRUE   US  West  Coast   US  West  Coast  

Russia   FALSE   US  West  Coast   Dublin  

Page 22: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

0%  

5%  

10%  

15%  

20%  

25%  

30%  

India   Pakistan   Singapore   Russia   Brazil  

Percen

tage  Im

provem

ent  

LinkedIn  Homepage  Download  Time  Improvement  

Median  Improvement   90th  Percen8le  Improvement  

Page 23: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’
Page 24: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’
Page 25: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

Plot  Twist:    Assignment  far  from  op8mal  

•  About  31%  of  US  traffic  gets  assigned  to  a  subop8mal  PoP.  – 45%  of  East  Coast  

•  About  10%  of  traffic  globally  gets  assigned  to  a  subop8mal  PoP.  

Page 26: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

DNS  PoP  assignment  is  subop8mal  •  Assignment  based  on  Resolver  IP,  not  Client  IP  

DNS  Resolver  

PoP  US  East  

PoP  US  West  

New  York  California  

Page 27: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

DNS  PoP  assignment  is  subop8mal  •  Assignment  based  on  Resolver  IP,  not  Client  IP  

•  Bad  IP  to  Geo  databases  – Resolver  really  in  NY,  but  database  says  CA  

Page 28: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

Story  so  far  1.  We  built  PoPs  2.  …used  RUM  to  assign  users  to  Op8mal  PoPs  3.  …found  DNS  based  assignment  is  subop8mal  

Page 29: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

Accurate  PoP  assignment  Problem  •  Bug  our  DNS  providers  (31%  -­‐>  27%)  •  Run  our  own  DNS    How  about  Anycast?  

Page 30: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

Anycast  –  One  IP,  Mul8ple  Servers  

PoP  A  

PoP  B  

PoP  C  

Bob  

1.1.1.1  

1.1.1.1  

1.1.1.1  

ü Client  IP,  not  Resolver  IP  used!  ü No  Geo-­‐IP  Databases    

Page 31: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

How  does  Anycast  compare  to  DNS?  

 Will  anycast  send  more  users  to  op,mal  PoP?  

 Ø Lets  test  it!  

Page 32: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

RUM  to  rescue    For  each  PoP:  1.  Announce  same  anycast  IP  (108.174.13.10)  2.  Configure  a  domain  

ac.perf.linkedin.com  to  point  to  108.174.13.10  

Page 33: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

RUM  to  rescue  For  each  page  view:  1.  RUM  downloads  a  8ny  object  :        

 ac.perf.linkedin.com/pop/admin2.  Read    X-Li-Pop response  header  to  record  which  PoP  served  

the  object  3.  Send  this  back  to  LinkedIn  with  RUM  data  

Data:  1.  For  each  user,  the  anycast  PoP  2.  For  each  user,  the  op8mal  PoP  (from  pop  beacons)  

Page 34: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

Results  J  

Region  or    Country  

DNS  %  Op8mal  Assignment  

Anycast  %  Op8mal    Assignment  

Illinois   70   90  Florida   73   95  Georgia   75   93  Pennsylvania   85   95  

Page 35: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

Results  L  

Region  or    Country  

DNS  %  Op8mal  Assignment  

Anycast  %  Op8mal    Assignment  

Arizona   60   39  

Brazil   88   33  

New  York   77   74  

Page 36: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’
Page 37: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

Fewer  hops  !=  Lower  Latency  •  Carriers  prefer  to  haul  packets  within  

their  own  network  •  Peering  can  create  inter-­‐con8nental  

short  cuts  

Z  

X  

Alice  Y  

inter-­‐con8

nental  link  

1.1.1.1  

1.1.1.1  

1.1.1.1  

Page 38: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

Maybe  DNS  wasn’t  so  bad    Con8nent-­‐level  assignments      City  /  State  level  assignments  

Page 39: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

“Regional”  Anycast  DNS-­‐based  1  anycast  IP  per  con8nent  

Ran  a  RUM  experiment,    all  was  fine   Z  

X  

Alice  Y  

2.2.2.2  

1.1.1.1  

1.1.1.1  

inter-­‐con8

nental  link  

Page 40: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

USA  Ramp  Results  

50.00  

55.00  

60.00  

65.00  

70.00  

75.00  

80.00  

85.00  

90.00  

95.00  

100.00  

20141207  20141208  20141209  20141210  20141211  20141212  20141213  20141214  20141215  20141216  20141217  

%  Traffic  going  to  Op8

mal  PoP

 

Date  

Illinois  

Florida  

North  Carolina  

Indiana  

NY  

NJ  

VA  

WV  

LA  

Ramp  outside  USA    In  progress  

Page 41: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

Story  so  far  1.  We  built  PoPs  2.  …used  RUM  to  assign  users  to  Op8mal  PoPs  3.  …found  DNS  based  assignment  is  subop8mal  4.  …evaluated  Anycast  as  a  solu8on  using  RUM  5.  …now  using  Anycast  to  assign  users  to  PoPs  

Next  play:  •  Build  more  PoPs!  

Page 42: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

Story:  The  End  Learnings  •  Clients  are  your  

measurement  agents  •  Trust,  but  verify  •  You  can  have  a  bigger  

impact  if  you  collaborate  

Next  Play  •  Keep  evalua8ng  Anycast  •  Keep  building  new  PoPs  

Page 43: How$LinkedIn$used$TCP$Anycast$to$make$ - NANOG Archive · 2018. 7. 27. · Anycast • Anycastprovides’adistributed’service’viarou8ng.’ • Itis’notreally’differentthan’unicast.’

©2014 LinkedIn Corporation. All Rights Reserved. ©2014 LinkedIn Corporation. All Rights Reserved.