internet (large) scale applications l. grewe. what do i mean? examples include web, email, search,...

34
Internet (large) scale Applications L. Grewe

Post on 22-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

Internet (large) scale Applications

L. Grewe

Page 2: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

What do I mean?

• Examples include

• Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight), IPTV, P2P content distributions (e.g., BitTorrent, Limewire, PPLive), multimedia/social networks (e.g., skype, facebook, myspace), and cloud computing (e.g., Amazon EC, Google App Engine, and Microsoft Azure cloud services).

• Applications that have such a scale that a single application will use as many as hundreds of thousands of servers.

Page 3: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

Some issues

• Server scaling• adaptive, open clients • Scalability and reliability• service-oriented software design• cloud computing paradigms• protocol specification• performance modeling• debugging and diagnosis• deployment and licensing.

Page 4: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

4

Growth of the Internet in Terms of Number of Hosts

Number of Hosts on the Internet:

Aug. 1981 213Oct. 1984 1,024Dec. 1987 28,174 Oct. 1990 313,000 Jul. 1993 1,776,000Jul. 1996 19,540,000Jul. 1999 56,218,000Jul. 2004 285,139,000Jul. 2005 353,284,000Jul. 2007 489,774,000Jul. 2008 570,937,000Jul. 2009 681,064,000Jul 2010| 768,913,036

CAIDA routerlevel view

Page 5: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

5

Backbone ISPISP ISP

Internet Physical InfrastructureResidential access

– Cable– Fiber– DSL– Wireless

Campus access, e.g.,

Ethernet Wireless

The Internet is a network of heterogeneous networks

Each individually administrated network is called an Autonomous System (AS)

Page 6: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

6

Us Traffic

http://atlas.grnoc.iu.edu/atlas.cgi?map_name=Internet2%20IP%20Layer

Page 7: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

7

Qwest Backbone Map

http://www.qwest.com/largebusiness/enterprisesolutions/networkMaps/preloader.swf

Page 8: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

8

ATT Global Backbone IP Network

From http://www.business.att.com

Page 9: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

Traffic in US, 1/24/2015

9Source: comScore Media Metrix (http://www.comscore.com)

Page 10: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

Unique Visitors – top 50 sites in U.S. (Jan. 2011)

10Source: comScore Media Metrix (http://www.comscore.com)

Page 11: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

Top Sites, Mexico , Oct. 2014

http://www.comscore.com/Insights/Data-Mine/Top-Properties-in-Mexico-for-October-2014

Page 12: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

How Much Data?

12

1 PB = 1000 TB1EB = 1000 PB

Page 13: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

How Much Data?

• Wayback Machine has 2 PB + 20 TB/month (2006) • NOAA has ~1 PB climate data (2007)• Google processes 20 PB a day (2008)• Internet traffic 5-8 EB (Dec. 2008)• Size of World’s digital content 500 EB (May 2009)• 2014- 50 Billion Web pages:

sorted Google. && 34% US download trafficnetflix and 14% youtube with approx8GB/netflix user/month

640K ought to be enough for anybody.

1 PB = 1000 TB1EB = 1000 PBhttp://en.wikipedia.org/wiki/Exabyte

Page 14: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

Processing Examples

• Crawling, indexing, searching, mining the Web• Ecommerce transactions• Software as service• …

Page 15: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

Large Data Centers• One idea/ trend: centralization of computing resources in large data

centers

• Necessary ingredients: space +?– What do Oregon, Iceland, and abandoned mines have in common?

• Major design point: scale out, not scale up

15

Page 16: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

Maximilien Brice, © CERN

Page 17: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

Evolving Computing Models

• Do it yourself (build your own data centers)• Utility computing IaaS

– Why buy machines when you can rent cycles?– Examples: Amazon’s EC2, GoGrid, AppNexus

• Platform as a Service (PaaS)– Give me nice API and take care of the implementation– Example: Google App Engine

• Software as a Service (SaaS)– Just run it for me!– Example: Gmail; MS Exchange; MS Office Online

Page 18: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

Programming Architecture Matters

• Performance vs. software extensibility

18

Page 19: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

Software Architecture Matters

• It all boils down to…– Divide-and-conquer (to the grid?)– Throwing more hardware at the problem as the

problem grows bigger

19

Page 20: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

Divide and Conquer

“Work”

w1 w2 w3

r1 r2 r3

“Result”

“worker” “worker” “worker”

Partition

Combine

It is simple to state, hard to master…

Page 21: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

Different Workers

• Where are the workers?– Different threads in the same core– Different cores in the same CPU– Different CPUs in a multi-processor system– Different machines in a distributed system (grid)

• Many design issues– Which worker does what?– How do the workers communicate/coordinate?– What if some workers die or are separated from others?

Page 22: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

Example Architecture:Three Tiered Architecture

• Stateless frontend• Soft state middle tier containing application logic and

common services• Backend persistent storage

22

Page 23: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

More 3 tier ideas/imagesTraditional –from Cisco

Page 24: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

More 3 tier ideas/imagesMoving into cloud

Page 25: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

More 3 tier ideas/imagesThinking Cloud Storage

Page 26: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

More 3 tier ideas/images

Moving into cloudIaaS

Page 27: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

More 3 tier ideas/imagesMoving into cloudIaaS – here feature Amazon

Page 29: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

3 Tier with GAE and Google Cloud• See https://cloud.google.com/solutions/architecture/webapp

Autoscaling compute power of App Engine,distributed in-memory cache, task queues and datastore, to create robust applications quickly and easily.

Page 30: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

3 Tier GAE for UdacitySee http://googleappengine.blogspot.com/2012_10_01_archive.html

Page 31: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

3 Tier GAE for WordChums Game• See http://googlecloudplatform.blogspot.com/2014_03_01_archive.html

Page 32: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

Mobile on GAE• See

https://cloud.google.com/developers/articles/developing-mobile-games-on-google-app-engine-compute-engine/

Page 33: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

Adding Google Cloud onto GAE for Lean Plum.com

• Addition of cloud storage and Big Query• BigQuery lets us run arbitrary

queries on arbitrary data sets• It has improved our customer

response time by allowing us to query over our logs in seconds whenever we receive a support call.

• Cloud Datastore lets us store vast amounts of structured data

• Cloud Storage provides secure, scalable storage. Like Amazon S3

• Compute Engine to take advantage of more powerful cores for processing large amounts of data when generating reports. (like Amazon EC2)

See http://googlecloudplatform.blogspot.com/2014/03/google-cloud-platform-and-leanplum-help-app-developers-conduct-on-the-fly-ab-tests.html

Page 34: Internet (large) scale Applications L. Grewe. What do I mean? Examples include Web, Email, Search, content delivery networks (e.g., Akamai, and Limelight),

Platform Matters“Developers who have worked at the small scale might be asking themselves why we

need to bother with “platform design” when we could just use some kind of out-of the-box solution. For small-scale applications, this can be a great idea. We save time and money up front and get a working and serviceable application. The problem comes at larger scales—there are no off-the-shelf kits that will allow you to build something like Amazon or Friendster. While building similar functionality might be fairly trivial, making that functionality work for millions of products, millions of users, and without spending far too much on hardware requires us to build something highly customized and optimized for our exact needs. There’s a good reason why the largest applications on the Internet are all bespoke creations: no other approach can create massively scalable applications within a reasonable budget.”

34

http://www.evontech.com/symbian/55.html