network programming intro to distributed systems fall 2013 l1-intro dongsu han some material taken...

46
NETWORK PROGRAMMING INTRO TO DISTRIBUTED SYSTEMS FALL 2013 L1-Intro Dongsu Han Some material taken from publicly available lecture slides including Srini Seshan’s and David Anderson’s

Upload: jessie-york

Post on 25-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

NETWORK PROGRAMMINGINTRO TO DISTRIBUTED SYSTEMSFALL 2013

L1-IntroDongsu Han

Some material taken from publicly available lecture slides including Srini Seshan’s and David Anderson’s

Today’s Lecture

Administrivia What is a distributed system and what

does it do? Walking through an example Overview of the topics covered in this

course

Instructors

Instructor 한동수 (Dongsu Han) [email protected], N1 814 Office hours: Wednesday 1-3pm

Teaching Assistant 정은영 [email protected], N1 820

Course Goals

Become familiar with the principles and practice of distributed systems

Understand the challenges and common techniques in distributes systems design

Learn how to write distributed applications that use the network How does Dropbox work? How does a Content Distribution Network

work?

Course Format

~30 lectures References (no single textbook):

Distributed systems: Principles and Paradigms Distributed systems: Concepts and Design

(CDK) Computer Networks: A Systems Approach

Exams: Midterm and Final Programming assignments

5 to 6 assignments Loosely tied to lecture materials (start early)

About Programming Assignments Systems programming in Low-level (C) Must be robust, error handling must be

rock solid Handle concurrency Understand the system’s failure modes Interfaces specified by documented

protocols 1 or 2 TA led hands-on session on

programming/debugging

Grading

10% late penalty per day Can’t be more than 3 days late Exceptions: documented

medical/personal emergency. Two “late points” to use over the entire

course (up to one point for each assignment)

Regrade request must be done in writing within a week of the original grading.

Grading

Weight assignment 20% for Midterm exam 25% for Final exam 45% for Homework/programming

assignment 10% for class participation

You MUST demonstrate competence in both projects and tests to pass the course

Collaboration

Working together important Discuss course material Work on problem debugging

Programming assignment must be your own work Partial credit (points) Will run plagiarism detection on source code “Copy and paste” codes will get severely

penalized Implication: You will fail this course if you

copy someone else’s code.

Why do I need this course?

“Everything” is distributed these days.

Web, google, dropbox, kakao talk, youtube, calendar, email, facebook, cais, the cloud,…

“Everything” relies on distribute systems Learn how they really work. Learn how to design systems that scale.

Why do I need this course?

Enables new things Search engine Facebook (Social Networking Systems) Dropbox

Make existing thing more efficient Scale Facebook for the next billion users Scale CAIS to work well even when

everyone tries to access it Make computer graphs rendering faster

using clusters

Examples of Scale

Updates/Posts Twitter: The record is 25,088 tweets per

second (when Castle in the Sky was broadcast in Japan)

Searches Google: 5,134,000,000 searches per day.

Network operations 10Gbps == 14,880,952 packets per second

(@64bytes) Fast key-value store: 50~70 Mops/sec

Examples of Scale

Akamai running 105,000 servers in more than 1,900 networks

Number of networks on the Internet: 45,000 (2013/8/26)

Microsoft: 1 million servers Google envisions 10 million servers.

Map of Google’s Datacenters

Microsoft Data Center

Google’s Datacenter

Google’s Server

What do they enable?

means $

What do they enable?

In-class Activity

Form a group of 3 Questions to answer:

Name a few distributed systems you know of. Pick one and draw its components as best as you

can. Think about how many servers there are or how

many requests it handles per second. (Make some assumptions)

Describe each component in ~2 sentences. No right or wrong answer. 10 mins. 1 or 2 teams will present.

Today’s Lecture

Administrivia What is a distributed system and what

does it do? Walking through an example Overview of the topics covered in this

course

A Real Distributed System

Google search

Remember IP...

From: 128.2.185.33

To: 66.233.169.103

<packet contents>

hosts.txt

www.google.com 66.233.169.103www.cmu.edu 128.2.185.33www.cs.cmu.edu 128.2.56.91www.areyouawake.com 66.93.60.192...

Domain Name System

Local DNS server

`who is www.google.com?

www.google.com is 66.233.169.103.com DNS server

google.com DNS server

`

. DNS server

who is www.google.com?ask the .com guy... (here’s his IP)

`ask the google.com guy... (IP)

`

66.233.169.103

who is www.google.com?

Decentralized - admins update own domains without coordinating with other domains

Scalable - used for hundreds of millions of domainsRobust - handles load and failures well

But there’s more...

who is www.google.com?

google.com DNS server

`128.2.53.5

Which google datacenter is 128.2.53.5 closest to?

Is it too busy?

66.233.169.99Search!

Query

Front-endFront-endFront-endFront-endFront-endFront-endFront-end

slide from Jeff Dean, Google

Result

How do you index the web?

1.Get a copy of the web.2.Build an index.3.Profit (insert advertisements)There are over 60 trillion individual web pages

Hundreds of millions of websites

How do you index the web?

Crawling -- download those web pages Indexing -- harness 10s of thousands of

machines to do it Profiting -- we leave that to you.

“Data-Intensive Computing”

i1 i2 i3

i4 ...

i1 i2 i3

i4 ...

i1 i2 i3

i4 ...

Data is split into chunks

Replicate: Handle load

GFS distributed filesystemReplicatedConsistentFast

Storing: Google File System

doc1,2,3,..n

Indexing

hello hadoopgoodbye hadoophello youhello me

hello worldgoodbye world

doc1

doc2

goodbye doc1 =>1 doc2=>1hadoop doc1=>2hello doc1=>3 doc2=>1me doc1=>1world doc2=>2you doc1 =>1

Inverted index

MapReduce / Hadoop

Data Chunks

...

Computers

Data Transformation

Sort by key

DataAggregation

Storage

Storage

Doc 1~n

hello

you

MapReduce / Hadoop

Data Chunks

...

Computers

Data Transformation

Sort

DataAggregation

Storage

Storage

Why? Hiding details of programming 10,000 machines!

Programmer writes two simple functions:

map (data item) -> list(tmp values)reduce ( list(tmp values)) -> list(out values)

MapReduce system balances load, handles failures, starts job, collects results, etc.

All that...

Hundreds of DNS servers Protocols on protocols on protocols Distributed network of Internet routers

to get packets around the globe Hundreds of thousands of servers ... to find ryu hyun jin in under 0.2

second

Today’s Lecture

Administrivia What is a distributed system and what

does it do? Walking through an example Overview of the topics covered in this

course

39

What is a Distributed System?

A distributed system is: “A collection of independent computers that appears to its users as a single coherent system”

"A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable." – Leslie Lamport

40

Distributed Systems

The middleware layer extends over multiple machines, and offers each application the same interface.

41

What does it do?

Hide complexity to programmers/usersHide the fact that its processes and resources are physically distributed across multiple machines.

Transparency in a Distributed System

42

How?

The middleware layer extends over multiple machines, and offers each application the same interface.

What we will learn

Learn principles in distributed systems design.

Distributed systems differ from traditional software because components are dispersed.

Many assumptions break for this reason. We will study important aspects that we

must consider in dealing with distributed systems.

Challenges (1/2)

Heterogeneity Networks, computer hardware, OS, programming

languages, different implementations Openness

Different implementation or extension can be added. (e.g., Firefox, Internet Explorer). Key interfaces are published.

Security Confidentiality, integrity, and availability E-doctor (you don’t want someone else to see

your record)

Challenges (2/2)

Scalability Does the system remain effective with a significant

increase in # of users? Failure handling

Detection, masking, tolerating failure, recovery Concurrency

Need synchronization when accessing shared resources.

Transparency Quality of service

Video quality may suffer when the network is overloaded.

Next Lecture

How does the Internet work? – Intro to networking

Programming assignment Socket programming If you got an A+ in Computer Networks

course, please see the instructor after class (might let you skip this).