distributed systems aka special topics in networking cs 7780

Post on 04-Jan-2016

222 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Distributed Systems

aka Special Topics in NetworkingCS 7780

Welcome

• This is CS7780– Everyone in the right room?– Ok, good.

• Who am I?– Professor David Choffnes– choffnes@ccs.neu.edu– West Village H 256 – No office hours: Just e-mail to make an appt.

• No TAs, either

Why take this course?

• The reason you’re here is because of a DS– Registering for a course– Checking class times and location– Visiting my website– Getting directions to class– Writing notes in a Gdoc– Checking your e-mail

Why take this course?

• When you registered, how were you guaranteed a slot and that this slot wasn’t overwritten by someone else?

• When you got directions, how did you get results in milliseconds when looking up one of billions of locations/tiles?

• How do your notes stay properly synced even though you never hit save and sometimes use the doc while offline?

Why take this class?

• With DS, life would be pretty boring– I’d have to assign you a textbook– And photocopy manuscripts to read– And I couldn’t paste in pictures like this

6

Goals

• Fundamental understanding of DS– All the way from core concepts and principles– … to the applications that use them in various ways

• Focus on software systems and protocols– Not hardware (treat as black box)– Minimal theory (but some for proofs

• Paper-centric– Learn DS from source material– Build your vocabulary and awareness of foundational DS concepts

• Research projects– Apply these concepts in your own original research

7

Online Resources

• http://david.choffnes.com/classes/cs7780fa14/• Class forum is on Piazza– Sign up today!– Install their iPhone/Android app

• When in doubt, post to Piazza– Piazza is preferable to email• If you e-mail me a question, I will tell you to post it on Piazza

• HotCRP for paper reviews– Mandatory for all papers assigned (except this week)

Sept 10 Intro

Sept 14 No Class Monday, CAP/Clocks

Sept 21 Consistency/Consensus, No class Thursday (NSDI), proposals due

Sept 28 Fault Tolerance/Availability

Oct 5 Distributed/Remote Processing, Distributed Cache

Oct 12 No class Monday: Columbus Day, DHTs

Oct 19 File systems (early, modern)

Oct 22 Overlays (maybe), No class: IMC, Midterm reports due

Nov 2 Wild card (Christo and friends)

Nov 9 The Internet, CDNs

Nov 16 Privacy, Anonymity, BitCoin (Field trip to DTL workshop)

Nov 23 DCNs, Thanksgiving Thursday, no class

Nov 30 SDNs, Management, security

Dec 7 Project presentations

Dec 14 Reports due

Schedule

• The schedule will probably slip• If there’s a paper/topic you really want to

discuss and is not on the list, let me know

Teaching Style

• This is not a lecture course– There is no textbook– There are no homework assignments– There is no hand holding

• Class will be very interactive– I will ask you questions– You should ask questions– Discussion is paramount

• That said, I will lead first few lectures

How you are evaluated

• Attend class• Read and summarize papers– Present a subset of papers

• Present your project• Write it up

Grading

• Pretty much all on final project• Research report + presentation– Some weight on attendance/participation

Projects

• Anything that is related to distributed systems– Needs to be approved by me (9/24)– Should be your ongoing research– If you need a project, come see me

• Midterm progress report due 10/26

• 6-page (minimum) writeup due at end of semester– Also will need to give a 15-20’ talk

Papers and reviews

• Most content will come from original sources– Students must pick papers to present• Ok to reuse some slides from authors’ presentations

– Everyone needs to enter a review in HotCRP

• Student presents paper, then we discuss

Cheating

• Do not plagiarize• Do not do it– Seriously, don’t make me say it again

• Cheating is an automatic zero– Will be referred to the university for discipline and possible

expulsion– I’m not kidding: I will send any suspects to OSCCR without

exception• Research code and text must be original– If you have any questions about whether there might be an

issue, ask me

Questions?

What is a distributed system?

Google’s definition

• an application that executes a collection of protocols to coordinate the actions of multiple processes on a network, such that all components cooperate together to perform a single or small set of related tasks

Why build a DS?

• Scale• Availability• … no other way to connect lots of components• …

Key challenge

• Everything fails– No really, on a long enough timeline, everything fails

• Node goes down• Network unavailable• Storage is corrupted• Attack on system• Packet loss• Bugs• …

Example: Google Docs

Handling failures can be hard

• Multiple layers of components can mask problems or interfere with recovery

• Achieving agreement/consistency after failures can be difficult– Even without failures this is hard

• Reliability during failures is challenging

Related Challenges

• Fault tolerance• Availability• Recoverability• Consistency• Scalability• Security• Predictability, Simplicity (?)

The Network

• Communication, coordination require a network– What are examples of networks that DSes use?– Why are they used?

How not to design a DS

Assume:• Nodes are always online.• The network is reliable.• Latency is zero.• Bandwidth is infinite.• The network is secure.• Topology doesn't change.• There is one administrator.• Transport cost is zero.• The network is homogeneous.

Communication/Coordination

• How do you compartmentalize tasks in a computer program?

• Analogy in DS: Remote Procedure Calls– Anyone have examples?

Key RPC components

• Protocol• Client/Server implementation• Error handling– What can go wrong?

End-to-End Argument

29

Where to Place Functionality

• How do we distribute functionality across a DS?– Example: who is responsible for security?

Switch SwitchRouter

??

??

?

• “The End-to-End Arguments in System Design”• Saltzer, Reed, and Clark• Endlessly debated by researchers and engineers

30

Basic Observation

• Some applications have end-to-end requirements– Security, reliability, etc.

• Implementing this stuff inside the network is hard– Every step along the way must be fail-proof– Different applications have different needs

• End hosts…– Can’t depend on the network– Can satisfy these requirements without network level

support

31

Example: Reliable File Transfer

Solution 1: Make the network reliable Solution 2: App level, end-to-end check, retry on failure

Integrity Check

Integrity Check

Integrity Check

App has to do a check anyway!

32

Example: Reliable File Transfer

Solution 1: Make the network reliable Solution 2: App level, end-to-end check, retry on failure

Please Retry

Full functionality can be built at App level

• In-network implementation… Doesn’t reduce host complexity Does increase network complexity Increased overhead for apps that don’t need

functionality• But, in-network performance may be better

33

Conservative Interpretation

“Don’t implement a function at the lower levels of the system unless it can be

completely implemented at this level” (Peterson and Davie)

Basically, unless you can completely remove the burden from endpoints, don’t bother

34

Radical Interpretation

• Don’t implement anything in an intermediate DS component that can be implemented correctly by the endpoints

• Make each DS component absolutely minimal

• Ignore performance issues

35

Moderate Interpretation

• Think twice before implementing functionality in a given component of the system

• If an endpoint can implement functionality correctly, implement it a lower layer only as a performance enhancement

• But do so only if it does not impose burden on applications that do not require that functionality…– …and if it doesn’t cost too much to implement– Cost = $ or complexity

For next week

• Read the papers!– I will set up HotCRP shortly

• No class Monday• Start looking at papers to present– When I open bidding, it will be FCFS– Let me know if you want to add anything

• Start figuring out your project proposal

top related