distributed systems aka special topics in networking cs 7780

36
Distributed Systems aka Special Topics in Networking CS 7780

Upload: brianna-christal-flynn

Post on 04-Jan-2016

222 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Distributed Systems aka Special Topics in Networking CS 7780

Distributed Systems

aka Special Topics in NetworkingCS 7780

Page 2: Distributed Systems aka Special Topics in Networking CS 7780

Welcome

• This is CS7780– Everyone in the right room?– Ok, good.

• Who am I?– Professor David Choffnes– [email protected]– West Village H 256 – No office hours: Just e-mail to make an appt.

• No TAs, either

Page 3: Distributed Systems aka Special Topics in Networking CS 7780

Why take this course?

• The reason you’re here is because of a DS– Registering for a course– Checking class times and location– Visiting my website– Getting directions to class– Writing notes in a Gdoc– Checking your e-mail

Page 4: Distributed Systems aka Special Topics in Networking CS 7780

Why take this course?

• When you registered, how were you guaranteed a slot and that this slot wasn’t overwritten by someone else?

• When you got directions, how did you get results in milliseconds when looking up one of billions of locations/tiles?

• How do your notes stay properly synced even though you never hit save and sometimes use the doc while offline?

Page 5: Distributed Systems aka Special Topics in Networking CS 7780

Why take this class?

• With DS, life would be pretty boring– I’d have to assign you a textbook– And photocopy manuscripts to read– And I couldn’t paste in pictures like this

Page 6: Distributed Systems aka Special Topics in Networking CS 7780

6

Goals

• Fundamental understanding of DS– All the way from core concepts and principles– … to the applications that use them in various ways

• Focus on software systems and protocols– Not hardware (treat as black box)– Minimal theory (but some for proofs

• Paper-centric– Learn DS from source material– Build your vocabulary and awareness of foundational DS concepts

• Research projects– Apply these concepts in your own original research

Page 7: Distributed Systems aka Special Topics in Networking CS 7780

7

Online Resources

• http://david.choffnes.com/classes/cs7780fa14/• Class forum is on Piazza– Sign up today!– Install their iPhone/Android app

• When in doubt, post to Piazza– Piazza is preferable to email• If you e-mail me a question, I will tell you to post it on Piazza

• HotCRP for paper reviews– Mandatory for all papers assigned (except this week)

Page 8: Distributed Systems aka Special Topics in Networking CS 7780

Sept 10 Intro

Sept 14 No Class Monday, CAP/Clocks

Sept 21 Consistency/Consensus, No class Thursday (NSDI), proposals due

Sept 28 Fault Tolerance/Availability

Oct 5 Distributed/Remote Processing, Distributed Cache

Oct 12 No class Monday: Columbus Day, DHTs

Oct 19 File systems (early, modern)

Oct 22 Overlays (maybe), No class: IMC, Midterm reports due

Nov 2 Wild card (Christo and friends)

Nov 9 The Internet, CDNs

Nov 16 Privacy, Anonymity, BitCoin (Field trip to DTL workshop)

Nov 23 DCNs, Thanksgiving Thursday, no class

Nov 30 SDNs, Management, security

Dec 7 Project presentations

Dec 14 Reports due

Page 9: Distributed Systems aka Special Topics in Networking CS 7780

Schedule

• The schedule will probably slip• If there’s a paper/topic you really want to

discuss and is not on the list, let me know

Page 10: Distributed Systems aka Special Topics in Networking CS 7780

Teaching Style

• This is not a lecture course– There is no textbook– There are no homework assignments– There is no hand holding

• Class will be very interactive– I will ask you questions– You should ask questions– Discussion is paramount

• That said, I will lead first few lectures

Page 11: Distributed Systems aka Special Topics in Networking CS 7780

How you are evaluated

• Attend class• Read and summarize papers– Present a subset of papers

• Present your project• Write it up

Page 12: Distributed Systems aka Special Topics in Networking CS 7780

Grading

• Pretty much all on final project• Research report + presentation– Some weight on attendance/participation

Page 13: Distributed Systems aka Special Topics in Networking CS 7780

Projects

• Anything that is related to distributed systems– Needs to be approved by me (9/24)– Should be your ongoing research– If you need a project, come see me

• Midterm progress report due 10/26

• 6-page (minimum) writeup due at end of semester– Also will need to give a 15-20’ talk

Page 14: Distributed Systems aka Special Topics in Networking CS 7780

Papers and reviews

• Most content will come from original sources– Students must pick papers to present• Ok to reuse some slides from authors’ presentations

– Everyone needs to enter a review in HotCRP

• Student presents paper, then we discuss

Page 15: Distributed Systems aka Special Topics in Networking CS 7780

Cheating

• Do not plagiarize• Do not do it– Seriously, don’t make me say it again

• Cheating is an automatic zero– Will be referred to the university for discipline and possible

expulsion– I’m not kidding: I will send any suspects to OSCCR without

exception• Research code and text must be original– If you have any questions about whether there might be an

issue, ask me

Page 16: Distributed Systems aka Special Topics in Networking CS 7780

Questions?

Page 17: Distributed Systems aka Special Topics in Networking CS 7780

What is a distributed system?

Page 18: Distributed Systems aka Special Topics in Networking CS 7780

Google’s definition

• an application that executes a collection of protocols to coordinate the actions of multiple processes on a network, such that all components cooperate together to perform a single or small set of related tasks

Page 19: Distributed Systems aka Special Topics in Networking CS 7780

Why build a DS?

• Scale• Availability• … no other way to connect lots of components• …

Page 20: Distributed Systems aka Special Topics in Networking CS 7780

Key challenge

• Everything fails– No really, on a long enough timeline, everything fails

• Node goes down• Network unavailable• Storage is corrupted• Attack on system• Packet loss• Bugs• …

Page 21: Distributed Systems aka Special Topics in Networking CS 7780

Example: Google Docs

Page 22: Distributed Systems aka Special Topics in Networking CS 7780

Handling failures can be hard

• Multiple layers of components can mask problems or interfere with recovery

• Achieving agreement/consistency after failures can be difficult– Even without failures this is hard

• Reliability during failures is challenging

Page 23: Distributed Systems aka Special Topics in Networking CS 7780

Related Challenges

• Fault tolerance• Availability• Recoverability• Consistency• Scalability• Security• Predictability, Simplicity (?)

Page 24: Distributed Systems aka Special Topics in Networking CS 7780

The Network

• Communication, coordination require a network– What are examples of networks that DSes use?– Why are they used?

Page 25: Distributed Systems aka Special Topics in Networking CS 7780

How not to design a DS

Assume:• Nodes are always online.• The network is reliable.• Latency is zero.• Bandwidth is infinite.• The network is secure.• Topology doesn't change.• There is one administrator.• Transport cost is zero.• The network is homogeneous.

Page 26: Distributed Systems aka Special Topics in Networking CS 7780

Communication/Coordination

• How do you compartmentalize tasks in a computer program?

• Analogy in DS: Remote Procedure Calls– Anyone have examples?

Page 27: Distributed Systems aka Special Topics in Networking CS 7780

Key RPC components

• Protocol• Client/Server implementation• Error handling– What can go wrong?

Page 28: Distributed Systems aka Special Topics in Networking CS 7780

End-to-End Argument

Page 29: Distributed Systems aka Special Topics in Networking CS 7780

29

Where to Place Functionality

• How do we distribute functionality across a DS?– Example: who is responsible for security?

Switch SwitchRouter

??

??

?

• “The End-to-End Arguments in System Design”• Saltzer, Reed, and Clark• Endlessly debated by researchers and engineers

Page 30: Distributed Systems aka Special Topics in Networking CS 7780

30

Basic Observation

• Some applications have end-to-end requirements– Security, reliability, etc.

• Implementing this stuff inside the network is hard– Every step along the way must be fail-proof– Different applications have different needs

• End hosts…– Can’t depend on the network– Can satisfy these requirements without network level

support

Page 31: Distributed Systems aka Special Topics in Networking CS 7780

31

Example: Reliable File Transfer

Solution 1: Make the network reliable Solution 2: App level, end-to-end check, retry on failure

Integrity Check

Integrity Check

Integrity Check

App has to do a check anyway!

Page 32: Distributed Systems aka Special Topics in Networking CS 7780

32

Example: Reliable File Transfer

Solution 1: Make the network reliable Solution 2: App level, end-to-end check, retry on failure

Please Retry

Full functionality can be built at App level

• In-network implementation… Doesn’t reduce host complexity Does increase network complexity Increased overhead for apps that don’t need

functionality• But, in-network performance may be better

Page 33: Distributed Systems aka Special Topics in Networking CS 7780

33

Conservative Interpretation

“Don’t implement a function at the lower levels of the system unless it can be

completely implemented at this level” (Peterson and Davie)

Basically, unless you can completely remove the burden from endpoints, don’t bother

Page 34: Distributed Systems aka Special Topics in Networking CS 7780

34

Radical Interpretation

• Don’t implement anything in an intermediate DS component that can be implemented correctly by the endpoints

• Make each DS component absolutely minimal

• Ignore performance issues

Page 35: Distributed Systems aka Special Topics in Networking CS 7780

35

Moderate Interpretation

• Think twice before implementing functionality in a given component of the system

• If an endpoint can implement functionality correctly, implement it a lower layer only as a performance enhancement

• But do so only if it does not impose burden on applications that do not require that functionality…– …and if it doesn’t cost too much to implement– Cost = $ or complexity

Page 36: Distributed Systems aka Special Topics in Networking CS 7780

For next week

• Read the papers!– I will set up HotCRP shortly

• No class Monday• Start looking at papers to present– When I open bidding, it will be FCFS– Let me know if you want to add anything

• Start figuring out your project proposal