data management in distributed systems minqi zhou software engineering institute office: room 111...

17
Data Management in Distributed Systems Minqi Zhou Software Engineering Institute Office: Room 111 Mathematics Building E-mail: [email protected] Phone: 32204750-167 2010-09-16

Upload: allan-merritt

Post on 25-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Management in Distributed Systems Minqi Zhou Software Engineering Institute Office: Room 111 Mathematics Building E-mail: mqzhou@sei.ecnu.edu.cn Phone:

Data Management in Distributed Systems

Minqi ZhouSoftware Engineering InstituteOffice: Room 111 Mathematics BuildingE-mail: [email protected]: 32204750-1672010-09-16

Page 2: Data Management in Distributed Systems Minqi Zhou Software Engineering Institute Office: Room 111 Mathematics Building E-mail: mqzhou@sei.ecnu.edu.cn Phone:

Course Introduction

• Data Management in P2P Systems– 1-4 weeks

• Data Management in Cloud Systems– 5-10 weeks

• Computational Advertisement – 11-18 weeks

Page 3: Data Management in Distributed Systems Minqi Zhou Software Engineering Institute Office: Room 111 Mathematics Building E-mail: mqzhou@sei.ecnu.edu.cn Phone:

Final Grades

• Usual Grades (60%)– Attendance– Presentation

• Final Report (40%), (English Preferred)– Survey – Paper

Page 4: Data Management in Distributed Systems Minqi Zhou Software Engineering Institute Office: Room 111 Mathematics Building E-mail: mqzhou@sei.ecnu.edu.cn Phone:

A Brief Introduction to Distributed Systems

Page 5: Data Management in Distributed Systems Minqi Zhou Software Engineering Institute Office: Room 111 Mathematics Building E-mail: mqzhou@sei.ecnu.edu.cn Phone:

5

What Is a Distributed System?

• Multiple computers (“machines,” “hosts,” “boxes,” &c.)– Each with CPU, memory, disk, network interface– Interconnected by LAN or WAN (e.g., Internet)

• Application runs across this dispersed collection of networked hardware

• But user sees single, unified system

Page 6: Data Management in Distributed Systems Minqi Zhou Software Engineering Institute Office: Room 111 Mathematics Building E-mail: mqzhou@sei.ecnu.edu.cn Phone:

6

What Is a Distributed System?(Alternate Take)

“A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable.”

– Leslie Lamport, Microsoft Research (ex DEC)

Page 7: Data Management in Distributed Systems Minqi Zhou Software Engineering Institute Office: Room 111 Mathematics Building E-mail: mqzhou@sei.ecnu.edu.cn Phone:

7

Start Simple: Centralized System

• Suppose you run Gmail• Workload:– Inbound email arrives; store on disk– Users retrieve, delete their email

• You run Gmail on one server with disk

GmailServer (PC)

EmailSender

EmailSender

EmailSender

EmailReader

EmailReader

EmailReader

What are shortcomings of this design?

Page 8: Data Management in Distributed Systems Minqi Zhou Software Engineering Institute Office: Room 111 Mathematics Building E-mail: mqzhou@sei.ecnu.edu.cn Phone:

8

Why Distribute? For Availability

• Suppose Gmail server goes down, or network between client and it goes down

• No incoming mail delivered, no users can read their inboxes

• Fix: replicate the data on several servers– Increased chance some server will be reachable– Consistency? One server down when delete message, then

comes back up; message returns in inbox– Latency? Replicas should be far apart, so they fail

independently– Partition resilience? e.g., airline seat database splits, one

seat remains, bought twice, once in each half!

Page 9: Data Management in Distributed Systems Minqi Zhou Software Engineering Institute Office: Room 111 Mathematics Building E-mail: mqzhou@sei.ecnu.edu.cn Phone:

9

Why Distribute?For Scalable Capacity

• What if Gmail a huge success?• Workload exceeds capacity of one server• Fix: spread users across several servers– Best case: linear scaling—if U users per box, N

boxes support NU users– Bottlenecks? If each user’s inbox on one server,

how to route inbound mail to right server?– Scaling? How close to linear?– Load balance? Some users get more mail than

others!

Page 10: Data Management in Distributed Systems Minqi Zhou Software Engineering Institute Office: Room 111 Mathematics Building E-mail: mqzhou@sei.ecnu.edu.cn Phone:

10

Performance Can Be Subtle

• Goal: predictable performance under high load

• 2 employees run a Starbucks– Employee 1: takes orders from customers, calls

them out to Employee 2– Employee 2: • writes down drink orders (5 seconds per order)• makes drinks (10 seconds per order)

• What is throughput under increasing load?

Page 11: Data Management in Distributed Systems Minqi Zhou Software Engineering Institute Office: Room 111 Mathematics Building E-mail: mqzhou@sei.ecnu.edu.cn Phone:

11

Starbucks Throughput

• Peak system performance: 4 drinks / min• What happens when load > 4 orders / min?• What happens to efficiency as load increases?

What would preferable curve be?What design achieves that goal?

Page 12: Data Management in Distributed Systems Minqi Zhou Software Engineering Institute Office: Room 111 Mathematics Building E-mail: mqzhou@sei.ecnu.edu.cn Phone:

12

Why Are Distributed SystemsHard to Design?

• Failure: of hosts, of network– Remember Lamport’s lament

• Heterogeneity– Hosts may have different data representations

• Need consistency (many specific definitions)– Users expect familiar “centralized” behavior

• Need concurrency for performance– Avoid waiting synchronously, leaving resources idle– Overlap requests concurrently whenever possible

Page 13: Data Management in Distributed Systems Minqi Zhou Software Engineering Institute Office: Room 111 Mathematics Building E-mail: mqzhou@sei.ecnu.edu.cn Phone:

References• Books

– Legitimate applications of peer-to-peer networks 。 Dinesh Verma 。 Wiley-IEEE, 2004

– Cloud Computing: Web-Based Applications That Change the Way You Work and Collaborate Online 。 Michael Miller , Que, 2008。

– F. von Lohmann, “P2P File ShDavid P. Anderson and John Kubiatowicz, The Worldwide Computer, Scientific American, March 2002

• Papers– Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari

Balakrishnan, “Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications”, Proceedings of ACM SIGCOMM’01, San Diego, CA, August 2001.

– Bujor Silaghi, Bobby Bhattacharjee, Pete Keleher, “Query Routing in the TerraDir Distributed Directory”, Proceedings of SPIE ITCOM, Boston, MA, July 2002.

– Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker, “A Scalable Content-Addressable Network”, Proceedings of ACM SIGCOMM’01, San Diego, CA, August 2001.

Page 14: Data Management in Distributed Systems Minqi Zhou Software Engineering Institute Office: Room 111 Mathematics Building E-mail: mqzhou@sei.ecnu.edu.cn Phone:

– OceanStore: An Architecture for Global-Scale Persistent Storage , John Kubiatowicz, David Bindel, Yan Chen, Steven Czerwinski, Patrick Eaton, Dennis Geels, Ramakrishna Gummadi, Sean Rhea, Hakim Weatherspoon, Westley Weimer, Chris Wells, and Ben Zhao. Appears in Proceedings of the Ninth international Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2000), November 2000

– W. J. Bolosky, J. R. Douceur, D. Ely, M. Theimer; Feasibility of a Serverless Distributed File System Deployed on an Existing Set of Desktop PCs, Proceedings of the international conference on Measurement and modeling of computer systems, 2000, pp. 34-43

– J. Kleinberg, The Small-World Phenomenon: An Algorithmic Perspective, Proc. 32nd ACM Symposium on Theory of Computing, Portland, OR, May, 2000

– R. Albert, H. Joeong, A. Barabasi, Error and Attack Tolerance of Complex Networks, Nature, vol. 46, July 2000.

– H. Zhang, A. Goel, R. Govindan, Using the Small-World Model to Improve Freenet Performance, Proceedings of IEEE Infocom, New York, NY, June 2002.

– J. Chu, K. Labonte, B. Levine, Availability and Locality Measurements of Peer-to-Peer File Systems, Proceedings of SPIE ITCOM, Boston, MA, July 2002.

– R. Bhagwan, S. Savage, G. Voelker, Understanding Availability, in Proc. 2nd International Workshop on Peer-to-Peer Systems (IPTPS), Berkeley, CA, Feb 2003.

– S. Saroiu, P. Gummadi, S. Gribble, A Measurement Study of Peer-to-Peer File Sharing Systems, in Proceedings of Multimedia Computing and Networking 2002 (MMCN'02), San Jose, CA, January 2002.

– aring and Copyright Law: A Primer for Developers,” IPTPS 2003

Page 15: Data Management in Distributed Systems Minqi Zhou Software Engineering Institute Office: Room 111 Mathematics Building E-mail: mqzhou@sei.ecnu.edu.cn Phone:

– Antony Rowstron and Peter Druschel, “Pastry: Scalable, Decentralized, Object Location and Routing for Large-scale Peer-to-peer Systems”, Proceedings of IFIP/ACM International Conference on Distributed Systems Platforms (Middelware)’02

– Ben Y. Zhao, John Kubiatowicz, Anthony Joseph, “Tapestry: An Infrastructure for Fault-tolerant Wide-area Location and Routing”, Technical Report, UC Berkeley

– A. Rowstron and P. Druschel, "Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility", 18th ACM SOSP'01, Lake Louise, Alberta, Canada, October 2001.

– S. Iyer, A. Rowstron and P. Druschel, "SQUIRREL: A decentralized, peer-to-peer web cache", appeared in Principles of Distributed Computing (PODC 2002), Monterey, CA

– Frank Dabek, M. Frans Kaashoek, David Karger, Robert Morris, and Ion Stoica, Wide-area cooperative storage with CFS, ACM SOSP 2001, Banff, October 2001

– Ion Stoica, Daniel Adkins, Shelley Zhaung, Scott Shenker, and Sonesh Surana, Internet Indirection Infrastructure, in Proceedings of ACM SIGCOMM'02, Pittsburgh, PA, August 2002, pp. 73-86

– L. Garces-Erce, E. Biersack, P. Felber, K.W. Ross, G. Urvoy-Keller, Hierarchical Peer-to-Peer Systems, 2003, http://cis.poly.edu/~ross/publications.html

– Kangasharju, K.W. Ross, D. Turner, Adaptive Content Management in Structured P2P Communities, 2002, http://cis.poly.edu/~ross/publications.html

Page 16: Data Management in Distributed Systems Minqi Zhou Software Engineering Institute Office: Room 111 Mathematics Building E-mail: mqzhou@sei.ecnu.edu.cn Phone:

– K.W. Ross, E. Biersack, P. Felber, L. Garces-Erce, G. Urvoy-Keller, TOPLUS: Topology Centric Lookup Service, 2002, http://cis.poly.edu/~ross/publications.html

– P. Felber, E. Biersack, L. Garces-Erce, K.W. Ross, G. Urvoy-Keller, Data Indexing and Querying in P2P DHT Networks, http://cis.poly.edu/~ross/publications.html

– K.W. Ross, Hash-Routing for Collections of Shared Web Caches, IEEE Network Magazine, Nov-Dec 1997

– A. Keromytis, V. Misra, D. Rubenstein, SOS: Secure Overlay Services, in Proceedings of ACM SIGCOMM'02, Pittsburgh, PA, August 2002

– M. Reed, P. P. Syverson, D. Goldschlag, Anonymous Connections and Onion Routing, IEEE Journal on Selected Areas of Communications, Volume 16, No. 4, 1998.

– V. Scarlata, B. Levine, C. Shields, Responder Anonymity and Anonymous Peer-to-Peer File Sharing, in Proc. IEEE Intl. Conference on Network Protocols (ICNP), Riverside, CA, November 2001.

– E. Sit, R. Morris, Security Considerations for Peer-to-Peer Distributed Hash Tables, in Proc. 1st International Workshop on Peer-to-Peer Systems (IPTPS), Cambridge, MA, March 2002.

– J. Saia, A. Fiat, S. Gribble, A. Karlin, S. Sariou, Dynamically Fault-Tolerant Content Addressable Networks, in Proc. 1st International Workshop on Peer-to-Peer Systems (IPTPS), Cambridge, MA, March 2002.

Page 17: Data Management in Distributed Systems Minqi Zhou Software Engineering Institute Office: Room 111 Mathematics Building E-mail: mqzhou@sei.ecnu.edu.cn Phone:

– M. Castro, P. Druschel, A. Ganesh, A. Rowstron, D. Wallach, Secure Routing for Structured Peer-to-Peer Overlay Netwirks, In Proceedings of the Fifth Symposium on Operating Systems Design and Implementation (OSDI'02), Boston, MA, December 2002.

– Edith Cohen and Scott Shenker, “Replication Strategies in Unstructured Peer-to-Peer Networks”, in Proceedings of ACM SIGCOMM'02, Pittsburgh, PA, August 2002

– Dan Rubenstein and Sambit Sahu, “An Analysis of a Simple P2P Protocol for Flash Crowd Document Retrieval”, Columbia University Technical Report