distributed systems fall 2011 introduction. 2 distributed systems?
TRANSCRIPT
Distributed SystemsFall 2011
Introduction
2
Distributed Systems?
3
“A distributed system is one in which components located at networked computers communicate and coordinate their actions by passing messages.” (Coulouris, Dollimore, Kindberg, 2005)
4
“A distributed system is one in which nodes communicate and coordinate their actions by passing messages.”(Larsson, 2010)
5
Outline
• Staff presentation• Course presentation• Lessons from last year• This year's course• Basics and challenges of distributed
systems• The big assignment
6
Staff
• Nalin Ranasinghe ([email protected])• Daniel Espling ([email protected])• Lars Larsson ([email protected])
• Questions about the assignment?– Send to [email protected]
• Questions about lectures?– Send email to the appropriate teacher!
Assistance
• Email us if you need us!– [email protected]
• We will either respond by mail or go to D420 or whatever lab you’re currently in– Most days between 13:00 and 14:30– Priority / FIFO queue
7
8
Course presentation
• Theoretical part (4.5 ECTS)– Theory, methods, algorithms, and
principles
• Practical part (3 ECTS)– Practical obligatory assignments
9
Course presentation
• Students should obtain:– Knowledge of theoretical models for
distributed systems– Knowledge of problems and solutions
in designing and in the implementation of distributed systems
10
Course presentation
• The course covers:– Architectural models of distributed
systems– Client-Server, peer-to-peer,
transactions, transparency, naming, error handling, resource management, and synchronization … and much more!
– Computer security in a broad perspective
– Distributed programming and middlewares
11
Lessons from last year
• Students were very happy with the staff and the amount of help they got
• Good disposition, good assignments• More about security!• Too hard assignment, don’t let this course
be the first for newly arrived master students
12
Effort vs. utility
Number of students
Avg. #hours per week
Course quality
Group 1 11 12.5 3.7
Group 2 5 23.4 4.6
Group 3 7 21.6 4.7
13
This year's course• Keep up the good work:
• Assistance with assignment• Comment box on web site• Teaching• etc.
14
About the book…
1. Buy the book.2. No, seriously. Buy it!3. Which edition? 4 or 5?
15
The big assignment
• GCom – group communication middleware
• Apply concepts from theory– Group handling– Message ordering– (Reliable) Multicast of messages
– Not security, however
16
What to learn?
• Book is dense with information– See reading guide on web page – it is
actually accurate– Extremely good, but no easy read
• Start now! You will be busy later...
• Understand the problems and solutions– Learn the general ideas of algorithms and
how/why they work, not every minute step
• Definitions are very important!
17
Benefits of distributed systems
• Resource sharing– CPU, storage, attached equipment,
networking (e.g. NAT routing)
• Functional distribution– Separation of concerns
• Security enforcement• Load balancing• Bridging physical separation• Economics
18
Properties of distributed systems
• No global clock– Processes cannot be perfectly
synchronized (use logical time instead)
• No global state– A process can never be aware of a
single global state of the system
• Independent failures– A process can fail at any time– Can you detect this?
19
So many failures!
• Omission failures– Process crashes, failed message
deliveries
• Timing failures– Too slow networks, laggy processes
• Arbitrary failures– Buggy processes, buggy networks– These are the worst…
20
Design challenges
• Failure handling– Detection, masking, redundancy,
dependability
• Resource heterogeneity– Networks, hardware, software stacks,
design patterns
• Security• Scalability and QoS
– Performance, bottlenecks, resource integrity, caching
21
More design challenges
• Failure handling– Detection, masking, redundancy
• Concurrency– Interleaving sessions, locks
• Openness– Standards, competitors
• Transparency– Users shouldn’t have to know!
22
System models
• (relatively) “smart clients”• (relatively) “dumb clients” and n-tier
servies• Stateless clients (e.g. HTTP)
• Peer-to-Peer (P2P)• BitTorrent, Freenet, Direct Connect, …
• Combinations: multiple servers, mobile code, mobile agents, thin clients
23
Middleware
• Distributed systems often utilize middleware to aid development
• Offers layer of abstraction• Extends upon traditional
programming models:– Local procedure call → Remote
procedure call– OOP → Remote Method Invocation– Event-based programming model
24
Middleware
Applications, Services
RMI, RPC
Request/Reply protocolMarshalling, Unmarshalling
UDP, TCP
Middleware
25
Operation invocation
• Data structures must be “flattened” and serialized (marshaled) for transport– External formats, e.g. XML, JSON, Java Object
Serialization, ...
• Use interface– Procedures having either input, output, or both– No pointers– Service interface: provided services– Remote interface: operations accessible from
other processes– Cross-language/platform interfaces: IDL, WSDL
26
Semantics
• (Local call = exactly once)• Maybe once
– Omission failures (lost packets, crashes)
• At-least-once– Crash failures, arbitrary failures (multiple
executions)– Used by Sun RPC
• At-most-once– Executed exactly once or not at all– Used by Java RMI, Corba
27
Security
• Distributed system = increased exposure• Client- and Server-authentication• Client authorization
– Is the client allowed to perform X?• Proof of execution
– Server must be able to prove that something has been executed
– Also, non-repudiation: it should not be possible to claim that something did not happen if it did
28
Distributed systems: a mess!
• Communication performance variations– Latency (delay), bandwidth
(throughput), jitter (variation in time)• Clocks and timing
– Clock drift• Interaction models
– Asynchronous, synchronous• Event ordering
– Delays cause replies to arrive to some process before the request
29
Distributed systems: a mess!
• Failures– Distributed systems are much more
likely to fail unexpectedly– Lost packets, bit errors, local failures,
no response, method does not exist, etc …
If you can write stable programs in spite of these difficulties, you are a great programmer!
30
The big assignment
• Group communication middleware– Group membership handling– Message ordering guarantees– (Reliable) Multicast communication
• Presentation of working implementation at the end of the course
• Deals with theory from the first set of lectureshttp://www.cs.umu.se/kurser/5DV020/HT11/assignment.html
31
Rules and grading
• Solved in pairs• Three levels
– Bonus points for the exam (if non-bonus points give you ≥ 30p of 60p total)!
• Valid for this year's exams only
– Level 1: basic system (no bonus)– Level 2: + dynamic groups (3p bonus)– Level 3: + tree-based reliable multicast
(6p bonus)
32
Levels
• You may change level at any time• Level 1 is easiest, but in practice
only if you aim for it from the beginning• Many problems can be avoided due to
greatly lowered fault-tolerance of the system
33
Constraints
• May use any programming language and any tools you like– ...as long as they do not provide a too
big advantage (check with us!)– Currently, we will only help with Java
RMI– You may absolutely not use plain
sockets
• All normal rules apply– Thou shall not cheat, etc.
34
Test and debug application(s)
• Test application– A user-level application that shows the
functionality of the system• Debug application
– Used to demonstrate the correctness of your implementation
• These programs can, and likely will, be one and the same!– But make the debug parts non-essential to
use the application– Must be GUI applications!
35
Deliverables
• Deliverable 1 (project plan) – Dec. 2– Your interpretation of the assignment– Requirement analysis– Project and time plan– Basic design of the system
• Yes, really
• Deliverable 2 (report) – Jan. 12– Refers back to Deliverable 1– Describe your system– ...the usual– Make something to be proud of!
• One of your biggest projects during your time here at CS
36
Live demonstration
• You will demonstrate your system to us at the end of the course– Written test protocol
37
Good luck!
• Students have done this before, and succeeded– Certainly not easy– Hard work, big payoff– All students that attempted the entire
assignment passed!
• Hints– Start on time (this afternoon!)– Read the whole specification
– We know it’s long, but it helps you
38
Next lecture
• Fundamental properties of distributed systems