490dp part i: challenges intermezzo: applications part ii: java object serialization robert grimm
Post on 19-Dec-2015
237 views
TRANSCRIPT
Challenges
• Pervasive computing– Vision: Focus on users and their tasks– Enabled by ubiquitous smart devices
• Central question– How can devices get users’ tasks done?
• They need to work together!
Distributed State
“Information retained in one place that describes something, or is determined by something, somewhere else in the system”
• Examples– Association between addresses and names– Sequence number to identify most recent data– File block cached in memory of client– List of clients caching a file
Why is Distributed State Good?
• Performance– Not going over the network saves time– Example: Local cache of files
• Coherency– Easier to coordinate based on knowledge– Example: Server notification when cache expires
• Reliability– Replication makes it possible to tolerate failures– Example: Same files stored on two servers
Why is Distributed State Bad?
• Consistency• Crash sensitivity• Time and space overheads• Complexity
Consistency
• Problem: Keep copies consistent• Approaches
– Detect stale data on use• Treat copy as hint• Example: name-to-address map
– Prevent inconsistency• Require exclusive ownership before modifying• Example: all operations go through one node
– Tolerate inconsistency• Make window of inconsistency small• Example: delays in network games
Crash Sensitivity
• Problem: Mask failures• Approaches
– Reconstruct state• Example: reopening files in Sprite FS
– Limit degree of distribution / affected state• Example: partition files according to usage
– Fully replicate state• Example: Coda file system
Time and Space Overheads
• Time– Go across the network
• Space– Distributed copies– Tracking distributed copies
• Overheads depend on– Degree of sharing– Degree of modification
Complexity
• Distributed state requires– Maintaining consistency– Masking of failures
• Distributed state makes it harder to– Debug– Tune
No Perfect Solution
• Solution needs to be “good enough”
NFS Sprite FS
Consistency Limited
Availability Limited Limited
Scalability LimitedLimited
(Baseline OK)
Complexity Low High
What Means “Good Enough”
• Depends on application domain– Make an informed trade-off
• Examples– Cluster-based services
• Porcupine• Distributed Data Structures
– Disconnected storage services• Epidemic replication• Two-tier replication
Porcupine
• Cluster-based email server• Assumptions
– Email typically doesn’t get modified– Deleted emails may reappear (temporarily)
• Eventual consistency• But, availability and scalability
[Saito et al. 99]
Distributed Data Structures
• Cluster-based hash table• Assumptions
– Network is fast and doesn’t partition– Nodes fail infrequently– OK to return failure at storage layer
• Consistency, availability, and scalability
[Gribble et al. 00]
Bayou
• Epidemic replication [Demers et al. 87]
– Two nodes periodically synchronize state– Only pair-wise connectivity
• Structured storage (database)• Eventually consistent• But, always available
[Petersen et al. 97]
Coda
• First-tier nodes
– Fully connected
– Store all data
• Second-tier nodes
– Often disconnected
– Store subset of data
• Limited consistency, but greater availability [Kistler & Satya 92, Mummert et al. 95]
Conflicts
• Caused by competing updates• Detected “after the fact”• Need to be resolved automatically
Conflict Resolution Techniques
• Based on data– Timestamps– Heuristics– Programs [Kumar & Satya 95, Reiher et al. 94]
• Part of update: Bayou [Terry et al. 95]
– Dependency check– Merge procedure
Morals
• No perfect solution– Need to exploit application domain
• Complexity grows very quickly– Beware of special case code (recovery)
Intermezzo: Applications
• Team 1: Cluster-based application– Scalable Napster / Gnutella repository– Scalable document repository
• Leased storage• Customizable actions when leases expire
• Team 2: Roving application– Personal jukebox– PIM on steroids– Universal inbox
Java Object Serialization
• Problem– Turn graph of objects into byte string– Turn byte string back into graph of objects
A
B C
D
A
B C
D
The Basic Idea
• Write a description of each object
• Keep track of each written object
• <1:A <2:B <3:D>> <4:C ref(3)>>
A
B C
D
A
B C
DD
All Things Serializable
• Not everything is serializable– java.lang.Object– java.lang.Thread
• Serializable objects implementjava.io.Serializable
– An empty marker interface
Default Serialization
• Writes out all fields– Independent of their access controls
(private, package private, protected, public)
• Good style to document invariants– Use @serial tag
@serial Must not be <code>null</code>
Default Deserialization
• Allocates memory for new object– No constructor invoked– Fields initialized to their default values
• Reads in all fields– Independent of their access controls
Transient Fields
• Some fields shouldn’t or can’t be serialized private Object lock;
• How to prevent default serializationfrom trying to write them out?– Declare such fields as “transient”
private transient Object lock;
– Restored to defaults during deserialization•null in above example
Overriding Serialization
• Customize serialization by implementing private void writeObject(ObjectOutputStream) throws IOException;
• Good style to document customization– Use @serialData tag
Overriding Serialization
• Example: Thread-safe serialization private void writeObject(ObjectOutputStream out) throws IOException {
synchronized (lock) { out.defaultWriteObject(); }}
Overriding Serialization
• Example: Filter elements from a list– Declare list to be transient– In writeObject()
• Invoke default serialization• Iterate over list, writing filtered elements
out.writeObject(el);
• Write end-of-list markerout.writeObject(Boolean.FALSE);
• Alternatively, write length & elements
Overriding Deserialization
• Customize deserialization by implementing private void readObject(ObjectInputStream) throws IOException, ClassNotFoundException;
Overriding Deserialization
• Example: Restore lock private void readObject(ObjectInputStream in) throws IOException, ClassNotFoundException {
in.defaultReadObject(); lock = new Object();}
Overriding Deserialization
• Example: Restore list• In readObject()
– Invoke default deserialization– Read filtered elements until end-of-list
marker– Alternatively, read length & elements
Notes on Customization
• Don’t perform operations that take a long time– No I/O besides accessing object stream
• Swing UI elements are serializable– But are not designed for long-term storage– Declare them transient– Restore UI in application logic
The Replacements
• Example: Symbols — there can only be one private Object readResolve() throws ObjectStreamException { return intern(name);}
• Done after object graph has been restored– Embedded self references
are not replaced!
Inheritance
• If a superclass implements Serializable,all subclasses are also serializable– Each class in such a hierarchy
serializes only its own state– Classes can control all state
by implementing java.io.Externalizable• If superclass is not Serializable,
a serializable subclass must handle the superclass’s state
Inheritance
• To make a subclass of a serializable classnot serializable
private void writeObject( ObjectOutputStream o) throws IOException { throw new NotSerializableException( getClass().getName());}
• This indicates a semantic problem!
Versioning
• Problem– Classes can change– While instance is in serialized form
• Solution– Let classes declare their version– Define what are compatible changes
Stream Unique Identifier (SUID)
• Hash of the class– Determined by serialver tool– Accessible in Java throughObjectStreamClass.getSerialVersionUID()
• Modified version declares same SUIDas original version private static final long serialVersionUID = …;
Incompatible Changes
• Deleting fields• Moving classes up or down in the hierarchy• Changing non-static fields to static• Changing non-transient fields to transient• Changing the declared type of a field• Adding / removing access to default fields
from writeObject() / readObject()• See specification!
Compatible Changes
• Adding fields• Adding / removing classes• Adding Serializable• Adding / removing writeObject() / readObject()
• Changing static fields to non-static• Changing transient fields to non-transient
Security
• Serialized objects expose their internal state• If that state is sensitive it must be protected
– Don’t serialize sensitive state– Encrypt sensitive state– Encrypt serialized objects