type systems for distributed data sharing ben liblit alex aikenkathy yelick
Post on 21-Dec-2015
220 Views
Preview:
TRANSCRIPT
Distributed Sharing: Many Uses
• Data location management• Cache coherence• Race condition detection• Program/algorithm documentation• Consistency model relaxation• Synchronization elimination• Autonomous garbage collection• Security
Distributed Memory Model
• Multiple machines, each with local memory
• Global memory is union of local memories
• Distinguish two types of pointers:– LocalLocal points to local memory only– GlobalGlobal points anywhere: machine, address– Different representations & operations
Type Grammar
boxedint::
globallocal::
• Boxed and unboxed values• Integers, pointers, and pairs
– Pairs are not assumed boxed
• References to boxes are either local or global
Review of Global Dereferencing:Standard Approach Unsound
5
int local boxed where,:
global boxed:
x
x
x =
x =
Type Expansion in Detail
intintpop
pop,pop,pop
global boxed global boxedpop
intintexpand
pop,pop,expand
global boxed boxedexpand
2121
2121
ττ
ττω
Representation Versus Sharing
• Locally pointed-to data might not be private– Because of local / global aliasing
5
x =
Representation Versus Sharing
• Locally pointed-to data might not be private– Because of transitivity + pointer widening
5
y =
y =
y =
Representation Versus Sharing
• Globally pointed-to data might not be shared– What if “y” never actually happens?
5
y =
y =
Representation Versus Sharing
• But globally used data must be shared– If “y” can happen, local pointer cell is shared.– What about cell containing “5”?
5
y =
Data Sharing as Types
• Shared data allows certain operations– Access by way of global pointer
• Private data allows other operations– Optimizations, GC, fast monitors, etc.
• Some form of polymorphism is essential– Neither subsumes the other– But we can have a common supertype
boxedint::
privatemixedshared::
globallocal::
Augmented Type Grammar
• Allow subtyping of pointers, pairs– But not across pointers, since we allow assignment
• Allocation is explicitly shared or private• Question: what can you do with mixed data?
Late Enforcement:Limited Use of Global Pointers
global boxed boxedexpand
:
local boxed:
expand:
shared global boxed:
x
x
x
x
Late Enforcement: Applicability
Data location managementCache coherenceRace condition detectionProgram/algorithm documentationConsistency model relaxationSynchronization elimination Autonomous garbage collection (in practice) Security
Why Garbage Collection Breaks
1. Send out global pointer to my private data
2. Destroy all my local pointers to it
3. GC locally unreachable private data
4. …
5. Get that global pointer back again later
6. It points to my data, so coerce to local
7. Use this local pointer to my private data
Slightly Earlier Enforcement:No Escape of Private Addresses
shared global boxed shared boxedexpand
:
local boxed:
expand:
shared global boxed:
x
x
x
x
• Note that τ′ might reference private dataAutonomous garbage collection: OK Security: not OK
Early Enforcement:Shared is Transitively Closed
shared local boxed:
allShared :
private local boxed:
:
sp x
x
x
x
trueintallShared
allSharedallShared,allShared
shared boxedallShared
2121
τω
Recap of Enforcement Strategies
• Late enforcement– Anything can point to anything– Restricted global dereference & assignment
5
y =
3
Recap of Enforcement Strategies
• Slightly earlier enforcement– Can only reveal shared addresses– Still restrict global pointer operations
5
y =
3
Recap of Enforcement Strategies
• Early enforcement– Shared universe is transitively closed– Global pointer restrictions trivially satisfied
5
y =
3
Type Inference:Constraint Generation
• Type structure already known– Including local / global
• Induce constraints on sharing qualifiersδ = shared from global deref / assignδ ≤ δ′ from assignmentsδ = δ′ from various other
operations
• Stricter enforcement adds more constraintsδ = shared δ′ = shared
Type Inference:Constraint Resolution
• Given constraints:private ≤ δ1 δ ≤ δ1
shared ≤ δ2 δ ≤ δ2
private sharedδ
δ1 δ2
Type Inference:Constraint Resolution
• Two “minimal” solutionsδ = shared δ1 = mixed δ2 = shared
private sharedδ = shared
δ1 = mixed δ2 = shared
Type Inference:Constraint Resolution
• Two “minimal” solutionsδ = shared δ1 = mixed δ2 = shared
δ = private δ1 = private δ2 = mixed
private sharedδ = private
δ1 = private δ2 = mixed
Type Inference:Biased Constraint Resolution
1. Push “shared” and “mixed” forward
private sharedδ
δ1 shared ≤ δ2
Type Inference:Biased Constraint Resolution
1. Push “shared” and “mixed” forward
2. Identify qualifiers which cannot be private
private sharedδ
δ1 shared ≤ δ2
Type Inference:Biased Constraint Resolution
1. Push “shared” and “mixed” forward 2. Identify qualifiers which cannot be private3. Set all other qualifiers to private
private sharedδ = private
δ1 = private shared ≤ δ2
Type Inference:Biased Constraint Resolution
2. Identify qualifiers which cannot be private 3. Set all other qualifiers to private4. Push “private” forward
private sharedδ = private
δ1 = privateshared ≤ δ2
private ≤ δ2
Type Inference:Biased Constraint Resolution
3. Set all other qualifiers to private4. Push “private” forward5. Set remaining qualifiers to “shared” or “mixed”
private sharedδ = private
δ1 = private δ2 = mixed
Implementation For Titanium
• Java + extensions– Objects, classes, interfaces, methods– Multidimensional arrays, templates– Local / global, communications primitives
• Sharing validation as type checking• Sharing inference as compiler analysis
– Late or early enforcement– Whole-program or partial
Experimental Findings:Static Metrics
• How much data is “private”?– 16% - 75% of all static declaration sites– 46% overall; 50% on largest benchmark
• Is “mixed” really needed?– Up to 6% of static sites, but large impact– Some utility code: could use parametric poly
Experimental Findings:Static Metrics
• Why have “local shared”?– 24% - 53% of shared data is locally addressed– Bad idea to force these to global
• Does enforcement policy affect results?– No change for small benchmarks (<1000 lines)– 1% - 4% shift for larger codes
Experimental Findings: Consistency Model Relaxation
• Impose sequentially consistent semantics– Restrict both Titanium & C optimizers– Relax restrictions for private data
• Performance impact varies widely– Negligible sequential slowdown: nothing to do– Sequential slowdown, offset by inference– Sequential slowdown, better inference needed
Experimental Findings:Other Dynamic Metrics
• Data location management– 1% - 100% of allocated bytes are private
• 45% in gas benchmark
– amr: highly sensitive to enforcement policy• 74% late / 19% early
• Synchronization elimination– Statically, one third eliminated– Dynamically, not significant for these codes
Summary
• “Shared” might not mean what you think– Related to local/global, but not the same
– Different degrees of privacy to choose from• Escape analysis, or several weaker alternatives
– Generalizes on earlier language designs
• Experimental implementation – Ideas & algorithms scale to real system
– More aggressive clients needed
– Potential for stronger (phase-aware) inference
top related