d istributed systems, n etwork protocols & a pplications srinivasan seshan computer science...
DESCRIPTION
3 Measurement/Analysis of Networks Selfish TCP behavior Bottleneck discovery Scaling properties of the Internet Multihoming People: Aditya Akella, Jeff Pang, Bruce Maggs and Srinivasan Seshan Anees Shaikh (IBM)TRANSCRIPT
Distributed Systems,Network Protocols &Applications
Srinivasan SeshanComputer Science DepartmentCarnegie Mellon University
2
Three Major Projects
Measurement analysis of networks Sensor networks Distributed virtual reality
3
Measurement/Analysis of Networks
Selfish TCP behavior Bottleneck discovery Scaling properties of the Internet Multihoming
People: Aditya Akella, Jeff Pang, Bruce Maggs and
Srinivasan Seshan Anees Shaikh (IBM)
4
Measuring the Internet from Everywhere
What could you learn if you could… Have a machine in almost every ISP Collect routing information (E-BGP/I-BGP) from
these ISPs Be part of a significant fraction of all Web
transfers Be queried by almost every DNS server in the
world We have access to such a testbed Akamai
5
Bottleneck Discovery Where are bottlenecks in the
Internet? Ignoring access links
What is the capacity of these bottlenecks?
Initial results There is a lot of available
bandwidth in the Internet today > 45Mbps on 50% of paths!
Quantified relative benefit of using larger Tier-1 ISPs over smaller ISPs
Internal ISP links are bottlenecks more often than expected
Peering between ISPs not as significant a bottleneck as expected
Stub
More ISPs
ISP1
Stub
ISP2
Stub
Stub
Bottlenecks?
6
Scaling Properties of the Internet How will these bottlenecks
change over time? Analyzing the combination of
Internet topology and routing Identifying changes that are
needed to make the Internet scale with hardware improvements from Moore’s law
Initial results Congestion scales poorly in
Internet-like graphs Policy-routing does not
worsen the congestion Alleviation possible via
simple, straight-forward mechanisms
Congested hot-spots
Uniformly scale all capacities?
Scale some links faster?
Moore’s-law like scaling sufficient?
7
Multihoming How can stub networks like
CMU route traffic around bottlenecks?
Using multiple ISPs can… Improve performance and
reliability of Internet connectivity
Make Internet routing robust to failures and attacks
But need… Techniques for stub domains
to choose providers Monitoring tools to track
changes in Internet performance
Dynamic control over chosen routes
ISP 2ISP 1
Internet
CMU
Destination
The effective use of multiple ISPs (multihoming) by stub networks
8
Multihoming In a given metro area…
What maximum performance benefits can multihoming offer? How can multihomed networks realize these benefits in practice?
Initial results Multihoming helps, but not much beyond 4 providers Careful choice necessary
Cannot just pick top individual performers Performance can be 50% worse for a poor choice of providers
Future work Reasons for observed performance benefit can we relate
route/ISP selection to bottleneck observations? Impact of ISP cost structure what is the best choices for a given
cost? How will Internet operation be affected by such “smart” routing?
9
Sensor Networks
IrisNet
People: Suman Nath, Yan Ke, and Srinivasan Seshan Phil Gibbons, Babu Pillai, Rahul Sukthankar (Intel
Research)
10
What if Sensors Were Everywhere?
Persistent queries/triggered actions
Characterization of human activity
Is the cafeteria busy?Where’s Fred?Person Locator System
Show an image when you hear a
honk
Network monitoringPacket sniffers as
sensors
11
Sensor Services
Need: infrastructure to simplify creation of sensor-enriched services
Remove deployment overhead Provide a common shared infrastructure of sensors
Automate common tasks Sensor reading collection and storage Efficient query processing over readings Address privacy concerns of users
12
Sensor Networksmote hardware
TinyOS, TinyDB, etc.
campus-scale
minimal sensor processing
energy is a key concern
scalar sensors
narrowly focused services
ad hoc wireless connectivity
IrisNetPCs/PDAs
Linux, Java, XML, C++
Internet-scale
intensive sensor processing
powered nodes
multimedia sensors
wide variety of services
direct Internet connectivity
IrisNet: Internet-scale Resource-Intensive Sensor Network Services
13
Example: Parking Space Finder
A distributed database maintains Spot availability data Address of parking spot Meter description Historical availability data
Query: Where is the cheapest empty parking spot near school? Returns driving directions to the best spot
14
IrisNet Architecture
University Downtown Hill District
InternetInternet
Parking Space Finder Organizing Agents
Sensing Agents
Person Finder Organizing Agents
Amy-John Kim-Steve Tom-Zoe
Sensing Agents
15
Design Decisions
Sensor feeds processed in application specific way near source Reduces demand on network Requires relatively intensive processing on sensor device
Distributed, hierarchical XML database stores readings Accommodates frequent updates to different readings XML supports hierarchical and heterogeneous/evolving
description of data Hierarchical organization enables scalability and rich query
styles Challenges in database processing, image processing
& distributed systems
16
Distributed Virtual Reality
Distributed multiplayer games
People: Ashwin Bharambe, Jeff Pang and
Srinivasan Seshan
17
What do Multiplayer Games Look Like? Large shared world
Composed of map information, textures, etc Populated by active entities: user avatars, computer AI’s, etc
Only parts of world relevant to particular user/player
Player 1Player 2
Game World
18
Individual Player’s View
Interactive environment (e.g. door, rooms)
Live ammo
Monsters
Players
Game state
19
Current Game Architectures Distributed broadcast-based
(e.g., DOOM ) Every update sent to all
participants Advantages/disadvantages
+ No central server- Waste of bandwidth- Synchronized game state
– difficult for players to join at arbitrary times
Do not scale well
Centralized client-server (e.g., Quake)Every update sent to server who
maintains “true” state Advantages/disadvantages
+ Reduces overall bandwidth requirements
+ State management, cheat proofing much easier
- Bottleneck for computation and bandwidth current games limited to about 6000 players
- Single point of failure- Response time limited by client-
server latency
20
Large-Scale Distributed Games Need to distribute responsibility for maintaining world state
and running computer AIs Avoid any single point of failure Efficient use of available bandwidth
Every player only receives “relevant” updates subscribes to updates
Player
x ≥ 50x ≤ 150y ≥ 150y ≤ 250
Interests
x 100y 200
Events
(100,200)
(150,150)
(50,250)
Arena
Virtual World
Solution: model game with Publish-Subscribe
21
Publish-Subscribe Overview
Key feature subscription language Rich database-like subscription languages (e.g. all publications with
stock price > 100) Subject/channel-based subscriptions (e.g. all publications on the IBM
stock channel)
State-of-the-art Centralized designs with rich subscriptions Scalable distributed designs with channel-based subscriptions Unscalable designs with rich subscriptions
SubscriptionPublications
Publishers produce publications
Subscribers register their interests via subscriptions
23
Publish-Subscribe Critical Components
Subscription language Subjects vs. attribute/values Exact matches vs. regular expressions?
Routing mechanism Where are subscriptions stored in the system? How are publications routed so that they “meet”
subscriptions? How are publications delivered from this
rendezvous point to subscribers?
24
Related Systems
Scribe, Herald Scalable, but – Restricted subscription language
Siena, Gryphon Flexible subscription language, but – Poor scalability due to message flooding
Delicate balance between expressiveness of language and scalability of routing
25
MERCURY: Subscription Language
SQL-like but more limited tradeoff to achieve scalability Example: int x ≤ 200 Enough to support range predicates SQL-
like Need sortable attribute-values Sufficient for modeling games
Game arenas Player statistics, etc.
How to support this subscription language scalably? Use techniques derived from distributed hash tables (DHT) Existing DHT-based designs only support exact-match lookup
Need range-based lookups Eliminate the use of cryptographic hashes must explicitly handle
load-balancing
26
MERCURY: Routing Protocol
Each node responsible for range of attribute values
For each attribute, nodes arranged into circle
Each node compares value in message to his range; and routes along the circle
Hx
[240, 320)
[80, 160)
[160, 240)[0, 80)
Attribute Hub
27
Routing Illustrated
Send subscription to any one attribute hub Send publications to all attribute hubs
[0, 80)
[210, 320)
50 ≤ x ≤ 150150 ≤ y ≤ 250
x 100y 200
Hx
[240, 320)
[80, 160)
[160, 240)Hy
[0, 105)
[105, 210)
Subscription
Publication
Rendezvous point
28
Why Not Use DHTs (and Cryptographic Hashing) ?
Hashing is good for exact matches e.g., DHTs
Want to support range queries Possible approach
Hash each value in the range Problems
Can only be used for discrete-valued attributes
Too many subscriptions
int x 1 int x 10
int x = 1
int x = 9
int x = 10
29
Future Work
Performance Cached pointers reduce number of overlay hops Network aware placement of nodes delay competitive
with centralized systems Robustness need to survive node failures Workload need system to self-tune to workload Cheating detecting various forms of cheating
Routing, subscriptions, state ownership
30
Future Work
Distributed VR has similar challenges as many other distributed applications
Other applications we plan to explore: Collaborative applications (whiteboard, shared
applications, chat servers, etc) Distributed databases Distributed simulation (ns-2) …