27/1/20101 lecture 3: state of the art d.sc. arto karila helsinki institute for information...
Post on 18-Dec-2015
215 views
TRANSCRIPT
27/1/2010 1
Lecture 3:Lecture 3:State of the ArtState of the Art
D.Sc. Arto Karila
Helsinki Institute for Information Technology (HIIT)
T-110.6120 – Special Course on Data Communications Software: Publish/Subscribe Internetworking
www.psirp.org
27/1/201027/1/2010 22
ContentsContents1. Introduction2. Guiding Principles3. Future Internet Architecture
1. Protocols2. Mechanisms3. Publish/Subscribe paradigm
4. Design Considerations1. Economics2. Security3. Trust4. Privacy
27/1/201027/1/2010 33
IntroductionIntroduction The PSIRP project aims to solve some major issues of
the current Internet by applying… information-centric publish/ subscribe
… paradigm throughout the layers
In fact, many current applications are inherently pub/sub in nature:
Distribution of software and anti-virus updates IPTV BitTorrent RSS feedsand more!
A clean-slate pub/sub architecture could serve such applications very well
27/1/201027/1/2010 44
IntroductionIntroduction To succeed, we must know the current state of the art,
make use of it, and extend it in many areas of communication
In early 2008 a rather thorough state-of-the-art study was conducted and collected to a report (D2.1)
Development has not stopped there and the wiki used has lived on but D2.1 presents a snap-shot of the situation two years ago
Because of the breadth of the area, we had to focus on promising sub-areas
27/1/201027/1/2010 55
ContentsContents1. Introduction2. Guiding Principles3. Future Internet Architecture
1. Protocols2. Mechanisms3. Publish/Subscribe paradigm
4. Design Considerations1. Economics2. Security3. Trust4. Privacy
27/1/201027/1/2010 66
Guiding PrinciplesGuiding Principles Our vision is based on these concepts:
Everything is information, which can be organized hierarchically to build complicated structures from simple elements
There are different forms of information reachability on all levels of the design and they can change in real-time
Control is given to the recipient of information, fixing the imbalance of powers inherent in TCP/IP
The state-of-the-art study was focused on issues that appear to serve these ideas
27/1/201027/1/2010 77
ContentsContents1. Introduction2. Guiding Principles3. Future Internet Architecture
1. Protocols2. Mechanisms3. Publish/Subscribe paradigm
4. Design Considerations1. Economics2. Security3. Trust4. Privacy
27/1/201027/1/2010 88
ScopeScope The goals were mapped into areas of
investigation that seemed to be relevant Future Internet Architecture
• Protocols Naming Addressing Routing Multicast
• Mechanisms Compensation Caching Security Network Coding
27/1/201027/1/2010 99
Scope (cont’d)Scope (cont’d) Publish Subscribe Design considerations
• Economics• Socio-economic aspects• Security must be designed into the architecture• Trust is an important aspect of networking• Privacy is of increasing importance
27/1/201027/1/2010 1010
MethodologyMethodology The methodology of the SoA study was
dictated by the envisioned scope The SoA was simply the first step towards
understanding the relevant prior work “A system as complex as the Internet can
only be designed effectively if it is based on a core set of design principles, or tenets, that identify points in the architecture where there must be common understanding and agreement” [Cla2003]
27/1/201027/1/2010 1111
MethodologyMethodology The original Internet was created by people who share
the common goal of interconnecting their computing equipment
Computers were physically large, with extremely limited resources You kept your data with you and not on the system
Communication was modeled to share resources point-to-point…NOT for many-to-many content sharing and retrieval
As the Internet has grown well out of its envisioned scope, several of its limitations have become apparent
From the socio-economic point of view, solving tussles (conflicts of interest) is one of the key problems facing future Internet
This leads to design for change [Cla2003] and the requirement of evolvability [Rat05]
The importance of trust (E2E => T2T)
27/1/201027/1/2010 1212
Naming Currently naming usually happens at the
service-level: domain names, e-mail addresses, URIs etc.
The Domain Name System (DNS) defines a static, hierarchical namespace organized into a tree, where ICANN manages the top-level domains
The DNS namespace is decoupled from the (also hierarchical) IP address space
27/1/201027/1/2010 1313
Quick Discussion What is good about DNS?
What is bad about DNS?
Why is DNS is insufficient to support host mobility?
27/1/201027/1/2010 1414
Naming DONA replaces domain names with self-
certifying, two-part, hash-based names, naming data (not hosts or interfaces)
[Ram2004a] proposes a new design for name resolution
[Ram2004b] proposes prefix-matching DHT In [Cal2007] on channels are named with
unique identifiers without hierarchy or centralized control
[Cro2003] introduces contexts – collections of homogeneous network elements
There are lots of different proposals
27/1/201027/1/2010 1515
Addressing Traditionally IP addresses are divided into
classes A, B, and C In 1993 Classless Inter-Domain Routing (CIDR)
was introduced, with variable-length prefixes and aggregation of blocks
[And2007] proposes an address structure where the subnet prefix is replaced with a self-certifying Autonomous Domain identifier (AD) and the suffix with a self-certifying Host Identifier (EID), adresses now being of form: AD:EID
ROFL proposes routing on flat labels, in a totally topology-independent way (this does not scale)
27/1/201027/1/2010 1616
Addressing
In [Cal2007] nodes are anonymous and addressed through their incoming channels
In [Cro2003] specific addresses are bound to different addresses in different contexts
[Han2004] proposes seven steps towards an Internet resistant against DoS attacks – the first two calling for separation of client and server addresses and removal of globally reachable client addresses
27/1/201027/1/2010 1717
Inter-Domain Routing Border Gateway Protocol (BGP) is suffering from
serious scaling problems Default-free zone In June 2007, APNIC router in Tokyo had ~225,000
routes! Any change in a globally visible prefix causes
Internet-wide route updates The number of globally visible prefixes is growing
for a number of reasons, such as: Provider-independent addressing Multi-homing of sites Protecting against prefix hijacking
27/1/201027/1/2010 1818
Domain-Level Routing To tackle BGP’s scaling issues [And2007] proposes to
route at the domain level Removal of path selection from packet-forwarding-level
routing has been proposed Explicit domain-level path construction fits with name-
based routing (e.g. TRIAD) [Lak2006] proposes providing the path selection function
as a separate routing service [Key2006] lets the sending host optimize path selection
based on congestion information NIRA [Yan2007] proposes a separate path discovery
protocol for the up-graph, Name-to-Route Lookup Service (NRLS) for the downhill route, and allowing the endpoints to further negotiate end-to-end path selection
27/1/201027/1/2010 1919
Domain-Level Routing Some of these functionalities are needed by multi-path
capable transport protocols, such as the Stream Control Transmission Protocol (SCTP) [Ste2000]
[Fea2004] proposes removing the routing function from routers to allow for better domain-level control of routing policies and allow a more direct domain-level mechanism for inter-domain routing
ROFL uses domain-level source routes as the means to route packets between endpoints – the first packet of a session uses hierarchical DHT routing, but after that the endpoints can use NIRA-like [Yan2007] end-to-end domain-level path control
27/1/201027/1/2010 2020
Compact Routing Routing table sizes and communication
cost of BGP are increasing exponentially with the number of global prefixes [Kri2007]
Routing on AS numbers doesn’t offer a real solution to the growing complexity
Compact Routing aims to decrease the size of routing tables while allowing non-shortest paths to be used
Traditional shortest-path algorithms yield routing tables of size O[n*log(n)] [Gav1996]
27/1/201027/1/2010 2121
Compact Routing A routing scheme is said to be compact if it
produces: Logarithmic address and header sizes Sub-linear routing table sizes Stretch bounded by a constant
A compact routing scheme can be.: Specialized or universal (works on all graphs) Name-dependent or name-independent
Two compact routing schemes with small stretch (3) are the non-hierarchical Cowen [Cow1999] and the Thorup-Zwick (TS) [Tho2001] schemes
[Kri2004] focuses on the TZ scheme with Internet-like graphs
27/1/201027/1/2010 2222
Overlay Routing In overlay routing the topology is formed
over an underlying (usually IP) network DHTs are examples of overlay routing DHT techniques can be utilized e.g. in
implementing non-hierarchical rendezvous An example of DHT-based solutions is the
Content Addressable Network (CAN) CAN is based on a d-dimensional
Cartesian space, each node having a coordinate zone that it is responsible for
27/1/201027/1/2010 2626
Content-Based Pub/Sub Routing
Hosts subscribe to content by specifying filters on the events
The content of the message defines its ultimate destination
Subscribers use interest registration facility which sets up data delivery paths
Pub/sub has been proposed as a replacement for TCP/IP
This would change the economic model too
27/1/201027/1/2010 2727
Content-Based Pub/Sub Routing
Filter-based event routing – pub/sub servers are organized into an acyclic tree
Multicast-based event routing – a multicast tree is build for every interest group
Kyra [Cao2004] combines the approaches using a two-level hierarchy Within a clique (based on proximity) all nodes
know each other On a higher level minimum spanning trees to
the cliques are built for various events
27/1/201027/1/2010 2828
Content-Based Pub/Sub Routing
Siena is a classic example of distributed content-based routing implemented in the application layer, coexisting with TCP/IP [Car2001]
Overlay networks allow more complex functionality to be implemented on top of IP
Good overlay routing configuration follows the placement of network-level routers
27/1/201027/1/2010 2929
Multicast Multicast is vital for the efficient distribution of
media (such as video) IPv4 has class D addresses for multicast DVRMP is and early mcast routing protocol The topological map of OSPF allows MOSPF to
operate with little overhead Protocol Independent Multicast (PIM) works with
any routing protocol in two modes: sparse (PIM-SM) and dense (PIM-DM)
In the local network, IGMP is used
27/1/201027/1/2010 3030
Multicast Multicast is considered valuable but it is not
supported in the Internet The main reasons for this are its security
and scalability issues DVRMP and PIM-DM initially flood the n/wk Each multicast router requires a lot of state The sender runs the risk of getting traffic
back from a large group of recipients [PAS1998] provides a summary of
approaches emphasizing different goals
27/1/201027/1/2010 3131
Recent Trends in Multicast There are many proposals for more scalable
or more easily deployable multicast These can be roughly divided into three
groups: Router-based Host-based Overlay (DHT) -based
27/1/201027/1/2010 3333
Compensation To facilitate efficient use of resources by
providing the “owner” with some assurance that he will eventually benefit from the use of his resource
Different forms of compensation: Authorization Community membership Resource exchange Sacrifice or evidence of deliberate waste of the user’s
own resources Payment or promise of future reimbursement
27/1/201027/1/2010 3434
Compensation Types of transaction-related costs:
Immediate technical costs Information search costs Collateral costs associated with the use
Compare w/ Transaction Cost Economics: Researching potential suppliers Collecting information on prices Negotiating contracts Monitoring the supplier’s output Legal costs incurred (contract breaches)
27/1/201027/1/2010 3535
Compensation Weber, Biggard and Delbridge: exchange =
voluntary agreement involving the offer of any sort of present, continuing, or future utility in exchange for utilities of any sort offered in return
Four categories of exchange systems: Price System Associative System Moral System Communal System
27/1/201027/1/2010 3636
Caching [PIT2008] studies caching performance in
nodes of a Delay Tolerant Network (DTN), providing ad-hoc communication services within (sparse) mobile user communities when end-to-end IP service is unavailable
The network acting as a distributed cache Caching is needed to handle heavy traffic The price of storage is dropping faster
than the price of communication => caching is getting more tempting
27/1/201027/1/2010 3737
Storage vs. Transit PriceDisk space price (logarithmic)
1985 1990 1995 2000 2005 2010
$100/MB
$10/MB$1/MB
$100/GB
$10/GB$1/GB
$0.1/GB
Tier-1 Internet Transit
2009
Raw Disk Space
27/1/201027/1/2010 3838
Scope Security
In pub/sub architectures scopes control the spreading of information
[Fie2004] proposes an extension to a large pub/sub system Rebeca to support scopes
In [Far2002] access control is implemented with attribute certificates (ACs) used to identify nodes and their privileges
27/1/201027/1/2010 3939
Packet Layer Authentication
Each packet (or PDU at any layer) can be signed and the public key included
The authenticity of the packet can now be determined by any node on its route
This prevents the attacker from consuming a lot of resources with falsified packets
This area will be covered more on the PSIRP Security Architecture lecture
27/1/201027/1/2010 4040
Transparency and Information Accountability
Social rules ted to more often cause compliance than abuse
This is due to the fact that the consequences of compliance usually are more pleasant than those of violation
If we can build this into the architecture, a large-scale system can be made reliable, robust, secure, trusted, and efficient
[Wei2007] introduces transparency and accountability as the attributes of information systems that could result in compliance and collaboration
27/1/201027/1/2010 4141
Network Coding Communication through an unreliable and
unpredictable channel is difficult Transmission errors can lead to long delay
and large number of retransmissions Network coding includes Forward Error
Correction (FEC) as well as more modern rateless codes – i.e. digital fountain codes
27/1/201027/1/2010 4242
Reed-Solomon Codes Among the most significant traditional codes
is the Reed-Solomon code (N,K), with qm symbols in its alphabet, can be decoded after receiving K out of N symbols sent
The message consists of K original symbols and N-K parity symbols
27/1/201027/1/2010 4343
Fountain Codes Fountain techniques send randomly all the parts of
a message with added redundancy They are rateless since there is no limit on the
number of encoded packets generated from the source message and it can change on the fly
The source can send as many encoded packets as necessary for the destination to decode the data
Among fountain codes are: Random Linear Fountain Code, Tornado Codes and LT Fountain Code
27/1/201027/1/2010 4444
XOR Coding Intelligent mixing of packets can be used to
increase network throughput An example is the situation where two users
of a wifi base station (router) exchange two messages
Without network coding we need four transmissions
With simple XOR coding we can do with only three transmissions
27/1/201027/1/2010 4747
Linear Network Coding Linear network coding is rather like XOR
coding, except that the XOR operation is replaced with linear combination of data
The recipient can decode the information having received m out of the n messages
Linear coding appears to work well with multicast, which makes it interesting for PSIRP
27/1/201027/1/2010 4848
Publish/Subscribe Paradigm The starting point of our work is that event-
based computing and the pub/sub paradigm are crucial for future services
RSS feeds can be seen as pub/sub SIP is an example of event-based comp. Formal modeling of pub/sub systems and
correctness of content-based routing protocols are examined in [Müh2002b]
A routing protocol is correct if it satisfies the safety and liveliness requirements
27/1/201027/1/2010 4949
ContentsContents1. Introduction2. Guiding Principles3. Future Internet Architecture
1. Protocols2. Mechanisms3. Publish/Subscribe paradigm
4. Design Considerations1. Economics2. Security3. Trust4. Privacy
27/1/201027/1/2010 5151
Economics
Some key economic issues are: Which aspects of network usage are charged for? Related to above, which are the entities involved? How is charging accomplished? What happens at domain boundaries? What are the objectives of charging? Which economical "fundamentals" limit the
architectural choices?
27/1/201027/1/2010 5252
Socio-Economics
The socio-economic aspects include: Value-chain dynamics Bullwhip Effect Overlay Economics Design for Tussle Reductionism vs. Evolution
27/1/201027/1/2010 5353
Security Designing and building security into the
architecture is central to PSIRP The SoA study was concerned with:
Network attacks Threat analysis Solution methodologies Formal methods for modeling security
protocols Requirements Operating tactics
27/1/201027/1/2010 5454
DDoS Attacks Distribute Denial of Service (DDoS) attacks are the
difficult to protect against Among them the most difficult are band-width consuming
attacks [And2003] and [Par2007] use data channel and a small
control channel, over which anybody can send packets to a destination asking for permission to send data => proactive filtering
The capability is added to every packet sent and the data channel only needs to handle packets w/ valid capabilities
Obviously, the control channel now becomes a target Various computational, memory, and band-width puzzles
have been proposed to increase real customers’ chances
27/1/201027/1/2010 5555
DDoS Attacks Filtering really should be done already in the
network (cmp/ w/ PLA) [Bal2005] proposes proactive filtering based on
Bloom filters and source routes Diffusion, replication and hiding focus on making
it harder to concentrate the attack Pub/sub systems make routing decisions on
flexible messages – routing-scope flooding with complex messages consumes lots of resources
Pub/sub routing nodes maintain a lot of state information – false publications can cause DoS
[Wun2007] states DoS attacks on pub/sub systems might have unpredictable effects
27/1/201027/1/2010 5656
Threat Analysis and ResearchTo survey existing attacks, we divide them into three domains of functionality: End-user domain where publishers and
subscribers may not trust each other, the pub/sub service or the underlying infra
Pub/sub service provision domain where the provider may not trust publishers and subscribers or vice-versa
Infrastructure domain whose components (cache elements, label switching routers, forwarding nodes, multicast points, network coders) may not trust each other
27/1/201027/1/2010 5757
Threat Analysis and Research In the Pub/sub service provision domain
providers and end-users should have a symbiotic relation Replay attack – intercepting and copying
packets containing credentials and using them to masquarade
Sybil attack – the attacker presents itself with multiple identities, undermining the redundancy of a distributed system
Integrity of service means avoiding service misuse and isolating incidents (e.g. a rouge service broker generating spam)
27/1/201027/1/2010 5858
Threat Analysis and Research Infrastructure integrity means that the elements
performing networking functions are uncorrupted and trustworthy
Possible threats include: Cache Poisoning (bogus caches) Routing Service Attacks (discovery & maint.) Forwarding Phase Attacks (fast data path) Eclipse Attack (malicious nodes colluding) Amplification (e.g. dormant subscriptions) Resource Consumption Attacks (aka sleep
deprivation) Message State Effect (statefull routing nodes)
Service-layer confidentiality – Man-in-the-Middle
27/1/201027/1/2010 5959
Existing SolutionsExisting security solutions include: Access Control’
[Bel2003] proposes role-based access control EventGuard
Provides security for content-based pub/sub: authentication, confidentiality and integrity of publications using six guards: subscribe, adv., publish, unsubscribe, unadvertised, routing)
QUIP Protocol for securing content distribution in
pub/sub networks [Cor2007]
27/1/201027/1/2010 6060
Formal Modeling and Analysis of Security Protocols
The analysis of pub/sub crypto protocols is much like that of traditional send/receive
Pub/sub versions of existing protocols rely on explicit channels or pre-agreed names instead of expecting the network to deliver
Unlike in traditional crypto protocols, the sender need not know the identity of rcpt.
It may, for example, be enough to know that there is just one peer
27/1/201027/1/2010 6161
Formal Modeling and Analysis of Security Protocols
The Dolev-Yao intruder model, where the intruder can hear, intercept and synthesize any message, largely still pertains but it will need to be extended and enriched
Focus is moving from authenticating principals to various security properties related to the data itself
Group communication changes the nature of many problems
New pub/sub protocols need to be designed –resource control, including issues of fairness, compensation, and authentication
27/1/201027/1/2010 6262
Security GoalsA preliminary set of threats and security goals: Secrecy of security-related entity identities and identity
protection. Secrecy of keys and other related information, typically
needed for confidentiality and data integrity of the transmitted information.
DoS, including unsolicited bulk traffic (spam). Threats to fairness, including mechanisms such as
compensation and authorization. Authenticity and accountability of the information,
including its integrity and trustworthiness, reputation of the origin, and evidence of past behavior, if available.
Privacy and integrity of subscriptions to information. Privacy and integrity of the forwarding state
(as a result of subscriptions).
27/1/201027/1/2010 6363
Formal Methods in Security
Burrows, Abadi, Needham – BAN logic – assumes only passive eaves-dropping
The Casper/FDR combination provides a dynamic perspective
Casper translates security protocol into CSP that can be fed into the FDR model
27/1/201027/1/2010 6464
Trust Trust deals with the intentions and
knowledge of parties A lot of work has been done on analyzing
this in the protocol context In the real world, trust is about our ability
to rely on the benevolence and good intentions of people and organizations
Checks and balances have been developed to institutionalize trust
27/1/201027/1/2010 6565
Privacy
Privacy issues are central to any new technology (e.g. RFID)
Privacy can be divided into (overlapping) domains: Physical privacy Information privacy Contextual privacy
27/1/201027/1/2010 6666
Anonymity
Anonymity should be the norm – not the exception
Matt Blaze has done a lot of good work in this area but generally it is neglected
There are several anonymity architectures for preserving different kinds of anonymity