pier & phi overview of challenges & opportunities ryan huebsch † joe hellerstein † °,...

8
PIER & PHI Overview of Challenges & Opportunities Ryan Huebsch Joe Hellerstein ° , Boon Thau Loo , Sam Mardanbeigi , Scott Shenker †‡ , Ion Stoica [email protected] UC Berkeley, CS Division International Computer Science Institute, Berkeley CA STREAM DAY 5/7/04

Upload: winifred-mcdonald

Post on 12-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PIER & PHI Overview of Challenges & Opportunities Ryan Huebsch † Joe Hellerstein † °, Boon Thau Loo †, Sam Mardanbeigi †, Scott Shenker †‡, Ion Stoica

PIER & PHIOverview of Challenges & Opportunities

Ryan Huebsch†

Joe Hellerstein† °, Boon Thau Loo†,Sam Mardanbeigi†, Scott Shenker†‡, Ion

Stoica†

[email protected]†UC Berkeley, CS Division

‡International Computer Science Institute, Berkeley CA

°Intel Research Berkeley

STREAM DAY 5/7/04

Page 2: PIER & PHI Overview of Challenges & Opportunities Ryan Huebsch † Joe Hellerstein † °, Boon Thau Loo †, Sam Mardanbeigi †, Scott Shenker †‡, Ion Stoica

PIER P2P Information Exchange & Retrieval

A wide-area distributed dataflow engine Outfitted with relational operators Designed to scale to thousands or millions of nodes

Motivation: It’s an interesting challenge Lowers the barrier of entry for large-scale applications

No massive infrastructure for server farms Cost is distributed among participants

Provide a viable solution where other options are not socially acceptable

We are NOT trying be better than other (centralized) solutions, we are trying to be different.

Page 3: PIER & PHI Overview of Challenges & Opportunities Ryan Huebsch † Joe Hellerstein † °, Boon Thau Loo †, Sam Mardanbeigi †, Scott Shenker †‡, Ion Stoica

Challenges

Physical Network

Overlay Network

Query Plan

DeclarativeQueries

Query OptimizationMulti-Query Optimization

CatalogsPersistent Storage

Recursion

Query DisseminationReplicationSoft-State

Quality of Service

ResilienceRoute Flapping

Efficiency

SecurityPrivacyQuality of Service

GeneralChallenges

Page 4: PIER & PHI Overview of Challenges & Opportunities Ryan Huebsch † Joe Hellerstein † °, Boon Thau Loo †, Sam Mardanbeigi †, Scott Shenker †‡, Ion Stoica

Applications & Requirements File sharing

Flooding works for popular items Need something better for rare items May want ‘triggers’ when a new item

matches an old search Network Monitoring

Aggregation & grouping very common Continuous queries with well defined

semantics

PHI is one use of PIER…

Page 5: PIER & PHI Overview of Challenges & Opportunities Ryan Huebsch † Joe Hellerstein † °, Boon Thau Loo †, Sam Mardanbeigi †, Scott Shenker †‡, Ion Stoica

PHI Public Health for the Internet Community-based monitoring The metaphor:

Old way – Treat computers with medicine Virus protection

New way – Monitor the community Like the Center for Disease Control

Global CDC has social implications Central repository, privacy, who controls it, who pays

for it… PHI wants to create the Center for Disease

Control without the Center (of control) Motivation is to inform users about the dangers

of the Internet

Page 6: PIER & PHI Overview of Challenges & Opportunities Ryan Huebsch † Joe Hellerstein † °, Boon Thau Loo †, Sam Mardanbeigi †, Scott Shenker †‡, Ion Stoica

PHI Example PIER is currently deployed on 150-300

PlanetLab nodes. ~100 sites Some nodes on DSL,

1Mbps, 10 Mbps, etc. Very unreliable

SNORT is the primary data source ~2400 rules 10’s - 1000’s of tuples per day per node Schema: time, rule, source socket, destination socket

Quick Demo: Shows the top ten sources of events across all of

PlanetLab (live), i.e. who are the bad guys?

Page 7: PIER & PHI Overview of Challenges & Opportunities Ryan Huebsch † Joe Hellerstein † °, Boon Thau Loo †, Sam Mardanbeigi †, Scott Shenker †‡, Ion Stoica
Page 8: PIER & PHI Overview of Challenges & Opportunities Ryan Huebsch † Joe Hellerstein † °, Boon Thau Loo †, Sam Mardanbeigi †, Scott Shenker †‡, Ion Stoica

What’s next… PIER

Lots of problems, including the meta-problem of what problem to work on

No streaming semantics, no language to describe windows, etc…

Additional challenges: Interaction with soft-state, no synchronized clocks, unknown (changing) network latencies

PHI Create a complete application

Gets intrusion data from a variety of sources (including the built-in Windows Firewall

Develop a snazzy visualization Release to the world, first using PlanetLab as the query

processor, eventually the world Scale to at least 10,000’s nodes and explore the

design space