understanding kazaa

20
Understanding KaZaA Jian Liang Rakesh Kumar Keith Ross Polytechnic University Brooklyn, N.Y.

Upload: haile

Post on 07-Jan-2016

27 views

Category:

Documents


2 download

DESCRIPTION

Understanding KaZaA. Jian Liang Rakesh Kumar Keith Ross Polytechnic University Brooklyn, N.Y. KaZaA/FastTrack Operation. Top file sharing system 3 million active nodes four clients: KaZaA, KaZaA-lite, Grokster and iMesh Good availability and scalability - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Understanding KaZaA

Understanding KaZaA

Jian LiangRakesh Kumar

Keith Ross

Polytechnic UniversityBrooklyn, N.Y.

Page 2: Understanding KaZaA

KaZaA/FastTrack Operation

• Top file sharing system– 3 million active nodes – four clients: KaZaA, KaZaA-lite,

Grokster and iMesh• Good availability and scalability• Proprietary protocol; signaling

traffic encrypted– in contrast with Gnutella and e-mule

Page 3: Understanding KaZaA

Purpose of Measurement Study

• Try to understand highly successful file-sharing system– Overlay topology and dynamics– Peer selection – Index management

• Utilize the KaZaA as a test-bed for further research.– Content pollution research (another paper)

Page 4: Understanding KaZaA

Existing Tools and Projects

• FastTrack encryption algorithm – available from a Web site: http://gift-

fasttrack.berlios.de/

• KaZaA Media Desktop (KMD) software architecture– http://kazaasearch.narod.ru/

Page 5: Understanding KaZaA

Big Picture of Overlay

• Two layer hierarchy– Ordinary Node (ON)– Super Node (SN)

Page 6: Understanding KaZaA

Measurement Apparatus

• KaZaA Sniffing Platform• KaZaA Probing Tool

Page 7: Understanding KaZaA

KaZaA Sniffing Platform

• Poly (Ethernet)• Home (cable modem)

Page 8: Understanding KaZaA

KaZaA Probing Tool

• Campus & home based probing– Node list – Workloa

d

SN 128.

Probe

ON

SN

ON

SN

SN

SN

ON

SN

SN

SN 24.

Probe

ON SN

SN

ON

ON

Home

Poly

KaZaA Network

SN 213.

ON

ON

ON

ON

ON

Page 9: Understanding KaZaA

Signaling Protocol

SN–SN Node list fragment 1 [Enc]

SN–SN Node list fragment 2 [Enc]

SN–SN Node list fragment n [Enc]

ON-SN session initial SN-SN session initial

Page 10: Understanding KaZaA

TCP Connections Evolution

0

20

40

60

80

100

197193

289

385

481

577

673

769

865

961

1057

1153

1249

1345

1441

1537

1633

1729

1825

1921

2017

2113

2209

2305

2401

2497

on-snsn-sn

0

10

20

30

40

50

60

70

80

1

148

295

442

589

736

883

1030

1177

1324

1471

1618

1765

1912

2059

2206

2353

2500

2647

2794

2941

3088

3235

3382

3529

3676

3823

3970

on-snsn-sn

Poly campus 4 – 6 hour measurement

Cable modem 7-11 hour measurement

Page 11: Understanding KaZaA

SN Workload

0

20

40

60

80

100

120

140

1

104

207

310

413

516

619

722

825

928

1031

1134

1237

1340

1443

1546

1649

1752

1855

1958

2061

2164

2267

2370

2473

2576

on-snsn-sntotal

0

20

40

60

80

1 17 33 49 65 81 97 113

129

145

161

177

193

209

225

241

257

273

289

305

321

337

353

369

385

401

417

433

workload

0

20

40

60

80

100

120

140

1

145

289

433

577

721

865

1009

1153

1297

1441

1585

1729

1873

2017

2161

2305

2449

2593

2737

2881

3025

3169

3313

3457

3601

3745

3889

4033

on-snsn-sntotal

0

10

20

30

40

50

60

70

1 25 49 73 97 121

145

169

193

217

241

265

289

313

337

361

385

409

433

457

481

505

529

553

577

601

625

649

673

worload

7 - 11 hours TCP connections evolution

7 - 11 hours workload values evolution

Page 12: Understanding KaZaA

Signaling Sessions Lifetime

Page 13: Understanding KaZaA

Peer Selection: Node List IP Prefix Match

Page 14: Understanding KaZaA

Peer Selection: Workload & RTT

Page 15: Understanding KaZaA

Index Management: Sharing Content

Page 16: Understanding KaZaA

Port Dynamic and NAT

• 19,637 unique SN addresses collected• Found only 707 SNs (3.6%) use the

default 1214 port number. • 18,887 SNs (96.3%) use non-default

port numbers. • Of total unique 64834 peers (SN + ON),

21269 peers (ON) use private IP.

Page 17: Understanding KaZaA

Summary of Results

• 20,000 ~ 40,000 active super nodes• Each SN connects to approx. 0.1% of

other SNs • Highly dynamic connections: over 35%

SN-SN durations are less than 30 sec.

Page 18: Understanding KaZaA

Summary of results

• Peer selection uses IP prefix match, workload, RTT and freshness

• No index exchange between SNs but query forwarding

• Skewed content distribution: 20% peers provide 70% metadata for sharing.

Page 19: Understanding KaZaA

Design Principles forUnstructured P2P Overlays

• Distributed design– No infrastructure– Avoiding legal attacks.

• Exploit heterogeneity – Hierarchy– Self organization

• Load balancing - workload balancing.• Explicit locality awareness• Shuffle connections in core overlay

Page 20: Understanding KaZaA

Design Principles forUnstructured P2P Overlays

• Properly designed gossip mechanisms – peers have a fresh list of SNs

• Firewall circumvention – dynamic port numbers– improves availability

• NAT circumvention