celliq: real-time cellular network analytics at scale · slide courtesy: joey gonzales can compute...

Post on 15-Jul-2020

0 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

CellIQ: Real-Time Cellular Network Analytics at Scale

Anand Iyer#, Li Erran Li+, Ion Stoica# #UC Berkeley +Bell Labs

Cellular Networks have been seeing exponential growth and become part of our lives

Image courtesy: Alcatel-Lucent

What is needed to solve these issues?

Are some regions in the network hotspots? - Better load balancing

How is user traffic moving in the network? - Better resource provisioning

What are the popular handoff sequences? - Troubleshoot handoff related problems

Cellular Network Analytics Today

Cellular Network Analytics Today

Cellular Network Analytics Today

Problem

Existing cellular network analytic systems do not

support advanced analytic tasks in an efficient manner.

High Velocity Data Continuous Monitoring

Advanced Tasks

Timely Spatio-Temporal Analysis

Challenges

CellIQ is a cellular network analytics system that supports rich analysis

tasks efficiently by leveraging domain-specific optimizations

Cellular Data as Time-Evolving Graphs

Tasks easily expressed in graphs: Hotspot computation è Connected components

Handoff sequences & User traffic è Pregel model

Edge PropertyVertex Property

BS1

UE2

UE1 BS2

UE3

UE4

UE5

Why Not Use a Graph Parallel Framework?

��

��

���

���

���

���

���

���

���

�������� ���������� ������ �����

�����������������������

������������

Fails to produce results!

Domain specific optimizations key for efficient analysis

CellIQ Implementation

*Gonzales. et.al. “GraphX: Graph Processing in a Distributed Dataflow Framework”, OSDI 2014

Implemented as a layer on GraphX* Incorporates several domain specific optimizations

GraphX

Spark

Pregel API

PageRank Connected Comp. K-core Triangle Count

LDA SVD++

CellIQ

Computational Model

BS1UE2

UE1

BS2

UE3

UE4

UE5

Computational Model

BS1UE2

UE1

BS2

UE3

UE4

UE5

BS1UE2

UE1

BS2

UE3

UE4

UE5

Computational Model

BS1UE2

UE1

BS2

UE3

UE4

UE5

BS1UE2

UE1

BS2

UE3

UE4

UE5

BS1

UE2

UE1 BS2

UE3

UE4

UE5

Computational Model: GStreams

BS1UE2

UE1

BS2

UE3

UE4

UE5

BS1UE2

UE1

BS2

UE3

UE4

UE5

BS1

UE2

UE1 BS2

UE3

UE4

UE5

Domain specific graph partitioning Spatial operations

Window operations

Computational Model: GStreams

BS1UE2

UE1

BS2

UE3

UE4

UE5

BS1UE2

UE1

BS2

UE3

UE4

UE5

BS1

UE2

UE1 BS2

UE3

UE4

UE5

Domain specific graph partitioning Spatial operations

Window operations

Graph computation frameworks rely on partitioning to minimize communication & balance computation  

B C

A D

FE

A DD

B C

D

E

AA

F Machine 1 Machine 2

A

B

C

D

E

F

Graph Partitioning

Partition geographically close-by entities  

Machine 3 Machine 4

3

B CB C

D

E

A

F

Machine 1 Machine 2

CellIQ Graph Partitioning

G H

2D 1D

?

3 Machine 3 Machine 4

B CB C

D

E

A

F

Machine 1 Machine 2

AB

CD

EF

Graph Partitioning

G HG

H

Random (hashed) partitioning

3 Machine 3 Machine 4

B CB C

D

E

A

F

Machine 1 Machine 2

AB

CD

EF

Graph Partitioning

G HG

H

Random (hashed) partitioning results in poor spatial locality

Machine 3 Machine 4

B CB C

D

E

A

F

Machine 1 Machine 2

CellIQ Graph Partitioning

G H

Uses Hilbert space-filling curves

Machine 3 Machine 4

0 3

2 1 B CB C

D

E

A

F

Machine 1 Machine 2

CellIQ Graph Partitioning

G H

Uses Hilbert space-filling curves Use curve’s distance as the 1-dimensional key

Machine 3 Machine 4

0 3

2 1 B CB C

D

E

A

F

Machine 1 Machine 2

AB C

D

EF

CellIQ Graph Partitioning

G H G H

Uses Hilbert space-filling curves Use curve’s distance as the 1-dimensional key Range partition the key space

0 1

2 3

4 7

6 5

8 11

10 9

14 15

12 13

Machine 3 Machine 4

B CB C

D

E

A

F

Machine 1 Machine 2

AB C

D

EF

CellIQ Graph Partitioning

G H G H

Uses Hilbert space-filling curves Use curve’s distance as the 1-dimensional key Range partition the key space

Computational Model: GStreams

BS1UE2

UE1

BS2

UE3

UE4

UE5

BS1UE2

UE1

BS2

UE3

UE4

UE5

BS1

UE2

UE1 BS2

UE3

UE4

UE5

Domain specific graph partitioning Spatial operations Window operations

GeoGraph API

class  GeoGraph[V,  E]  {      //  Broadcast  a  message  to  all        //  vertices  within  a  radius      def  sendMsg(radius)            //  Create  a  spatially  aggregated        //  graph  by  combining  vertices          //  and  edges        def  spatialAG(reduceV:  (V,  V)  =>  V,                                  reduceE:  (E,  E)  =>  E)  }  

Tracking user traffic gradients

Goal: Detect and track direction of movement of user groups

3

B C

A D

F

E

A DD

B C

D

E

AA

F

Tracking user traffic gradients

Base Station

3

B C

A D

F

E

A DD

B C

D

E

AA

F

Tracking user traffic gradients

B C

A D

F

E

A DD

B C

D

E

AA

F

Hop-by-hop propagation

Tracking user traffic gradients

B C

A D

F

E

A DD

B C

D

E

AA

F

Hop-by-hop propagation is inefficient

Tracking user traffic gradients

Tracking user traffic gradients

B C

A D

F

E

A DD

B C

D

E

AA

F

Instead, CellIQ enables radius based broadcast

Part. 2

Part. 1

Vertex Table(RDD)

B C

A D

FE

A D

Routing Table in GraphX enables Multicast

D

B C

D

E

AA

F

Machine 1

Machine 2

Edge Table(RDD)

A B

A C

C D

B C

A E

A F

E F

E D

B

C

D

E

A

F

RoutingTable

(RDD)

B

C

D

E

A

F

1  

2  

1   2  

1   2  

1  

2  

Slide courtesy: Joey Gonzales

RoutingTable

(RDD)

B

C

D

E

A

F

1  

2  

1   2  

1   2  

1  

2  Part. 2

Part. 1

Vertex Table(RDD)

B C

A D

FE

A DD

B C

D

E

AA

F

Machine 1

Machine 2

Edge Table(RDD)

A B

A C

C D

B C

A E

A F

E F

E D

B

C

D

E

A

FSlide courtesy: Joey Gonzales

Can compute destination partitions easily due to the use of geo-partitioner

GeoGraph API

class  GeoGraph[V,  E]  {      //  Broadcast  a  message  to  all        //  vertices  within  a  radius      def  sendMsg(radius)            //  Create  a  spatially  aggregated        //  graph  by  combining  vertices          //  and  edges        def  spatialAG(reduceV:  (V,  V)  =>  V,                                  reduceE:  (E,  E)  =>  E)  }  

B C

A D

F

E

A DD

B C

D

E

AA

F

Spatial Clustering

F E DDB’F

Goal: Combine spatially close-by vertices

Spatial Clustering Two ways to enable spatial aggregation: - Using a (supplied) field in properties - Leverage geo partitioner

00   01  

02  03  

10   13  

12  11  

20   23  

22  21  

32   33  

30  31  

Spatial Clustering Two ways to enable spatial aggregation: - Using a (supplied) field in properties - Leverage geo partitioner

00   01  

02  03  

10   13  

12  11  

20   23  

22  21  

32   33  

30  31  0   3  

2  1  

Computational Model: GStreams

BS1UE2

UE1

BS2

UE3

UE4

UE5

BS1UE2

UE1

BS2

UE3

UE4

UE5

BS1

UE2

UE1 BS2

UE3

UE4

UE5

Domain specific graph partitioning Spatial operations

Window operations

Tracking Persistent Hotspots

Goal: Detect and track groups of base stations with high traffic volume

Equivalent to finding connected components

Tracking Persistent Hotspots BS1

BS2 BS3

t1 t2 t3

W

Combining graphs at the end of the window results in many join operations (inefficient)

BS1

BS2

BS1

BS2

Tracking Persistent Hotspots BS1

BS2 BS3

t1 t2 t3

W

BS1

BS2

BS1

BS2

BS1

BS2 BS3

1 1

1

BS1

BS2 BS3

2 1

1

BS1

BS2 BS3

3 1

1

Apply incremental updates to a cumulative graph

Tracking Persistent Hotspots BS1

BS2 BS3

t1 t2 t3

BS1

BS2

BS1

BS2

BS1

BS2 BS3

1 1

1

Apply differential updates to a cumulative graph

BS1

BS3

t4

BS1

BS2 BS3

1 2

1

BS1

BS2 BS3

1 3

1

BS1

BS2 BS3

1 2

0

GStream API

class  GStream[V,  E]  {        def  graphReduceByWindow(          reduceFunc(Graph[V,  E],  Graph[V,  E],                                  fv:  (V,  V)  =>  V,                                  fe:  (E,  E)  =>  E):  Graph[V,  E],            invReduceFunc(Graph[V,  E],  Graph[V,  E],                                  fv:  (V,  V)  =>  V,                                  fe:  (E,  E)  =>  E):  Graph[V,  E],            windowDuration,  slideDuration)  }  

graphReduceByWindow    

•  Implemented using Spark’s cogroupedRDD  •  Two default reduce functions: graph intersection and union •  Further optimizations:

– Co-partition graphs from multiple batches – Reuse indices and routing tables for graphs in the

same window More details in the paper!

How does CellIQ perform?

Evaluation Setup

•  LTE control plane data from a major cellular network operator •  1 million+ subscribers, live network

•  2 TB data from 1 week

– 1 file per minute, 750k records, 100s of fields/line – 10 collection points, 10 hours per day

•  Implemented several analysis tasks

Benefits of Geo-partitioning

��

��

���

���

���

���

���

���

���

�������� ���������� ������ �����

�����������������������

������������

�������������������� ����������������

Benefits of Geo-partitioning

��

��

���

���

���

���

���

���

���

�������� ���������� ������ �����

�����������������������

������������

�������������������� ����������������

Small amount of data, movement not noticeable

Default practitioner fails to produce results

Benefits of Incremental Updates

��

��

���

���

���

���

���

���

���

�������� ���������� ������ �����

�����������������������

������������

������������������������������������

������������������������������������������������������������

Benefits of Incremental Updates

��

��

���

���

���

���

���

���

���

�������� ���������� ������ �����

�����������������������

������������

������������������������������������

������������������������������������������������������������

2 – 5X improvements

Benefits of Incremental Updates

��

��

���

���

���

���

���

���

���

�������� ���������� ������ �����

�����������������������

������������

������������������������������������

������������������������������������������������������������

window size affects performance

Benefits of Differential Updates

��

��

��

��

��

���

�� �� �� �� �� ��� ���

�����������������

����������������

��������������

Benefits of Differential Updates

��

��

��

��

��

���

�� �� �� �� �� ��� ���

�����������������

����������������

��������������

Larger windows see bigger benefits

Graceful degradation in performance

Benefits of Radius-based Broadcast

��

����

����

����

����

����

����

�������� ���������� ������ �����

�����������������

������������

����������������������

��� � ��

���

����������

��� ���

���

Benefits of Radius-based Broadcast

��

����

����

����

����

����

����

�������� ���������� ������ �����

�����������������

������������

����������������������

��� � ��

���

����������

��� ���

���

Larger datasets result in increase in messages exchanges per hop

CellIQ is a cellular network analytics system that uses domain-specific optimizations to achieve 2x to 5x

improvements

CellIQ is a cellular network analytics system that uses domain-specific optimizations to achieve 2x to 5x

improvements Ongoing Work: • Using techniques in CellIQ to perform root-cause

analysis on operational LTE Networks •  Generalized streaming graph analysis techniques

top related