twitter frenzy fpga data stream processing

Post on 02-Jan-2016

31 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Twitter Frenzy FPGA Data Stream Processing. Cory Kleinheksel (Team Leader) Tim Meyer David Graziano Josh Clausman. Project Idea. Twitter Frenzy - A way to filter tweets as a set of frequencies using a FPGA to perform packet analysis. - PowerPoint PPT Presentation

TRANSCRIPT

Twitter Frenzy  FPGA Data Stream Processing

Cory Kleinheksel (Team Leader)Tim Meyer

David GrazianoJosh Clausman

Project Idea  • Twitter Frenzy - A way to filter tweets as a set of frequencies using a FPGA

to perform packet analysis.

• Accelerate the stream processing of Twitter data queries.

• Specifically accelerate computationally intensive and long life-time queries with data with short life-times.

• The design/implementation of a frequency-based query will be the primary focus (interesting application of signal processing).

 

Details  • Input: Live (or simulated) Twitter stream data

• Java program used to simulate twitter feed by reading from a dataset

• Processing:1. Extract tweets from input stream2. Filter tweets based on query parameters

• Text Matching3. Determine tweet frequency components

• Frequency Analysis4. Apply signal filter (signal processing)

• Output: Tweets matching filter

Design Issues

• Ability to acquire data from twitter at a useful speed

• Determining packet usefulness (send/drop) in efficient manner

• Managing concurrently arriving packets and multi-fragment packets

• How to calculate frequency and filter corresponding packets

Implementation Issues• How to properly buffer and send fragmented tweets

• Time/clock cycles needed to perform frequency calculations

• Time to perform Hashing – Created a lookup table based hashing block

• Modules consuming data at different rates

• Debugging HW

System Architecture Diagram

 

Breakdown: Network Data Flow

 

Breakdown: Text Matching

Breakdown: Frequency Analysis

Algorithms

• Hashing

• String Matching

• Frequency Analysis

• Filtering (FIR)

Project Results

• Analyzed the problem

• Implemented full simulator in software

• Implemented in VHDL

• Simulated in ModelSim

• Tested on hardware, confirmed results against software implementation

Dataset: JSON_29493.txtProcessed 29493 tweets192 passed string filter133 passed frequency filter

Software Simulator Example

Demo

References

Berinde, Indyk, Cormode, Strauss. "Space-optimal Heavy Hitters with Strong Error Bounds"

Cormode, Korn, Tirthapura. "Time-Decaying Aggregates in Out-of-order Streams"

Charikar, Chen, Farach-Colton. "Finding Frequent Items in Data Streams“

top related