memory-efficient regular expression search using state merging department of computer science and...

20
Memory-Efficient Regular Expression Search Using State Merging Department of Computer Science and Information Engineering National Cheng Kung University, Taiwan R.O.C. Authors: Michela Becchi; Srihari Cadambi Publisher: IEEE INFOCOM 2007 Present: Chia-Ming ,Chang Date: 3, 18, 2009 1

Post on 22-Dec-2015

218 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Memory-Efficient Regular Expression Search Using State Merging Department of Computer Science and Information Engineering National Cheng Kung University,

Memory-Efficient Regular Expression Search Using State

Merging

Department of Computer Science and Information Engineering National Cheng Kung University, Taiwan R.O.C.

Authors: Michela Becchi; Srihari Cadambi

Publisher: IEEE INFOCOM 2007

Present: Chia-Ming ,Chang Date: 3, 18, 2009

1

Page 2: Memory-Efficient Regular Expression Search Using State Merging Department of Computer Science and Information Engineering National Cheng Kung University,

Outline

1. Introduction 2. State merging 3. Experimental results 4. Conclusion

2

Page 3: Memory-Efficient Regular Expression Search Using State Merging Department of Computer Science and Information Engineering National Cheng Kung University,

Introduction (1/4)

3

Page 4: Memory-Efficient Regular Expression Search Using State Merging Department of Computer Science and Information Engineering National Cheng Kung University,

Introduction (2/4)

4

Page 5: Memory-Efficient Regular Expression Search Using State Merging Department of Computer Science and Information Engineering National Cheng Kung University,

Introduction (3/4)

5

Fig. 3. (a) Rudimentary bitmap-based data structure. The lower part shows the bitmap and its transition table for State 3 from the example in Figure 1.

Fig. 3. (b) More compact bitmap-based data structure using pointer indirection.

ab c

d e

fghi012

Page 6: Memory-Efficient Regular Expression Search Using State Merging Department of Computer Science and Information Engineering National Cheng Kung University,

Introduction (4/4)

6

Bitmap-based data structures have two other disadvantages,especially when implemented in software. First; some computation(1’s counting) is necessary in order to process thebitmap and obtain an address into the transition table.

Second: fetching the bitmap could require several memory accesses. These issues may be resolved to a certain extent by breaking up a large bitmap into several smaller bitmaps, each annotated with some extra information.

Page 7: Memory-Efficient Regular Expression Search Using State Merging Department of Computer Science and Information Engineering National Cheng Kung University,

Outline

1. Introduction 2. State merging 3. Experimental results 4. Conclusion

7

Page 8: Memory-Efficient Regular Expression Search Using State Merging Department of Computer Science and Information Engineering National Cheng Kung University,

Architecture (1/8)

8

1

2

3

a

b1_2 3

0

1

a/0,b/1

source

State merging method : source

label

Page 9: Memory-Efficient Regular Expression Search Using State Merging Department of Computer Science and Information Engineering National Cheng Kung University,

Architecture (2/8)

9

1

2

3

a

b1 2_3

0

1

a.0,b.1

destination

State merging method : destination

label

Page 10: Memory-Efficient Regular Expression Search Using State Merging Department of Computer Science and Information Engineering National Cheng Kung University,

Architecture (3/8)

8

10

Page 11: Memory-Efficient Regular Expression Search Using State Merging Department of Computer Science and Information Engineering National Cheng Kung University,

Architecture (4/8)

811

Page 12: Memory-Efficient Regular Expression Search Using State Merging Department of Computer Science and Information Engineering National Cheng Kung University,

Architecture (5/8)

12

abcdef

afgh

01

Page 13: Memory-Efficient Regular Expression Search Using State Merging Department of Computer Science and Information Engineering National Cheng Kung University,

Architecture (2/3)

13

G is the weight graph corresponding to D.

D is the DFA being processed,

e is an edge in G,

h is a heap storing weight graph edges ordered according to edge weights.

two DFA states s and t, metric(s, t) is a measure of the memory savings when s and t are merged.

max labels is the maximum number of labels allowed.

Page 14: Memory-Efficient Regular Expression Search Using State Merging Department of Computer Science and Information Engineering National Cheng Kung University,

Architecture (7/8)

14

Algorithm V.1 shows the core of our procedure, which is a loop that terminates when merging can no longer produce memory savings, or if more than the maximum allowed labels are required.

Algorithm V.2 constructs the weight graph G and stores the edges in heap h. It considers all pairs of states in the given DFA. If the total number of sub-states in the state pair under consideration is less than the number of labels,it evaluates metric(s, t) and assigns it to the weight of theedge connecting states s and t in the weight graph. Recall that metric(s, t) is an exact measure of the memory savings possible when states s and t are merged.

Algorithm V.3 recalculates the metric corresponding to statepairs where one state is connected to the merged state.

Page 15: Memory-Efficient Regular Expression Search Using State Merging Department of Computer Science and Information Engineering National Cheng Kung University,

Architecture (8/8)

15

Page 16: Memory-Efficient Regular Expression Search Using State Merging Department of Computer Science and Information Engineering National Cheng Kung University,

Outline

1. Introduction 2. State merging 3. Experimental results 4. Conclusion

16

Page 17: Memory-Efficient Regular Expression Search Using State Merging Department of Computer Science and Information Engineering National Cheng Kung University,

Experimental results (1/2)

17Merging-128 labels

Page 18: Memory-Efficient Regular Expression Search Using State Merging Department of Computer Science and Information Engineering National Cheng Kung University,

Experimental results (2/2)

18

Page 19: Memory-Efficient Regular Expression Search Using State Merging Department of Computer Science and Information Engineering National Cheng Kung University,

Outline

1. Introduction 2. Architecture 3. Experimental results 4. Conclusion

19

Page 20: Memory-Efficient Regular Expression Search Using State Merging Department of Computer Science and Information Engineering National Cheng Kung University,

Conclusion (1/1)

20

We describe a compact data structure that can representa DFA with merged states and transition labels.

We present a merging and labeling algorithm. We extend the bitmap data structure proposed for string

matching [7] to DFAs, and introduce a modification using pointer indirection, which also reduces memory usage in its own right.

We present an analysis of our scheme, and perform a systematic experimental study comparing state merging to previous compaction techniques.