deterministic memory- efficient string matching algorithms for intrusion detection nathan tuck,...
Post on 20-Dec-2015
217 views
TRANSCRIPT
![Page 1: Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department](https://reader036.vdocument.in/reader036/viewer/2022081603/56649d415503460f94a1c669/html5/thumbnails/1.jpg)
Deterministic Memory-Deterministic Memory-Efficient String Efficient String
Matching Algorithms for Matching Algorithms for Intrusion DetectionIntrusion Detection
Deterministic Memory-Deterministic Memory-Efficient String Efficient String
Matching Algorithms for Matching Algorithms for Intrusion DetectionIntrusion Detection
Nathan Tuck, Timothy Sherwood, Brad Calder, George Nathan Tuck, Timothy Sherwood, Brad Calder, George VargheseVarghese
Department of Computer Science and Engineering, Department of Computer Science and Engineering, University of California, San DiegoUniversity of California, San Diego
Department of Computer Science, University of California, Department of Computer Science, University of California, Santa BarbaraSanta Barbara
![Page 2: Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department](https://reader036.vdocument.in/reader036/viewer/2022081603/56649d415503460f94a1c669/html5/thumbnails/2.jpg)
Abstract• IDSs : Intrusion Detection Systems• Space and time efficient string match
ing algorithms• Providing worst-case performance
– Amenable to H/W implementation• Aho-Corasick
– Memory, performance
![Page 3: Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department](https://reader036.vdocument.in/reader036/viewer/2022081603/56649d415503460f94a1c669/html5/thumbnails/3.jpg)
Introduction (i)• Combating attacks at every level• Automatically monitoring network traffic• IDS uses a set of rules
– Apply to matching packets
• Edge and core routers– Stringent worst-case performance bounds– Tight constraints on memory
![Page 4: Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department](https://reader036.vdocument.in/reader036/viewer/2022081603/56649d415503460f94a1c669/html5/thumbnails/4.jpg)
Introduction (ii)• At the heart of IDSs is a string matching alg
orithm– In Snort, 70% of total execution time and 80% o
f instructions executed• Contributions of this paper
– Characterization– New Algorithms– Evaluation
![Page 5: Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department](https://reader036.vdocument.in/reader036/viewer/2022081603/56649d415503460f94a1c669/html5/thumbnails/5.jpg)
String matching for intrusion detection
![Page 6: Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department](https://reader036.vdocument.in/reader036/viewer/2022081603/56649d415503460f94a1c669/html5/thumbnails/6.jpg)
Quantifying the Use of String Matching (i)
• Snort-An intrusion detection system– The rules are generated manually
• Extract relevant content strings from the payload and header of known attacks
– The action can include logging, alerting, ignoring, ……
– Rules are usually added as new vulnerabilities are discovered
![Page 7: Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department](https://reader036.vdocument.in/reader036/viewer/2022081603/56649d415503460f94a1c669/html5/thumbnails/7.jpg)
Quantifying the Use of String Matching (ii)
• Scalability of the intrusion detection system database– Beneficial to avoid that has run-time
proportional to the length of the rules in the database
– New rules are being added to detect or combat new attacks
![Page 8: Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department](https://reader036.vdocument.in/reader036/viewer/2022081603/56649d415503460f94a1c669/html5/thumbnails/8.jpg)
![Page 9: Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department](https://reader036.vdocument.in/reader036/viewer/2022081603/56649d415503460f94a1c669/html5/thumbnails/9.jpg)
![Page 10: Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department](https://reader036.vdocument.in/reader036/viewer/2022081603/56649d415503460f94a1c669/html5/thumbnails/10.jpg)
Quantifying the Use of String Matching (iii)
• Linearly searching through the of rules is becoming increasingly infeasible
• The database is growing at a rate that is well within Moore’s Law
• Need a technique with run-time performance
![Page 11: Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department](https://reader036.vdocument.in/reader036/viewer/2022081603/56649d415503460f94a1c669/html5/thumbnails/11.jpg)
State of the Art in String Matching (i)
• Single-pattern string matching– Boyer-Moore, ……
• Multi-pattern string matching– Aho-Corasick, Wu-Manber, ……
• Imprecise string matching– Using hashing and signature-based– Be reverified using a precise string matching
![Page 12: Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department](https://reader036.vdocument.in/reader036/viewer/2022081603/56649d415503460f94a1c669/html5/thumbnails/12.jpg)
State of the Art in String Matching (ii)
• Bad Character Heuristics– Easily exploitable by attackers
• Aho-Corasick– Use unoptimized data structure for space opti
mizations• SFKSearch
– Worst-case performance is quite poor• Wu-Manber
– Memory access to the shift and hash table
![Page 13: Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department](https://reader036.vdocument.in/reader036/viewer/2022081603/56649d415503460f94a1c669/html5/thumbnails/13.jpg)
![Page 14: Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department](https://reader036.vdocument.in/reader036/viewer/2022081603/56649d415503460f94a1c669/html5/thumbnails/14.jpg)
Applying IP Lookup Techniques to String
Matching (i)• IP-lookup : a set of patterns to matc
h, finding the longest possible match for a set of IP address that are streaming by
• String matching : a set of strings to match, finding all of the places in the input stream where there is a match
![Page 15: Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department](https://reader036.vdocument.in/reader036/viewer/2022081603/56649d415503460f94a1c669/html5/thumbnails/15.jpg)
Applying IP Lookup Techniques to String
Matching (ii)• Unibit and Multibit Tries
– Wastes space with pointer• Lulea Algorithm
– Use the concepts of leaf pushing and bitmaps to compress the database
• Eatherton Algorithm– Internal bitmap and external bitmap
![Page 16: Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department](https://reader036.vdocument.in/reader036/viewer/2022081603/56649d415503460f94a1c669/html5/thumbnails/16.jpg)
Optimizations for string matching
![Page 17: Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department](https://reader036.vdocument.in/reader036/viewer/2022081603/56649d415503460f94a1c669/html5/thumbnails/17.jpg)
Bitmap compression (i)• With 32-bits pointers• In Aho-Corasick has 256 next state pointers• Now using a single pointer to the first valid
next state, and maintain a 256 bit bitmap• Summing all the bits prior that bit number
and adding them to the base next node pointer
![Page 18: Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department](https://reader036.vdocument.in/reader036/viewer/2022081603/56649d415503460f94a1c669/html5/thumbnails/18.jpg)
![Page 19: Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department](https://reader036.vdocument.in/reader036/viewer/2022081603/56649d415503460f94a1c669/html5/thumbnails/19.jpg)
Bitmap compression (ii)• Original optimized Aho-Corasick
– 1028 bytes each node• Bitmapped version
– Only 44 bytes each node• Incurs two costs
– Doubles the worst-case of work– Performing a sum up to 256 prior bits
![Page 20: Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department](https://reader036.vdocument.in/reader036/viewer/2022081603/56649d415503460f94a1c669/html5/thumbnails/20.jpg)
Path Compression (i)• Bitmap is largely wasted
information at the bottom nodes• Any path compressed nodes must
be equal in size to bitmapped nodes
• Failure pointers must include an offset
![Page 21: Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department](https://reader036.vdocument.in/reader036/viewer/2022081603/56649d415503460f94a1c669/html5/thumbnails/21.jpg)
![Page 22: Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department](https://reader036.vdocument.in/reader036/viewer/2022081603/56649d415503460f94a1c669/html5/thumbnails/22.jpg)
Path Compression (ii)• On a 32 bits pointer
– A single path compressed node can contain data equivalent to 4 bitmap compressed nodes
– In practice, achieve a 2.54:1 compression ratio
![Page 23: Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department](https://reader036.vdocument.in/reader036/viewer/2022081603/56649d415503460f94a1c669/html5/thumbnails/23.jpg)
![Page 24: Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department](https://reader036.vdocument.in/reader036/viewer/2022081603/56649d415503460f94a1c669/html5/thumbnails/24.jpg)
Results
![Page 25: Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department](https://reader036.vdocument.in/reader036/viewer/2022081603/56649d415503460f94a1c669/html5/thumbnails/25.jpg)
Instruction Detection in Hardware
• The number of rules go up by over a factor of 2.5, whereas the size of memory for our algorithm only goes up by 30%
• Focus our attention on the worst-case performance
![Page 26: Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department](https://reader036.vdocument.in/reader036/viewer/2022081603/56649d415503460f94a1c669/html5/thumbnails/26.jpg)
![Page 27: Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department](https://reader036.vdocument.in/reader036/viewer/2022081603/56649d415503460f94a1c669/html5/thumbnails/27.jpg)
Intrusion Detection in Software
• Examine both average-case and worst-case performance
• Wu-Manber is the fastest in the average-case because of hash function
![Page 28: Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department](https://reader036.vdocument.in/reader036/viewer/2022081603/56649d415503460f94a1c669/html5/thumbnails/28.jpg)
![Page 29: Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department](https://reader036.vdocument.in/reader036/viewer/2022081603/56649d415503460f94a1c669/html5/thumbnails/29.jpg)
![Page 30: Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department](https://reader036.vdocument.in/reader036/viewer/2022081603/56649d415503460f94a1c669/html5/thumbnails/30.jpg)
Summary (i)• Current software IDSs largely rely on
common-case optimizations to gain speed
• Aho-Corasick is only has deterministic worst-case lookup times and friendly enough to use for wire speed H/W matching
![Page 31: Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department](https://reader036.vdocument.in/reader036/viewer/2022081603/56649d415503460f94a1c669/html5/thumbnails/31.jpg)
Summary (ii)• Contribution of this paper
– Apply bitmap node compression and path compression to Aho-Corasick
– Gain both compact storage and worst-case performance