nullcon 2011 - building an intelligence analysis systems using open source tools
DESCRIPTION
Building an intelligence analysis systems using open source tools by Fyodor YarochkinTRANSCRIPT
![Page 1: nullcon 2011 - Building an intelligence analysis systems using open source tools](https://reader035.vdocument.in/reader035/viewer/2022062513/5576299dd8b42a4e1c8b53b7/html5/thumbnails/1.jpg)
http://null.co.in/ http://nullcon.net/
Building Intelligence analysis systems
Hands-on with nutch, solr, lucene, maltego/netglub, and
more
![Page 2: nullcon 2011 - Building an intelligence analysis systems using open source tools](https://reader035.vdocument.in/reader035/viewer/2022062513/5576299dd8b42a4e1c8b53b7/html5/thumbnails/2.jpg)
http://null.co.in/ http://nullcon.net/
Agenda
• INTRO:Intelligence analysis systems• ARCHITECTURE:components• HANDS-ON:what’s on your VM• MOD01:data scapping• MOD02:data storage• MOD03:data viz (Maltego/Netglub:
building transforms )• FINI:other stuff and Q/A
![Page 3: nullcon 2011 - Building an intelligence analysis systems using open source tools](https://reader035.vdocument.in/reader035/viewer/2022062513/5576299dd8b42a4e1c8b53b7/html5/thumbnails/3.jpg)
3http://null.co.in/ http://nullcon.net/
INTRO:intelligence analysis
• IA is a way of reducing ambiguity in highly ambiguous situations (wiki)
![Page 4: nullcon 2011 - Building an intelligence analysis systems using open source tools](https://reader035.vdocument.in/reader035/viewer/2022062513/5576299dd8b42a4e1c8b53b7/html5/thumbnails/4.jpg)
4http://null.co.in/ http://nullcon.net/
INTO:Intelligence analysis
• Or rather - making large volumes of information available at your fingertips for your personal joy of owning personal custom google ;-)
![Page 5: nullcon 2011 - Building an intelligence analysis systems using open source tools](https://reader035.vdocument.in/reader035/viewer/2022062513/5576299dd8b42a4e1c8b53b7/html5/thumbnails/5.jpg)
5http://null.co.in/ http://nullcon.net/
Bits and bolts of IA system
• What we build:Data
scrappers
Datascrappers
Massive
Data
storage
IndexingAnd
Searchingcapabilities
Some sort of UIAnd vizualization
NLP
![Page 6: nullcon 2011 - Building an intelligence analysis systems using open source tools](https://reader035.vdocument.in/reader035/viewer/2022062513/5576299dd8b42a4e1c8b53b7/html5/thumbnails/6.jpg)
6http://null.co.in/ http://nullcon.net/
Scrappers
• HTTP: web crawlers, RSS feed parsers, forum crawlers, social media etc
• IRC bots• ... Yer own ..HANDS ON: on prepared VMYou’ll find some samples, which we areGoing to play with. Roll your sleeves :-)
![Page 7: nullcon 2011 - Building an intelligence analysis systems using open source tools](https://reader035.vdocument.in/reader035/viewer/2022062513/5576299dd8b42a4e1c8b53b7/html5/thumbnails/7.jpg)
7http://null.co.in/ http://nullcon.net/
Data storage
• Small amounts of data: files (local, HDFS)
• SQL databases: works but scale poorly
• Non-sql key value storage works too and scales wellHANDS-ON: we’ve got a bit of both
On VM
![Page 8: nullcon 2011 - Building an intelligence analysis systems using open source tools](https://reader035.vdocument.in/reader035/viewer/2022062513/5576299dd8b42a4e1c8b53b7/html5/thumbnails/8.jpg)
8http://null.co.in/ http://nullcon.net/
Post-analysis• Language correction
(slang, misspellings etc)• Language translation
(taking chinese/russian/.. Feeds)
• Custom “synonymous” word matching
• Similarity hashing functions and more
![Page 9: nullcon 2011 - Building an intelligence analysis systems using open source tools](https://reader035.vdocument.in/reader035/viewer/2022062513/5576299dd8b42a4e1c8b53b7/html5/thumbnails/9.jpg)
9http://null.co.in/ http://nullcon.net/
Post analysis tools
• Hands on:– SOLR– RIAK Search
![Page 10: nullcon 2011 - Building an intelligence analysis systems using open source tools](https://reader035.vdocument.in/reader035/viewer/2022062513/5576299dd8b42a4e1c8b53b7/html5/thumbnails/10.jpg)
10
http://null.co.in/ http://nullcon.net/
UI and viz
• A few tools that we are going to play with:– Custom web UI– Maltego (including
building custom transforms)
– Netglub (if have time)
![Page 11: nullcon 2011 - Building an intelligence analysis systems using open source tools](https://reader035.vdocument.in/reader035/viewer/2022062513/5576299dd8b42a4e1c8b53b7/html5/thumbnails/11.jpg)
11
http://null.co.in/ http://nullcon.net/
HANDS-ON• So boot VM and lets get started :-)
Some codez!yeh!!!!!!
![Page 12: nullcon 2011 - Building an intelligence analysis systems using open source tools](https://reader035.vdocument.in/reader035/viewer/2022062513/5576299dd8b42a4e1c8b53b7/html5/thumbnails/12.jpg)
12
http://null.co.in/ http://nullcon.net/
HANDS-ON
• On your VM:– Instructions in docs folder– MOD01 MOD02 and MOD03 are different– Sections that we are going to play with
– You will need internet connection and some URLz to play with. You’ll get idea
![Page 13: nullcon 2011 - Building an intelligence analysis systems using open source tools](https://reader035.vdocument.in/reader035/viewer/2022062513/5576299dd8b42a4e1c8b53b7/html5/thumbnails/13.jpg)
13
http://null.co.in/ http://nullcon.net/
Objectives
• To get the sh* working ;)• To write some code (maybe)
• To exchange ideas
• Did I say beer? ;)
![Page 14: nullcon 2011 - Building an intelligence analysis systems using open source tools](https://reader035.vdocument.in/reader035/viewer/2022062513/5576299dd8b42a4e1c8b53b7/html5/thumbnails/14.jpg)
14
http://null.co.in/ http://nullcon.net/
MOD01
• Doing scrappers:– Nutch
• Customization and custom plugins• Custom scrapping and indexing• Data into solr
– Ebot• Data into RIAK storage
– (if we have time, we’ll look into more)
![Page 15: nullcon 2011 - Building an intelligence analysis systems using open source tools](https://reader035.vdocument.in/reader035/viewer/2022062513/5576299dd8b42a4e1c8b53b7/html5/thumbnails/15.jpg)
15
http://null.co.in/ http://nullcon.net/
MOD02
• Storage and processing:– SOLR (details in doc)
![Page 16: nullcon 2011 - Building an intelligence analysis systems using open source tools](https://reader035.vdocument.in/reader035/viewer/2022062513/5576299dd8b42a4e1c8b53b7/html5/thumbnails/16.jpg)
16
http://null.co.in/ http://nullcon.net/
MOD02.2
• Key-value -> RIAK and ERL– Details in doc
![Page 17: nullcon 2011 - Building an intelligence analysis systems using open source tools](https://reader035.vdocument.in/reader035/viewer/2022062513/5576299dd8b42a4e1c8b53b7/html5/thumbnails/17.jpg)
17
http://null.co.in/ http://nullcon.net/
MOD03
• Extracting and making use of data– Custom UI in 3 minutes (doc #1)
– Using maltego client to eat your data– Transforms - custom builds and tweaking
![Page 18: nullcon 2011 - Building an intelligence analysis systems using open source tools](https://reader035.vdocument.in/reader035/viewer/2022062513/5576299dd8b42a4e1c8b53b7/html5/thumbnails/18.jpg)
18
http://null.co.in/ http://nullcon.net/
MOD03.2
• Netglub - opensource maltego on drugs
![Page 19: nullcon 2011 - Building an intelligence analysis systems using open source tools](https://reader035.vdocument.in/reader035/viewer/2022062513/5576299dd8b42a4e1c8b53b7/html5/thumbnails/19.jpg)
19
http://null.co.in/ http://nullcon.net/
Other topics of interest
• NLP• Nilsimsa hashing and applications• Language correction algorithms