rmll 2013 : build your personal search engine using crawlzilla
DESCRIPTION
TRANSCRIPT
![Page 1: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/1.jpg)
Jazz WangJazz WangYao-Tsung WangYao-Tsung Wang
[email protected]@nchc.org.tw
Jazz WangJazz WangYao-Tsung WangYao-Tsung Wang
[email protected]@nchc.org.tw
Build Your Personal Build Your Personal Search Engine using CrawlzillaSearch Engine using Crawlzilla
Build Your Personal Build Your Personal Search Engine using CrawlzillaSearch Engine using Crawlzilla
![Page 2: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/2.jpg)
2/50
WHO AM I ? JAZZWHO AM I ? JAZZ ??WHO AM I ? JAZZWHO AM I ? JAZZ ??• Speaker :
– Jazz Yao-Tsung Wang / ECE Master
– Associate Researcher , NCHC, NARL, Taiwan
– Co-Founder of Hadoop.TW
• My slides are available on the website– http://trac.nchc.org.tw/cloud , http://www.slideshare.net/jazzwang
FLOSS UserFLOSS UserDebian/Ubutnu
Access GridMotion/VLC
Red5Debian Router
DRBL/ClonezillaHadoop
FLOSS EvalistFLOSS EvalistDRBL/ClonezillaPartclone/TuxbootHadoop Ecosystem
FLOSS DeveloperFLOSS DeveloperDRBL/Clonezilla
Hadoop Ecosystem
![Page 3: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/3.jpg)
3/50
WHAT isBig Data?
![Page 4: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/4.jpg)
4/50
Data Explosion!!Data Explosion!!
Source : The Expanding Digital Universe, A Forecast of Worldwide Information Growth Through 2010,March 2007, An IDC White Paper - sponsored by EMChttp://www.emc.com/collateral/analyst-reports/expanding-digital-idc-white-paper.pdf
![Page 5: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/5.jpg)
5/50
Source : Extracting Value from Chaos,June 2011, An IDC White Paper - sponsored by EMChttp://www.emc.com/collateral/about/news/idc-emc-digital-universe-2011-infographic.pdf
According to IDC reports:
2006 161 EB2007 281 EB2008 487 EB2009 800 EB (0.8 ZB) 2010 988 EB (predict)2010 1200 EB (1.2 ZB)2011 1773 EB (predict)2011 1800 EB (1.8 ZB)
Data expanded 1.6x each year !!Data expanded 1.6x each year !!
![Page 6: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/6.jpg)
6/50
Commonly used software tools can't capture, manage and process.
'Big Data' = few dozen TeraBytes to PetaBytes in single data set.
What is Big Data?!What is Big Data?!
Multiple files, totally 20TB
Single Database, totally 20TB
Single file, totally 20TB
Source : http://en.wikipedia.org/wiki/Big_data
![Page 7: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/7.jpg)
7/50
Gartner Big Data Model ?Gartner Big Data Model ?
Challenge of 'Big Data' is to manage Volum, Variety and Velocity.Challenge of 'Big Data' is to manage Volum, Variety and Velocity.
Volume(amount of data)
Velocity(speed of data in/out)
Variety(data types, sources)
Batch Job
Realtime
TB
EB
Unstructured
Semi-structured
Structured
PB
Source :[1] Laney, Douglas. "3D Data Management: Controlling Data Volume, Velocity and Variety" (6 February 2001)[2] Gartner Says Solving 'Big Data' Challenge Involves More Than Just Managing Volumes of Data, June 2011
![Page 8: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/8.jpg)
8/50
12D of Information Management?12D of Information Management?
Source: Gartner (March 2011), 'Big Data' Is Only the Beginning of Extreme Information Management, 7 April 2011, http://www.gartner.com/id=1622715
Big Datais just the beginningOf ExtremInformation
Management
![Page 9: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/9.jpg)
9/50
Possible Applications of Big Data?Possible Applications of Big Data?
Source: http://www.emc.com/collateral/analyst-reports/expanding-digital-idc-white-paper.pdfSource: http://lib.stanford.edu/files/see_pasig_dic.pdf
Top 1 : Human Genomics – 7000 PB / YearTop 2 : Digital Photos – 1000 PB+/ YearTop 3 : E-mail (no Spam) – 300 PB+ / Year
![Page 10: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/10.jpg)
10/50
HOW to deal with Big Data?
![Page 11: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/11.jpg)
11/50
The SMAQ stack for big dataThe SMAQ stack for big data
Source : The SMAQ stack for big data , Edd Dumbill , 22 September 2010 , http://radar.oreilly.com/2010/09/the-smaq-stack-for-big-data.html
![Page 12: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/12.jpg)
12/50
The SMAQ stack for big dataThe SMAQ stack for big data
Source : The SMAQ stack for big data , Edd Dumbill , 22 September 2010 , http://radar.oreilly.com/2010/09/the-smaq-stack-for-big-data.html
![Page 13: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/13.jpg)
13/50Source : The SMAQ stack for big data , Edd Dumbill , 22 September 2010 , http://radar.oreilly.com/2010/09/the-smaq-stack-for-big-data.html
The SMAQ stack for big dataThe SMAQ stack for big data
![Page 14: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/14.jpg)
14/50
Three Core Technologies of Google ....Three Core Technologies of Google ....
• Google shared their design of web-search engine– SOSP 2003 :– “The Google File System”– http://labs.google.com/papers/gfs.html
– OSDI 2004 :– “MapReduce : Simplifed Data Processing on Large Cluster”– http://labs.google.com/papers/mapreduce.html
– OSDI 2006 : – “Bigtable: A Distributed Storage System for Structured Data”– http://labs.google.com/papers/bigtable-osdi06.pdf
![Page 15: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/15.jpg)
15/50
Open Source Mapping of Google Core TechnologiesOpen Source Mapping of Google Core TechnologiesOpen Source Mapping of Google Core TechnologiesOpen Source Mapping of Google Core Technologies
Hadoop Distributed File System (HDFS)Hadoop Distributed File System (HDFS)Sector Distributed File SystemSector Distributed File System
Hadoop Distributed File System (HDFS)Hadoop Distributed File System (HDFS)Sector Distributed File SystemSector Distributed File System
Hadoop MapReduce APIHadoop MapReduce APISphere MapReduce API, ...Sphere MapReduce API, ...Hadoop MapReduce APIHadoop MapReduce API
Sphere MapReduce API, ...Sphere MapReduce API, ...
HBase, HBase, HypertableHypertableCassandra, ....Cassandra, ....
HBase, HBase, HypertableHypertableCassandra, ....Cassandra, ....
S = StorageS = StorageGoogle File SystemGoogle File System
To store petabytes of dataTo store petabytes of data
S = StorageS = StorageGoogle File SystemGoogle File System
To store petabytes of dataTo store petabytes of data
MMapapRReduceeduceTo parallel process dataTo parallel process data
MMapapRReduceeduceTo parallel process dataTo parallel process data
Q = QueryQ = QueryBigTableBigTable
A huge key-value datastoreA huge key-value datastore
Q = QueryQ = QueryBigTableBigTable
A huge key-value datastoreA huge key-value datastore
Google's Stack Open Source Projects
![Page 16: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/16.jpg)
16/50
HadoopHadoopHadoopHadoop
• http://hadoop.apache.org • Apache Top Level Project• Major sponsor is Yahoo!• Developed by Doug Cutting,
Reference from Google Filesystem• Written by Java, it provides HDFS and
MapReduce API• Used in Yahoo since year 2006• It had been deploy to 4000+ nodes in Yahoo• Design to process dataset in Petabyte• Facebook 、 Last.fm 、 Joost are also powered by
Hadoop
![Page 17: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/17.jpg)
17/50
WHO willneeds Hadoop?
Let's take'Search Engine' as
an example
![Page 18: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/18.jpg)
18/50
Search is everywhere in our daily life !!Search is everywhere in our daily life !!Search is everywhere in our daily life !!Search is everywhere in our daily life !!
FileFile
MailMail PidgenPidgen
DatabaseDatabase
Web PagesWeb Pages
![Page 19: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/19.jpg)
19/50
To speed up search, We need “Index”To speed up search, We need “Index”
Keyword
Page Number
![Page 20: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/20.jpg)
20/50
History of Hadoop … History of Hadoop … 2001~20052001~2005
• Lucene– http://lucene.apache.org/– a high-performance, full-featured text search
engine library written entirely in Java. – Lucene create an inverse index of every word in
different documents. It enhance performance of text searching.
![Page 21: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/21.jpg)
21/50
History of Hadoop … History of Hadoop … 2005~20062005~2006
• Nutch – http://nutch.apache.org/ – Nutch is open source web-search software.– It builds on Lucene and Solr, adding web-
specifics, such as a crawler, a link-graph database, parsers for HTML and other document formats, etc.
![Page 22: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/22.jpg)
22/50
History of Hadoop … History of Hadoop … 2006 ~ Now2006 ~ Now
• Nutch encounter performance issue.• Reference from Google's papers.• Added DFS & MapReduce implement to Nutch• According to user feedback on the mail list of Nutch ....• Hadoop became separated project since Nutch 0.8• Nutch DFS → Hadoop Distributed File System (HDFS)• Yahoo hire Dong Cutting to build a team of web search
engine at year 2006.– Only 14 team members (engineers, clusters, users, etc.)
• Doung Cutting joined Cloudera at year 2009.
![Page 23: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/23.jpg)
23/50
Do you like to write notes?Do you like to write notes?
![Page 24: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/24.jpg)
24/50
Tools that I used to write notesTools that I used to write notesOddmuse WikiOddmuse Wikihttp://www.oddmuse.org/
2005~2008
![Page 25: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/25.jpg)
25/50
Tools that I used to write notesTools that I used to write notesPmWikiPmWiki
http://www.pmwiki.org/
![Page 26: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/26.jpg)
26/50
Tools that I used to write notesTools that I used to write notesScrapBookScrapBook
https://addons.mozilla.org/zh-TW/firefox/addon/scrapbook/
2005~NOW
![Page 27: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/27.jpg)
27/50
Tools that I used to write notesTools that I used to write notesTrac = Wiki + Version Control (SVN or GIT)Trac = Wiki + Version Control (SVN or GIT)
http://trac.edgewall.org/
2006~NOW
![Page 28: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/28.jpg)
28/50
Tools that I used to write notesTools that I used to write notesReadItLaterReadItLater
http://readitlaterlist.com/
2010~NOW
![Page 29: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/29.jpg)
29/50
It's painful tosearch all my notes!
![Page 30: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/30.jpg)
30/50
How do I search all my notesHow do I search all my notesfrom different websites?from different websites?
![Page 31: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/31.jpg)
31/50
Feature of CrawlzillaFeature of Crawlzilla
● Cluster-based
● Chinese Word Segmentation
● Multiple Users with Multiple Index Pools
● Re-Crawl
● Schedule / Crontab
● Display Index Database (Top 50 sites, keywords)
● Support UTF-8 Chinese Search Results
● Web-based User Interface
![Page 32: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/32.jpg)
32/50
Hadoop
Tomcat
Crawlzilla System Management
Lucene
NutchJSP + Servlet +
JavaBean
PC1 PC2 PC3
Web UI ( Crawlzil la Website + Search Engine)
System Architecture of CrawlzillaSystem Architecture of Crawlzilla
![Page 33: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/33.jpg)
33/50
Crawlzilla Web UI
Users
\
Index DB
![Page 34: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/34.jpg)
34/50
Comparison with other projectsComparison with other projects
Spidr Larbin Jcrawl Nutch Crawlzilla
Install Rube
Package Install
Gmake Compiler and Install
Java Compiler and Install
Deploy Configure
Files
Provide Auto Installation
Crawl website pages
O O O O O
Parser Content X X X O O
Cluster Computing X X X O O
Interface Command Command Command Command Web-UI
Support Chinese Segmentation
X X X X O
![Page 35: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/35.jpg)
35/50
Multi-user Web Search Cloud Service : Crawlzilla 1.0Multi-user Web Search Cloud Service : Crawlzilla 1.0
▲ Step 1 : Registration
http://demo.crawlzilla.info (1)
(2)
(3)
![Page 36: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/36.jpg)
36/50
Multi-user Web Search Cloud Service : Crawlzilla 1.0Multi-user Web Search Cloud Service : Crawlzilla 1.0
▲ Step 2 : Acceptance Notification
Wait for notification from Administrator !
(1)(2)
![Page 37: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/37.jpg)
37/50
Multi-user Web Search Cloud Service : Crawlzilla 1.0Multi-user Web Search Cloud Service : Crawlzilla 1.0
▲ Step 3 : Login with your account
Login http://demo.crawlzilla.info (1)
(2)
![Page 38: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/38.jpg)
38/50
Multi-user Web Search Cloud Service : Crawlzilla 1.0Multi-user Web Search Cloud Service : Crawlzilla 1.0
▲ Step 4 : Setup Name, URLs and depth
Setup new Search Pool(1)
(2)
(3)
(4)
(5)
![Page 39: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/39.jpg)
39/50
Multi-user Web Search Cloud Service : Crawlzilla 1.0Multi-user Web Search Cloud Service : Crawlzilla 1.0
▲ Step 5 : wait for crawlzilla to generate Index DB
Wait for Crawlzilla to generate Index for you
(1)
![Page 40: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/40.jpg)
40/50
Multi-user Web Search Cloud Service : Crawlzilla 1.0Multi-user Web Search Cloud Service : Crawlzilla 1.0
▲ You could know how long to generate Index DB
Index Pools Management
(1)
(2)
![Page 41: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/41.jpg)
41/50
Multi-user Web Search Cloud Service : Crawlzilla 1.0Multi-user Web Search Cloud Service : Crawlzilla 1.0
▲ Manually Re-Crawl or Delete Index Database
You can manually 'Re-Crawl' and generate new Index DB
(1) (2)
![Page 42: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/42.jpg)
42/50
Multi-user Web Search Cloud Service : Crawlzilla 1.0Multi-user Web Search Cloud Service : Crawlzilla 1.0
Crawlzilla 1.0 Crawlzilla 1.0 多人版雲端服務多人版雲端服務 (8)(8)
▲ 您可以在索引庫管理看到目前爬取已使用的時間
可以於「系統排程」處進行排程重新爬取( schedule )
(1)
(2)
(3)(4)
![Page 43: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/43.jpg)
43/50
Multi-user Web Search Cloud Service : Crawlzilla 1.0Multi-user Web Search Cloud Service : Crawlzilla 1.0
▲ Display the basic information of Index Database
Display the Index Database Basic Informations
(1)(2)
![Page 44: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/44.jpg)
44/50
Multi-user Web Search Cloud Service : Crawlzilla 1.0Multi-user Web Search Cloud Service : Crawlzilla 1.0
▲ Add these syntax to your website
HTML Syntax for Your Personal Search Engine
(1) (2)
![Page 45: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/45.jpg)
45/50
Multi-user Web Search Cloud Service : Crawlzilla 1.0Multi-user Web Search Cloud Service : Crawlzilla 1.0
Total Web pages had been indexed,And Top 50 URLs
(1)
(2)
![Page 46: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/46.jpg)
46/50
Multi-user Web Search Cloud Service : Crawlzilla 1.0Multi-user Web Search Cloud Service : Crawlzilla 1.0
It also shows indexed Document Typesand Top 50 Keywords
(1) (2)
![Page 47: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/47.jpg)
47/50
Crawlzilla Web UI
User
Index DB
ftp:// file:// Skydrive
![Page 48: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/48.jpg)
48/50
Start from Here!Start from Here!
● Crawlzilla Demo Cloud Service● http://demo.crawlzilla.info
● Crawlzilla @ Google Code Project Hosting● http://code.google.com/p/crawlzilla/
● Crawlzilla @ Source Forge (Toturial in English)● http://sourceforge.net/p/crawlzilla/home/
● Crawlzilla User Group @ Google● http://groups.google.com/group/crawlzilla-user
● NCHC Cloud Computing Research Group● http://trac.nchc.org.tw/cloud
![Page 49: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/49.jpg)
49/50
Authors of CrawlzillaAuthors of Crawlzilla
Waue Chen (Left) [email protected]
Rock Kuo (Center)[email protected]
Shunfa Yang (Right)[email protected]
![Page 50: RMLL 2013 : Build Your Personal Search Engine using Crawlzilla](https://reader034.vdocument.in/reader034/viewer/2022051412/54c668b94a7959342b8b45fb/html5/thumbnails/50.jpg)
50/50
Questions?Questions?Slides - http://trac.nchc.org.tw/cloudSlides - http://trac.nchc.org.tw/cloud
Questions?Questions?Slides - http://trac.nchc.org.tw/cloudSlides - http://trac.nchc.org.tw/cloud
Jazz WangJazz WangYao-Tsung WangYao-Tsung Wang
[email protected]@nchc.org.tw
Jazz WangJazz WangYao-Tsung WangYao-Tsung Wang
[email protected]@nchc.org.tw