building saas solutions for online media using apache solr - by alberto mijares
TRANSCRIPT
![Page 1: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/1.jpg)
Building SaaS solutions with Apache Solr
Alberto Mijares, Canoo Engineering [email protected], 26/05/2011
Twitter: @lemaiol
![Page 2: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/2.jpg)
Bullet point time!
2
![Page 3: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/3.jpg)
What I Will Cover
Practical applications of Apache Solr and Apache Lucene: how to increase the time spent by a user in an website and do website “cross-selling”.
Use case: how Canoo helped Axel Springer Switzerland to increased the page impressions, user permanence time and traffic in their financial online newspapers.
Key concepts:• How to achieve this using Lucene & Solr• How to profit from a SaaS business model
3
![Page 4: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/4.jpg)
Who I am
Alberto Mijares Canoo Engineering AG Background in web applications and standards:
• Participated in W3C Semantic Web interest group (SWEO)
• Led web standards compliance tools development in the past (Web Accessibility and Mobile Web)
• Led enterprise information retrieval projects in the recent past
• Actually coaching Google Web Toolkit projects’ development
4
![Page 5: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/5.jpg)
Who is Canoo
People:• Dirk Koenig: Groovy founder• Andres Almiray: Griffon project lead and Java
Champion• Hamlet D’Arcy: Groovy committer and enthusiast• … almost 40 more top software engineers
5
Products:• WebTest: framework for web functional testing• RIA Suite (aka ULC): Java based RIA framework• FindIT: information retrieval and search tools
• WMTrans: language analysis tools
![Page 6: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/6.jpg)
Canoo FindIT
http://www.canoo.com/videos/FindIT.html
6
![Page 7: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/7.jpg)
Stop “bullet-pointing”!
7
![Page 8: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/8.jpg)
The facts
8
Axel Springer group is a market leader
Bilanz, Handelszeitung and Stocks
In Switzerland financials are important!
Financial language is German
Online media is the future
![Page 9: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/9.jpg)
The facts
9
Axel Springer group is a market leader
Bilanz, Handelszeitung and Stocks
In Switzerland financials are important!
Financial language is German
Online media is the future
![Page 10: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/10.jpg)
The gap
Make the online versions more profitable
10
Make all newspapers “market leaders”
![Page 11: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/11.jpg)
The gap
Make the online versions more profitable
11
Make all newspapers “market leaders”
![Page 12: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/12.jpg)
The how
Workshop
12
“Related articles”
“Cross-selling”
![Page 13: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/13.jpg)
The how
Workshop
13
“Related articles”
“Cross-selling”
![Page 14: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/14.jpg)
The analysis
Find a funding model
14
Use Lucene’s “More like this”
Integrate back the suggestions
Implement a selection mechanism
![Page 15: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/15.jpg)
The analysis
Find a funding model
15
Use Lucene’s “More like this”
Integrate back the suggestions
Implement a selection mechanism
![Page 16: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/16.jpg)
The issues
“More like this” was “experimental”
16
Works out-of-the-box only in English
Without “semantics” not always makes sense
Indexing full pages produces noise
![Page 17: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/17.jpg)
The issues
“More like this” was “experimental”
17
Works out-of-the-box only in English
Without “semantics” not always makes sense
Indexing full pages produces noise
![Page 18: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/18.jpg)
The key
18
![Page 19: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/19.jpg)
The key
19
![Page 20: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/20.jpg)
The functional requirements
Discover and index articles
20
Extract only content
Simple and flexible query service
![Page 21: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/21.jpg)
The functional requirements
Discover and index articles
21
Extract only content
Simple and flexible query service
![Page 22: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/22.jpg)
The funding model
22
![Page 23: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/23.jpg)
The business model
23
SaaS
![Page 24: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/24.jpg)
The “other” requirements
Lucene-based analysis pipeline
24
Web oriented platform
Multi-application platform
Reliable, fast and scalable
Plan B?
![Page 25: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/25.jpg)
The “other” requirements
Lucene-based analysis pipeline
25
Web oriented platform
Multi-application platform
Reliable, fast and scalable
Plan B?
![Page 26: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/26.jpg)
The search
Wraps Lucene in a nice way
26
It is mature and Open Source
Supports scheduling, REST API, DIH,…
Scalability out-of-the-box
Well documented and has professional support
![Page 27: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/27.jpg)
The search
Wraps Lucene in a nice way
27
It is mature and Open Source
Supports scheduling, REST API, DIH…
Scalability out-of-the-box
Well documented and has professional support
![Page 28: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/28.jpg)
The plan
From POC to PROD in “80 days”
28
![Page 29: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/29.jpg)
The plan
From POC to PROD in “80 days”
29
![Page 30: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/30.jpg)
The results
Google analytics
30
![Page 31: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/31.jpg)
The results
Google analytics
31
![Page 32: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/32.jpg)
The conclusions
32
![Page 33: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/33.jpg)
The Q&A
33
Thanks!
![Page 34: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/34.jpg)
Sources
Links• http://people.canoo.com/share• http://www.canoo.com• http://www.canoo.net• http://www.leo.org• http://www.bilanz.ch• http://www.handelszeitung.ch• http://www.stocks.ch
34
![Page 36: Building SaaS Solutions for Online Media Using Apache Solr - By Alberto Mijares](https://reader034.vdocument.in/reader034/viewer/2022052700/55a274911a28ab13058b4637/html5/thumbnails/36.jpg)
Architecture
Platform: Apache Solr 1.4.1Architecture:
Solr container Web container
Springer Solr Springer WebApp
Customer 2 Solr Customer 2 WebApp
Customer 3 Solr Customer 3 WebApp
Extern accessIntern access
Requests