jdd2014: multitenant search - pablo barros

39

Upload: proidea

Post on 01-Jul-2015

185 views

Category:

Software


1 download

DESCRIPTION

With Search playing such a big part on modern applications, provisioning robust search solutions that provide the proper level of security and low maintenance costs in multitenant applications become an entire new challenge. In this session, we define the requirements for multitenant search and review different patterns and solutions available to tackle this challenge.

TRANSCRIPT

Page 1: JDD2014: Multitenant Search - Pablo Barros
Page 2: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Mul@tenant  Search  JDD  2014,  Krakow  -­‐  PL  

Pablo  Barros  Applica@ons  Architect  October  14,  2014  

2  

Page 3: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

•  The  opinions  and  views  expressed  in  this  talk  are  my  own,  and  do  not  necessarily  reflect  the  opinions  or  views  of  my  employer.  

3  

Page 4: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

About  me  

4  

Page 5: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

About  me  

5  

Page 6: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Giveaway  

•  Elas@csearch  Server  –  Second  Edi@on  – By  Rafal  Kuc  

6  

Page 7: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Agenda  

Key  Concepts  and  PiUalls  of  Mul@tenancy  

Designing  the  Search  Index  

Defining  the  Cluster  Topology  

Integra@ng  with  your  Applica@on  

Q&A  

7  

1  

2  

3  

4  

5  

Page 8: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Overview  Key  Concepts  and  Pi>alls  of  MulCtenant  Search  

8  

Page 9: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Defining  Mul@tenancy  

           

“Single  so^ware  instance  serving  mul@ple  customers.”    

9  

Page 10: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Benefits  

•  Sharing  of  Resources  •  Lower  Costs  •  Easier  Horizontal  Scaling  • Quicker  onboarding  of  new  Customers  • Data  Aggrega@on  •  Simpler  Release  Processes  •  “Green”  

10  

Page 11: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

PiUalls  &  Risks  

• Resource  Sharing  Limits  • Requires  more  Customiza@on    capabili@es  • Higher  Complexity  • Data  Security    

11  

Page 12: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Search  Engine                          

Topology  

12  

Your  Applica@on  

Search    Engine  Search  Cluster  1..N  

         

Read/Write        

Read        

Hub/Tribe  Node  

Page 13: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Designing  the  Search  Index  

13  

Page 14: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Index  Logical  Granularity  

14  

vs.  

Page 15: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Shared  Indices  

•  Schema-­‐less  Index  • Pros  – True  Global  search  – Intermixed  Results  across  Customers/En@@es  

• Cons  – Cross  Tenant  Data  Security  – Weaker  data  separa@on  – Index  corrup@on  can  affect  en@re  Search  – Ability  of  indexing  data  in  parallel  diminished  

15  

Page 16: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Dedicated  Indices  

• Pros  – Bejer  data  separa@on  – More  modular/portable  – Bejer  parallel  indexing  capabili@es  

• Cons  – More  storage  – Global  search  is  more  limited  •  However,  some  search  engines  allow  searching  across    indexes  and  even  across  clusters  (Elas@csearch  Tribe  Node)  

16  

Page 17: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Indexing  Process  &  Storage  

17  

Page 18: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Indexing  Process  &  Storage  

18  

Token   Pointer  

Droid   1,  2,  3  

Look   2,  3  

Rain   1  

Doc  1:  …  

Doc  2:  …  

Doc  3:  …  

Page 19: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Indexing  with  Storage  Enabled  

“total”: 2,!“hits”: [!{!!“id”: 1,!!“text”: “These are not the <b>droids</b> you are looking for.”,!!“date”: “2014/10/8”!

},!{!!“id”: 2,!!“text”: “However, those are the <b>droids</b> you are looking for.”,!!“date”: “2014/10/8”!

} ]!

19  

Page 20: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Indexing  with  Storage  Disabled  

!“total”: 2,!“hits”: [!{!!“id”: 1,!

},!{!!“id”: 2,!

} ]!

20  

Page 21: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Storing  and  Retrieving  Original  Indexed  Data  

21  

Full  Document   IDs  Only  Pros   •  Avoid  hiong  database  on  

your  applica@on  •  Snippet  highligh@ng  

•  Storage  on  Search  Engine  file  system  is  light  

•  Small  response  payload  Cons   •  Extra  storage  on  Search  

Engine  file  system  •  Access  control  needs  to  be  built  in  the  index  

•  Reliance  on  database  for  reading  data  to  show  users  

Page 22: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Parent-­‐child  Rela@onships  

• Defines  1-­‐to-­‐many  rela@onship  between  entries  in  different  indices  • Convenient  when  pushing  rela@onal  data  into  Index  • Parent  can  be  updated  without  re-­‐indexing  children  

22  

Customer   Order  1   0..*  

Page 23: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Mul@ple  Languages  

•  Leverage  language  auto-­‐detec@on  •  Leverage  stop  words  •  Limit  amount  of  stemming  • Op@on:  – Single  entry  in  mul@ple  languages  • Merge  value  in  different  languages  into  single  field  •  Pro:  Simple  implementa@on.  Search  can  be  performed  in  any  language  •  Con:  Match  might  include  homonym  in  other  languages  

23  

Page 24: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Defining  the  Search  Cluster  Topology  

24  

Page 25: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Node  2                

Shards  

25  

Node  1                

1   2   3  

4   5  

Page 26: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Node  2                

Replicas  

26  

Node  1                

1   2   3  

4   5   1R   2R   3R  4R   5R  

Page 27: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Cluster  

• Approach  depends  on  what  your  framework  has  to  offer  •  Elas@csearch  provides  a  lot  of  support  out  of  the  box  • Considera@ons:  – Cluster  Segmenta@on  (Few  Smaller  vs  Single  Large?)  – Geographical  Distribu@on  – Searching  across  Clusters  – Write/Read  Ra@o  

27  

Page 28: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Hub  

• Aware  of  All  Clusters  • Maintains  map  of  Tenant  -­‐>  Cluster  •  Serves  as  discovery  mechanism  for  the  Client  Applica@on  • Able  to  create/pause/move/delete  Tenants  

28  

Page 29: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Hub  Tenant  Discovery  Service  

29  

Page 30: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Integra@ng  with  your  Applica@on  

30  

Page 31: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Indexing  vs  Querying  

•  Expected  load  on  wri@ng/reads  • Depends  on  Problem  Domain  of  Client  Applica@on  • Writes  are  expensive!  – Specially  if  not  done  in  bulk  

• Reads  are  fairly  cheap  

31  

Page 32: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Ini@al  Data  Load/Full  Re-­‐index  

• Perform  Ac@ons  in  Bulk  – Minimize  overall  number  of  Lucene  Commits  

• Consider  enabling  External  “Versioning”  – Safely  parallelize  indexing  requests  

• Keep  track  of  documents  that  failed  to  index  

32  

Page 33: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Incremental  Indexing  

33  

Page 34: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Incremental  Indexing  

• Monitor  Indexing  requests  delay  • Message  customers  accordingly  – i.e.    

Search  Results  might  not  include  recently  updated  entries.    

Page 35: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Disaster  Recovery  

•  Take  advantage  of  what  your  Framework  offers  you  – i.e.  Replica@on  in  Elas@c  Search  

• Nightly  Backups  +  Replay  of  changes  since  Backup  crea@on  • Avoid  star@ng  from  scratch!  

35  

Page 36: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Final  Thoughts  

• Recent  Open-­‐source  Tooling  (*cough*  Elas%csearch)  makes  it  easy  • Consider  Carefully:  – Design  and  granularity  of  your  tenant  in  the  Search  engine  – Define  En@@es  and  their  Rela@onships  – Sharding  and  Replica@on  Schemes  – Clustering  Distribu@on    •  i.e.  per  applica@on  Installa@on,  geographically,  etc.  

– High  Availability  Mechanisms  

36  

Page 37: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |  

Q&A  Thank  you!  

Page 38: JDD2014: Multitenant Search - Pablo Barros

Copyright  ©  2014  Oracle  and/or  its  affiliates.  All  rights  reserved.    |   38  

Page 39: JDD2014: Multitenant Search - Pablo Barros