couchbase usage at symantec
TRANSCRIPT
About Me
Gaurav ChandnaAt Symantec for 11 years.
Worked at AntiSpyware, Behavioral malware, Norton and now building platforms.
Engineering Manager previously
@xyxxy
2
Copyright © 2014 Symantec Corporation5
Environment
• Service Oriented Architecture
• Message Oriented Middleware
• Multi-tenant
• Multiple data centers
Copyright © 2014 Symantec Corporation13
Elasticsearch
• Document oriented search engine
• Based on Apache Lucene project
• JSON based
• Distributed
• Scale out
• Multi-tenancy
• RESTful
Copyright © 2014 Symantec Corporation14
Elasticsearch
• Structured search
• Unstructured search
• Analytics
• Combine
Copyright © 2014 Symantec Corporation16
Well… kind of
• Elasticsearch doesn’t recommend using it as data store– Due to write coalescing
• Limited support for replication between data centers– Especially across unreliable networks
Copyright © 2014 Symantec Corporation17
New pipeline
Message Broker
Service
Couchbase
ElasticSearch
Query & Visualization
Copyright © 2014 Symantec Corporation21
Just one more thing…
• Document stored on Couchbase After X days, the document gets purged
• XDCR will purge in Elasticsearch also
• Forked the Couchbase – Elasticsearch connector to disable deletion
• Looking to integrate those changes to Couchbase for community use
Copyright © 2014 Symantec Corporation24
Relationships are everywhere
Engineering
Gaurav
iPhone Mac
John
PC Android
Copyright © 2014 Symantec Corporation27
You get information through queries
• Get all members of a particular group
• All groups ordered by number of people in the group
• List all members of a group, order by location
• List all groups where users have iPhone, ordered by total number of people
• And so on..
Copyright © 2014 Symantec Corporation28
Engineering considerations
• Ability to add new types – Don’t make big changes when iWatch came out
• Ability to extend the types
• Add new relationships quickly
Copyright © 2014 Symantec Corporation29
Table stakes
• Cross data center replication
• Performance
• Horizontal scaling
• Searching
• Querying capability
Copyright © 2014 Symantec Corporation30
Initial implementation
• {
• ”type": ”person”,
• "name": ”Gaurav",
• "modified": "2015-04-17T00:39:45.468Z",
• "_id": "79014cbb-69ce-4459-8f4a-6edaf4b8313a",
• // other information
• }
• {
• ”type": ”department”,
• "name": ”Engineering",
• "_id": "79014cbb-69ce-4459-8f4a-6edaf4b8313a",
• // other information
• }
Copyright © 2014 Symantec Corporation31
Initial version
• Store different objects with types as documents
• Create views in Couchbase– Type– Relationships between types– Specialize on special types
• Ensure persistence during operations
• Works well… to a point
Copyright © 2014 Symantec Corporation33
Second iteration
• {
• "_id":"Well_crafted_id”
• // Other meta information
• // Other associated information
• }
• Relationship information embedded in the _id
• Takes advantage of key access to document by id
• Can still build views on these documents
• Does require more hand crafting to get better performance
• Maintain flexibility
• Need to play around with different APIs to see which work best for use case
• More generic code
Copyright © 2014 Symantec Corporation35
Use Elasticsearch for search!
• Couchbase is primary data store
• Use Couchbase XDCR for one direction replication – Like time series data– But don’t disable deletes
Message Queue
Service
Couchbase
Elasticsearch
Copyright © 2014 Symantec Corporation36
But…
• Elasticsearch prefers types to be well defined for indexing
• Code is generic for extensibility
• Couchbase – Elasticsearch plugin only supports 1 configuration
• Different services have different needs
Copyright © 2014 Symantec Corporation37
• More modifications to the Elasticsearch – Couchbase connector
• Custom code to deduce types for compatibility with Elasticsearch
• Per bucket configurations so deletes for time series but not for relational data
• Keep dynamically defined types and the code generic
Copyright © 2014 Symantec Corporation38
Final thoughts
• Data modeling is extremely important
• JSON Parser – experiment for your use case
• Experiment with the different API options for your needs
• Don’t be afraid to customize for your needs!
• Polyglot persistence – Matt’s presentation
Thank you!
Copyright © 2014 Symantec Corporation. All rights reserved. Symantec and the Symantec Logo are trademarks or registered trademarks of Symantec Corporation or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners.
This document is provided for informational purposes only and is not intended as advertising. All warranties relating to the information in this document, either express or implied, are disclaimed to the maximum extent allowed by law. The information in this document is subject to change without notice.
Gaurav [email protected]: @xyxxy
Thank you!
Copyright © 2014 Symantec Corporation. All rights reserved. Symantec and the Symantec Logo are trademarks or registered trademarks of Symantec Corporation or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners.
This document is provided for informational purposes only and is not intended as advertising. All warranties relating to the information in this document, either express or implied, are disclaimed to the maximum extent allowed by law. The information in this document is subject to change without notice.
We are hiringwww.symantec.com/careers