amazing speed: elasticsearch for the .net developer- adrian carr, codestock 2015
TRANSCRIPT
A little about me…
• Software Engineer & Team Lead o The White Stone Group in Knoxville, TN.
• Previouslyo Jewelry Televisiono Alltel/Fidelityo One terrible non-profit in Boone, NC that I don’t like to talk about.
• Several small startups
Experience:• Full stack developer, C#, Ruby/Rails, JavaScript, Java, etc…….
“Business Problem Solver”
Really slow sometimes…
• Some legacy database decisions that weren’t so good back in the day.
• Worked just fine when volume was lower.• Re-architecting would mean changing too
many apps.• Maybe you just have too much data to search
efficiently, or it isn’t structured for search.
I’ve seen this before…
• Jewelry Television• At the time, a lot of the database design
wasn’t very good, but it worked.
o ~$200 million when I started in 2005o ~$550 million three years later
I found:
http://blog.wikimedia.org/2014/01/06/wikimedia-moving-to-elasticsearch/
Elasticsearch is used here:
And here:
https://www.elastic.co/assets/bltdfb74654a34935fa/case-study-github.pdf
And here:
http://meta.stackexchange.com/questions/160100/a-new-search-engine-for-stack-exchange
What is Elasticsearch?
Elasticsearch is a distributed, open source search and analytics engine, designed for horizontal scalability, reliability, and easy management
Terminology
• Node: DB Instance. (Java process running Elasticsearch)
• Cluster: Database Cluster- One or more nodes with same cluster name
• Index: Database, logical grouping of tables• Type: Database Table• Document: Like a row. JSON, key-value pairs• Fields: Columns• Shard: Worker processes. (These mostly happen
automagically by Elastic)
Tools
• Elasticsearch (Obviously)o Demo of how to install and run. (It’s easy.)
• Browser http://localhost:9200/, etc.
• Sense- a fantastic Chrome plugin
• C# with Nest <Multiple Demos of Indexing and Searching>
How to Load & Synchronize Data?
• It depends…
This is what I did:• SQL Server Triggers on data that matters• Windows Service
o Nest, C#, Elastic’s Bulk API (Found 10,000 rows at a time to be the sweet spot for insertion speed)
Gotchas
• Elasticsearch.net vs Nest• Documentation on Elasticsearch is fantastic.
Documentation on Nest is very sparse.
More Gotchas
• It’s a different way of thinking. o Not RDBMSo Joins? Nope
• Case sensitivity• Index size- disk space can grow a lot• Integration with existing application. Are your
existing users going to have their cheese moved?
There is much more to know…
• Analyzers- These can be very complex• Scaling- Elastic has a lot built in, but it can be
tuned quite a bit.• Field weighting- Relevance of blog post vs
body vs comments• This is just a start.
Questions?
• Code samples will eventually be at http://adriancarr.com• Contact:
o https://discuss.elastic.co/
Thank You!