amazing speed: elasticsearch for the .net developer- adrian carr, codestock 2015

22
Amazing Speed: Elasticsearch for the .NET Developer 4:05 PM / 301-D

Upload: adrian-carr

Post on 13-Aug-2015

176 views

Category:

Software


2 download

TRANSCRIPT

Amazing Speed: Elasticsearch for the .NET Developer

4:05 PM / 301-D

Adrian Carr

With special guest- Shaunak Kashyap, Developer Advocate, Elastic.co

A little about me…

• Software Engineer & Team Lead o The White Stone Group in Knoxville, TN.

• Previouslyo Jewelry Televisiono Alltel/Fidelityo One terrible non-profit in Boone, NC that I don’t like to talk about.

• Several small startups

Experience:• Full stack developer, C#, Ruby/Rails, JavaScript, Java, etc…….

“Business Problem Solver”

A little background…

• We have some large clients, with lots of patients.

Really slow sometimes…

• Some legacy database decisions that weren’t so good back in the day.

• Worked just fine when volume was lower.• Re-architecting would mean changing too

many apps.• Maybe you just have too much data to search

efficiently, or it isn’t structured for search.

I’ve seen this before…

• Jewelry Television• At the time, a lot of the database design

wasn’t very good, but it worked.

o ~$200 million when I started in 2005o ~$550 million three years later

The JTV Solution:

The JTV Solution:

• It was awesome!

Business Problem Solver…

A little research…

I found Elasticsearch -built on top of Lucene.

I found:

http://blog.wikimedia.org/2014/01/06/wikimedia-moving-to-elasticsearch/

Elasticsearch is used here:

What is Elasticsearch?

Elasticsearch is a distributed, open source search and analytics engine, designed for horizontal scalability, reliability, and easy management

So, I did a proof of concept

Demo Here:Results in SQL vs Results in Elastic

Terminology

• Node: DB Instance. (Java process running Elasticsearch)

• Cluster: Database Cluster- One or more nodes with same cluster name

• Index: Database, logical grouping of tables• Type: Database Table• Document: Like a row. JSON, key-value pairs• Fields: Columns• Shard: Worker processes. (These mostly happen

automagically by Elastic)

Tools

• Elasticsearch (Obviously)o Demo of how to install and run. (It’s easy.)

• Browser http://localhost:9200/, etc.

• Sense- a fantastic Chrome plugin

• C# with Nest <Multiple Demos of Indexing and Searching>

How to Load & Synchronize Data?

• It depends…

This is what I did:• SQL Server Triggers on data that matters• Windows Service

o Nest, C#, Elastic’s Bulk API (Found 10,000 rows at a time to be the sweet spot for insertion speed)

Gotchas

• Elasticsearch.net vs Nest• Documentation on Elasticsearch is fantastic.

Documentation on Nest is very sparse.

More Gotchas

• It’s a different way of thinking. o Not RDBMSo Joins? Nope

• Case sensitivity• Index size- disk space can grow a lot• Integration with existing application. Are your

existing users going to have their cheese moved?

There is much more to know…

• Analyzers- These can be very complex• Scaling- Elastic has a lot built in, but it can be

tuned quite a bit.• Field weighting- Relevance of blog post vs

body vs comments• This is just a start.

Questions?

• Code samples will eventually be at http://adriancarr.com• Contact:

o [email protected]

o [email protected]

o https://discuss.elastic.co/

Thank You!