all the things we didn’t do · schema vs ad-hoc based search schema-based systems addresses known...
TRANSCRIPT
![Page 1: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/1.jpg)
All The Things We Didn’t Do
Kresten Krab Thorup Humio CTO
![Page 2: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/2.jpg)
A Tale in Three Parts
• About Logging and Metrics Tools
• Product Team Practices
• Careful Engineering — Data Processing Engine
Part 1
Part 2
Part 3
![Page 3: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/3.jpg)
Log Analytics— And Why You Should Care
Part 1
![Page 4: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/4.jpg)
![Page 5: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/5.jpg)
![Page 6: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/6.jpg)
Record Logs, Monitor & Respond
LogAggregation & Analytics
Engine
Metrics/Monitoring:Dashboards/Alerts
Incident response:Log Search, Drill-down
![Page 7: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/7.jpg)
Dimension in Tooling
Logs
Metrics
Historic
Real-Time
Cloud
On-Prem
Schema
Ad-Hoc
![Page 8: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/8.jpg)
Logs vs. MetricsLogs are events — metrics are aggregates of events
Logs have high dimensionality — metrics have low dimensionality
Logs tend to be unstructured — metrics are structured
Logs support drill-down and analysis — metrics leans towards dashboards
and alerting
Logs will vary in volume — metrics have a fixed volume rate
Logs tend to be high volume — metrics tend to be low volume
![Page 9: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/9.jpg)
Dimension in Tooling
Logs
Metrics
Historic
Real-Time
Cloud
On-Prem
Schema
Ad-Hoc
![Page 10: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/10.jpg)
Historic vs Real TimeReal-time processing lets you generate alerts and dashboards
Historic processing is great for incident response and audits
Real-time addresses known issues to look out for
Historic searches lets you look for unknown issues
Real-time needs only CPU processing
Historic data may require a lot of disk storage
![Page 11: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/11.jpg)
Dimension in Tooling
Logs
Metrics
Historic
Real-Time
Cloud
On-Prem
Schema
Ad-Hoc
![Page 12: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/12.jpg)
Cloud vs On-PremisesCloud-based systems may have privacy and security concerns
On-premises are often required in health-care and banking applications
With cloud systems you can pay-as-you-go
On-prem systems requires dedicated hardware
With a cloud solution you don’t need to manage it
On-prem solution requires you to consider ease-of-operations
![Page 13: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/13.jpg)
Dimension in Tooling
Historic
Real-Time
Cloud
On-Prem
Schema
Ad-Hoc
Logs
Metrics
![Page 14: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/14.jpg)
Schema vs Ad-Hoc based SearchSchema-based systems addresses known issues to look out for
With ad-hoc searching, you can dig into new, unknown issues
Setting up schemas is often for the DBA or administrator
Everyone can use free text search and learn things about the system
schema ≠ index, but they often go hand in hand
Keeping around indexes increase disk-storage requirements
Lack of indexes slow down searching
![Page 15: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/15.jpg)
effort / query
effort / insert
![Page 16: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/16.jpg)
Dimension in Tooling
Logs
Metrics
Historic
Real-Time
Cloud
On-Prem
Schema
Ad-Hoc
![Page 17: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/17.jpg)
Log Analytics Sweet Spot
•Record Everything - TB’s of data per day
•Generate metrics from the logs in real-time
•Interactive/ad-hoc search on historic data - 100’s of TB
•Can be installed on-premises (privacy / security)
•Affordable - TCO (hardware, license, operations)
![Page 18: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/18.jpg)
![Page 19: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/19.jpg)
Record Events, Monitor & Respond
Humio
Metrics/Monitoring:Dashboards/Alerts
Incident response:Log Search, Drill-down
![Page 20: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/20.jpg)
Humio—Product Team Practices
Part 2
![Page 21: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/21.jpg)
Be The Customer
• Design target was an on-premise solution
• Co-locate with first customer
• Provide a hosted service “eat our own dog food”⇒
![Page 22: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/22.jpg)
Safe Environment
• “It takes all kinds”
• Be open about strengths and weaknesses
• Be open to learn (and teach) new practices
• Experienced team initially to set practices and culture
![Page 23: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/23.jpg)
Be in doubt!
• Discuss trade offs — not do’s and don’ts
• Leave time to wonder
• No one knows “what’s best”
![Page 24: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/24.jpg)
High BUS factor
• We depend on people. Period.
• Don’t try to make them replaceable
• Everyone is responsible
![Page 25: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/25.jpg)
Choosing Scala
• I ❤ Erlang
• Knowing what Erlang can do for you, coordination code is painful to write and manage in Scala (threadpools, futures, async).
• Use “scala, the good parts”.
![Page 26: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/26.jpg)
Choosing Elm
• Elm similar to React — functional javascript — but with proper syntax and static type checking.
• Tooling and libraries are less mature.
• Takes time for new devs to learn
• Upside is that it is “cool” — we give talks and contribute to the community.
![Page 27: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/27.jpg)
Take small steps — but look up!
• Running a SaaS with frequent deployments teaches you to take small steps.
• Define design goals and discuss tradeoffs. Keep those in mind and work towards that.
• Avoid long-running side-projecs. Feature-flag new work.
![Page 28: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/28.jpg)
Manage critical dependencies
• Own all critical components
• It is tempting (and easy) to pull in 200+ Apache libraries
• We use docker for delivery (reduce customer’s deps)
• Two outside dependencies: HighCharts and Kafka
![Page 29: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/29.jpg)
Don’t waste hardware
“The most amazing achievement of the computer software industry is its continuing cancellation of the steady and staggering gains made by the computer hardware industry.”
—Henry Peteroski
![Page 30: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/30.jpg)
Humio—Data Processing Engine
Part 3
![Page 31: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/31.jpg)
Record Events, Monitor & Respond
Humio
Metrics/Monitoring:Dashboards/Alerts
Incident response:Log Search, Drill-down
![Page 32: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/32.jpg)
State Machine
Event Store
Query/error/i | count()
State Machine
count: 473
count: 243,565
![Page 33: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/33.jpg)
Query Language State Machine
filter … | aggregate()
event: Map[String,String]
![Page 34: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/34.jpg)
Aggregates State Machine
Function State Step Merge Result
count N N+1 N1+N2 N
avg (N, s) (N+1,s+value) (N1+N2, s1+s2) s/N
stddev (N, s, q)(N+1,s+value,
q+value2)(N1+N2, s1+s2,
q1+q2)√(N*q-s2)/N
![Page 35: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/35.jpg)
GroupBy(host, function=count())State Map[String,State2]
Step(G,e) key = e[“host”]map[key] = Step2(map[key])
Merge(G1,G2) ∀key in G1,G2 => result[key] = Merge2(G1[key], G2[key])
Result(G) ∀key in G => result[key] = Result2(G[key])
![Page 36: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/36.jpg)
7 4 3
time
144 3 6 13
3 6 2 11
Time Boxing groupby( time − time % bucket_size )
![Page 37: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/37.jpg)
Query Language State Machine
filter … | aggregate()
![Page 38: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/38.jpg)
Event Store Design
• Build minimal index and compress data
Store order of magnitude more events
• Fast “grep” for filtering events
Filtering and time/metadata selection reduces the problem space
![Page 39: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/39.jpg)
Event Store
10GB (start-time, end-time, metadata)
10GB (start-time, end-time, metadata)
10GB (start-time, end-time, metadata)
10GB (start-time, end-time, metadata)
. . .
![Page 40: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/40.jpg)
Event Store
1GB (start-time, end-time, metadata)
1GB (start-time, end-time, metadata)
1GB (start-time, end-time, metadata)
1GB (start-time, end-time, metadata)
. . .
compress
1 month x 30GB/day ingest 90GB data, <1MB index
1 month x 1TB/day ingest 4TB data, <1MB index
![Page 41: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/41.jpg)
Query
1GB
1GB
1GB 1GB
1GB
1GB 1GB 1GB
1GB
1GB
time
#ds1, #web
#ds1, #app
#ds2, #web
metadata
![Page 42: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/42.jpg)
Query
1GB
1GB
1GB 1GB
1GB
1GB 1GB 1GB
1GB
1GB
time
#ds1, #web
#ds1, #app
#ds2, #web
metadata
10GBState Machine
![Page 43: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/43.jpg)
Filter 1GB data
![Page 44: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/44.jpg)
Filter 1GB data
![Page 45: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/45.jpg)
Filter 1GB data
![Page 46: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/46.jpg)
Filter 1GB data
![Page 47: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/47.jpg)
Filter 1GB data
![Page 48: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/48.jpg)
Brute Force: Grep at 30x• Streaming disk access, use async file I/O
• Compress data at rest (and in OS-level cache)
• Run one JVM per NUMA node
• Critical search code is sticky 1 thread per core.
• Reduce context switching (explicit scheduling)
• Localize data access (each core works on 64k chunks)
Go and find videos and blog posts about “Mechanical Sympathy” (Martin Thompson,
LMAX) and “Why KDB+ is fast”
![Page 49: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/49.jpg)
Event Processing Brute-Force Search
• “Materialized views” for relevant metrics.
• Processed when datais in-memory anyway.
• Fast response times for “known” queries.
• Shift CPU load to query time
• Data compression
• Allows ad-hoc queries
• Requires “Full stack” ownership
![Page 50: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/50.jpg)
effort / query
effort / insert
![Page 51: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/51.jpg)
Log Analytics
• Logging / Metrics Landscape
• Product Team Practices & User Engagement
• Careful Engineering
Part 1
Part 2
Part 3
![Page 52: All The Things We Didn’t Do · Schema vs Ad-Hoc based Search Schema-based systems addresses known issues to look out for With ad-hoc searching, you can dig into new, unknown issues](https://reader034.vdocument.in/reader034/viewer/2022050606/5fadd45cf03b011a073967eb/html5/thumbnails/52.jpg)
Thanks for your time.Kresten Krab Thorup Humio CTO