fluentd and aws at classmethod
DESCRIPTION
Presented at http://connpass.com/event/5222/TRANSCRIPT
Mar 21, 2014
www.treasuredata.com/
Fluentd & AWS!
Masahiro NakagawaTreasure Data, Inc
1
Who are you?
• Masahiro Nakagawa
• @repeatedly
• Treasure Data, Inc
• Senior Software Engineer
• Fluentd, td-agent, etc...
• Dlang, MessagePack, ...
2
Treasure Data on AWS
4
FrontendQueue
Worker
Hadoop
Fluentd
Applications push metrics to Fluentd(via local Fluentd)
Librato Metricsfor realtime analysis
Treasure Data
for historical analysis
Fluentd sums up data minutes(partial aggregation)
Backend overviewImpalaPresto
Hadoop
Used AWS products
• RDS
• Store service data
• Queue / Scheduler
• S3
• Columnar storage
• EC2
• Clusters: Hadoop, Workers, APIs, etc…6
SeparateStorage and Processor!
Classmethod use case!
7
Fluentd(Treasure Agent)
8
Structured logging
Reliable forwarding
Pluggable architecture
http://fluentd.org/
Collect Store Process Visualize
Data source
Reporting
Monitoring
Data Processing
Related Products
Store Process
Cloudera
Horton Works
Treasure Data
Collect Visualize
Tableau
Excel
R
easier & shorter time
???
Before…
12
Application
・・・
Server2
Application
・・・
Server3
Application
・・・
Server1
FluentLog Server
High Latency!must wait for a day...
Divide & Conquer & Retry
13
error retry
error retry retry
retry
After!
14
Application
・・・
Server2
Application
・・・
Server3
Application
・・・
Server1
Fluentd Fluentd Fluentd
Fluentd Fluentd
In streaming!
Lambda Architecture
15
http://www.drdobbs.com/database/applying-the-big-data-lambda-architectur/240162604
In short
• Open sourced log collector written in Ruby
• Customization is essentialsmall core + many plugins
16
Fluentd is a robust log collectordesigned for processing data streams
Core Plugins
• Divide & Conquer
• Buffring & Retrying
• Error handling
• Message routing
• Parallelize
• read / receive data
• write / send data
17
M x N → M + N
18
Nagios
MongoDB
Hadoop
Alerting
Amazon S3
Analysis
Archiving
MySQL
Apache
Frontend
Access logs
syslogd
App logs
System logs
Backend
Databasesbuffer / buffer / routing
Pluggable Architecture
19
Buffer Output
Input
> Forward> HTTP> File tail> dstat> ...
> Forward> File> MongoDB> ...
> File> Memory
Engine
Output
> rewrite> ...
Pluggable Pluggable
Next release
20
• Fluentd v0.10.45
• in_tail supports multiline and * watch
• in_exec supports json / msgpack
• several fixes
• td-agent 1.1.19
AWS use cases
21
Collecting instance logs
22
• A sign of Immutable Infrastructure
• Hard to manage state-full instance
• Almost instance should be disposable
• Excluding DB, Master, etc...
• How to manage such instance logs?
• Common problem on Cloud environment
• Start Fluentd at launch phase
• It is also useful for Docker / other containers
• Including metadata or host to identify
Collecting using Fluentd
23
Collector Aggregator
AWS Plugins
24
http://fluentd.org/plugin/
• s3
• dynamodb
• redshift
• rds
• elb
• cloudwatch
• sns
• sqs
• ses
• kinesis (soon!)