cassandra at general sentiment. architecture internet spidering nlp cassandra ui restful api hadoop
TRANSCRIPT
CASSANDRA AT GENERAL SENTIMENT
Architecture
Internet
Spidering NLP
CASSANDRA
UI
Restful
API
HADOOP
Cassandra at General Sentiment Schema Batch insertions from Hadoop Hosted on EC2 Montitoring
Schema
Row Key is Entity Name Column Family
Sentiment and volume counts Co-reference counts Entity name inverted index
Column name is a date Column value is serialized data structure
Batch Insertions
From Hadoop No Compaction during insertions No Hinted Handoffs
Cassandra on EC2
Instance types M1.large vs m1.xlarge
Instance disks vs EBS RAID-0 + xfs Scribe for logging
Monitoring
Monit Ganglia