silicon valley code camp 2010: social platforms : what goes on under the hood

26
Social Platforms What goes on under the hood Manish Pandit Silicon Valley Code Camp 2010

Upload: manish-pandit

Post on 15-Jan-2015

2.261 views

Category:

Technology


1 download

DESCRIPTION

In this session I'd share the design, architecture and implementation of some of the most common elements of any social platform - Open API, profiles, searches, lists and activity streams. These "pillers" of a social platform bear most of the weight behind a jazzy UI, and scaling them has its own challenges. I will also talk about how we built the Social Platform at IGN from ground up, including not-so-unique challenges like integration with legacy systems.

TRANSCRIPT

Page 1: Silicon Valley Code Camp 2010: Social Platforms : What goes on under the hood

Social PlatformsWhat goes on under the hood

Manish Pandit

Silicon Valley Code Camp 2010

Page 2: Silicon Valley Code Camp 2010: Social Platforms : What goes on under the hood

Agenda

• Components of Social Platform• Challenges• Technology• API• Engineering Process

Page 3: Silicon Valley Code Camp 2010: Social Platforms : What goes on under the hood

Components

• User Profile• Relationships• Activity Streams

Page 4: Silicon Valley Code Camp 2010: Social Platforms : What goes on under the hood

Components::User Profile

• Basic user information• Extended information based on the context– Player Cards

• Bragging rights – Points, levels– Achievements, badges

• Activities

Page 5: Silicon Valley Code Camp 2010: Social Platforms : What goes on under the hood

Components::Relationships

• Friending– Unidirectional (the Twitter model)– Bidirectional (the Facebok model)

• Association– Comments– Ratings, Thumbs– Bookmarking/favoriting

• Recommendations

Page 6: Silicon Valley Code Camp 2010: Social Platforms : What goes on under the hood

Components::Activity Streams

• Who is doing what and when• All about Actors, Actions, Objects and Targets• Activitystrea.ms standard vs. OpenSocial• Commentable

Page 7: Silicon Valley Code Camp 2010: Social Platforms : What goes on under the hood

Challenges

• Authentication• Performance– ActivityStrems

• Integration• Flexibility• Testing

Page 8: Silicon Valley Code Camp 2010: Social Platforms : What goes on under the hood

Challenges::Authentication

• People are tired of creating accounts on every site

• Need to support existing login method if the platform caters to an existing audience– Existing auth may not work well with Open API

initiatives• Open API and Oauth– 2 legged: Service to Service– 3 legged: User to App to Service

Page 9: Silicon Valley Code Camp 2010: Social Platforms : What goes on under the hood

Challenges::Performance

• Identify the bottlenecks • Measure everything• Use CDNs for all static content• Front end optimization via async loading• Database optimization via indexes, sharding• Caching • Scaling the sorts• Scaling up vs. Scaling out

– CAP theorem– Relational vs. NOSQL storage– Read vs. Write heaviness

Page 10: Silicon Valley Code Camp 2010: Social Platforms : What goes on under the hood

Challenges::ActivityStreams

• Query vs. Propagation– Queries are read heavy– Propagation is write heavy– Deletion is a pain with propagation

• Activity Aggregation– Aggregation on actor vs. object

• Normalized vs. Denormalized storage– Comments

• Decorating the activities on each request

Page 11: Silicon Valley Code Camp 2010: Social Platforms : What goes on under the hood

Challenges::Integration

• Integration with legacy touchpoints • Opening up the API– More channels like Mobile– More independent applications

• Rate limiting and access control• Don’t forget existing data– Data outlives code

Page 12: Silicon Valley Code Camp 2010: Social Platforms : What goes on under the hood

Challenges::Flexibility

• Flexibility in the code to adapt changing requirements quickly and seamlessly– Good design– DRY SOCs

• Flexibility in the infrastructure to adapt changing traffic and behavior– Virtualization– Heavy replication

• Flexibility in the team to respond to changes– Process

Page 13: Silicon Valley Code Camp 2010: Social Platforms : What goes on under the hood

Challenges::Testing

• Automated Testing wherever possible• Developer Focus on test coverage (80+%)• Continuous Integration and Deployment– Cucumber + Hudson

• Cross browser testing (yes, including IE)

Page 14: Silicon Valley Code Camp 2010: Social Platforms : What goes on under the hood

Technology

•Java •RubyServices•MemcachedCaching•MySQL•MongoDB (and Voldemort for a while)Persistence

•RabbitMQMessaging

Page 15: Silicon Valley Code Camp 2010: Social Platforms : What goes on under the hood

Technology:: Services

• Java services– Tomcat with Shindig 1.1, 4 nodes– REST/JSON

• Ruby – Rails Admin App for moderation and points/levels– Migration Scripts– Twitter bot for routing #myign tweets to the platform– Misc. scripts to invalidate memache keys and test

service endpoints

Page 16: Silicon Valley Code Camp 2010: Social Platforms : What goes on under the hood

Technology::Caching

• Memcached – Extremely trivial to set up and maintain– Almost never dies– Massive scale out– Careful with• Cache hotspots• Concurrent writes• On the fly scale-out• Key/Value size limits

Page 17: Silicon Valley Code Camp 2010: Social Platforms : What goes on under the hood

Technology::Persistence::MySQL

• MySQL– Proven, cheap to develop and operate– Maslow’s hammer– Easy scale out– Hard to store (and retrieve) network graphs– Write scaling with single master– Not the best choice for activitystreams– Schema changes lock the table(s)

Page 18: Silicon Valley Code Camp 2010: Social Platforms : What goes on under the hood

Technology::Persistence::MongoDB

– Awesome write scaling• Great for activity propagation model

– In place updates• Using $push and $set

– Excellent for storing social relationships as documents

– Very easy to cluster• We are running replica pairs, plan to move to replica sets

– Schema-less• No need to run alter scripts on 18M-row table

Page 19: Silicon Valley Code Camp 2010: Social Platforms : What goes on under the hood

Technology::Persistence::MongoDB

• Queryable– Rich Query language ($in, $size, $exists, $slice)– MapReduce for heavy data crunching

• Supports Indexing– You can even index collections inside a document

• Storage – ~4x storage compared to relational data

• Emerging technology– Index defragmentation – $or and indexing (to be supported in 1.7)– Load balancing support in the driver (coming soon)

Page 20: Silicon Valley Code Camp 2010: Social Platforms : What goes on under the hood

Technology::Messaging

• RabbitMQ for messaging– Ease of clustering– Written in Erlang for high performance and

availability– Used for• Propagation of activities• Sending out email alerts• Indexing data in Solr

Page 21: Silicon Valley Code Camp 2010: Social Platforms : What goes on under the hood

API

• Person– GET @self, @friends, @followers, @all, PUT/POST @self, @friends

• Activities– GET @global, @self, @friends, POST @self

• MediaItems– GET @self, @all and POST @self

• AppData– For applications to store/retrieve data as key-value pairs GET/POST

@self• Status

– GET @friends, @self, @followers , POST @self

Page 22: Silicon Valley Code Camp 2010: Social Platforms : What goes on under the hood

Monitoring::Newrelic

• Must have for any Java/Ruby webapp• Monitoring and troubleshooting• Save a ton of $ and time by efficient root

cause analysis tools• Agents for Ruby and Java– IGN Engineers helped write PHP and Memcached

agents

Page 23: Silicon Valley Code Camp 2010: Social Platforms : What goes on under the hood

Engineering Process

• Social Applications and community– Check the pulse of the community

• UserVoice (http://ign.uservoice.com)

– Less is more– Distinguish yourself and focus on your niche

• Be Agile - Release early, release often– Do not shock your audience– Announce the changes/features on a blog

• Eat your own dog food– http://people.ign.com/ign-labs

Page 24: Silicon Valley Code Camp 2010: Social Platforms : What goes on under the hood

Lets talk numbers

• Released July 2010 as beta• Daily API requests ~25M• Daily page views ~30K• Daily Uniques ~12K• 6ms response times• Expected traffic 8-10x with more integration

and mobile platform

Page 25: Silicon Valley Code Camp 2010: Social Platforms : What goes on under the hood

About.me

Manish PanditEngineering Manager, Social Platform at IGN

Email: pandit.manish-at-gmail.comTwitter: @lobster1234LinkedIn: http://www.linkedin.com/in/mpanditBlog: http://contrarianwisdom.blogspot.comMyIGN: http://people.ign.com/mpanditign

Page 26: Silicon Valley Code Camp 2010: Social Platforms : What goes on under the hood

We’re hiring!

• http://corp.ign.com • http://labs.ign.com• http://my.ign.com• http://people.ign.com/ign-labs