prepared by gregory rokita

Post on 25-Feb-2016

21 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Edmunds’ Pomelo : Automobile Dealership Analytics in Real Time using MongoDB April 3 rd , 2012 Greg Rokita, Sharat Nair Edmunds.com , Inc. Prepared by Gregory Rokita. Assumptions. Understanding of MongoDB Experience with Java - PowerPoint PPT Presentation

TRANSCRIPT

Copyright Edmunds.com,Inc. (the “Company”). All rights reserved. Edmunds®, Edmunds.com®, the Edmunds.com car design, Inside Linesm and AutoObserver® are proprietary trademarks of the Company. This document contains proprietary and/or confidential information of the Company. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Company, and any such disclosure requires the express approval of the Company.

Copyright Edmunds.com,Inc. (the “Company”). All rights reserved. Edmunds®, Edmunds.com®, the Edmunds.com car design, Inside Linesm and AutoObserver® are proprietary trademarks of the Company. This document contains proprietary and/or confidential information of the Company. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Company, and any such disclosure requires the express approval of the Company.

Copyright Edmunds Inc.  (the “Company”).  All rights reserved.Edmunds®, Edmunds.com®, the Edmunds.com car design, Inside Linesm , CarSpacesm and AutoObserver® are proprietary trademarks of the Company. This document contains proprietary and/or confidential information of the Company.  No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Company, and any such disclosure requires the express approval of the Company.

Prepared by Gregory Rokita

Edmunds’ Pomelo: Automobile Dealership Analytics in Real Time using MongoDBApril 3rd, 2012Greg Rokita, Sharat NairEdmunds.com, Inc

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

Assumptionso Understanding of MongoDBo Experience with Javao Basic understanding of serialization protocols e.g.

Thrift, Protocol Bufferso Basic understanding of messaging protocols e.g.

JMS

2

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

Agendao Edmunds

o Scale of Big Data operations o Use case for Pomelo Application

o System Overview & Designo Real time integration with MongoDB o Real time data creation for MongoDB

o Implementationo MongoDB Consumero MongoDB REST service

o Q&A

3

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

Edmunds.com and Scaleo Premier online resource for automotive information

launched in 1995 as the first automotive information Web site

o 15 million unique visitorso 210 million page viewso 1 million+ new inventory items per dayo 2 TB of new data every montho 40 node Hadoop cluster aggregating logs,

transactions, calls, referrals, advertising, vehicle, pricing, inventory and other data sets

o

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

Pomelo Applicationo Analytics tool for Automotive Dealers and

Edmunds’ Dealer Sales o Performance measurement for Edmunds traffic

and its correlation to calls & referrals o iPad, HTML5, Sencha Touch & Charts

5

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

6

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

7

Unifying data for MongoDB

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

8

Processing data for MongoDB-Oozie

9

Populating MongoDB - Publishing System

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

Targeting MongoDB - Producer-Consumer matching

Generic Thrift Producer

MongoDB Consumer

ProdLAXEdmundsGTP

I am

Prod, TestLax, EC2EdmundsMongoDB

Send To ProdLAX, EC2EdmundsGTP

I amTestEC2EdmundsMongoDB

Receive From

BrokerDestinationInterceptor

PublishDealerMetrics

PublishDealerMetrics

DealerMetrics Virtual Topic

DealerMetricsQueue

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

Integration with MongoDB – layered architecture for transport

ActiveMQ

Camel

Thrift

Message persistence, durability and failover

Retries and error handling

Type safety, versioning and service

12

Preparing data for MongoDB - summaryStructured and Unstructured Data (Logs, Calls, Referrals, etc)

Map-Reduce

Source Specific Thrift Objects in HBase

Map-Reduce

Application Specific Thrift Object in HBase

Generic Thrift Producer

Broker

MongoDB Consumer

MongoDB

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

Thrift IDL definition

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

Mongo Connection

<bean id="mongo” class="com.edmunds...MongoDBConnectionFactory">

<property name="address" value="pl1db470.media.edmunds.com:27017,pl1db471.media.edmunds.com:27017"/>

</bean>

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

Mongo Connection - cont’d

@Autowiredpublic MongoDbDealerMetricsConsumer(Mongo mongo) { collection = mongo.getDB(DB_NAME).getCollection(COLLECTION_NAME); collection.ensureIndex(new BasicDBObject(LAST_ACTIVE_DATE, -1));}

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

Mongo consumerprivate void processDealerMetrics(DealerMetrics dealerMetrics) throws TException {

String cddId = dealerMetrics.getCddDealershipId(); BasicDBObject query = new BasicDBObject(); query.put(CDD_ID, cddId); DBObject dmObj = (DBObject) JSON.parse(serializeToJson(dealerMetrics)); /* query - query to match fields - fields to be returned sort - sort to apply before picking first document remove - if true, document found will be removed update - update to apply returnNew - if true, the updated document is returned, otherwise the old document is returned (or it would be lost forever) upsert - do upsert (insert if document not present) */ collection.findAndModify(query, null, null, false, dmObj, true, true); }

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

Public interface to Mongo - Dealer

public List<DBObject> getDocument(String cddId) { final BasicDBObject query = new BasicDBObject(); query.put(CDD_ID, cddId); final DBObject object = collection.findOne(query); object.removeField(OBJECT_ID); object.removeField(LAST_ACTIVE_DATE); return newArrayList(object);}

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

Public interface to Mongo - Active listpublic List<DBObject> getActiveList() { final BasicDBObject query = new BasicDBObject(); query.put(LAST_ACTIVE_DATE, getActiveDate()); query.put(DMA_NAME, getDmaCriteria()); final BasicDBObject keys = new BasicDBObject(); keys.put(OBJECT_ID, 0); keys.put(CDD_ID, 1); keys.put(DEALERSHIP_NAME, 1); return collection.find(query, keys).toArray();}

private Object getActiveDate() { return collection.find().sort(getSortCriteria()).next().get(LAST_ACTIVE_DATE);}

private BasicDBObject getSortCriteria() { return new BasicDBObject(LAST_ACTIVE_DATE, -1);}

private BasicDBObject getDmaCriteria() { return new BasicDBObject("$in", DMAS);}

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

Rest service @GET @Path("{id}") @Produces(MediaType.APPLICATION_JSON) public List<DBObject> get(@PathParam("id") String cddId) { return dealerMetricsMongoDao.getDocument(cddId); }

@GET @Path("list") @Produces(MediaType.APPLICATION_JSON) public List<DBObject> getDealerList() { return dealerMetricsMongoDao.getActiveList(); }

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc. No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds.com, Inc., and any such disclosure requires the express approval of Edmunds.com, Inc.

No part of this document or the information it contains may be used, or disclosed to any person or entity, for any purpose other than advancing the best interests of the Edmunds Inc., and any such disclosure requires the express approval of Edmunds Inc.

Q&A

Greg Rokitagrokita@edmunds.com

Sharat Nairsnair@edmunds.com

20

top related