aws september webinar series - getting started with dynamodb streams
TRANSCRIPT
dynamodb-pm@
Event Driven Computing Enabled by DynamoDB Streams
Launch update on Cross-region replication and Database Triggers with AWS Lambda Integration
© 2015 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
• What are DynamoDB Streams?• How to build with DynamoDB Streams• Why DynamoDB Streams?
dynamodb-pm@
What are DynamoDB Streams?
• Time ordered and partitioned change log
• Provides a stream of updates, inserts, deletes
• Guaranteed to be delivered only once
• Use Kinesis Client Library (KCL), Lambda, or API to query pre-image, post-image, key, timestamp
• Scales with your table
DynamoDB StreamsDynamoDB
dynamodb-pm@
dynamodb-pm@
Use Cases for DynamoDB Streams
Cross-region replication
post
DynamoDB Streams
Cross Region ReplicationAsia Pacific (Tokyo) EU (Ireland) Replica
US East (N. Virginia)
dynamodb-pm@
Cross region replication library
• Bootstrapping• Horizontal scaling with KCL Workers• Load balancing• Fault tolerance with check pointing
dynamodb-pm@
Cross region replication
Shard 1
DynamoDB Stream
DynamoDB
Partition 1
Shard 2
Shard 3
Partition 2
Table
KCL Worker
KCL Worker
KCL Worker
US East (N. Virginia) EU (Ireland)
dynamodb-pm@
Cross region replication
DynamoDB Stream
DynamoDBTable
US East (N. Virginia) EU (Ireland)
Partition 1
Partition 2
Partition 3
Partition 4
Partition 5
Table
Shard 1
Shard 2
Shard 3
Shard 4
KCLWorker
KCLWorker
KCLWorker
KCLWorker
dynamodb-pm@
Consuming Streams (KCL)AWSDynamoDBstreamsAdapterClient adapterClient = new AWSDynamoDBstreamsAdapterClient(updateStreamsCredentials, .. );..
AmazonDynamoDBClient dynamoDBClient = new AmazonDynamoDBClient(dynamoDBCredentials, ..);..
KinesisClientLibConfiguration workerConfig = new KinesisClientLibConfiguration (.., streamId, updateStreamsCredentials, ..)
.withMaxRecords(100) .withInitialPositionInStream(InitialPositionInStream.TRIM_HORIZON);
Worker worker = new Worker(recordProcessorFactory, workerConfig, adapterClient, dynamoDBClient, ..);
Thread t = new Thread(worker);t.start();
Full code available online dynamodb-pm@
Processing streams (KCL)
public class StreamsRecordProcessor implements IRecordProcessor { .. @Override public void processRecords(List<Record> records,.. ) { for(Record record : records) {
if (record instanceof RecordAdapter) {
Record ddbStreamRecord = ((RecordAdapter) record).getInternalObject();
switch(ddbStreamRecord.getEventName()) { case "INSERT" : case "MODIFY" : DemoHelper.putItem(dynamoDBClient, tableName,
ddbStreamRecord.getDynamodb().getNewImage()); break;
case ”REMOVE" : DemoHelper.deleteItem(dynamoDBClient, tableName,
ddbStreamRecord.getDynamodb().getKeys().get(“Id”).getN()); break; }...
parixitpo@l
dynamodb-pm@
DynamoDB Triggers
• Trigger AWS Lambda functions• Example – Validate address, send notifications
DynamoDB Streams and AWS Lambda
Triggers
Lambda FunctionNotify change
Aggregate tables
External views
CloudSearchElastiCache Lambda Function
dynamodb-pm@
Real-Time Voting
Write-heavy items
Requirements for voting
• Allow each person to vote only once• No changing votes• Real-time aggregation• Voter analytics, demographics
Real-time voting architecture
AggregateVotesTable
Voters
RawVotes Table
Voting App
Partition 11000 WCUs
Partition K1000 WCUs
Partition M1000 WCUs
Partition N1000 WCUs
Votes Table
Candidate A Candidate B
Scaling bottlenecks
50,000/sec
70,000/sec
Voters
Provision 200,000 WCUs
Write sharing
Candidate A_2
Candidate B_1
Candidate B_2
Candidate B_3
Candidate B_5
Candidate B_4
Candidate B_7
Candidate B_6
Candidate A_1
Candidate A_3
Candidate A_4Candidate A_7 Candidate B_8
Candidate A_6 Candidate A_8
Candidate A_5
Voter
Votes Table
Write sharding
Candidate A_2
Candidate B_1
Candidate B_2
Candidate B_3
Candidate B_5
Candidate B_4
Candidate B_7
Candidate B_6
Candidate A_1
Candidate A_3
Candidate A_4Candidate A_7 Candidate B_8
UpdateItem: “CandidateA_” + rand(0, 10)ADD 1 to Votes
Candidate A_6 Candidate A_8
Candidate A_5
Voter
Votes Table
Votes Table
Shared aggregation
Candidate A_2
Candidate B_1
Candidate B_2
Candidate B_3
Candidate B_5
Candidate B_4
Candidate B_7
Candidate B_6
Candidate A_1
Candidate A_3
Candidate A_4
Candidate A_5
Candidate A_6 Candidate A_8
Candidate A_7 Candidate B_8
Periodic Process
Candidate ATotal: 2.5M
1. Sum2. Store Voter
Correctness in voting
UserId Candidate DateAlice A 2013-10-02
Bob B 2013-10-02
Eve B 2013-10-02
Chuck A 2013-10-02
RawVotes Table
Segment VotesA_1 23
B_2 12
B_1 14
A_2 25
AggregateVotes Table
Voter1. Record vote and de-dupe; retry 2. Increment candidate counter
Correctness in aggregation?
UserId Candidate DateAlice A 2013-10-02
Bob B 2013-10-02
Eve B 2013-10-02
Chuck A 2013-10-02
RawVotes Table
Segment VotesA_1 23
B_2 12
B_1 14
A_2 25
AggregateVotes Table
Voter
Real-time voting architecture (improved)
AggregateVotesTable
Amazon Redshift Amazon EMR
Your Amazon Kinesis–
Enabled App
Voters RawVotes TableVoting App RawVotesDynamoDB
Stream
Real-time voting architecture
AggregateVotesTable
Amazon Redshift Amazon EMR
Your Amazon Kinesis-
Enabled App
Voters RawVotes TableVoting App RawVotesDynamoDB
Stream
Handle any scale of election
Real-time voting architecture
AggregateVotesTable
Amazon Redshift Amazon EMR
Your Amazon
Kinesis-Enabled app
Voters RawVotes TableVoting App RawVotesDynamoDB
Stream
Vote only once, no changing votes
Real-time voting architecture
AggregateVotesTable
Amazon Redshift Amazon EMR
Your Amazon
Kinesis–Enabled App
Voters RawVotes TableVoting app RawVotesDynamoDB
Stream
Real-time, fault-tolerant, scalable aggregation
Real-time voting architecture
AggregateVotesTable
Amazon Redshift Amazon EMR
Your Amazon
Kinesis–Enabled App
Voters RawVotes TableVoting app RawVotesDynamoDB
Stream
Voter analytics, statistics
Analytics with DynamoDB Streams
• Collect and de-dupe data in DynamoDB• Aggregate data in-memory and flush periodically• Important when: Performing real-time aggregation and analytics
Op: PUTJohnTokyo
Op: UPDATEJohnPluto
Op: UPDATEJohnMars
DynamoDB writes and Streams
Operation #
DynamoDB Operation
Data inDynamoDB
Data in Streams
1 PUT : {John:Tokyo}
{John:Tokyo} PUT John Tokyo
2 UPDATE:{John:Mars}
{John:Mars} UPDATE John Mars
3 UPDATE:{John:Pluto}
{John:Pluto} UPDATE John Pluto
Data: {Name:Destination}
dynamodb-pm@
View Type Destination
Old Image – Before update Name = John, Destination = Mars
New Image – After update Name = John, Destination = Pluto
Old and New Images Name = John, Destination = MarsName = John, Destination = Pluto
Keys Only Name = John
View types
dynamodb-pm@
Features of DynamoDB Streams
dynamodb-pm@
Streams Characteristics
• Each item update appears exactly once • Records are strictly ordered by time• Streams are Asynchronous
dynamodb-pm@
Durability & high availabilityHigh throughput consensus protocolReplicated across multiple AZs
dynamodb-pm@
Managed StreamsSimply enable streams
dynamodb-pm@
ElasticityAdjusts to table throughput
dynamodb-pm@
PerformanceDesigned for sub-second latency
| |
Sub-second latency
dynamodb-pm@
DurabilityRecords available for 24 hours
dynamodb-pm@
How much does it cost?
• Free to turn it on• First 2.5 million reads per month are Free• $0.20 per million reads after that
DynamoDB StreamsDynamoDBdynamodb-pm@
dynamodb-pm@
What Customers Are Saying
Mapbox
ProblemMaking mapping data highly available, even faster.
DynamoDB Streams use-caseCross-region Replication
In their own words“DynamoDB Streams unlocks cross-region replication - a critical feature that enabled us to fully migrate to DynamoDB. Cross-region replication allows us to distribute data across the world for redundancy and speed.” - Jake Pruitt, Software Developer, Mapbox
TOKYU HANDS ProblemAugmenting Point of Sale system to react in real time to inventory and customer data
DynamoDB Streams use-caseDynamoDB Triggers (DynamoDB Streams + AWS Lambda)
In their own words“TOKYU HANDS is running in-store Point Of Sales system backed by DynamoDB and various AWS services. We really like full-managed services such as DynamoDB. I believe DynamoDB Streams would help us making the system more sophisticated and more automated.” - Yamazaki-san, Cloud Architect, TOKYU HANDS.
The local version of DynamoDB
• Desktop Installable• Development & Testing• Publicly available at DynamoDB.com
Now supports DynamoDB Streams
Cross-region replication app:http://tinyurl.com/DynamoDBCrossRegionReplication
Open sourced Cross-region library is availablehttp://tinyurl.com/DynamoDBReplicationLibrary
dynamodb-pm@
dynamodb-pm@
Thank you!