large-scale etl data flows with data pipeline and dataduct
TRANSCRIPT
![Page 1: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/1.jpg)
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Sourabh Bajaj, Software Engineer, Coursera
October 2015
BDT404
Large-Scale ETL Data FlowsWith Data Pipeline and Dataduct
![Page 2: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/2.jpg)
What to Expect from the Session
● Learn about:• How we use AWS Data Pipeline to manage ETL at Coursera
• Why we built Dataduct, an open source framework from Coursera for running pipelines
• How Dataduct enables developers to write their own ETL pipelines for their services
• Best practices for managing pipelines
• How to start using Dataduct for your own pipelines
![Page 3: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/3.jpg)
Coursera
![Page 4: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/4.jpg)
Coursera
![Page 5: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/5.jpg)
Coursera
![Page 6: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/6.jpg)
120partners
2.5 millioncourse completions
Education at Scale
15 millionlearners worldwide
1300courses
![Page 7: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/7.jpg)
Data Warehousing at Coursera
Amazon Redshift
167 Amazon Redshift users
1200 EDW tables
22 source systems
6 dc1.8xlarge instances
30,000,000 queries run
![Page 8: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/8.jpg)
Data Flow
AmazonRedshift
AmazonRDS
Amazon EMR Amazon S3
Event Feeds
Amazon EC2
AmazonRDS
Amazon S3
BI ApplicationsThird Party Tools
Cassandra
Cassandra
AWS Data Pipeline
![Page 9: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/9.jpg)
Data Flow
AmazonRedshift
AmazonRDS
Amazon EMR Amazon S3
Event Feeds
Amazon EC2
AmazonRDS
Amazon S3
BI ApplicationsThird Party Tools
Cassandra
Cassandra
AWS Data Pipeline
![Page 10: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/10.jpg)
Data Flow
AmazonRedshift
AmazonRDS
Amazon EMR Amazon S3
Event Feeds
Amazon EC2
AmazonRDS
Amazon S3
BI ApplicationsThird Party Tools
Cassandra
Cassandra
AWS Data Pipeline
![Page 11: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/11.jpg)
Data Flow
AmazonRedshift
AmazonRDS
Amazon EMR Amazon S3
Event Feeds
Amazon EC2
AmazonRDS
Amazon S3
BI ApplicationsThird Party Tools
Cassandra
Cassandra
AWS Data Pipeline
![Page 12: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/12.jpg)
Data Flow
AmazonRedshift
AmazonRDS
Amazon EMR Amazon S3
Event Feeds
Amazon EC2
AmazonRDS
Amazon S3
BI ApplicationsThird Party Tools
Cassandra
Cassandra
AWS Data Pipeline
![Page 13: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/13.jpg)
ETL at Coursera
150 Active pipelines 44 Dataduct developers
![Page 14: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/14.jpg)
Requirements for an ETL system
Fault Tolerance Scheduling Dependency Management
Resource Management Monitoring Easy Development
![Page 15: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/15.jpg)
Requirements for an ETL system
Fault Tolerance Scheduling Dependency Management
Resource Management Monitoring Easy Development
![Page 16: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/16.jpg)
Requirements for an ETL system
Fault Tolerance Scheduling Dependency Management
Resource Management Monitoring Easy Development
![Page 17: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/17.jpg)
Requirements for an ETL system
Fault Tolerance Scheduling Dependency Management
Resource Management Monitoring Easy Development
![Page 18: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/18.jpg)
Requirements for an ETL system
Fault Tolerance Scheduling Dependency Management
Resource Management Monitoring Easy Development
![Page 19: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/19.jpg)
Requirements for an ETL system
Fault Tolerance Scheduling Dependency Management
Resource Management Monitoring Easy Development
![Page 20: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/20.jpg)
Requirements for an ETL system
Fault Tolerance Scheduling Dependency Management
Resource Management Monitoring Easy Development
![Page 21: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/21.jpg)
Dataduct
![Page 22: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/22.jpg)
● Open source wrapper around AWS Data Pipeline
Dataduct
![Page 23: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/23.jpg)
Dataduct● Open source wrapper around AWS Data Pipeline
● It provides:• Code reuse
• Extensibility
• Command line interface
• Staging environment support
• Dynamic updates
![Page 24: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/24.jpg)
Dataduct
● Repository• https://github.com/coursera/dataduct
● Documentation• http://dataduct.readthedocs.org/en/latest/
● Installation • pip install dataduct
![Page 25: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/25.jpg)
Let’s build some pipelines
![Page 26: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/26.jpg)
Pipeline 1: Amazon RDS → Amazon Redshift
● Let’s start with a simple pipeline of pulling data from a relational store to Amazon Redshift
AmazonRedshiftAmazon
RDS
Amazon S3Amazon EC2
AWS Data Pipeline
![Page 27: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/27.jpg)
Pipeline 1: Amazon RDS → Amazon Redshift
![Page 28: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/28.jpg)
● Definition in YAML● Steps● Shared Config● Visualization● Overrides● Reusable code
Pipeline 1: Amazon RDS → Amazon Redshift
![Page 29: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/29.jpg)
Pipeline 1: Amazon RDS → Amazon Redshift (Steps)
● Extract RDS• Fetch data from Amazon RDS and output to Amazon S3
![Page 30: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/30.jpg)
Pipeline 1: Amazon RDS → Amazon Redshift (Steps)
● Create-Load-Redshift• Create table if it doesn’t exist and load data using COPY.
![Page 31: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/31.jpg)
Pipeline 1: Amazon RDS → Amazon Redshift
● Upsert• Update and insert into the production table from staging.
![Page 32: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/32.jpg)
Pipeline 1: Amazon RDS → Amazon Redshift (Tasks)
● Bootstrap• Fully automated• Fetch latest binaries from Amazon S3 for Amazon EC2 /
Amazon EMR• Install any updated dependencies on the resource• Make sure that the pipeline would run the latest version of code
![Page 33: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/33.jpg)
Pipeline 1: Amazon RDS → Amazon Redshift
● Quality assurance• Primary key violations in the warehouse.• Dropped rows: By comparing the number of rows.• Corrupted rows: By comparing a sample set of rows.• Automatically done within UPSERT
![Page 34: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/34.jpg)
Pipeline 1: Amazon RDS → Amazon Redshift
● Teardown• Amazon SNS alerting for failed tasks• Logging of task failures • Monitoring
• Run times• Retries• Machine health
![Page 35: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/35.jpg)
Pipeline 1: Amazon RDS → Amazon Redshift (Config)
● Visualization• Automatically generated
by Dataduct• Allows easy debugging
![Page 36: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/36.jpg)
Pipeline 1: Amazon RDS → Amazon Redshift
● Shared Config• IAM roles• AMI• Security group• Retries• Custom steps• Resource paths
![Page 37: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/37.jpg)
Pipeline 1: Amazon RDS → Amazon Redshift
● Custom steps• Open-sourced steps can easily be shared across multiple
pipelines• You can also create new steps and add them using the config
![Page 38: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/38.jpg)
Deploying a pipeline● Command line interface for all operations
usage: dataduct pipeline activate [-h] [-m MODE] [-f] [-t TIME_DELTA] [-b] pipeline_definitions [pipeline_definitions ...]
![Page 39: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/39.jpg)
Pipeline 2: Cassandra → Amazon Redshift
AmazonRedshift
Amazon EMR(Scalding)
Amazon S3Amazon S3 Amazon EMR
(Aegisthus)
Cassandra
AWS Data Pipeline
![Page 40: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/40.jpg)
● Shell command activity to rescue
Pipeline 2: Cassandra → Amazon Redshift
![Page 41: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/41.jpg)
● Shell command activity to rescue
● Priam backups of Cassandra to Amazon S3
Pipeline 2: Cassandra → Amazon Redshift
![Page 42: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/42.jpg)
● Shell command activity to rescue
● Priam backups of Cassandra to S3
● Aegisthus to parse SSTables into Avro dumps
Pipeline 2: Cassandra → Amazon Redshift
![Page 43: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/43.jpg)
● Shell command activity to rescue
● Priam backups of Cassandra to Amazon S3
● Aegisthus to parse SSTables into Avro dumps
● Scalding to process Aegisthus output• Extend the base steps to create more patterns
Pipeline 2: Cassandra → Amazon Redshift
![Page 44: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/44.jpg)
Pipeline 2: Cassandra → Amazon Redshift
![Page 45: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/45.jpg)
● Custom steps• Aegisthus• Scalding
Pipeline 2: Cassandra → Amazon Redshift
![Page 46: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/46.jpg)
● EMR-Config overrides the defaults
Pipeline 2: Cassandra → Amazon Redshift
![Page 47: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/47.jpg)
● Multiple output nodes from transform step
Pipeline 2: Cassandra → Amazon Redshift
![Page 48: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/48.jpg)
● Bootstrap
• Save every pipeline definition
• Fetching new jar for the Amazon EMR jobs
• Specify the same Hadoop / Hive metastore installation
Pipeline 2: Cassandra → Amazon Redshift
![Page 49: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/49.jpg)
Data products● We’ve talked about data into the warehouse● Common pattern:
• Wait for dependencies• Computation inside redshift to create derived tables• Amazon EMR activities for more complex process• Load back into MySQL / Cassandra• Product feature queries MySQL / Cassandra
● Used in recommendations, dashboards, and search
![Page 50: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/50.jpg)
Recommendations● Objective
• Connecting the learner to right content
● Use cases:• Recommendations email• Course discovery• Reactivation of the users
![Page 51: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/51.jpg)
Recommendations● Computation inside Amazon Redshift to create derived
tables for co-enrollments● Amazon EMR job for model training● Model file pushed to Amazon S3● Prediction API uses the updated model file● Contract between the prediction and the training layer is
via model definition.
![Page 52: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/52.jpg)
Internal Dashboard● Objective
• Serve internal dashboards to create a data driven culture
● Use Cases• Key performance indicators for the company• Track results for different A/B experiments
![Page 53: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/53.jpg)
Internal Dashboard
![Page 54: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/54.jpg)
● Do:• Monitoring (run times, retries, deploys, query times)
Learnings
![Page 55: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/55.jpg)
● Do:• Monitoring (run times, retries, deploys, query times)
• Code should live in library instead of scripts being passed to every pipeline
Learnings
![Page 56: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/56.jpg)
● Do:• Monitoring (run times, retries, deploys, query times)
• Code should live in library instead of scripts being passed to every pipeline
• Test environment in staging should mimic prod
Learnings
![Page 57: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/57.jpg)
● Do:• Monitoring (run times, retries, deploys, query times)
• Code should live in library instead of scripts being passed to every pipeline
• Test environment in staging should mimic prod
• Shared library to democratize writing of ETL
Learnings
![Page 58: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/58.jpg)
● Do:• Monitoring (run times, retries, deploys, query times)
• Code should live in library instead of scripts being passed to every pipeline
• Test environment in staging should mimic prod
• Shared library to democratize writing of ETL
• Using read-replicas and backups
Learnings
![Page 59: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/59.jpg)
● Don’t:• Same code passed to multiple pipelines as a script
Learnings
![Page 60: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/60.jpg)
● Don’t:• Same code passed to multiple pipelines as a script
• Non version controlled pipelines
Learnings
![Page 61: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/61.jpg)
● Don’t:• Same code passed to multiple pipelines as a script
• Non version controlled pipelines
• Really huge pipelines instead of modular small pipelines with dependencies
Learnings
![Page 62: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/62.jpg)
● Don’t:• Same code passed to multiple pipelines as a script
• Non version controlled pipelines
• Really huge pipelines instead of modular small pipelines with dependencies
• Not catching resource timeouts or load delays
Learnings
![Page 63: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/63.jpg)
Dataduct● Code reuse
● Extensibility
● Command line interface
● Staging environment support
● Dynamic updates
![Page 64: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/64.jpg)
Dataduct
● Repository• https://github.com/coursera/dataduct
● Documentation• http://dataduct.readthedocs.org/en/latest/
● Installation • pip install dataduct
![Page 65: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/65.jpg)
Questions?
Also, we are hiring!https://www.coursera.org/jobs
![Page 66: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/66.jpg)
Remember to complete your evaluations!
![Page 67: Large-Scale ETL Data Flows With Data Pipeline and Dataduct](https://reader034.vdocument.in/reader034/viewer/2022042723/586f77f61a28ab10258b69a9/html5/thumbnails/67.jpg)
Thank you!
Also, we are hiring!https://www.coursera.org/jobs
Sourabh Bajaj sb2nov @sb2nov