betterment engineering - bootstrapping data intelligence - agile r dashboards
Post on 22-May-2015
516 Views
Preview:
DESCRIPTION
TRANSCRIPT
Light-weight
Dashboarding and Reporting
workflows with R
And the GREAT stack...
Yuriy GoldmanJon Mauney
http://www.meetup.com/BigOnData
http://www.meetup.com/NYC-Open-Data
http://www.meetup.com/FinTech
@betterment #bootstrapbi
Team Polaris @ Betterment
Yuriy Jon
Avi Nick Andrew
https://www.betterment.com/blog/2014/03/07/bootstrap-data-team/
Bootstrapping Business Intelligence
Get Here
Walk before you run...
Leverage existing skillset
Minimally Viable Product
Lean and Efficient
Of
• GREAT Stack in Layers• Exercise a workflow for Development, Staging,
Deployment• Teamwork or Mingle - we will build an “almost
realtime” Dashboard in R
Agenda
GitHub For source control of scaffolding and scripts
R-Language RStudio, Knitr, Rcharts, Yaml
Engineering Elbow Grease Enabling Automation, QA
AWS Amazon Web Services - S3, EC2, RDS
Travis CI for continuous build and deployments
GREAT Stack
Engineer TestedAnalyst Approved
Workflow Overview
AUTHORING
STAGING
DEPLOYING
Local Environment
project
YAMLMySQL
R-scripts, system scripts,deployment-fu
network file storages3::rwizflowy-bucket
/mnt/rwizflowy-bucket
(A) Set up Git(B) Open project in R Studio
(C) Mount S3 Bucket and Symlink
(D) Test DB Connection
WiFi: BettermentGuest: guest, welcome to betterment
Complete environment for R development(But you knew that already)
Collaborative Source Code ManagementContinuous Integration hooksPost Deployment processing
Like Dropbox, but gives you dependable, static URLs to files you save there (images, html pages)
AWS S3
Access AWS S3 Bucket as a local driveSymlink /Volume/rwizflowy-bucket to /mnt/rwizflowy-bucket
ExpanDrive
Sample data will come from MySQL.
YAML is syntax for a config file our R scripts will load at runtime. It can tell us how to connect to MySQL or where to output our plots. Any settings that can change between your Local environment and Server environment should be defined here.
Assembly of Dashboards or Reports happen within Google Sites. But any Wiki will do. Use whatever as long as IMG and IFrames are supported.
Google Sites
Local Environment
project
YAMLMySQL
R-scripts, system scripts,deployment-fu
network file storages3::rwizflowy-bucket
/mnt/rwizflowy-bucket
(A) Set up Git(B) Open project in R Studio
(C) Mount S3 Bucket and Symlink
(D) Test DB Connection
Team Exercise #1: AuthoringSet up Local EnvironmentExercise a sample scriptOutput to S3
1.Get Code 3.Connect to S3 and MySQL
4.Run Code, output to S3 2.Team Name
Find a team captain for the Authoring Challenge!● Reconvene in 20 minutes● Take one of the samples and come up with an
original graphic● Team with the best ‘custom’ content that is web
accessible (in the s3 bucket) gets t-shirts!
https://s3.amazonaws.com/rwizflowy-bucket/${team-name}
StagingAdd R output to a Wiki
https://s3.amazonaws.com/rwizflowy-bucket/${TEAMNAME}/${FILENAME.png}
Local Environment
project
YAMLMySQL
R-scripts, system scripts,deployment-fu
network file storages3::rwizflowy-bucket
/mnt/rwizflowy-bucket
Server Environment
project
YAMLMySQL
R-scripts, system scripts,deployment-fu
network file storages3::rwizflowy-bucket
/mnt/rwizflowy-bucket
via S3Fuse
cron scheduler
Integration Environment
hook pullLocal Server
build and deploy
git push
Local Server
git push
Continuous Integration and Deployment tool. Connects to your GitHub account and listens for changes to your Branches. We tell it what to do via our .travis.yml file (in our project). Travis can execute unit/integration tests. If all is A.OK. it can push to EC2. Awesomeness!
Travis-CI
DeployingCommit to MasterSit back and enjoy the show...
Server Environment
YAMLMySQL
network file storages3::rwizflowy-bucket
/mnt/rwizflowy-bucket
via S3Fuse
cron scheduler
Wrap Up, Q&A
AUTHORING
STAGING
DEPLOYING
https://github.com/ygoldman/rwizflowy
http://www.betterment.com/jobs
https://www.betterment.com/blog
Get a Betterment account:https://www.betterment.com/fintech
top related