pfurlani@teamdrg - talend real-time open source data ...taming big data in healthcare paolo furlani....
TRANSCRIPT
Agenda
➢ Overview of DGR
➢ Initial Challenges
➢ The Platform that We Needed
➢ Overview of Snowflake and Talend
➢ Our way forward
Our mission:
Bringing together real-world data streams for algorithmically-drivenresponsiveness
Initial challenges
➢ Infrastructure was not ready for big data
➢ Building a big data team was costly and time-consuming
➢ Existing big data solutions were complicated & hard to integrate
➢ Pressure to move fast
The platform we needed
➢ A mature SQL Engine that works with big data
➢ Supports multi-terabyte data volumes
➢ Hosted in the Cloud
… But does it exist?
Data Warehouse Built for the Cloud...
The glue that holds it all together:
Talend!
• Quick to get up and running
• Scalable compute performance
• Minimal coding involved
• Keeping data gurus focused on building workflows instead of coding
Our current architecture
Talend Integration Cloud
Our way forward
➢ Snowflake: our big data engine
➢AWS S3: Data Lake + Disaster/Recovery
➢Talend: Connecting the pieces together
Next up: Streamlining advanced machine learning with AWS EMR Spark Clusters
Be Eligible to Win Prizes at the End of the Show!