pfurlani@teamdrg - talend real-time open source data ...taming big data in healthcare paolo furlani....

11
[email protected] Taming Big Data in Healthcare Paolo Furlani

Upload: others

Post on 22-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: pfurlani@teamDRG - Talend Real-Time Open Source Data ...Taming Big Data in Healthcare Paolo Furlani. Agenda Overview of DGR ... Infrastructure was not ready for big data Building a

[email protected]

Taming Big Data in Healthcare

Paolo Furlani

Page 2: pfurlani@teamDRG - Talend Real-Time Open Source Data ...Taming Big Data in Healthcare Paolo Furlani. Agenda Overview of DGR ... Infrastructure was not ready for big data Building a

Agenda

➢ Overview of DGR

➢ Initial Challenges

➢ The Platform that We Needed

➢ Overview of Snowflake and Talend

➢ Our way forward

Page 3: pfurlani@teamDRG - Talend Real-Time Open Source Data ...Taming Big Data in Healthcare Paolo Furlani. Agenda Overview of DGR ... Infrastructure was not ready for big data Building a

Our mission:

Bringing together real-world data streams for algorithmically-drivenresponsiveness

Page 4: pfurlani@teamDRG - Talend Real-Time Open Source Data ...Taming Big Data in Healthcare Paolo Furlani. Agenda Overview of DGR ... Infrastructure was not ready for big data Building a

Initial challenges

➢ Infrastructure was not ready for big data

➢ Building a big data team was costly and time-consuming

➢ Existing big data solutions were complicated & hard to integrate

➢ Pressure to move fast

Page 5: pfurlani@teamDRG - Talend Real-Time Open Source Data ...Taming Big Data in Healthcare Paolo Furlani. Agenda Overview of DGR ... Infrastructure was not ready for big data Building a

The platform we needed

➢ A mature SQL Engine that works with big data

➢ Supports multi-terabyte data volumes

➢ Hosted in the Cloud

… But does it exist?

Page 6: pfurlani@teamDRG - Talend Real-Time Open Source Data ...Taming Big Data in Healthcare Paolo Furlani. Agenda Overview of DGR ... Infrastructure was not ready for big data Building a

Data Warehouse Built for the Cloud...

Page 7: pfurlani@teamDRG - Talend Real-Time Open Source Data ...Taming Big Data in Healthcare Paolo Furlani. Agenda Overview of DGR ... Infrastructure was not ready for big data Building a

The glue that holds it all together:

Talend!

• Quick to get up and running

• Scalable compute performance

• Minimal coding involved

• Keeping data gurus focused on building workflows instead of coding

Page 8: pfurlani@teamDRG - Talend Real-Time Open Source Data ...Taming Big Data in Healthcare Paolo Furlani. Agenda Overview of DGR ... Infrastructure was not ready for big data Building a

Our current architecture

Page 9: pfurlani@teamDRG - Talend Real-Time Open Source Data ...Taming Big Data in Healthcare Paolo Furlani. Agenda Overview of DGR ... Infrastructure was not ready for big data Building a

Talend Integration Cloud

Page 10: pfurlani@teamDRG - Talend Real-Time Open Source Data ...Taming Big Data in Healthcare Paolo Furlani. Agenda Overview of DGR ... Infrastructure was not ready for big data Building a

Our way forward

➢ Snowflake: our big data engine

➢AWS S3: Data Lake + Disaster/Recovery

➢Talend: Connecting the pieces together

Next up: Streamlining advanced machine learning with AWS EMR Spark Clusters

Page 11: pfurlani@teamDRG - Talend Real-Time Open Source Data ...Taming Big Data in Healthcare Paolo Furlani. Agenda Overview of DGR ... Infrastructure was not ready for big data Building a

Be Eligible to Win Prizes at the End of the Show!