why you should care about synthetic data

Post on 24-Jan-2017

202 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

DATASYNTHETIC

Presented by Real Impact Analytics

WHY YOU SHOULDCARE ABOUT

QUESTIONS?#SYNTHETICRIA

OVERVIEWSYNTHETIC DATA

What is synthetic data?Why use it?How to create it?Who creates it?Conclusion

WHAT IS SYNTHETIC DATA

SYNTHETIC DATA?WHAT IS

Generic and artificial dataused to mimic real-worlddata sets.

Generic and artificial dataused to mimic real-worlddata sets.

Protect people’s privacysubstitutes real data that contains personal information

SYNTHETIC DATA?WHAT IS

Generic and artificial dataused to mimic real-worlddata sets.

Test robustness and accuracyduring software development

SYNTHETIC DATA?WHAT IS

Generic and artificial dataused to mimic real-worlddata sets.

Create artificial basewith similar features of real data sets

SYNTHETIC DATA?WHAT IS

WHYUSE IT?

Use of actual data sets is nolonger allowed, to protecteveryone’s right to privacy.

To develop big data tools, weneed realistic data sets fortesting algorithms and easy datavisualization.

Synthetic data - similar to realdata sets & shareable to public -acts as a substitute withoutinvading anyone’s privacy.

HOWTO CREATE IT?

TO CREATE IT?HOW

DRAWINGNUMBERS

AGENT-BASEDMODELLING

OR1 2

TO CREATE IT?HOW

DRAWING NUMBERS

Observe real-world statisticdistributions from original data to reproduce artificial bases by drawing simple numbers.

1

EXAMPLETELECOM DATA

DRAWING NUMBERS

DRAWING NUMBERS

Observe the real temportaldistributions of texts and phone calls from CDR data (call detail records).

Create an artificial base of customers.

DRAWING NUMBERS

Simulate texts and phone calls with time stamps following the distributions. The goal is to simulate CDRs so they follow the same distribution as real CDRs.

DRAWING NUMBERS

TO CREATE IT?HOW

Create physical models to explain observed behaviour to generate generic, random data using this model.

AGENT-BASEDMODELLING2

EXAMPLETELECOM DATA

AGENT-BASED MODELLING

Analyze real data from texts and phone calls, identifying temporal and behavioural patterns.

AGENT-BASEDMODELLING

Create a physical model based on those observations and evolutions over time.

AGENT-BASEDMODELLING

This model simulates texts and phone calls over time as they would occur in real life.

AGENT-BASEDMODELLING

WHOCREATES IT?

CREATES IT?WHO

IN-HOUSE DEVELOPMENT

AD-HOC DEVELOPMENT

OR

DEPENDING ON THE COMPLEXITY OF THE DATA SET

CONCLUSION

SYNTHETIC DATA

SYNTHETIC DATACONCLUSION

Your ability to generate realistic syntheticdata is essential to developing algorithms and software that will maximize the valueof your big data tools, without transgressing privacy laws.

info@realimpactanalytics.com

@RIAnalytics

realimpactanalytics.com

@RealImpactAnalytics

Real Impact Analytics

Real Impact Analytics (RIA) taps into rich telecomdata to capture its value. The data is turned intoaction with big data apps that ease our clients’day-to-day work.

RIA provides guided and predictive analyticsthrough proprietary software. Five of the top tenglobal telecom operators trust us to enhancecustomer experience through Customer ValueManagement, and optimize daily operations withour Commercial Excellence apps.

To learn how Real Impact Analytics can create thesame value for you, visit realimpactanalytics.com.

About Us

top related