cedric.cnam.fr - f. camillo–i. d’attoma integration of...

Post on 19-Jul-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

f. camillo – i. d’attoma

Integration of different data collection

techniques using a multivariate

counterfactual approach

Furio Camillo

Alma Mater Studiorum

Università di Bologna

f. camillo – i. d’attoma

f. camillo – i. d’attoma

column labe l description comments

id identifie r of ea ch sa le s point 4517 sa lespoint

pr1 sa le s va lue of the product ca tegory n.1 in millions of euros

pr2 sa le s va lue of the product ca tegory n.2 in millions of euros

pr3 sa le s va lue of the product ca tegory n.3 in millions of euros

pr4 sa le s va lue of the product ca tegory n.4 in millions of euros

pr5 sa le s va lue of the product ca tegory n.5 in millions of euros

pr6 sa le s va lue of the product ca tegory n.6 in millions of euros

pr7 sa le s va lue of the product ca tegory n.7 in millions of euros

pr8 sa le s va lue of the product ca tegory n.8 in millions of euros

pr9 sa le s va lue of the product ca tegory n.9 in millions of euros

trea t trea tment indica tor of a marke ting campa ign 1=trea ted; 2=no trea ted

outcome economic re turn in millions of euros

x1 structura l va riable n.1 ca tegorica l va ria ble

x2 structura l va riable n.2 ca tegorica l va ria ble

x3 structura l va riable n.3 ca tegorica l va ria ble

x4 structura l va riable n.4 ca tegorica l va ria ble

x5 structura l va riable n.5 ca tegorica l va ria ble

Y(outcome)

= Pr1-Pr9(products)

X1-X5(structural)

T(treatment)

f+ error

f. camillo – i. d’attoma

……..

Data exploration (1)

t

t t

f. camillo – i. d’attoma

Data exploration (2)

f. camillo – i. d’attoma

treated No treated

Simulated effect = 32mln Euros

f. camillo – i. d’attoma

f. camillo – i. d’attoma

f. camillo – i. d’attoma

f. camillo – i. d’attoma

Information about the past

information on the family

Information on the social class

Geo-demographic information

Value system and lifestyle

2 different data collection tools: CATI and CAWI

Hypothesis, method and available data about a web-panel

CAWI

CATI

Opinions

Motivations

Aspirations

Needs

Behaviours

X

Y

T

pre-treatment informations

treatment

post-treatment variables:

(OUTCOMES) Interesting

Variables of the survey

f. camillo – i. d’attoma

f. camillo – i. d’attoma

f. camillo – i. d’attoma

f. camillo – i. d’attoma

The Data Mining approach

• Researchers and analysts don’t need any a priori hypothesis

about variables distribution

• We can analyze high dimensional data in a easy way

• DM algorithms aim to minimize the complexity, the time and

costs of elaborations

• It generates results easy to understand

The data miner produces a “black-

box”, that is like an automatic tool,

that aims to meet decision makers

daily requirements, but in a flexible

way (U. Fayaad, 2001)

f. camillo – i. d’attoma

The main reference

f. camillo – i. d’attoma

The main reference

f. camillo – i. d’attoma

f. camillo – i. d’attoma

f. camillo – i. d’attoma

f. camillo – i. d’attoma

f. camillo – i. d’attoma

f. camillo – i. d’attoma

f. camillo – i. d’attoma

f. camillo – i. d’attoma

f. camillo – i. d’attoma

f. camillo – i. d’attoma

f. camillo – i. d’attoma

f. camillo – i. d’attoma

f. camillo – i. d’attoma

f. camillo – i. d’attoma

Type of job-contract: oral list (CATI) or written list (CAWI)?

f. camillo – i. d’attoma

f. camillo – i. d’attoma

Where c is the generic cluster of the

DM approach (multivariate)

f. camillo – i. d’attoma

The WEB in the Future: different access

points, different access tools

f. camillo – i. d’attoma

A web survey about the Italian identity:

15 items (1-10 scale points)

Below are listed a number of issues that characterize Italy. Please tell me for

each of these matters what you think are representative / characterizing the

national unity of our country, Italy.

To answer uses a rating from 1 to 10, with 1 being not at all represent the

national unity of our country and 10 means it is very much the national

identity of our country.

The golden question

(active)

1. Artistic and cultural heritage

2. …..

3. The “mafia”

4. …

15. The opera

f. camillo – i. d’attoma

+1

-1

0

maxminmean

Original

scale

Recoded

scale

A non-linear re-coding method

(MG-Strategy) (endogenous for each respondent)

Ref: F.Camillo – MicroMacro Marketing – 1999/1 –

Il Mulino

The outcome variable:

a p-clusters

segmentation

f. camillo – i. d’attoma

Covariate sub-space T1: smartphone T2: usual PC Balancing

sub-space1 yes

sub-space2 yes

sub-space3 no

sub-space4 yes

sub-space5 yes

sub-space6 no

--------- -----

sub-space n yes

Cluster1 Cluster2 Cluster3 ----- Cluster p

T1 0.5 0.3 0.1 ----- 0.03

T2 0.47 0.2 0.09 ---- 0.1

For sub-space1

Comparing the distributions (CHI2) it is possible to evaluate for each sub-

space the impact of different treatment (use of smartphone or not use)

f. camillo – i. d’attoma

Benefits of the proposed strategy in

“The Internet age”

• Why The Internet will be important in the future?

• Data driven approach

• Semi automatic (massive use)

• “Multivariate” use of the information

• Work in progress: a SAS software, qualitative and

quantitative co-variates (the science matrix of

Rubin), application of ICOMP approach of Bozdogan

for a more automatic stop rule definition.

furio.camillo@unibo.it – ida.dattoma2@unibo.it

top related