automating the formalization of product comparison matrices
DESCRIPTION
Automating the Formalization of Product Comparison Matrices ASE 2014TRANSCRIPT
![Page 1: Automating the Formalization of Product Comparison Matrices](https://reader034.vdocument.in/reader034/viewer/2022052505/556225f1d8b42ad44d8b5154/html5/thumbnails/1.jpg)
Automating the Formalization of
Product Comparison Matrices
Guillaume Bécan, Nicolas Sannier, Mathieu Acher,
Olivier Barais, Arnaud Blouin, Benoit Baudry
![Page 2: Automating the Formalization of Product Comparison Matrices](https://reader034.vdocument.in/reader034/viewer/2022052505/556225f1d8b42ad44d8b5154/html5/thumbnails/2.jpg)
Product lines everywhere
Automating the Formalization of Product Comparison Matrices - 2
![Page 3: Automating the Formalization of Product Comparison Matrices](https://reader034.vdocument.in/reader034/viewer/2022052505/556225f1d8b42ad44d8b5154/html5/thumbnails/3.jpg)
Product Comparison Matrices (PCMs)
Automating the Formalization of Product Comparison Matrices - 3
![Page 4: Automating the Formalization of Product Comparison Matrices](https://reader034.vdocument.in/reader034/viewer/2022052505/556225f1d8b42ad44d8b5154/html5/thumbnails/4.jpg)
Services on top of PCMs
Automating the Formalization of Product Comparison Matrices - 4
Edit
Compare
Visualize
Filter
Rank
Merge
Configure
Multi-objective optimization
![Page 5: Automating the Formalization of Product Comparison Matrices](https://reader034.vdocument.in/reader034/viewer/2022052505/556225f1d8b42ad44d8b5154/html5/thumbnails/5.jpg)
Problem
Automating the Formalization of Product Comparison Matrices - 5
Edit
Compare
Visualize
…
Information is:
• Uncontrolled
• Heterogeneous
• Ambiguous
[Sannier et al, ASE 2013]
| [[Acer Inc.|Acer]]
| [[Acer beTouch E110|beTouch E110]]
| {{dts|format=dmy|2010|2|15}}
| 1.5
| [[320x240|320x240 QVGA]]
| {{convert|2.8|in|mm|abbr=on}}
| Touch, accelerometer
|
* [[GSM]]/​GPRS/​[[Enhanced Data
Rates for GSM Evolution|EDGE]]
* [[Universal Mobile Telecommunications
System|UMTS]] 850 1900
* CSD
![Page 6: Automating the Formalization of Product Comparison Matrices](https://reader034.vdocument.in/reader034/viewer/2022052505/556225f1d8b42ad44d8b5154/html5/thumbnails/6.jpg)
Problem
Automating the Formalization of Product Comparison Matrices - 6
Common
language Transformation
Edit
Compare
Visualize
…
• How to formalize data contained in natural language PCMs?
• How to automate the formalization of PCMs?
• What tools and services can be built on top of this formalization?
![Page 7: Automating the Formalization of Product Comparison Matrices](https://reader034.vdocument.in/reader034/viewer/2022052505/556225f1d8b42ad44d8b5154/html5/thumbnails/7.jpg)
Contributions
Automating the Formalization of Product Comparison Matrices - 7
1. Design of a metamodel for product comparison matrices
2. Automated techniques for formalizing raw data into formalized
product comparison matrix model
3. Evaluation on 30,000+ cells from Wikipedia
![Page 8: Automating the Formalization of Product Comparison Matrices](https://reader034.vdocument.in/reader034/viewer/2022052505/556225f1d8b42ad44d8b5154/html5/thumbnails/8.jpg)
Metamodeling driven by (lots of) data
Automating the Formalization of Product Comparison Matrices - 8
Designed for data (lots of examples + personal experience)
Designed for applications (edit, compare, visualize…)
Objectives:
• A metamodel that can contain every PCM of Wikipedia
• A metamodel for building services on top of these PCMs
Categorization of patterns (ASE 2013)
Refinement of the patterns
Realization of the metamodel (2 intensive weeks)
Formalizing some examples to adjust the metamodel
Driven by statistics and manual review of lots of PCMs New concept
Statistics
Brainstorming
Working on the metamodel since February 2013
300+ PCMs – 300,000 cells
Numerous domains
Manual review of 50 PCMs (thousands of cells)
Statistics on all PCMs
Analysis of Wikipedia syntax for tables
Automated transformation of all PCMs to PCM models
![Page 9: Automating the Formalization of Product Comparison Matrices](https://reader034.vdocument.in/reader034/viewer/2022052505/556225f1d8b42ad44d8b5154/html5/thumbnails/9.jpg)
PCM metamodel
Automating the Formalization of Product Comparison Matrices - 9
![Page 10: Automating the Formalization of Product Comparison Matrices](https://reader034.vdocument.in/reader034/viewer/2022052505/556225f1d8b42ad44d8b5154/html5/thumbnails/10.jpg)
PCM metamodel
Automating the Formalization of Product Comparison Matrices - 10
Structure of a PCM
![Page 11: Automating the Formalization of Product Comparison Matrices](https://reader034.vdocument.in/reader034/viewer/2022052505/556225f1d8b42ad44d8b5154/html5/thumbnails/11.jpg)
PCM metamodel
Automating the Formalization of Product Comparison Matrices - 11
Feature/Product oriented
![Page 12: Automating the Formalization of Product Comparison Matrices](https://reader034.vdocument.in/reader034/viewer/2022052505/556225f1d8b42ad44d8b5154/html5/thumbnails/12.jpg)
Automating the Formalization of Product Comparison Matrices - 12
Formalized interpretation of a cell
Data types: Boolean, Integer, Real
Special values: Unknown, Empty, Inconsistent, Partial
PCM metamodel
row string formalized integer
![Page 13: Automating the Formalization of Product Comparison Matrices](https://reader034.vdocument.in/reader034/viewer/2022052505/556225f1d8b42ad44d8b5154/html5/thumbnails/13.jpg)
Contributions
Automating the Formalization of Product Comparison Matrices - 13
1. Design of a metamodel for product comparison matrices
2. Automated techniques for formalizing raw data into
formalized product comparison matrix model
3. Evaluation on 30,000+ cells from Wikipedia
![Page 14: Automating the Formalization of Product Comparison Matrices](https://reader034.vdocument.in/reader034/viewer/2022052505/556225f1d8b42ad44d8b5154/html5/thumbnails/14.jpg)
Approach
Automating the Formalization of Product Comparison Matrices - 14
Parsing: transform a PCM artefact in a PCM model
PCM PCM
model
parsing preprocessing extracting
information exploiting
PCM
model
PCM
model
PCM metamodel S
E
R
V
I
C
E
S
| [[Acer Inc.|Acer]]
| [[Acer beTouch E110|beTouch E110]]
| {{dts|format=dmy|2010|2|15}}
| 1.5
| [[320x240|320x240 QVGA]]
| {{convert|2.8|in|mm|abbr=on}}
| Touch, accelerometer
|
* [[GSM]]/​GPRS/​[[Enhanced
Data Rates for GSM Evolution|EDGE]]
* [[Universal Mobile Telecommunications
System|UMTS]] 850 1900
* CSD
Enable the development of a
generic formalization process
![Page 15: Automating the Formalization of Product Comparison Matrices](https://reader034.vdocument.in/reader034/viewer/2022052505/556225f1d8b42ad44d8b5154/html5/thumbnails/15.jpg)
Approach
Automating the Formalization of Product Comparison Matrices - 15
Preprocessing:
Contributors cannot be trusted: missing cells, headers everywhere
We have to normalize the matrix and identify headers
Default strategy: first line and first column are headers
PCM PCM
model
parsing preprocessing extracting
information exploiting
PCM
model
PCM
model
PCM metamodel S
E
R
V
I
C
E
S
![Page 16: Automating the Formalization of Product Comparison Matrices](https://reader034.vdocument.in/reader034/viewer/2022052505/556225f1d8b42ad44d8b5154/html5/thumbnails/16.jpg)
Approach
Automating the Formalization of Product Comparison Matrices - 16
Extracting information:
• Identify features and products
• Interpret cells based on a set of syntactic rules (regex)
PCM PCM
model
parsing preprocessing extracting
information exploiting
PCM
model
PCM
model
PCM metamodel S
E
R
V
I
C
E
S
List of rules:
…
"\d+" => Integer
…
match Integer(100)
Same process as the metamodel for creating the rules
![Page 17: Automating the Formalization of Product Comparison Matrices](https://reader034.vdocument.in/reader034/viewer/2022052505/556225f1d8b42ad44d8b5154/html5/thumbnails/17.jpg)
Contributions
Automating the Formalization of Product Comparison Matrices - 17
1. Design of a metamodel for product comparison matrices
2. Automated techniques for formalizing raw data into formalized
product comparison matrix model
3. Evaluation on 30,000+ cells from Wikipedia
![Page 18: Automating the Formalization of Product Comparison Matrices](https://reader034.vdocument.in/reader034/viewer/2022052505/556225f1d8b42ad44d8b5154/html5/thumbnails/18.jpg)
Evaluation
Automating the Formalization of Product Comparison Matrices - 18
Experimental settings:
• 75 PCMs from Wikipedia
• Headers specified manually
• Automated extraction of information
PCM PCM
model
parsing preprocessing extracting
information
PCM
model
PCM
model
PCM metamodel
exploiting
S
E
R
V
I
C
E
S
RQ1
RQ2
RQ3
![Page 19: Automating the Formalization of Product Comparison Matrices](https://reader034.vdocument.in/reader034/viewer/2022052505/556225f1d8b42ad44d8b5154/html5/thumbnails/19.jpg)
Evaluation
Automating the Formalization of Product Comparison Matrices - 19
Task: check interpretation of each cell (30,000+)
• Validate
• Correct it with existing concept
• Correct it with a new concept
• I don’t know / there is no interpretation
20 evaluators
Online editor
![Page 20: Automating the Formalization of Product Comparison Matrices](https://reader034.vdocument.in/reader034/viewer/2022052505/556225f1d8b42ad44d8b5154/html5/thumbnails/20.jpg)
Evaluation
Automating the Formalization of Product Comparison Matrices - 20
Metrics:
• Number of valid cells
• Number of cells corrected with concepts from the metamodel
• Number of cells corrected with new concepts
• List of new concepts
![Page 21: Automating the Formalization of Product Comparison Matrices](https://reader034.vdocument.in/reader034/viewer/2022052505/556225f1d8b42ad44d8b5154/html5/thumbnails/21.jpg)
Evaluation
Automating the Formalization of Product Comparison Matrices - 21
RQ1: To what extent can PCMs be formalized?
93.11% of the cells are valid
2.61% are corrected with concepts from the metamodel
4.28% are invalid and the evaluators proposed a new concept
• Dates
• Dimensions and units
• Versions
Solution:
• Add corresponding data types to the
metamodel
• Create new rules for interpreting cells
![Page 22: Automating the Formalization of Product Comparison Matrices](https://reader034.vdocument.in/reader034/viewer/2022052505/556225f1d8b42ad44d8b5154/html5/thumbnails/22.jpg)
Evaluation
Automating the Formalization of Product Comparison Matrices - 22
RQ2: To what extent can the formalization be automated?
93,11% of the cells are correctly formalized
Formalization errors may arise from 4 main areas:
• Overlapping concepts (e.g. what does an empty cell mean?)
• Missing concepts (e.g. dates, versions…)
• Missing interpretation rules
• Bad rules
![Page 23: Automating the Formalization of Product Comparison Matrices](https://reader034.vdocument.in/reader034/viewer/2022052505/556225f1d8b42ad44d8b5154/html5/thumbnails/23.jpg)
Evaluation
Automating the Formalization of Product Comparison Matrices - 23
RQ3: What services can be built on top of formalized PCMs?
Editing and formalizing PCMs
Warnings during edition (inconsistent cells)
Filtering capabilities
Translate PCMs to variability models
The metamodel provides
• Feature/product oriented perspective
• Clear semantics
![Page 24: Automating the Formalization of Product Comparison Matrices](https://reader034.vdocument.in/reader034/viewer/2022052505/556225f1d8b42ad44d8b5154/html5/thumbnails/24.jpg)
Results of the evaluation
Automating the Formalization of Product Comparison Matrices - 24
We now have a common language for PCMs
• validated by humans
• validated by transformation
• validated by the editor
A large proportion of the formalization can be automated
BUT human is necessary
Good news: the editor can help formalizing the data
![Page 25: Automating the Formalization of Product Comparison Matrices](https://reader034.vdocument.in/reader034/viewer/2022052505/556225f1d8b42ad44d8b5154/html5/thumbnails/25.jpg)
Future work
Automating the Formalization of Product Comparison Matrices - 26
Universal editor
Support large datasets Community of PCM contributors
Synchronization with Wikipedia
![Page 26: Automating the Formalization of Product Comparison Matrices](https://reader034.vdocument.in/reader034/viewer/2022052505/556225f1d8b42ad44d8b5154/html5/thumbnails/26.jpg)
Questions?
Automating the Formalization of Product Comparison Matrices - 27
PCM PCM
model
parsing preprocessing extracting
information exploiting
PCM
model
PCM
model
PCM metamodel S
E
R
V
I
C
E
S