![Page 1: 1 Solving ILP Problems in the EELA infrastructure Inês Dutra Departamento de Ciência de Computadores Universidade do Porto, Portugal](https://reader036.vdocument.in/reader036/viewer/2022062317/5a4d1af37f8b9ab05997fca3/html5/thumbnails/1.jpg)
1
Solving ILP Problems in the EELA infrastructure
Inês DutraDepartamento de Ciência de
ComputadoresUniversidade do Porto, Portugal
![Page 2: 1 Solving ILP Problems in the EELA infrastructure Inês Dutra Departamento de Ciência de Computadores Universidade do Porto, Portugal](https://reader036.vdocument.in/reader036/viewer/2022062317/5a4d1af37f8b9ab05997fca3/html5/thumbnails/2.jpg)
2
Outline
• Introduction– ILP– Examples– Motivation
• Experiments• Conclusions• Future Work
![Page 3: 1 Solving ILP Problems in the EELA infrastructure Inês Dutra Departamento de Ciência de Computadores Universidade do Porto, Portugal](https://reader036.vdocument.in/reader036/viewer/2022062317/5a4d1af37f8b9ab05997fca3/html5/thumbnails/3.jpg)
3
Introduction
• EELA selected application• Task 3.3: additional applications
![Page 4: 1 Solving ILP Problems in the EELA infrastructure Inês Dutra Departamento de Ciência de Computadores Universidade do Porto, Portugal](https://reader036.vdocument.in/reader036/viewer/2022062317/5a4d1af37f8b9ab05997fca3/html5/thumbnails/4.jpg)
4
Introduction
• What is ILP?– It is NOT Instruction Level Parallelism– It is NOT Integer Linear Programming
• So, what is it????• .......
![Page 5: 1 Solving ILP Problems in the EELA infrastructure Inês Dutra Departamento de Ciência de Computadores Universidade do Porto, Portugal](https://reader036.vdocument.in/reader036/viewer/2022062317/5a4d1af37f8b9ab05997fca3/html5/thumbnails/5.jpg)
5
Introduction• It is Inductive Logic Programming
– data mining – machine learning– Knowledge/information extraction
• Where:– Given:
• Set of observations (positive and negative)• Background knowledge (descriptions)• Language bias
– Find:• A hypothesis (in first order language) that best explains all
positive observations and none of the negatives.
![Page 6: 1 Solving ILP Problems in the EELA infrastructure Inês Dutra Departamento de Ciência de Computadores Universidade do Porto, Portugal](https://reader036.vdocument.in/reader036/viewer/2022062317/5a4d1af37f8b9ab05997fca3/html5/thumbnails/6.jpg)
6
Introduction
• Advantages:– Use of an understandable description
language– Relational knowledge
![Page 7: 1 Solving ILP Problems in the EELA infrastructure Inês Dutra Departamento de Ciência de Computadores Universidade do Porto, Portugal](https://reader036.vdocument.in/reader036/viewer/2022062317/5a4d1af37f8b9ab05997fca3/html5/thumbnails/7.jpg)
7
Introduction: example
TRAINS GOING EAST TRAINS GOING WEST
![Page 8: 1 Solving ILP Problems in the EELA infrastructure Inês Dutra Departamento de Ciência de Computadores Universidade do Porto, Portugal](https://reader036.vdocument.in/reader036/viewer/2022062317/5a4d1af37f8b9ab05997fca3/html5/thumbnails/8.jpg)
8
Introduction: example
short(car_12).closed(car_12).long(car_11).long(car_13).short(car_14).open_car(car_11).open_car(car_13).open_car(car_14).shape(car_11,rectangle). shape(car_12,rectangle).shape(car_13,rectangle).shape(car_14,rectangle).
load(car_11,rectangle,3). load(car_12,triangle,1).load(car_13,hexagon,1).load(car_14,circle,1).wheels(car_11,2).wheels(car_12,2).wheels(car_13,3).wheels(car_14,2).has_car(east1,car_11).has_car(east1,car_12).has_car(east1,car_13).has_car(east1,car_14).
![Page 9: 1 Solving ILP Problems in the EELA infrastructure Inês Dutra Departamento de Ciência de Computadores Universidade do Porto, Portugal](https://reader036.vdocument.in/reader036/viewer/2022062317/5a4d1af37f8b9ab05997fca3/html5/thumbnails/9.jpg)
9
Introduction: example
TRAINS GOING EAST TRAINS GOING WEST
![Page 10: 1 Solving ILP Problems in the EELA infrastructure Inês Dutra Departamento de Ciência de Computadores Universidade do Porto, Portugal](https://reader036.vdocument.in/reader036/viewer/2022062317/5a4d1af37f8b9ab05997fca3/html5/thumbnails/10.jpg)
10
Introduction: example
eastbound(T) IF has_car(T,C) AND short(C) AND closed(C)
TRAINS GOING EAST TRAINS GOING WEST
![Page 11: 1 Solving ILP Problems in the EELA infrastructure Inês Dutra Departamento de Ciência de Computadores Universidade do Porto, Portugal](https://reader036.vdocument.in/reader036/viewer/2022062317/5a4d1af37f8b9ab05997fca3/html5/thumbnails/11.jpg)
11
Another less “toyish” example: extracting knowledge from mammograms
is_malignant(A) if 'BIRADS_category'(A,b5), 'MassPAO'(A,present), 'Age'(A,age6570), previous_finding(A,B,C), 'MassesShape'(B,none), 'Calc_Punctate'(B,notPresent), previous_finding(A,C), 'BIRADS_category'(C,b3).
This rule states that finding (A) IS malignant IF it is:
classified as BI-RADS 5 ANDhad a mass presentin a patient who: was between the ages of 65 and 70 had two prior mammograms (B, C)and prior mammogram (B): had no mass shape described had no punctate calcificationsand prior mammogram (C) was classified as BI-RADS 3
![Page 12: 1 Solving ILP Problems in the EELA infrastructure Inês Dutra Departamento de Ciência de Computadores Universidade do Porto, Portugal](https://reader036.vdocument.in/reader036/viewer/2022062317/5a4d1af37f8b9ab05997fca3/html5/thumbnails/12.jpg)
12
Introduction: Motivation
• Applications:– Link discovery– Social Network Analysis– Equivalent identities– Drug design– Protein unfolding– Protein metabolism– Why not? Classifying grid failures ()– And...many others!
![Page 13: 1 Solving ILP Problems in the EELA infrastructure Inês Dutra Departamento de Ciência de Computadores Universidade do Porto, Portugal](https://reader036.vdocument.in/reader036/viewer/2022062317/5a4d1af37f8b9ab05997fca3/html5/thumbnails/13.jpg)
13
Introduction: Motivation
• Why does ILP need a grid?– Search space can become large very
quickly– Need many experiments to have statistical
significant results• Cross-validation• Training, tuning, testing
– Can combine classifiers: ensembles
![Page 14: 1 Solving ILP Problems in the EELA infrastructure Inês Dutra Departamento de Ciência de Computadores Universidade do Porto, Portugal](https://reader036.vdocument.in/reader036/viewer/2022062317/5a4d1af37f8b9ab05997fca3/html5/thumbnails/14.jpg)
14
Introduction: Motivation
• Assume we want to run a task for one domain: find a “good” hypothesis that describes pos examples
• Assume we run 5x4-fold cross-validation• Assume we have 100 classifiers per fold• # of experiments: 2,000
![Page 15: 1 Solving ILP Problems in the EELA infrastructure Inês Dutra Departamento de Ciência de Computadores Universidade do Porto, Portugal](https://reader036.vdocument.in/reader036/viewer/2022062317/5a4d1af37f8b9ab05997fca3/html5/thumbnails/15.jpg)
15
Introduction: Motivation
• Now assume each experiment takes 1 hour to run
• How long would it take to generate the 2,000 classifiers to be combined?
~ 83 days!!!• If we consider varying learning parameters
and learning algorithms, this number can be really big!!
![Page 16: 1 Solving ILP Problems in the EELA infrastructure Inês Dutra Departamento de Ciência de Computadores Universidade do Porto, Portugal](https://reader036.vdocument.in/reader036/viewer/2022062317/5a4d1af37f8b9ab05997fca3/html5/thumbnails/16.jpg)
16
Experiment
• Predict carcinogenecity in rodents– Difficult task– large search space!– Important problem
• Phase 1:– Tuning using 5x4-fold cross-validaton– Generating ensembles up to 100
• Aleph: well-known ILP system• Yap: Yet another prolog
![Page 17: 1 Solving ILP Problems in the EELA infrastructure Inês Dutra Departamento de Ciência de Computadores Universidade do Porto, Portugal](https://reader036.vdocument.in/reader036/viewer/2022062317/5a4d1af37f8b9ab05997fca3/html5/thumbnails/17.jpg)
17
Experiment: one of the classifiers
active(A) if atom(A,_,n,32,B), B ≤ -0.401, has_property(A,cytogen_sce,n), methyl(A,_).
Sister Chromatid Exchange (SCE)SCE is used for the determination of mutagenity
![Page 18: 1 Solving ILP Problems in the EELA infrastructure Inês Dutra Departamento de Ciência de Computadores Universidade do Porto, Portugal](https://reader036.vdocument.in/reader036/viewer/2022062317/5a4d1af37f8b9ab05997fca3/html5/thumbnails/18.jpg)
18
Experiment
• 2 submissions:– From LA– From EU
![Page 19: 1 Solving ILP Problems in the EELA infrastructure Inês Dutra Departamento de Ciência de Computadores Universidade do Porto, Portugal](https://reader036.vdocument.in/reader036/viewer/2022062317/5a4d1af37f8b9ab05997fca3/html5/thumbnails/19.jpg)
19
Submitting jobs from LA....
![Page 20: 1 Solving ILP Problems in the EELA infrastructure Inês Dutra Departamento de Ciência de Computadores Universidade do Porto, Portugal](https://reader036.vdocument.in/reader036/viewer/2022062317/5a4d1af37f8b9ab05997fca3/html5/thumbnails/20.jpg)
20
Experiment
EELA resources
utilised
Resource # of jobsCERN 1,160
CIEMAT 279CETA-CIEMAT 173
UniCan 98LIP 10
INFN 38UNAM 16
BIOF.UFRJ 159IF.UFRJ 8UFCG 28Total 1,969
~ 300 resources in LA
211 jobs in LA
![Page 21: 1 Solving ILP Problems in the EELA infrastructure Inês Dutra Departamento de Ciência de Computadores Universidade do Porto, Portugal](https://reader036.vdocument.in/reader036/viewer/2022062317/5a4d1af37f8b9ab05997fca3/html5/thumbnails/21.jpg)
21
Experiments
• Why 1,969 out of 2,000???• 2 reasons:
– Proxy expiration:• On submission (takes loooooong!!!)• On execution
– Use of dynamic libraries
![Page 22: 1 Solving ILP Problems in the EELA infrastructure Inês Dutra Departamento de Ciência de Computadores Universidade do Porto, Portugal](https://reader036.vdocument.in/reader036/viewer/2022062317/5a4d1af37f8b9ab05997fca3/html5/thumbnails/22.jpg)
22
• Submitting jobs from EU...• from a non-EELA site, BUT• Using the EELA VO:
– Jobs run only on EU resources...• Reasons:
– Misconfiguration?– Closer brokers with more machines?
![Page 23: 1 Solving ILP Problems in the EELA infrastructure Inês Dutra Departamento de Ciência de Computadores Universidade do Porto, Portugal](https://reader036.vdocument.in/reader036/viewer/2022062317/5a4d1af37f8b9ab05997fca3/html5/thumbnails/23.jpg)
23
Conclusions• Happiness: EELA is working!!!• We can run thousands of experiments!• Frida is happy!!! (see Condor introductory
tutorials, if you feel curious about Frida )• Experiment showed good utilization of EELA
resources in LA and EU• Low failure rate (1%)• Failures motivated by:
– Dynamic libs not available in the remote machine– Proxy expiration
![Page 24: 1 Solving ILP Problems in the EELA infrastructure Inês Dutra Departamento de Ciência de Computadores Universidade do Porto, Portugal](https://reader036.vdocument.in/reader036/viewer/2022062317/5a4d1af37f8b9ab05997fca3/html5/thumbnails/24.jpg)
24
Future work
• More detailed analysis of jobs and logs• Full ILP experiment• More domains• Other kinds of experiments based on
Statistical Relational Learning
• And, do not forget: ILP can help to model and diagnose errors in the grid environment!
![Page 25: 1 Solving ILP Problems in the EELA infrastructure Inês Dutra Departamento de Ciência de Computadores Universidade do Porto, Portugal](https://reader036.vdocument.in/reader036/viewer/2022062317/5a4d1af37f8b9ab05997fca3/html5/thumbnails/25.jpg)
25
Collaborators
• Fernando Silva (DCC-UPorto)• Vítor Santos Costa (DCC-UPorto)• Rui Camacho (FE-UPorto)• Nuno Fonseca (IBMC/IBMEC, Porto)• Beth Burnside (UW-Madison hospital)• David Page (UW-Madison)• Jesse Davis (UWashington)
![Page 26: 1 Solving ILP Problems in the EELA infrastructure Inês Dutra Departamento de Ciência de Computadores Universidade do Porto, Portugal](https://reader036.vdocument.in/reader036/viewer/2022062317/5a4d1af37f8b9ab05997fca3/html5/thumbnails/26.jpg)
26
Thanks!!!
Questions??