mind

29
MIND Models in decision making & data @nalysis Enza Messina and Francesco Archetti

Upload: salome

Post on 12-Jan-2016

30 views

Category:

Documents


0 download

DESCRIPTION

MIND. M odels in d ecision making & d ata @nalysis. Enza Messina and Francesco Archetti. Main Activities. Research Areas Machine Learning Algorithms Probabilistic and Relational Models Optimization Under Uncertainty. World Wide Web Life Sciences Ambient Intelligence Finance. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: MIND

MINDModels in decision making & data @nalysis

Enza Messina and Francesco Archetti

sdgdf

Page 2: MIND

Main ActivitiesResearch Areas

o Machine Learning Algorithmso Probabilistic and Relational Modelso Optimization Under Uncertainty

o World Wide Web o Life Scienceso Ambient Intelligenceo Finance

Applicative Domains

Faculty: Francesco ArchettiEnza MessinaGuglielmo Lulli

Post Doc: Elisabetta Fersini

Daniele ToscaniPhD Students: Ilaria Giordani

Cristina Elena ManfredottiOthers: Gaia Arosio

Irene SbernaFrancesca Bargna

Page 3: MIND

Statistical Learning and Relational Data

- Traditional learning methods are consistent with the classical statistical inference problem formulation are independent and identically distributed (i.i.d.)

aiuto!

ProbabilisticModels

LearningTechniques

SRL

ProbabilisticModels

Relational Representation

LearningTechniques

- but do not reflect the real world! We need a solution able to deal with relationships and with uncertainty in more general terms

SL

Page 4: MIND

Traditional learning approaches work well with flat representations

fixed length attribute-value vectors

assume independent (IID) sample

Patient

flattenProblems:

– introduces statistical skew– loses relational structure

• incapable of detecting link-based patterns

– must fix attributes in advance

Contact

Machine Learning and Relational Data

Page 5: MIND

Bayesian nets use propositional representation Real world has objects, related to each other

Intelligence Difficulty

Grade

Intell_Jane Diffic_CS101

Grade_Jane_CS101

Intell_George Diffic_Geo101

Grade_George_Geo101

Intell_George Diffic_CS101

Grade_George_CS101A C

These “instances” are not independent

Machine Learning and Relational Data

BDaphne Koller, 2003

Page 6: MIND

Probabilistic Relational Models

Integrate uncertainty with relational model

Convenient language for specifying complex models “Web of influence”: subtle & intuitive reasoning

Framework for incorporating heterogeneous data by connecting related entities (consider also relation uncertainty)

New problems: Relational clustering Collective classification

Open Problems: Inference and Learning

Level

Gene Cluster

LipidHSF

Endoplasmatic

GCN4

Exp. cluster

Exp. type

Heterogeneous Information

Inference

Page 7: MIND

Some Applications- Document Analysis

- Life Sciences- Ambient Intelligence

Page 8: MIND

Document Analysis

Page 9: MIND

Document AnalysisThe Web Case

1ir

2ir

3ir

Enhancing document representation for inducing traditional learning algorithm

lvr lvr′

lvR

Web Document Classification Web Document Ranking

Relational instances representation for enhancing:

Page 10: MIND

Learning Models for Relational Data: Relational Clustering

#origin_ref#destinatio

n_ref

#origin_ref#destinatio

n_ref

Link♦

document_id

class

♦ document_id

class

Document

lvR

Document AnalysisThe Web Case

1. Constraint Learning

2. Objective Function Adaptation

Relational Classification: Probabilistic Relational Models with Relational Uncertainty

Page 11: MIND

Document AnalysisE-Forensics JUdicial MAnagement by Digital Libraries

SemanticsInformation Extraction

Emotion Recognition

Proceedings n° ……..

Accused Name XXXXXX

Witness Name KKKKKK

Prosecutor Name -

Lawyer Name YYYYYYZZZZZZ

Meeting Date 1989

Meeting Location Civitanova Marche

Hearing Summarization

Page 12: MIND

Journal Papers

E. Fersini, E. Messina, F. Archetti, A probabilistic relational approach for web document clustering, to appear in Journal of Information Processing and Management.

E. Fersini, E. Messina, F. Archetti, Enhancing Web Page Classification using Visual Block Analysis, to appear in Journal of Information Processing and Management.

Conference Papers

F. Archetti, G. Arosio, E. Fersini, E. Messina, Emotion recognition in judicial domain: a multilayer SVM approach, Lecture Notes in Artificial Intelligence, Machine Learning and data Mining, Lipsia 2009.

E. Fersini, E. Messina, F. Archetti, Probabilistic relational models with relational uncertainty: an early study in web page classification, IEEE WI-IAT Workshop, 2009.

F. Archetti, G. Arosio, E. Fersini, E. Messina, Audio-based Emotion Recognition for Advanced Automatic Retrieval in Judicial Domain, Proc. ICT4JUSTICE, 1st Int. Conf. on ICT Solutions for Justice, Greece, 2008.

F. Archetti, E. Fersini, E. Messina, Granular modeling of web document: impact on information retrieval systems, Tenth International Workshop on Web Information and Data Management – WIDM 2008

F. Archetti, E. Fersini, P. Campanelli, E. Messina, "A Hierarchical Document Clustering Environment Based on the Induced Bisecting k-Means" LNCS Flexible Query Answering Systems, 2006.

Recent PublicationsRecent Publications

Page 13: MIND

Life Sciences

Page 14: MIND

Find a partition of a given set of instances using additional information coming from instances

relationships.

SEMI-SUPERVISED LEARNING METHODwhere relations can be represented by pair-wise constraints on some of the istances (specifying whether two istances should be in same or different cluster)

14

Relational clusteringRelational clustering

• Constraint Learning• Modify distance measure in clustering objective function

Page 15: MIND

Systems Biology Applications

Regulatory modules

TF

Gene CodingControl

DNA

RNAsingle strand

Transcription +

Human cancer

Gene expressi

on

Drug Activit

y

Gene drug interactionidentification of a drug treatment for a given cell line based both on drug activity pattern and gene expression profile

Learning gene regulatory networks

Modelling the pharmacology of cancer

Collaborations

Page 16: MIND

Journal Papers

E. Messina, M. Sanguineti eds, Special Issue on OR and data mining for biological data, Comuters and OR, to appear.

F. Archetti, I. Giordani, L. Vanneschi, Genetic Programming for Anticancer Therapeutic Response Prediction using the NCI-60 Dataset to appear in Computer and operations Research, 2009.

L. Vanneschi, F. Archetti, M. Castelli, I. Giordani, Classification of Oncologic Data with Genetic Programming to appear in Journal of Artificial Evolution and Applications, 2009.

G. Lulli, M. Romauch: A Mathematical Program to Refine Gene Regulatory Networks, Discrete Applied Mathematics, 157 (10), 2009.

F. Archetti, S. Lanzeni, E. Messina, Graph Models and Mathematical Programming in Biochemical Networks Analysis and Metabolic Engineering Design, Computers & Mathematics with Applications, Vol. 55, n. 5, pp. 970-983, 2008.

S. Lanzeni, E. Messina, F. Archetti, Towards metabolic networks phylogeny using Petri Net-based expansional analysis, BMC Systems Biology 2007.

Conference Papers

F. Archetti, I Giordani, D. Mari, E. Messina, G. Ogliari, A Systems Biology Approach to oral anticoagulation therapy, Systbiohealth Symposium,2008

I. Giordani, L. Vanneschi, E. Fersini. “Modelling the Relationship between the Microarray Data of the NCI-60 Anticancer Dataset with Therapeutic Responses by Genetic Programming”. SysBioHealth Symposium (ISBN: 978-88-903154-0-4), 2007.

E. Fersini, C. Manfredotti, E. Messina, F. Archetti. “Relational Clustering for Gene Expression Profiles and Drug Activity Pattern Analysis”. SysBioHealth Symposium (ISBN: 978-88-903154-0-4), 2007.

F. Archetti, S. Lanzeni, E. Messina, L. Vanneschi, Genetic Programming for Computational Pharmacokinetics in Drug Discovery and Development. Genetic Programming and Evolvable Machines, vol 8 (4), 2007.

F. Archetti, S. Lanzeni, E. Messina, L. Vanneschi "Genetic Programming and other Machine Learning approaches to predict Median Oral Lethal Dose (LD50) and Plasma Protein Binding levels (%PPB) of drugs" Lecture Notes in Computer Sciences, EvoBIO 2007. Submitted Papers

Archetti, Giordani, Messina, Mauri, A new clustering approach for learning transcriptional regulatory networks, submitted to Int. Journal of Data Mining and Bioinformatics.

F. Archetti, S. Lanzeni, G. Lulli, E. Messina A mathematical model for optimal functional disruption of biochemical networks, submitted to Journal of Mathematical Modelling and Algorithms.

E. Fersini, C. Manfredotti, E. Messina, F. Archetti Relational K-Means for Gene Expression Profiles and Drug Activity Pattern Analysis, submitted to Int. Journal of Mathematical Modelling and Algorithms.

Recent PublicationsRecent Publications

Page 17: MIND

17

Pharmacogenomics Application: Predict drug response to oral anticoagulation therapy (OAT)

Grouping (Profiling) patients based on their clinical and genotypic features

in order to suggest the correct drug dosage

Haemorragic riskThrombotic riskData on more than 1000 patients:

Clinical and therapeutical data: personal patients data, medical diagnosis, therapy, INR and dosage measurements Genetic data: polymorphism of two genes: CYP2C9 and VKORC1 that contribute to differences in patients’ response.

In collaboration with

.

Page 18: MIND

StateEstimation

ActionSelection

belief actionobservation

Dynamic State Space Model

State: a vector of variables some of which are not observableTransition Model p(xt|xt-1,at)Observation Model p(zt|xt)

A set of possible actionsgiven a belief state distribution

Inference and Decision Problems

Tracking the (hidden) state of a system as it evolves over time from sequentially arriving (noisy or ambiguous) observations

Page 19: MIND

Ambient Intelligence

Page 20: MIND

Multi-target trackingMulti-target tracking: finding the tracks of an unknown number of moving targets from noisy observations.

Track: sequence of “States” travelled by a target need to be estimated (we’ll deal with on-line problems).

Requires Data Association: PF tracking objects individually, lack a consistent way to resolve the ambiguities that arise in associating object with measurements

Exploiting relations can improve the efficiency of the tracker Monitoring relations can be a goal in itself

We model the transition probability of the system with a RDBN.

In collaboration with

Page 21: MIND

21

The main research topics we propose:

A new representation modelling not only objects but also their relations (i.e. exploiting relations can improve the efficiency of the tracker).

A new computational strategy based on a family of Sequential Monte Carlo methods called Relational Particle Filter

Statistical techniques for the detection of anomalous behaviours

Page 22: MIND

Wireless Sensor Networks

Bayesian abstractions for virtual sensing through low cost data Bayesian abstractions for virtual sensing through low cost data aggregation and net-wide anomaly detectionaggregation and net-wide anomaly detection

Modelling Cluster Heads as nodes of a BNModelling Cluster Heads as nodes of a BN Inference to know sensor values also in presence of temporary faults:Inference to know sensor values also in presence of temporary faults:

Lack of communication (sensor failure or sleep)Lack of communication (sensor failure or sleep) Outlier due to sensor malfunctioningOutlier due to sensor malfunctioning

2222

CH1

CH2

CH3

CH4

CH5

WSN

BN

sink

Page 23: MIND

Transportation & Logistics

In collaboration with:

Data Models Decisions

wwwf

ltk

f

lth

f

tj −−+≤

,,,

u

Lu f

j

Pj f

destf

h

korigf

v

w

1,,,

≤+≤ www f

Tw

f

Tv

f

Tu

Page 24: MIND

Journal Papers

F. Archetti, M. Frigerio, E. Messina, D. Toscani, IKNOS - Inference and Knowledge in Networks of Sensors, to appear on Int. Journal of Sensor Networks, 2009.

F. Chiti, R. Fantacci, F. Archetti, E. Messina, D. Toscani, An integrated Communications Framework for Context aware Continuous Monitoring with Body Sensor Networks, IEEE Journal on Selected Areas in Communications, Vol.27, No.4, pp. 379-386, 2009.

P. Dell’Olmo, A. Iovanella, G. Lulli, B. Scoppola, Exploiting Incomplete Information to manage multiprocessor tasks with variable arrival rates, Computers and Operations Research, Vol. 35, no 5, 2008.

G. Andreatta, G. Lulli, A Multi-period TSP with Stochastic Regular and Urgent Demands, European Journal of Operations Research, 2008.

D. Bertsimas, G. Lulli, A. Odoni, The ATFM Problem: An Integer Optimization Approach, Integer Programming and Combinatorial Optimization, LNCS 5035, 2008.

K.F. Doerner, W. J. Gutjahr, R.F. Hartl, G. Lulli, Stochastic Local Search Procedures for the Probabilistic Two-Day Vehicle Routing Problem, Advances in Computational Intelligence in Transportation and Logistics (A. Fink, F. Rothlauf Eds. )- Springer Series on Studies in Computational Intelligence, pp. 153-168, 2008.

G. Lulli, S. Sen ,A Heuristic Algorithm for Stochastic Integer Program with Complete Recourse, European Journal of Operations Research, 2006.

Conference Papers

C. Manfredotti, Modeling and Inference with RDBNs, Canadian Artificial Intelligence Conference, Graduated Student Symposium, May, 2009.

C. Manfredotti, E. Messina, F. Archetti.Improving Multiple Traget Tracking with RDBNs, working paper presented at AIROWinter 2009, International Conference of the Italian Operations Research Society, January, 2009.

F. Archetti, E. Messina, D. Toscani, M. Frigerio, KOINOS - Knowledge from observations and inference in networks of sensors, Proceedings of IASTED International Conference on Sensor Networs, 2008.

F. Archetti, C. Manfredotti, M. Matteucci, E. Messina and D. G. Sorrenti, Multiple Hypotesis Markov Chains For On-Line Anomaly Detection in Traffic Video Surveillance, Proceedings ICDP 2006: Imaging for Crime Detection and Prevention, 13-14 June 2006.

F.Archetti, C.E. Manfredotti, E. Messina, and D. G. Sorrenti foreground-to-ghost Discrimination in Single-difference Pre-processing, Lecture Notes in Computer Science: Advanced Concepts for Intelligent Vision Systems, ACIVS’06, 263-274, 2006.

Submitted Papers

D. Toscani, F. Archetti, E. Messina, M. Frigerio, F. Chiti, R. Fantacci. SIFNOS – Statistical Inference and Filtering in Networks of Sensors. Submitted to IEEE Journal on Selected Areas in Communications - Simple WSN Solutions, 2009.

Recent PublicationsRecent Publications

Page 25: MIND

Ambient Intelligence Currently active Projects

LENVIS - Localised environmental and health information services for all (EU-FP7)

INSYEME – Integrated Systems for Emergencies (MIUR - FIRB)

GREIS - Gestione del Risparmio Energetico attraverso Informazioni di Sicurezza (MIUR)

In collaboration with SAL Lab.

LIMNOS Logistics and Informatics for Mobility and Network OptimiSation (MIUR)

H-CIM Health Care through Intelligent Monitoring (MIUR)

In collaboration with NOMADIS Lab.

Page 26: MIND

Financial Time Series

Page 27: MIND

Hidden var.: Regime

Dynamic State Space Models for Scenario Generation

1( | )

( | )t t

t t

p x x

p z x−Transition Model

Observation Model

Markov Chain

Mixture of Gaussians(Autoregressive Process)

(Autoregressive) Hidden Markov Model

Observations: pricestxtS

tS

Regime Switching Models

t=1 t=2 t=3 t=4

27

Messina, E., Toscani, D., Hidden Markov models for scenario generation, IMA Journal of Management Mathematics, Vol. 4, pp. 379-401, 2008.

Recent Publications

Page 28: MIND

Perspectives

Extend state space models to more general Relational Dynamic Bayesian Networks to account not only prices but also “exogenous” economic factors and unstructured information

Algorithms for managing risk tracking portfolio using all available evidence and taking into account all uncertainties

Markets are good at gathering information from many heterogeneous sources and combining it appropriately, the same we would expect from models

PRIN 2007 ”Probabilistic Models for representing uncertainty in portfolio optimization problems”(with Università di Bergamo and Università della Calabria)

Collaboration with Brunel University and CARISMA Research Centre.

Projects & Collaborations

Page 29: MIND

A cooperation network for research projects and student mobility

University of Toronto

Massachusset

Institute of

Technology

Norwegian University of Science and Technology

Brunel University

Centre of Research and Technology Hellas

Aachen University

Hungarian Academy of Sciences

-TXT e-Solutions -Siemens -Project Automation-Aegate Ltd-OptiRisk-Astra Zeneca-DELOS-Comerson

CARISMA Research Center