mind
DESCRIPTION
MIND. M odels in d ecision making & d ata @nalysis. Enza Messina and Francesco Archetti. Main Activities. Research Areas Machine Learning Algorithms Probabilistic and Relational Models Optimization Under Uncertainty. World Wide Web Life Sciences Ambient Intelligence Finance. - PowerPoint PPT PresentationTRANSCRIPT
MINDModels in decision making & data @nalysis
Enza Messina and Francesco Archetti
sdgdf
Main ActivitiesResearch Areas
o Machine Learning Algorithmso Probabilistic and Relational Modelso Optimization Under Uncertainty
o World Wide Web o Life Scienceso Ambient Intelligenceo Finance
Applicative Domains
Faculty: Francesco ArchettiEnza MessinaGuglielmo Lulli
Post Doc: Elisabetta Fersini
Daniele ToscaniPhD Students: Ilaria Giordani
Cristina Elena ManfredottiOthers: Gaia Arosio
Irene SbernaFrancesca Bargna
Statistical Learning and Relational Data
- Traditional learning methods are consistent with the classical statistical inference problem formulation are independent and identically distributed (i.i.d.)
aiuto!
ProbabilisticModels
LearningTechniques
SRL
ProbabilisticModels
Relational Representation
LearningTechniques
- but do not reflect the real world! We need a solution able to deal with relationships and with uncertainty in more general terms
SL
Traditional learning approaches work well with flat representations
fixed length attribute-value vectors
assume independent (IID) sample
Patient
flattenProblems:
– introduces statistical skew– loses relational structure
• incapable of detecting link-based patterns
– must fix attributes in advance
Contact
Machine Learning and Relational Data
Bayesian nets use propositional representation Real world has objects, related to each other
Intelligence Difficulty
Grade
Intell_Jane Diffic_CS101
Grade_Jane_CS101
Intell_George Diffic_Geo101
Grade_George_Geo101
Intell_George Diffic_CS101
Grade_George_CS101A C
These “instances” are not independent
Machine Learning and Relational Data
BDaphne Koller, 2003
Probabilistic Relational Models
Integrate uncertainty with relational model
Convenient language for specifying complex models “Web of influence”: subtle & intuitive reasoning
Framework for incorporating heterogeneous data by connecting related entities (consider also relation uncertainty)
New problems: Relational clustering Collective classification
Open Problems: Inference and Learning
Level
Gene Cluster
LipidHSF
Endoplasmatic
GCN4
Exp. cluster
Exp. type
Heterogeneous Information
Inference
Some Applications- Document Analysis
- Life Sciences- Ambient Intelligence
Document Analysis
Document AnalysisThe Web Case
1ir
2ir
3ir
Enhancing document representation for inducing traditional learning algorithm
lvr lvr′
lvR
Web Document Classification Web Document Ranking
Relational instances representation for enhancing:
Learning Models for Relational Data: Relational Clustering
#origin_ref#destinatio
n_ref
#origin_ref#destinatio
n_ref
Link♦
document_id
class
♦ document_id
class
Document
lvR
Document AnalysisThe Web Case
1. Constraint Learning
2. Objective Function Adaptation
Relational Classification: Probabilistic Relational Models with Relational Uncertainty
Document AnalysisE-Forensics JUdicial MAnagement by Digital Libraries
SemanticsInformation Extraction
Emotion Recognition
Proceedings n° ……..
Accused Name XXXXXX
Witness Name KKKKKK
Prosecutor Name -
Lawyer Name YYYYYYZZZZZZ
Meeting Date 1989
Meeting Location Civitanova Marche
Hearing Summarization
Journal Papers
E. Fersini, E. Messina, F. Archetti, A probabilistic relational approach for web document clustering, to appear in Journal of Information Processing and Management.
E. Fersini, E. Messina, F. Archetti, Enhancing Web Page Classification using Visual Block Analysis, to appear in Journal of Information Processing and Management.
Conference Papers
F. Archetti, G. Arosio, E. Fersini, E. Messina, Emotion recognition in judicial domain: a multilayer SVM approach, Lecture Notes in Artificial Intelligence, Machine Learning and data Mining, Lipsia 2009.
E. Fersini, E. Messina, F. Archetti, Probabilistic relational models with relational uncertainty: an early study in web page classification, IEEE WI-IAT Workshop, 2009.
F. Archetti, G. Arosio, E. Fersini, E. Messina, Audio-based Emotion Recognition for Advanced Automatic Retrieval in Judicial Domain, Proc. ICT4JUSTICE, 1st Int. Conf. on ICT Solutions for Justice, Greece, 2008.
F. Archetti, E. Fersini, E. Messina, Granular modeling of web document: impact on information retrieval systems, Tenth International Workshop on Web Information and Data Management – WIDM 2008
F. Archetti, E. Fersini, P. Campanelli, E. Messina, "A Hierarchical Document Clustering Environment Based on the Induced Bisecting k-Means" LNCS Flexible Query Answering Systems, 2006.
Recent PublicationsRecent Publications
Life Sciences
Find a partition of a given set of instances using additional information coming from instances
relationships.
SEMI-SUPERVISED LEARNING METHODwhere relations can be represented by pair-wise constraints on some of the istances (specifying whether two istances should be in same or different cluster)
14
Relational clusteringRelational clustering
• Constraint Learning• Modify distance measure in clustering objective function
Systems Biology Applications
Regulatory modules
TF
Gene CodingControl
DNA
RNAsingle strand
Transcription +
Human cancer
Gene expressi
on
Drug Activit
y
Gene drug interactionidentification of a drug treatment for a given cell line based both on drug activity pattern and gene expression profile
Learning gene regulatory networks
Modelling the pharmacology of cancer
Collaborations
Journal Papers
E. Messina, M. Sanguineti eds, Special Issue on OR and data mining for biological data, Comuters and OR, to appear.
F. Archetti, I. Giordani, L. Vanneschi, Genetic Programming for Anticancer Therapeutic Response Prediction using the NCI-60 Dataset to appear in Computer and operations Research, 2009.
L. Vanneschi, F. Archetti, M. Castelli, I. Giordani, Classification of Oncologic Data with Genetic Programming to appear in Journal of Artificial Evolution and Applications, 2009.
G. Lulli, M. Romauch: A Mathematical Program to Refine Gene Regulatory Networks, Discrete Applied Mathematics, 157 (10), 2009.
F. Archetti, S. Lanzeni, E. Messina, Graph Models and Mathematical Programming in Biochemical Networks Analysis and Metabolic Engineering Design, Computers & Mathematics with Applications, Vol. 55, n. 5, pp. 970-983, 2008.
S. Lanzeni, E. Messina, F. Archetti, Towards metabolic networks phylogeny using Petri Net-based expansional analysis, BMC Systems Biology 2007.
Conference Papers
F. Archetti, I Giordani, D. Mari, E. Messina, G. Ogliari, A Systems Biology Approach to oral anticoagulation therapy, Systbiohealth Symposium,2008
I. Giordani, L. Vanneschi, E. Fersini. “Modelling the Relationship between the Microarray Data of the NCI-60 Anticancer Dataset with Therapeutic Responses by Genetic Programming”. SysBioHealth Symposium (ISBN: 978-88-903154-0-4), 2007.
E. Fersini, C. Manfredotti, E. Messina, F. Archetti. “Relational Clustering for Gene Expression Profiles and Drug Activity Pattern Analysis”. SysBioHealth Symposium (ISBN: 978-88-903154-0-4), 2007.
F. Archetti, S. Lanzeni, E. Messina, L. Vanneschi, Genetic Programming for Computational Pharmacokinetics in Drug Discovery and Development. Genetic Programming and Evolvable Machines, vol 8 (4), 2007.
F. Archetti, S. Lanzeni, E. Messina, L. Vanneschi "Genetic Programming and other Machine Learning approaches to predict Median Oral Lethal Dose (LD50) and Plasma Protein Binding levels (%PPB) of drugs" Lecture Notes in Computer Sciences, EvoBIO 2007. Submitted Papers
Archetti, Giordani, Messina, Mauri, A new clustering approach for learning transcriptional regulatory networks, submitted to Int. Journal of Data Mining and Bioinformatics.
F. Archetti, S. Lanzeni, G. Lulli, E. Messina A mathematical model for optimal functional disruption of biochemical networks, submitted to Journal of Mathematical Modelling and Algorithms.
E. Fersini, C. Manfredotti, E. Messina, F. Archetti Relational K-Means for Gene Expression Profiles and Drug Activity Pattern Analysis, submitted to Int. Journal of Mathematical Modelling and Algorithms.
Recent PublicationsRecent Publications
17
Pharmacogenomics Application: Predict drug response to oral anticoagulation therapy (OAT)
Grouping (Profiling) patients based on their clinical and genotypic features
in order to suggest the correct drug dosage
Haemorragic riskThrombotic riskData on more than 1000 patients:
Clinical and therapeutical data: personal patients data, medical diagnosis, therapy, INR and dosage measurements Genetic data: polymorphism of two genes: CYP2C9 and VKORC1 that contribute to differences in patients’ response.
In collaboration with
.
StateEstimation
ActionSelection
belief actionobservation
Dynamic State Space Model
State: a vector of variables some of which are not observableTransition Model p(xt|xt-1,at)Observation Model p(zt|xt)
A set of possible actionsgiven a belief state distribution
Inference and Decision Problems
Tracking the (hidden) state of a system as it evolves over time from sequentially arriving (noisy or ambiguous) observations
Ambient Intelligence
Multi-target trackingMulti-target tracking: finding the tracks of an unknown number of moving targets from noisy observations.
Track: sequence of “States” travelled by a target need to be estimated (we’ll deal with on-line problems).
Requires Data Association: PF tracking objects individually, lack a consistent way to resolve the ambiguities that arise in associating object with measurements
Exploiting relations can improve the efficiency of the tracker Monitoring relations can be a goal in itself
We model the transition probability of the system with a RDBN.
In collaboration with
21
The main research topics we propose:
A new representation modelling not only objects but also their relations (i.e. exploiting relations can improve the efficiency of the tracker).
A new computational strategy based on a family of Sequential Monte Carlo methods called Relational Particle Filter
Statistical techniques for the detection of anomalous behaviours
Wireless Sensor Networks
Bayesian abstractions for virtual sensing through low cost data Bayesian abstractions for virtual sensing through low cost data aggregation and net-wide anomaly detectionaggregation and net-wide anomaly detection
Modelling Cluster Heads as nodes of a BNModelling Cluster Heads as nodes of a BN Inference to know sensor values also in presence of temporary faults:Inference to know sensor values also in presence of temporary faults:
Lack of communication (sensor failure or sleep)Lack of communication (sensor failure or sleep) Outlier due to sensor malfunctioningOutlier due to sensor malfunctioning
2222
CH1
CH2
CH3
CH4
CH5
WSN
BN
sink
Transportation & Logistics
In collaboration with:
Data Models Decisions
wwwf
ltk
f
lth
f
tj −−+≤
,,,
u
Lu f
j
Pj f
destf
h
korigf
v
w
1,,,
≤+≤ www f
Tw
f
Tv
f
Tu
Journal Papers
F. Archetti, M. Frigerio, E. Messina, D. Toscani, IKNOS - Inference and Knowledge in Networks of Sensors, to appear on Int. Journal of Sensor Networks, 2009.
F. Chiti, R. Fantacci, F. Archetti, E. Messina, D. Toscani, An integrated Communications Framework for Context aware Continuous Monitoring with Body Sensor Networks, IEEE Journal on Selected Areas in Communications, Vol.27, No.4, pp. 379-386, 2009.
P. Dell’Olmo, A. Iovanella, G. Lulli, B. Scoppola, Exploiting Incomplete Information to manage multiprocessor tasks with variable arrival rates, Computers and Operations Research, Vol. 35, no 5, 2008.
G. Andreatta, G. Lulli, A Multi-period TSP with Stochastic Regular and Urgent Demands, European Journal of Operations Research, 2008.
D. Bertsimas, G. Lulli, A. Odoni, The ATFM Problem: An Integer Optimization Approach, Integer Programming and Combinatorial Optimization, LNCS 5035, 2008.
K.F. Doerner, W. J. Gutjahr, R.F. Hartl, G. Lulli, Stochastic Local Search Procedures for the Probabilistic Two-Day Vehicle Routing Problem, Advances in Computational Intelligence in Transportation and Logistics (A. Fink, F. Rothlauf Eds. )- Springer Series on Studies in Computational Intelligence, pp. 153-168, 2008.
G. Lulli, S. Sen ,A Heuristic Algorithm for Stochastic Integer Program with Complete Recourse, European Journal of Operations Research, 2006.
Conference Papers
C. Manfredotti, Modeling and Inference with RDBNs, Canadian Artificial Intelligence Conference, Graduated Student Symposium, May, 2009.
C. Manfredotti, E. Messina, F. Archetti.Improving Multiple Traget Tracking with RDBNs, working paper presented at AIROWinter 2009, International Conference of the Italian Operations Research Society, January, 2009.
F. Archetti, E. Messina, D. Toscani, M. Frigerio, KOINOS - Knowledge from observations and inference in networks of sensors, Proceedings of IASTED International Conference on Sensor Networs, 2008.
F. Archetti, C. Manfredotti, M. Matteucci, E. Messina and D. G. Sorrenti, Multiple Hypotesis Markov Chains For On-Line Anomaly Detection in Traffic Video Surveillance, Proceedings ICDP 2006: Imaging for Crime Detection and Prevention, 13-14 June 2006.
F.Archetti, C.E. Manfredotti, E. Messina, and D. G. Sorrenti foreground-to-ghost Discrimination in Single-difference Pre-processing, Lecture Notes in Computer Science: Advanced Concepts for Intelligent Vision Systems, ACIVS’06, 263-274, 2006.
Submitted Papers
D. Toscani, F. Archetti, E. Messina, M. Frigerio, F. Chiti, R. Fantacci. SIFNOS – Statistical Inference and Filtering in Networks of Sensors. Submitted to IEEE Journal on Selected Areas in Communications - Simple WSN Solutions, 2009.
Recent PublicationsRecent Publications
Ambient Intelligence Currently active Projects
LENVIS - Localised environmental and health information services for all (EU-FP7)
INSYEME – Integrated Systems for Emergencies (MIUR - FIRB)
GREIS - Gestione del Risparmio Energetico attraverso Informazioni di Sicurezza (MIUR)
In collaboration with SAL Lab.
LIMNOS Logistics and Informatics for Mobility and Network OptimiSation (MIUR)
H-CIM Health Care through Intelligent Monitoring (MIUR)
In collaboration with NOMADIS Lab.
Financial Time Series
Hidden var.: Regime
Dynamic State Space Models for Scenario Generation
1( | )
( | )t t
t t
p x x
p z x−Transition Model
Observation Model
Markov Chain
Mixture of Gaussians(Autoregressive Process)
(Autoregressive) Hidden Markov Model
Observations: pricestxtS
tS
Regime Switching Models
t=1 t=2 t=3 t=4
27
Messina, E., Toscani, D., Hidden Markov models for scenario generation, IMA Journal of Management Mathematics, Vol. 4, pp. 379-401, 2008.
Recent Publications
Perspectives
Extend state space models to more general Relational Dynamic Bayesian Networks to account not only prices but also “exogenous” economic factors and unstructured information
Algorithms for managing risk tracking portfolio using all available evidence and taking into account all uncertainties
Markets are good at gathering information from many heterogeneous sources and combining it appropriately, the same we would expect from models
PRIN 2007 ”Probabilistic Models for representing uncertainty in portfolio optimization problems”(with Università di Bergamo and Università della Calabria)
Collaboration with Brunel University and CARISMA Research Centre.
Projects & Collaborations
A cooperation network for research projects and student mobility
University of Toronto
Massachusset
Institute of
Technology
Norwegian University of Science and Technology
Brunel University
Centre of Research and Technology Hellas
Aachen University
Hungarian Academy of Sciences
-TXT e-Solutions -Siemens -Project Automation-Aegate Ltd-OptiRisk-Astra Zeneca-DELOS-Comerson
CARISMA Research Center