l’età della parola giuseppe attardi dipartimento di informatica università di pisa esa...
TRANSCRIPT
L’età della parola
Giuseppe AttardiDipartimento di Informatica
Università di Pisa
ESA SoBigData Pisa, 24 febbraio 2015
Natural Language LearningChildren learn to speak naturally,
by talking with othersTeach computers to learn
language in a similarly natural way
Statistical Machine LearningTraining on large document
collectionsRequires ability to process Big
Data If we used same algorithms 10 years
ago they would still be runningThe Unreasonable Effectiveness of
Big Data
Example: Machine Translation
Arabic to English, five-gram language models, of varying size
Deep Learning Breakthrough: 2006
…
…
…
…Output layerPrediction of target
Hidden layersLearn more abstract
representations
Input layerRaw input
Lots of Unlabeled Data
Language Model Corpus: 2 B words Dictionary: 130,000 most frequent words 4 weeks of training
Parallel + CUDA algorithm 2 hours
A Unified Deep Learning Architecture for NLP NER (Named Entity
Recognition) POS tagging Chunking Parsing SRL (Semantic Role
Labeling) Sentiment Analysis
Deep Text Analysis
Parsing Word Sense Disambiguation Anafora Resolution Information Extraction Sentiment Analysis Text Entailment Question Answering Biomedical Text Analysis
QA on Alzheimer Disease
the γ-secretase inhibitor Semacestat failed to slow cognitive decline
disorderSnowMed: C0236848
protein drug
substance
QA on Alzheimer Competition
SUBJ OBJ
APPO OBJ
ROOT
Correlation Simptoms-Diseases
Big data, Big Brain
Google DistrBelief Cluster capable of simulating 100 billion
connections Used to learn unsupervised image classification Used to produce tiny ASR model
Similar basic capability for processing image, audio and language
European FET Brain project