using machine learning to predict temporal orientation of search engine queries in the temporalia...

Upload: mfilannino

Post on 04-Feb-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    1/98

    [email protected], [email protected]

    Tokyo, 11/12/2014

    presentation:NTCIR-11 Temporalia

    Using machine learning to predict temporalorientation of search engines queries

    in the Temporalia challenge

    Michele Filannino, Goran Nenadic

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    2/98

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    3/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25

    temporal intent of queries (TIQ)

    3Source:

    Given a user queryand its submission time, can a

    system predict its temporal intent?

    " input: queries & submission date

    " output: temporal intent

    PAST, RECENCY, FUTURE or ATEMPORAL

    " easyfor people

    " hardfor machines

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    4/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25Source:https://www.google.co.uk/search?q=google+stock+price

    TQI: recency

    4

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    5/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25Sourc

    e:https://www.google.co.uk/search?q=weather+forecast+manchester

    TQI: future

    5

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    6/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25Source: https://www.google.co.uk/search?q=who+was+eliminated+on+dancing+with+the+stars

    TQI: past

    6

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    7/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25Source: https://www.google.co.uk/search?q=who+was+eliminated+on+dancing+with+the+stars

    TQI: atemporal

    7

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    8/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25

    the data

    " training set

    80 instances +

    20 instances (released as preliminary test set)

    " test

    300instances

    8

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    9/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25

    [1] G. H. Dias, M. Hasanuzzaman, S. Ferrari, and Y. Mathet. TempoWordNet for sentence time tagging. InProceedings of the 23rd International Conference on World Wide Web Companion, pages 833838, Republicand Canton of Geneva, Switzerland, 2014.

    proposed approach

    9

    " data-driven rather than rule-based

    " low-sparsity attributes

    " external resources:

    TempoWordNet1, a temporal lexical KB

    ManTIME, a temporal expression

    extraction system

    NLTK

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    10/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25[1] M. Filannino, G. Brown, and G. Nenadic. ManTIME: Temporal expression identification andnormalization in the TempEval-3 challenge. In Proceedings of SemEval 2013, pages 5357,Atlanta, USA, June 2013. ACL.

    ManTIME1usage

    madden 2014 release date

    madden 2014 release date

    drudge report 2013 september

    drudge report 2013 september

    10

    " a ML-based temporal expression extraction system

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    11/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25

    trigger classes

    PAST

    ancient

    daysdeath

    didhistory

    lastmonths

    21 triggers

    RECENCY

    actual

    costcosts

    currentdailyday

    direction

    44 triggers

    FUTURE

    agenda

    calendarchancecomingdates

    forecastforthcoming

    27 triggers

    ATEMPORAL

    chords

    lyrics2 triggers

    11

    " Feature selection RELIEF algorithm

    " BOW representation

    " 4 dictionaries (1 per class)

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    12/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25Sparsity is measured on the full data set: training + test

    attributes

    12

    # Attribute description SparsityExample

    Input (query/time) !attribute value

    1 Is it a Wikipedia page title? 2 New York Times #YES2 Does it contain a temporal expression? 2 june 2013 movies #YES3 Submissions term 3 Feb 28, 2013 GMT+0 #B4 Submissions trimester 4 Aug 26, 2013 GMT+0#M2

    5 Timing 4 Movies 2012, Feb 28, 2013 #past6 Most frequent trigger class 5 peso dollar exchange rate #present7 Wh type 5 how did hitler die #how8 Most frequent TempoWordNet class 5 current stock prices #present9 os requen ag ense 7 what is stop kony 2012 #VBZ

    10 Most frequent coarse-grained POS tag 8 kony 2012 fake #N11 Trigger classes footprint 11 what was I thinking lyrics#past-atemporal12 Temporal $between submission and query 16 fathers day 2010, Feb 28, 2013#36.0

    13 Tenses footprint 18 when does fall start #VBZ-VB14 Ordered TempoWordNet classes 18 the last song #past-future-present-15 Most frequent fine-grained POS tag 21 kony 2012 fake #NN16 Coarse-grained POS tag ordered footprint 119 when is labour day #N-W-V17 Fine-grained POS tag ordered footprint 202 when is labour day #NN-WRB-VBZ18 Coarse-grained POS tag footprint 204 when is labour day #W-V-N-N19 Fine-grained POS tag footprint 265 when is labour day #WRB-VBZ-NN-NN

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    13/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25* default parameters (C and gamma)

    run 1: minimal

    " classifier:

    SVM with polynomial

    kernel

    13

    # Attribute description Sparsity

    2 Does it contain a temporal expression? 2

    5 Timing 4

    6 Most frequent trigger class 5

    9 Most frequent POS tag tense 7

    11 Trigger classes footprint 11

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    14/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25* default parameters (C and gamma)

    run 2: intermediate

    14

    # Attribute description Sparsity

    1 Is it a Wikipedia page title? 2

    2 Does it contain a temporal expression? 2

    5 Timing 4

    7 Wh type 5

    9 Most frequent POS tag tense 7

    10 Most frequent coarse-grained POS tag 8

    11 Trigger classes footprint 11

    12 Temporal#between submission and 16

    13 Tenses footprint 18

    15 Most frequent fine-grained POS tag 21

    " classifier:

    SVM with polynomial

    kernel

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    15/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /251000 random trees

    run 3: full

    15

    # Attribute description Sparsity

    1 Is it a Wikipedia page title? 2

    2 Does it contain a temporal expression? 2

    3 Submissions term 3

    4 Submissions trimester 4

    5 Timing 4

    6 Most frequent trigger class 57 Wh type 5

    8 Most frequent TempoWordNet class 5

    9 Most frequent POS tag tense 7

    10 Most frequent coarse-grained POS tag 8

    11 Trigger classes footprint 11

    12 Temporal#between submission and 16

    13 Tenses footprint 1814 Ordered TempoWordNet classes 18

    15 Most frequent fine-grained POS tag 21

    16 Coarse-grained POS tag ordered footprint 119

    17 Fine-grained POS tag ordered footprint 202

    18 Coarse-grained POS tag footprint 204

    19 Fine-grained POS tag footprint 265

    " classifier:

    Random Forests

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    16/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /251st ranked system

    results (submitted runs)

    16

    Accuracy

    0

    25

    50

    75

    100

    Full Intermediate Minimal

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    17/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /251st ranked system

    results: 5 x 10 cross-fold v.

    17

    Accuracy

    0

    25

    50

    75

    100

    Full Intermediate Minimal

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    18/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25best combination of attributes

    a posteriori fix

    18

    Accuracy

    0

    25

    50

    75

    100

    Full Intermediate Minimal Minimal

    fixed

    72.33%

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    19/98presentation: NTCIR-11 Temporalia

    how to reach the eak

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    20/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25minimal run

    confusion matrix

    20

    Classified as

    Recency Past Future Atemporal

    Recency 43 0 21 11

    Past 3 60 6 6

    Future 38 0 35 2

    Atemporal 6 5 3 61

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    21/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25minimal run

    confusion matrix

    21

    Classified as

    Recency Past Future Atemporal

    Recency 43 0 21 11

    Past 3 60 6 6

    Future 38 0 35 2

    Atemporal 6 5 3 61

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    22/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25

    difficult queries

    " iPhone 5 release date

    it can be FUTURE or PAST according to the submission time

    keywords dont help here

    " 2061: Odyssey Three

    keywords can lie!

    " season 2 dexter

    use of external sources of knowledge

    22

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    23/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25

    difficult queries

    " iPhone 5 release date

    it can be FUTURE or PAST

    keywords dont help here

    " Ventura Stern 2016

    keywords could possibly lie

    " season 2 dexter

    use of external sources of knowledge

    23

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    24/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25Source:Filannino, M., Nenadic G. Mining temporal footprints from Wikipedia. Proceedings ofthe First AHA!-Workshop on Information Discovery in Text. (COLING 2014) (Dublin, Ireland,August 2014), ACL.

    temporal footprint

    a continuous period on the time-line that temporally

    defines the existence of a articular conce t.

    24

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    25/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25Source: http://www.cs.man.ac.uk/~filannim/projects/temporalia/

    online material

    25

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    26/98

    Thankyou

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    27/98

    Contact:

    [email protected]

    ?QUESTIONS

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    28/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /2528

    Natural LanguageProcessing

    Linguistics

    Parallel computing

    Semi-structureddata

    Statistics

    MachineLearning

    TextMining

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    29/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25

    the task

    " source: written texts

    " goal: a (machine-understandable)

    temporal representation of the texts

    " easyfor people

    " hardfor machines

    Temporal aspects of events provide a natural

    mechanism for organising information

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    30/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25Source: ISO-TimeML (ISO/TC37/SC 4 N412 ), rev. 12, 2007

    linguistic key concepts

    " temporal expressions: phrases denoting a temporal

    entity such as an interval or a time point

    01/05/2014, March 15, the next week, Saturday, at that time,

    yesterday, 5 oclock, 3 days, every 4 hours

    " events: phrases denoting eventuality and states

    inflected verbs and nouns: spoken, deliver, will be published

    " links: temporal relation between two phrases

    BEFORE, AFTER, INCLUDES, ENDS, DURING, BEGINS

    30

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    31/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25Source:CNN news article published on 28th February 2010.

    example

    " Yesterday, Deutsche Bank released a note saying

    that China's current economic policies would result in an

    enormous surge in coal consumption over the next

    decade.

    31

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    32/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25Source:CNN news article published on 28th February 2010.

    example: temporal expressions

    " Yesterday(T), Deutsche Bank released a note sayingthat China's current economic policies would result in

    an enormous surge in coal consumption over the next

    decade(T).

    32

    value: 2010-02-27type: DATE

    value: P10Ytype: DURATION

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    33/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25Source:CNN news article published on 28th February 2010.

    example: events

    " Yesterday(T), Deutsche Bank released(E)a note saying(E)that China's current economic policies wouldresult(E)in

    an enormous surge(E)in coal consumption over the next

    decade(T).

    33

    class: OCCURRENCE

    class: REPORTING

    class: OCCURRENCEclass: OCCURRENCE

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    34/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25Source:CNN news article published on 28th February 2010.

    example: links

    " Yesterday(T), Deutsche Bank released(E)a note saying(E)that China's current economic policies wouldresult(E)in

    an enormous surge(E)in coal consumption over the next

    decade(T).

    34

    is included

    is included

    after

    is included

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    35/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25

    example: ISO-TimeML output

    nyt_20100228_china_pollution

    Yesterday, Deutsche Bank releaseda note saying

    that China's currenteconomic policies

    would resultin an enormous surgein coal consumptionover the next decade.

    35

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    36/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25Utterance time:28th February 2010.

    visual representation

    36

    now27 Feb. 2010

    released,saying

    2020

    surge

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    37/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25Rule-based Machine learning-based

    TempEval-3 results

    37

    Research groupIdentification Normalisation

    accuracyOverallscore

    Prec. Rec. F1

    The University of Heidelberg 0.93 0.88 0.9 0.86 0.776

    US Naval Academy 0.89 0.91 0.9 0.79 0.71

    The University of Manchester 0.95 0.85 0.9 0.77 0.69

    Stanford University 0.89 0.91 0.9 0.75 0.674

    AT&T Lab Research 0.98 0.75 0.85 0.77 0.656

    University of Colorado Boulder 0.94 0.87 0.9 0.72 0.647

    Jadavpur University 0.93 0.8 0.86 0.74 0.638

    Katholieke Universiteit Leuven 0.93 0.76 0.84 0.75 0.63

    Joint Research Centre European Commission 0.9 0.8 0.85 0.68 0.582

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    38/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25

    model selection

    38

    Source:Filannino, M., and Nenadic G. ManTIME: Temporal expression extraction withsystematic feature type selection and a posteriori label adjustment. Journal of Informationprocessing and Management: Special Issue on Time and Information Retrieval, (2014),Elsevier. (under review)

    *5x10-fold cross validation

    93 features, 4 models:

    " M1: morpho-lexical only

    " M2: morpho-lexical + syntactic

    " M3: morpho-lexical + gazeetters

    " M4: morpho-lexical + gazeetters

    + WordNet

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    39/98

    presentation: NTCIR-11 Temporalia

    Better software, better research

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    40/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25Source:Filannino, M., Nenadic G. Mining temporal footprints from Wikipedia. Proceedings ofthe First AHA!-Workshop on Information Discovery in Text. (COLING 2014) (Dublin, Ireland,August 2014), ACL.

    temporal footprint

    A temporal footprintis a continuous period

    on the time-line that temporally defines

    the existence of a particular concept.

    40

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    41/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25

    evaluation

    " subjects: people

    " lived from 1000 AD to 2014

    textfrom Wikipedia web pages

    year of birth and deathfrom DBpedia

    " 228,824 people collected

    " simple definition of temporal footprint

    birth and death dates

    41

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    42/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25Error: 0.204

    results

    " Galileo Galilei (1564-1642), prediction: 1556-1654

    42

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    43/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25Source: http://www.cs.man.ac.uk/~filannim/projects/temporal_footprints/

    results

    " Computer (1940-today), prediction: 1882-1982

    43

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    44/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25Source:http://start.csail.mit.edu/answer.php?query=

    application?

    44

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    45/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25

    Source:Kovaevi, A., Dehghan, A., Filannino, M., Keane, J. A., and Nenadic, G. Combining

    rules and machine learning for extraction of temporal expressions and events from clinicalnarratives. Journal of American Medical Informatics (2013).

    i2b2 shared Task 12

    45

    ADMISSION DATE: 2011-02-06;DISCHARGE DATE: 2011-02-08;HISTORY OF PRESENTILLNESS: Mr. Pohl is a 53 - year-old male with historyof alcohol useand hypertension. Blood alcohol level was 383. Agitated in emergency room requiring 4leather restraints, received 5 mg of Haldol, 2 mg of Ativan. He became hypotensivein theemergency room with a systolic blood pressure in the 80 'sand had decreased respiratory

    rate. He received a normal saline bolus of 2 litres of good blood pressure response. Thepatient was then admitted to the medical Intensive Care Unit for observation and thentransferred to our service on medicine when the blood pressures remained stableovernight...

    06/02/2011 07/02/2011 08/02/2011

    General

    Tests

    Treatments

    Problems

    admission discharge

    BAL 383

    Haldol 4mg

    Ativan 2mg

    hypotensive

    SBP ~80

    decreased respiratory rate

    Saline bolus 2l

    transfer

    stable

    SBP stable

    hands tremor improved

    blood pressure medications

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    46/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25

    clinical data

    " disease progression

    modelling

    " analysis of the effectiveness

    of treatments

    " extraction of patients clinical

    pathway

    46

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    47/98

    presentation: NTCIR-11 Temporalia

    1s ear backu

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    48/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25

    identification techniques

    48

    2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013

    TimeML(standard)

    ACE-2004 dev & eval(TERN2004 corpus)

    TimeBank(corpus)

    Hand grammar approach(rule-based)

    TempEval Task#15(in SemEval07)

    TempEval-2 Task#13(in SemEval10)

    TempEval-3 Task#1(in SemEval13)

    Markov logic network(machine learning)

    SVM(machine learning)

    Maximum Entropy Class.(machine learning)

    Conditional Random Fields(machine learning)

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    49/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25Source: Google Scholar (27/02/2012)

    scientific interest

    49

    0

    7

    14

    21

    28

    35

    42

    49

    56

    63

    70

    2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011

    temporal expressions AND clinical

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    50/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25

    conferences & journals

    " SemEval: Evaluation Exercises on Semantic Evaluation

    TempEval: Temporal Evaluation Task

    " TIME: Time International Symposium Series

    " JAMIA: Journal of American Informatics Association

    " COLING: Computational Linguistics Conference

    " IJHI: International Journal of Health Informatics

    50

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    51/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25

    ISO-TimeML" DATE

    [YYYY-MM-DD]

    " TIME

    [date]T[hh:mm:ss]

    " SET

    P[[n][Y/M/D/w/h/m/s]]

    " DURATION

    R[n][set]

    51

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    52/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25J. Poveda, M. Surdeanu, and J. Turmo, An analysis of Bootstrapping for the Recognition ofTemporal Expressions, 2009

    temporal forms

    " time or date references

    11pm, February 14th

    " time references that

    anchor on another time

    one hour after midnight

    " durations

    two days, five years

    " recurring times

    twice in the hour

    " context-dependent

    times

    today, last year

    " vague references

    the near future

    " times indicated by an

    event the day after Silvio

    Berlusconi resigned

    52

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    53/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25D. Ahn, S. Fissaha Adafre, and M. de Rijke, Towards Task-BasedTemporal Extraction andRecognition, 2005

    temporal binding

    " fully-qualified: no reference to any other temporal

    entity

    March 15, 2001" deictic: reference to the time of utterance

    today, yesterday, three weeks ago, last Thursday

    " anaphoric: reference to a temporal expression

    previously evoked in the text

    March 15, the next week, Saturday, at that time

    53

    i NTCIR 11 T li

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    54/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25

    NorMA architecture

    " design, implement and evaluate a novel:

    identification architecture

    normalisation architecture" investigate the difference between general and clinical domain

    " investigate the use of the proposed frameworks to the general

    domain

    " suggest a more temporally-aware error measure for

    normalisation phase

    54

    t ti NTCIR 11 T li

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    55/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25

    clinical NorMA architecture

    " design, implement and evaluate a novel:

    identification architecture

    normalisation architecture" investigate the difference between general and clinical domain

    " investigate the use of the proposed frameworks to the general

    domain

    " suggest a more temporally-aware error measure for

    normalisation phase

    55

    t ti NTCIR 11 T li

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    56/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25

    clinical NorMA pipeline

    " design, implement and evaluate a novel:

    identification architecture

    normalisation architecture" investigate the difference between general and clinical domain

    " investigate the use of the proposed frameworks to the general

    domain

    " suggest a more temporally-aware error measure for

    normalisation phase

    56

    t ti NTCIR 11 T li

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    57/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25

    example of clinical rule

    pattern = re.findall(^(?:the |her |his |their )?([09][09]!)(?:st|nd|rd |th)

    (?:post|post|day)? ?(?:pod| operative |op| hospital |hsp|day|hd)(?:ly)?

    (?:day|night|afternoon)?$, raw_expression)

    if pattern:

    value = add_date(reference_date , int(pattern[0]) )

    returnexpression, DATE , value, postoperative_literals3

    57

    temporal expression type ISO-8601 representation(value)

    rule name

    presentation: NTCIR 11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    58/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25

    Rules activation distribution

    " design, implement and evaluate a novel:

    identification architecture

    normalisation architecture

    " investigate the difference between general and clinical domain

    " investigate the use of the proposed frameworks to the general

    domain

    " suggest a more temporally-aware error measure for

    normalisation phase

    58

    presentation: NTCIR 11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    59/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25

    Rules activation distribution

    " design, implement and evaluate a novel:

    identification architecture

    normalisation architecture

    " investigate the difference between general and clinical domain

    " investigate the use of the proposed frameworks to the general

    domain

    " suggest a more temporally-aware error measure for

    normalisation phase

    59

    presentation: NTCIR 11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    60/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25Source: i2b2 2012 clinical corpus

    example: raw text

    Admission Date :

    02/01/2002

    Discharge Date :

    02/08/2002

    HISTORY OF PRESENT ILLNESS :

    Saujule Study is a 77-year-old woman with a history of obesity and

    hypertension who presents with increased shortness of breath x 5

    days. Her shortness of breath has been progressive over the last2-3 years. On admission , she was diuresed with Lasix and was

    negative 1-2 liters per day for several days.

    60

    presentation: NTCIR 11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    61/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25Source: i2b2 2012 clinical corpus

    example: identification

    Unisys must pay about $100 million in interest every quarter, on

    top of $27 million in dividends on preferred stock.

    61

    Admission Date :

    02/01/2002

    Discharge Date :

    02/08/2002

    HISTORY OF PRESENT ILLNESS :

    Saujule Study is a 77-year-old woman with a history of obesity and

    hypertension who presents with increased shortness of breath x 5

    days. Her shortness of breath has been progressive over the last2-3 years. On admission , she was diuresed with Lasix and was

    negative 1-2 liters per day for several days.

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    62/98

    11/12/2014, Tokyo

    presentation: NTCIR-11 Temporalia

    /25Source: i2b2 2012 clinical corpus

    example: normalisation

    02/01/2002

    02/08/2002

    5 days

    2-3 years

    several days

    62

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    63/98

    presentation: NTCIR-11 Temporalia

    n ear backu

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    64/98

    11/12/2014, Tokyo

    presentation: NTCIR 11 Temporalia

    /25

    ml-driven identification phase

    " Conditional Random Fields

    Features: harvested from the literature

    Tagging scheme: BIO (beginning, inside, outside)

    Factor graph:

    64

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    65/98

    11/12/2014, Tokyo

    presentation: NTCIR 11 Temporalia

    /25Source:Richard P. Feynmans page

    factor graph

    ... was | discovered | in | 1977 | , | Feynman | immediately ...

    65

    w0 w+1 w+2 w+3w-1w-2w-3

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    66/98

    11/12/2014, Tokyo

    presentation: NTCIR 11 Temporalia

    /25

    unique values per feature

    66

    0

    1200

    2400

    3600

    48006000

    7200

    8400

    9600

    10800

    12000

    _w

    ord

    _w

    ord_preprocessed

    lex_lemma

    lex_porter_stem

    lex_treetagger_lemma

    lex_prefix

    lex_lancaster_stem

    lex_suffix

    lex_token_with_no_letters

    lex_extended_pattern

    lex_vocal_pattern

    lex_pattern

    lex_treetagger_pos

    lex_token_with_no_letters_and_numbers

    lex_tense

    ga

    z_countries

    ga

    z_iso_countries

    ga

    z_nationalities

    ga

    z_uscities

    lex_polarity

    TIMEX3(class)

    ga

    z_female_names

    ga

    z_festivities

    ga

    z_male_names

    ga

    z_stopword

    lex_first_upper

    lex_has_digit

    lex_has_symbols

    lex_is_all_caps_and_dots

    lex_is_all_digits_and_dots

    lex_is_alnum

    lex_is_alpha

    lex_is_decimal

    lex_is_digit

    lex_is_lower

    lex_is_numeric

    lex_is_title

    lex_is_upper

    lex_last_s

    lex_unusual

    temp_cardinal

    temp_compound

    temp_digit

    temp_festivity

    temp_future_ref

    temp_fuzzy_quantifier

    temp_literal_number

    temp_modifier

    temp_month

    temp_number

    temp_ordinal

    temp_past_ref

    temp_period

    temp_pod

    temp_present_ref

    temp_season

    temp_signal

    temp_temporal_adjectives

    temp_temporal_adverbs

    temp_temporal_co-reference

    temp_temporal_conjunctives

    temp_temporal_prepositions

    temp_time

    temp_weekday

    temp_year

    lex_chunk

    lex_is_space

    lex_pnp

    ph

    on_first_phoneme

    ph

    on_form

    ph

    on_last_phoneme

    ph

    on_length

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    67/98

    11/12/2014, Tokyo

    presentation: NTCIR 11 Temporalia

    /25

    Post-processing analysis

    67

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    68/98

    11/12/2014, Tokyo

    p p

    /25

    Temporal

    " ManTIME

    " wikipedia pages

    " using dates only

    " gaussian fit

    to be improved

    68

    Galileo Galilei

    (1564-1642)

    Dante(1265-1321)

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    69/98

    presentation: NTCIR-11 Temporalia

    ManTIME architecture

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    70/98

    11/12/2014, Tokyo

    p p

    /25

    feature type selection

    " 93 features

    morpho-lexical, syntactic, gazetteers and WordNet

    " 4 models M1: morpho-lexical only

    M2: morpho-lexical + syntactic

    M3: morpho-lexical + gazeetters

    M4: morpho-lexical + gazeetters + WordNet

    " model selection

    70

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    71/98

    11/12/2014, Tokyo

    p p

    /25Silver + Gold, 5x10-fold cross validation

    model selection result

    That means Unisys must pay about $100 million in interest every

    quarter, on top of $27 million in dividends on preferred stock.

    71

    " M1: morpho-lexical only

    " M2: morpho-lexical + syntactic

    " M3: morpho-lexical + gazeetters

    "M4: morpho-lexical + gazeetters + WordNet

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    72/98

    11/12/2014, Tokyo /25Source:TempEval-3 challenge; Corpora released in October 2012 (except the eval).

    TempEval-3

    " temporal information extraction challenge

    " organised every 3 years in SemEval (ACL)

    72

    Corpus#

    documents#

    wordsannotation

    sourcepurpose

    AQUAINT 73 33.973 experts training

    TimeBank 183 61.418 experts training

    TempEval-3 silver 2.452 666.309 systems training

    TempEval-3 eval 20 6.375 experts testing

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    73/98

    11/12/2014, Tokyo /25Silver + Gold; 4x10-fold cross validation

    identification post-processing

    " Probabilistic correction module

    " BIO fixer

    " Threshold-based label switcher

    73

    BIOfixerTbLS

    BIOfixerPCMCRFs

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    74/98

    11/12/2014, Tokyo /25

    TempEval-3: results (Task A)

    " investigate semi-supervised techniques

    " approach the normalisation phase in a novel way

    " investigate the differences between general and clinical

    domain

    " investigate the use of the proposed framework to other

    domains

    " suggest a more temporally-aware error measure in the

    normalisation

    74

    Training data (post-processing)

    Identification Normalisationaccuracy Overall

    scorestrict matching lenient matching

    Prec Rec F1 Prec Rec F1 Type Value

    Human&Silver (no) 0.79 0.64 0.7 0.97 0.79 0.87 0.89 0.77 0.672

    Human&Silver (yes) 0.8 0.66 0.72 0.97 0.8 0.88 0.87 0.76 0.667

    Human (no) 0.76 0.64 0.7 0.95 0.8 0.87 0.87 0.77 0.675

    Human (yes) 0.79 0.7 0.74 0.95 0.85 0.9 0.86 0.77 0.69

    Silver (no) 0.78 0.63 0.7 0.97 0.8 0.87 0.89 0.77 0.672

    Silver (yes) 0.82 0.66 0.73 0.98 0.79 0.88 0.91 0.78 0.683

    Source:M. Filannino, G. Brown, and G. Nenadic. ManTIME: Temporal expression identification andnormalization in the TempEval-3 challenge. Proceedings of the Seventh International Workshop on

    Semantic Evaluation (SemEval 2013), pages 5357, Atlanta, Georgia, USA, June 2013. ACL.

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    75/98

    11/12/2014, Tokyo /25

    Source: Naushad UzZaman, Hector Llorens, Leon Derczynski, James Allen, Marc Verhagen,and James Pustejovsky. Semeval-2013 task 1: Tempeval-3: Evaluating time expressions,

    events, and temporal relations. Proceedings of the Seventh International Workshop onSemantic Evaluation (SemEval 2013), pages 1-9, Atlanta, Georgia, USA, June 2013. ACL.

    TempEval-3: ranking (Task A)

    75

    System(best run only)

    Identification Normalisationaccuracy Overall

    scorestrict matching lenient matching

    Prec Rec F1 Prec Rec F1 Type Value

    HeidelTime 0.84 0.79 0.81 0.93 0.88 0.9 0.91 0.86 0.776

    NavyTime 0.79 0.8 0.8 0.89 0.91 0.9 0.89 0.79 0.71

    ManTIME 0.79 0.7 0.74 0.95 0.85 0.9 0.86 0.77 0.69

    SUTime 0.79 0.8 0.8 0.89 0.91 0.9 0.89 0.75 0.674

    ATT 0.91 0.7 0.79 0.98 0.75 0.85 0.91 0.77 0.656

    ClearTK 0.86 0.8 0.83 0.94 0.87 0.9 0.93 0.72 0.647JU-CSE 0.82 0.7 0.75 0.93 0.8 0.86 0.87 0.74 0.638

    KUL 0.77 0.63 0.69 0.93 0.76 0.84 0.89 0.75 0.63

    FSS-TimEx 0.52 0.46 0.49 0.9 0.8 0.85 0.81 0.68 0.582

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    76/98

    11/12/2014, Tokyo /25

    TempEval-3: results (Task A)

    " investigate semi-supervised techniques

    " approach the normalisation phase in a novel way

    " investigate the differences between general and clinical

    domain

    " investigate the use of the proposed framework to other

    domains

    " suggest a more temporally-aware error measure in the

    normalisation

    76

    Training data (post-processing)

    Identification Normalisationaccuracy Overall

    scorestrict matching lenient matching

    Prec Rec F1 Prec Rec F1 Type Value

    Human&Silver (no) 0.79 0.64 0.7 0.97 0.79 0.87 0.89 0.77 0.672

    Human&Silver (yes) 0.8 0.66 0.72 0.97 0.8 0.88 0.87 0.76 0.667

    Human (no) 0.76 0.64 0.7 0.95 0.8 0.87 0.87 0.77 0.675

    Human (yes) 0.79 0.7 0.74 0.95 0.85 0.9 0.86 0.77 0.69

    Silver (no) 0.78 0.63 0.7 0.97 0.8 0.87 0.89 0.77 0.672

    Silver (yes) 0.82 0.66 0.73 0.98 0.79 0.88 0.91 0.78 0.683

    Source:M. Filannino, G. Brown, and G. Nenadic. ManTIME: Temporal expression identification andnormalization in the TempEval-3 challenge. Proceedings of the Seventh International Workshop on

    Semantic Evaluation (SemEval 2013), pages 5357, Atlanta, Georgia, USA, June 2013. ACL.

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    77/98

    11/12/2014, Tokyo /25

    Source: Naushad UzZaman, Hector Llorens, Leon Derczynski, James Allen, Marc Verhagen,and James Pustejovsky. Semeval-2013 task 1: Tempeval-3: Evaluating time expressions,

    events, and temporal relations. Proceedings of the Seventh International Workshop onSemantic Evaluation (SemEval 2013), pages 1-9, Atlanta, Georgia, USA, June 2013. ACL.

    TempEval-3: ranking (Task A)

    77

    System(best run only)

    Identification Normalisationaccuracy Overall

    scorestrict matching lenient matching

    Prec Rec F1 Prec Rec F1 Type Value

    HeidelTime 0.84 0.79 0.81 0.93 0.88 0.9 0.91 0.86 0.776

    NavyTime 0.79 0.8 0.8 0.89 0.91 0.9 0.89 0.79 0.71

    ManTIME 0.79 0.7 0.74 0.95 0.85 0.9 0.86 0.77 0.69

    SUTime 0.79 0.8 0.8 0.89 0.91 0.9 0.89 0.75 0.674

    ATT 0.91 0.7 0.79 0.98 0.75 0.85 0.91 0.77 0.656

    ClearTK 0.86 0.8 0.83 0.94 0.87 0.9 0.93 0.72 0.647JU-CSE 0.82 0.7 0.75 0.93 0.8 0.86 0.87 0.74 0.638

    KUL 0.77 0.63 0.69 0.93 0.76 0.84 0.89 0.75 0.63

    FSS-TimEx 0.52 0.46 0.49 0.9 0.8 0.85 0.81 0.68 0.582

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    78/98

    11/12/2014, Tokyo /25

    feature type selection

    " 93 features

    morpho-lexical, syntactic, gazetteers and WordNet

    " 4 models M1: morpho-lexical only

    M2: morpho-lexical + syntactic

    M3: morpho-lexical + gazeetters

    M4: morpho-lexical + gazeetters + WordNet

    " model selection

    78

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    79/98

    11/12/2014, Tokyo /25Silver + Gold, 5x10-fold cross validation

    model selection result

    That means Unisys must pay about $100 million in interest every

    quarter, on top of $27 million in dividends on preferred stock.

    79

    " M1: morpho-lexical only

    " M2: morpho-lexical + syntactic

    " M3: morpho-lexical + gazeetters

    "M4: morpho-lexical + gazeetters + WordNet

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    80/98

    11/12/2014, Tokyo /25Source:TempEval-3 challenge; Corpora released in October 2012 (except the eval).

    TempEval-3

    " temporal information extraction challenge

    " organised every 3 years in SemEval (ACL)

    80

    Corpus#

    documents#

    wordsannotation

    sourcepurpose

    AQUAINT 73 33.973 experts training

    TimeBank 183 61.418 experts training

    TempEval-3 silver 2.452 666.309 systems training

    TempEval-3 eval 20 6.375 experts testing

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    81/98

    11/12/2014, Tokyo /25Silver + Gold; 4x10-fold cross validation

    identification post-processing

    " Probabilistic correction module

    " BIO fixer

    " Threshold-based label switcher

    81

    BIOfixerTbLS

    BIOfixerPCMCRFs

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    82/98

    presentation: NTCIR-11 Temporalia

    r ear backu

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    83/98

    11/12/2014, Tokyo /25Source: Temporal Information Extraction and Shallow Temporal Reasoning, D. Roth et al. 2012

    why is it challenging?

    1. Matt exercised during his lunch break.

    2. He stretched, lifted weights, and ran.

    3. He showered, got dressed and returned work.

    83

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    84/98

    11/12/2014, Tokyo /25

    1. Matt exercised(E)during his lunch break(E).

    2. He stretched(E), lifted(E)weights, and ran(E).

    3. He showered(E), got dressed(E)and returned(E)work.

    Source: Temporal Information Extraction and Shallow Temporal Reasoning, D. Roth et al. 2012

    linguistic knowledge

    84

    exercised

    lunch break

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    85/98

    11/12/2014, Tokyo /25

    1. Matt exercised(E)during his lunch break(E).

    2. He stretched(E), lifted(E)weights, and ran(E).

    3. He showered(E), got dressed(E)and returned(E)work.

    Source: Temporal Information Extraction and Shallow Temporal Reasoning, D. Roth et al. 2012

    linguistic knowledge

    85

    stretch, lift, run

    lunch break

    exercised

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    86/98

    11/12/2014, Tokyo /25

    1. Matt exercised(E)during his lunch break(E).

    2. He stretched(E), lifted(E)weights, and ran(E).

    3. He showered(E), got dressed(E)and returned(E)work.

    Source: Temporal Information Extraction and Shallow Temporal Reasoning, D. Roth et al. 2012

    linguistic knowledge

    86

    shower, dress, return

    lunch break

    exercised

    stretch, lift, run

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    87/98

    11/12/2014, Tokyo /25

    1. Matt exercised(E)during his lunch break(E).

    2. He stretched(E), lifted(E)weights, and ran(E).

    3. He showered(E), got dressed(E)and returned(E)work.

    Source: Temporal Information Extraction and Shallow Temporal Reasoning, D. Roth et al. 2012

    common sense knowledge

    87

    shower, dress, return

    lunch break

    exercised

    stretch, lift, run

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    88/98

    11/12/2014, Tokyo /25

    1. Matt exercised(E)during his lunch break(E).

    2. He stretched(E), lifted(E)weights, and ran(E).

    3. He showered(E), got dressed(E)and returned(E)work.

    Source: Temporal Information Extraction and Shallow Temporal Reasoning, D. Roth et al. 2012

    common sense knowledge

    88

    lunch break

    exercised shower, dress, return

    stretch, lift, run

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    89/98

    11/12/2014, Tokyo/25

    1. Matt exercised(E)during his lunch break(E).

    2. He stretched(E), lifted(E)weights, and ran(E).

    3. He showered(E), got dressed(E)and returned(E)work.

    Source: Temporal Information Extraction and Shallow Temporal Reasoning, D. Roth et al. 2012

    common sense knowledge

    89

    lunch break

    exercised shower dress return

    stretch, lift, run

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    90/98

    11/12/2014, Tokyo/25

    1. Matt exercised(E)during his lunch break(E).

    2. He stretched(E), lifted(E)weights, and ran(E).

    3. He showered(E), got dressed(E)and returned(E)work.

    Source: Temporal Information Extraction and Shallow Temporal Reasoning, D. Roth et al. 2012

    domain knowledge

    90

    stretch liftrun stretch

    lunch break

    exercised shower dress return

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    91/98

    presentation: NTCIR-11 Temporalia

    Tem oral foot rint

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    92/98

    11/12/2014, Tokyo /25E: 0.159

    results

    " Robin Williams (1951 - 2014), prediction: 1953-2006

    92

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    93/98

    11/12/2014, Tokyo /25Prediction: 1366-2057 (1451-1506), E: 0.92

    other types of temporal footprint?

    " Christopher Columbus will die in 2057?!

    93

    AHA!

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    94/98

    11/12/2014, Tokyo /25

    physical existence vs. social coverage

    " Anne Franks footprint is shifted in the future

    94

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    95/98

    presentation: NTCIR-11 Temporalia

    Tem oralia

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    96/98

    11/12/2014, Tokyo /25

    data

    " training set: 100 queries

    " benchmark test set: 300 queries

    96

    Query Submission date CLASS

    Movies 2012 Feb 28, 2013 past

    Upcoming Movies in 2013 Jan 1, 2013 future

    2013 MLB PlayoffSchedule Jan 1, 2013 future

    current price of gold Feb 28, 2013 present

    Amazon Deal of the Day Feb 28, 2013 present

    Number of Neck Muscles Feb 28, 2013 atemporal

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    97/98

    11/12/2014, Tokyo /25

    attributes

    97

    ID QuerySubmitted runs

    Minimal Intermediate Full

    1 Is it a Wikipedia page title? ! !

    2 Does the query contain a temporal expression? ! ! !

    3 Submissions term !

    4 Submissions trimester !

    5 Timing ! ! !

    6 Most frequent trigger class ! !

    7 Wh type ! !

    8 Most frequent TempoWordNet class !

    9 Most frequent POS tag tense ! ! !

    10 Most frequent coarse-grained POS tag ! !

    11 Trigger classes footprint ! ! !

    12 Tem oral$between submission and uer ! !

    13 Tenses footprint ! !

    14 Ordered TempoWordNet classes !

    15 Most frequent fine-grained POS tag ! !

    16 Coarse-grained POS tag ordered footprint !

    17 Fine-grained POS tag ordered footprint !

    18 Coarse- rained POS ta foot rint !19 Fine- rained POS ta foot rint !

    presentation: NTCIR-11 Temporalia

  • 7/21/2019 Using machine learning to predict temporal orientation of search engine queries in the Temporalia challenge

    98/98

    error measure

    union overlap

    gold

    prediction