semantic knowledge discovery, organization and usecollaboratively constructed semantic resources on...

Click here to load reader

Post on 26-Mar-2020

0 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • Semantic KnowledgeDiscovery, Organization and Use

    Warren Weaver Hall, New York University

    November, 14 and 15, 2008

    NSF Sponsored Symposium

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    P-1The Development of a Shared Dataset for Predictive

    Analysis in the Behavioral Sciences

    Kai R. Larsen, Jintae Lee, Eliot RichU. Colorado

    On Deck:Double Deck:

    P-2 Catherine HavasiP-3 Iryna Gurevych

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • P-1

    The Development of a Shared Dataset for Predictive Analysis in

    the Behavioral Sciences Kai Larsen, Jintae Lee, and Eliot Rich

    0

    200,000

    400,000

    600,000

    800,000

    1,000,000

    1,200,000

    1,400,000

    Re

    lati

    on

    ship

    s (i

    n T

    ho

    usa

    nd

    s) Unknown

    Known

    Setting: A large portion of behavioral

    research focuses on very distinct

    knowledge constructs and their

    relationships.

    Problem: For every behavioral paper

    published…

    • known relationships increase linearly

    • unknown relationships increase

    exponentially

    X

    Y

    Z

    0 5000 10000exponentially

    Solution:

    • Collect large dataset of behavioral

    constructs and their relationships

    • Make this available to the community of

    knowledge discovery researchers

    • By automatically figuring out 1% of the

    relationships, a researcher could

    contribute more to science than 100,000

    behavioral researchers could do in their

    lifetimes.

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    P-2Discovering Semantic Relations Using Singular Value

    Decomposition Based Techniques

    Catherine HavasiBrandeis University

    On Deck:Double Deck:

    P-3 Iryna GurevychP-4 Roy Bar-Haim

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • P-2

    Acquiring and UsingCommon Sense

    • Acquire Common Sense– From human volunteers– From inference– From corpora

    • Using Dimensionality Reduction– To learn more common sense– To add common sense intuition to domain

    specific data

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    P-3Putting the Wisdom-of-Crowds to Use in NLP:

    Collaboratively Constructed Semantic Resources on the Web

    Iryna GurevychTechnical University of Darmstadt, Germany

    On Deck:Double Deck:

    P-4 Roy Bar-HaimP-5 Derrick Higgins

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • P-3

    Putting the „Wisdom‐of‐Crowds“ to Use in NLP: Collaboratively Constructed Semantic Resources on the WebUbiquitous Knowledge Processing Lab, Iryna Gurevych

    Information Extraction (Ruiz-Casado et al., 2005)Information Retrieval (Gurevych et al., 2007)Named Entity Recognition (Bunescu & Pasca, 2006)Question Answering (Ahn et al., 2004)Text Categorization (Gabrilovich & Markovitch, 2006)

    Semantic Relatedness (Zesch et al., 2008)Information Retrieval (Müller and Gurevych, 2008)

    Wikipedia Wiktionary WordNet GermaNet ...JWPL JWKTL JWNL GN API ...

    Mapping

    InformationExtraction

    InformationRetrieval

    LexicalChains

    LexicalGraphs

    Named EntityRecognition

    QuestionAnswering

    SemanticRelatedness

    TextCategorization

    TextSegmentation

    TextSummarization

    Word SenseDisambiguation ...

    Unified access

    Entity• Part of Speech• Lexeme / Sense •pairs

    Lexical Relations• Synonymy• Antonymy

    Semantic Relation• Hypernymy• Hyponymy• …

    Explicit lexical-semantic relations

    Advantages of collaborative construction

    Abbreviations, Antonyms, Categories, Collocations, Derived Terms, Etymology, Examples, Glosses, Hypernyms, Hyponyms, Morphology, Part-of-speech, Pronunciation, Quotations, Related terms, Synonyms, Translations, Troponyms, Word senses

    BigMulti-lingualCheapUp-to-date

    Mapping Mapping Mapping

    Open issues with “the user‐contributed information”

    • incompleteness of information • inconsistent structure of entries • uneven coverage• vagueness of concepts• insufficient quality of information

    This work is funded by the German Research Foundation (DFG GU 798/1‐2, 798/1‐3, and 798/3‐1) and the Volkswagen‐Foundation (I/82806)

    Wikipedia & Wiktionary API http://www.ukp.tu-darmstadt.de/software

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    P-4Efficient Semantic Inference over Language Expressions

    Roy Bar-HaimBar-Ilan University

    On Deck:Double Deck:

    P-5 Derrick HigginsP-6 Jung-Wei Fan

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • P-4

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    P-5Length-independent vector-space document similarity

    measures

    Derrick HigginsEducational Testing Service

    On Deck:Double Deck:

    P-6 Jung-Wei FanP-7 Peter Clark

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • P-5

    Length-independent vector-space document similarity measures

    Derrick Higgins, ETS

    The Problem: Similarity and LengthThe similarity between two texts, as estimated by vector-basedmethods (CVA, LSA, RI,. . .) depends not only on their congruenceof meaning, but also on the lengths of the texts compared

    Documents on similar topics will converge to similar representationvectors as their length increases.

    Longer documents are more likely to appear similar thanshorter ones.

    Even documents on different topics may exhibit some increase insimilarity scores with increasing length.

    500 1000 1500 2000

    −0.

    20.

    00.

    20.

    40.

    60.

    81.

    0

    gTypes

    CV

    A S

    imila

    rity

    500 1000 1500 2000

    −0.

    20.

    00.

    20.

    40.

    60.

    81.

    0gTypes

    RI S

    imila

    rity

    One simple way to remove effect of text length is to subtract anestimate of similarity based on length, leaving the residual

    Length-independent vector-space document similarity measures – p.1

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    P-6Semantic reclassification of ontology concepts using

    contextual and lexical features

    Jung-Wei Fan, Carol FriedmanColumbia University

    On Deck:Double Deck:

    P-7 Peter ClarkP-8 Alexander Yates

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • P-6

    Semantic Reclassification of Ontological Concepts using Contextual and Lexical Features

    Jung-Wei Fan, MS, MPhil Carol Friedman, PhDDepartment of Biomedical Informatics

    Columbia University, New York

    Semantic TypeFinding Disease-related concepts

    Progressive renal failure,Hyperkalemia, etc.

    Function-related conceptsNitrogen balance,Mitotic activity, etc.

    Procedure-related conceptsAppendico-vesicostomy, etc.

    General finding conceptsUnemployment,Beer drinker, etc.

    Example:The problem

    Methods

    Trainingcorpus

    Naïve Bayesclassifier

    Distributionalclassifier

    Bag of wordsfor the classes

    Traininglexicon

    Contexts forthe classes

    Hyperkalemia

    Disorder

    Training phase Classifying phase

    Disorder

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    P-7Semantic Knowledge Discovery, Organization and Use: Some

    Ongoing Research at Boeing

    Peter Clark and Phil HarrisonBoeing Phantom Works

    On Deck:Double Deck:

    P-8 Alexander YatesP-9 Yutaka Matsuo

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • P-7

    Knowledge Discovery, Organization and Use:Some Ongoing Research at Boeing

    Peter Clark, Boeing Phantom Works

    1. Developing WordNet (with Princeton and ISI)– 30,000 additional links, glosses in logic, core theories

    2. Extracting Commonsense Knowledge from Text– database of 55 million Schubert-style "tuples" – e.g., “planes can be bought”, “pilots can fly to places”, …

    3. Recognizing Textual Entailment– use of world knowledge, usingWordNet and DIRT– logical reasoning and explainable decisions

    4. Machine Reading – integration of semantic representations from multiple texts

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    P-8ShopSmart: Product Recommendations through Technical

    Specifications and User Reviews

    Alexander YatesTemple University

    On Deck:Double Deck:

    P-9 Yutaka MatsuoP-10 Hiroyuki TODA

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • P-8

    ShopSmart: Making Recommendations based on Technical Specifications and User Feedback

    Alexander Yates1, James Joseph1, Ana-Maria Popescu21Computer and Information Sciences, Temple University, Philadelphia, PA, USA

    2Yahoo! Labs, Santa Clara, CA, USA

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    P-9Social Network Mining from the Web

    Yutaka Matsuo, Danushka Bollegala, Hironori Tomobe, YingZi Jin,Junichiro Mori, Keigo Watanabe, Taiki Honma, Masahiro

    Hamasaki, Kotaro Nakayama, and Mizuki OkaTokyo University

    On Deck:Double Deck:

    P-10 Hiroyuki TODAP-11 Atsushi Fujita

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • P-9

    Social Network Mining from the WebYutaka Matsuo and his colleagues, University of Tokyo, Japan

    Our solution: POLYPHONET

    Network View

    At a conference: “Nice to meet you” and ... ?

    who is he?

    Who are his colleagues?

    What is he presenting?

    What are his publications?

    How is he connected with his

    colleagues?

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    P-10Geographic Information Retrieval against Immediate

    Surroundings

    Hiroyuki TODA, Norihito YASUDA, Yumiko MATSUURA, andRyoji KATAOKA

    NTT Cyber Solutions Laboratories

    On Deck:Double Deck:

    P-11 Atsushi FujitaP-12 Saif Mohammad

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • P-10

    Geo-Information Retrieval Against Immediate Surroundings

    • What is Geographic Information Retrieval (GIR): – Doc retrieval method using content query(keyword) and geographic query. – Utilize geographic expressions in each document.

    • Problems and our propositions:– Ranking:

    • Estimate relevancy of each doc against the geo-query and prioritize the docs describing restricted areas related to geo-query.

    => Ranking method which considers extents implied by place names.– Result representation:

    • Represent the search results with consideration of geo-constraints and enable the users easily to decide whether to read docs or not even if the screen size is restricted.

    => Query-biased summarization, which utilizes place name expressions related to the geo-query, for GIR result snippets.

    Hiroyuki Toda, Norihito Yasuda, Yumiko Matsuura, Ryoji Kataoka (NTT Cyber Solutions Labs.)

    • Goal of our GIR: – Realize the searches for spots or services in our immediate surroundings

    via mobile communication devices.

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    P-11Toward Automatic Compilation of Phrasal Thesaurus

    Atsushi Fujita, Satoshi SatoNagoya University

    On Deck:Double Deck:

    P-12 Saif MohammadP-13 Fabio Massimo Zanzotto

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • P-11

    Toward Automatic Compilationof Phrasal Thesaurus

    Phrasal thesaurus Beyond the word-based semantic computing

    Deals with various phrasal paraphrases

    Atsushi Fujita and Satoshi Sato(Nagoya Univ., JAPAN)

    Productive

    Non-productive

    X wrote Y X is the author of YX solves Y X deals with Y

    X show a A Y X v(Y) adv(A)

    X V YX V Y X’s V-ing of Y

    Y be V-PP by X

    burst into tears criedcomfort console

    Generate!!

    Collect!!

    P-11

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    P-12Towards Antonymy-Aware Natural Language Applications

    Saif Mohammad and Bonnie Dorr, Graeme HirstUniversity of Maryland, University of Toronto

    On Deck:Double Deck:

    P-13 Fabio Massimo ZanzottoP-14 Nitin Madnani

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • P-12

    Towards Antonymy‐Aware NL Applica6ons Saif Mohammad, Bonnie Dorr, Graeme Hirst 

    •  Scope –  Clear opposites: wet‐dry, promoted‐demoted –  Contras;ng word pairs: cold‐warm, promoted‐censured 

    •  Method:   –  Iden;fy contras;ng word pairs  using seed antonym pairs and  

    thesaurus categories. –  Determine degree of antonymy using distribu;onal distance and 

    tendency to co‐occur. •  Evalua;on:   

    –  950 GRE‐style closest‐opposite ques;ons. •  Results:   

    –  F score = .70  (baselines: .20 and .22). •  Applica;ons:  

    –  detec;ng incompa;bles (contradic;ons, sen;ment), genera;ng paraphrases, detec;ng humor, improving distribu;onal thesauri.  

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    P-13Combining Semi-Unsupervised Acquisition of Corpora and

    Supervised Learning of Textual Entailment Rules

    Fabio Massimo ZanzottoUniversity of Rome ”Tor Vergata”, Italy

    On Deck:Double Deck:

    P-14 Nitin MadnaniP-15 Justin Betteridge

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • P-13

    F.M.Zanzotto Saarbrucken 14/6/2007

    University of Rome “Tor Vergata”

    The Problem: To determine if:

    “Kesslers team conducted 60,643 face-to-face interviews with

    adults in 14 countries”

    “Kesslers team interviewed more than 60,000 adults in 14

    countries”

    we need

    • the equivalence between “X conducted Y interviews with

    Z” and “X interviewed Y Z”

    • the implication rule that says “X” “more than Y” if “X is bigger than Y”

    Combining Semi-Unsupervised Acquisition of Corpora and

    Supervised Learning of Textual Entailment Rules

    Fabio Massimo Zanzotto, Marco Pennacchiotti, Alessandro Moschitti

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    P-14Applying Automatically Generated Semantic Knowledge A

    Case Study in Machine Translation

    Nitin Madnani, Philip Resnik, Bonnie Dorr and Richard SchwartzUniversity of Maryland

    On Deck:Double Deck:

    P-15 Justin BetteridgeP-16 Karin Verspoor

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • P-14

    • No single correct answer for MT• Need multiple correct (human) answers to tune MT system• Expensive to have humans create multiple translations

    This Leads To Reference Sparsity!

    Automatic Paraphrasing as E-to-E translation

    O: We must bear in mind the community as a whole.P: We must remember the wider community.

    O: France sent its proposal in the form of a “non-official paper”. P: French transmits its recommendations to serve as a “non- official document”.

    O: They should be better coordinated and more effective. P: They should improve the coordination and efficacy.

    O:Thirdly, the implications of enlargement for the union’s regional policy cannot be overlooked. P: Finally, the impact of enlargement for EU regional policy cannot be ignored.

    Artificial “Reference” Translations (O: original, P: our paraphrase)

    Tuning RefsNewswire Web

    BLEU TER BLEU TER

    1H 37.65 56.39 15.17 70.32

    1H+1P 39.32 54.69 15.92 69.94

    Significant improvements when using even a single additional artificial

    reference for tuning

    Applying Automatically Generated Semantic Knowledge:A Case Study in Machine Translation

    Nitin Madnani, Philip Resnik, Bonnie Dorr & Richard Schwartz

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    P-15Continuous Discovery of Semantic Knowledg

    Justin Betteridge, Andrew Carlson, Sue Ann Hong, Estevam R.Hruschka Jr., Edith L. M. Law, Tom M. Mitchell, and Sophie H.

    WangCMU

    On Deck:Double Deck:

    P-16 Karin VerspoorP-18 Svetlana Stoyanchev

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • P-15

    Toward Continuous Discovery of Semantic KnowledgeJustin Betteridge, Andrew Carlson, Sue Ann Hong, Estevam R. Hruschka Jr.,

    Edith L. M. Law, Tom M. Mitchell and Sophie H. Wang. Carnegie Mellon University

    SubGoal considered here:

    • Achieving high semi-supervised

    learning accuracy by coupling the

    learning of many categories

    • Domain: learning semantic classes

    Coupling learning of functions f(x), g(x):

    1. Propagate initial labeled examples of

    f(x) to g(x)

    2. Propagate self-labeled examples

    3. Use learned instances/patterns of f(x)

    Goal: Never-ending language learning

    • Domain: learning semantic classes

    of NPs

    Multi-task learning with explicit

    relationships between learning tasks

    • subset(organization(x), university(x))

    • exclusive(university(x),person(x))

    • inverse(parentOf(x,y),childOf(x,y))

    • childOf(x,y) => person(x) ^ person(y)

    3. Use learned instances/patterns of f(x)

    to assess patterns/instances of g(x)

    Coupling country city company univ. mean

    1 93.6 99.1 100.0 79.1 93.0

    1,2,3 89.1 98.2 100.0 97.3 96.2

    Bootstrap learning accuracy: iteratively

    labeling 110 new examples from 8M web

    pages

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    P-16The Colorado OpenDMAP system: Building on Community

    Ontologies and a Community, Platform for BiomedicalNatural Language Processing

    Karin Verspoor, William Baumgartner, Kevin Cohen, HelenJohnson, and Larry HunterUniversity of Colorado Denver

    On Deck:Double Deck:

    P-18 Svetlana StoyanchevP-19 Jordan Boyd-Graber

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • P-16

    The Colorado OpenDMAP system Karin Verspoor, William Baumgartner, K. Bretonnel Cohen, Helen Johnson, Larry Hunter 

    Cyclin E2 interacts with Cdk2 in a func>onal kinase complex. 

    protein protein interac>on:      interactor1: cyclin E2      interactor2: cdk2 

    ontology  paDerns 

    OpenDMAP 

    freetext 

    extracted informa>on 

    CLASS: protein protein interac>on      SLOT: interactor1           TYPE: molecule      SLOT: interactor2              TYPE: molecule 

    PROTÉGÉ ONTOLOGY 

    {c‐interact} := [interactor1] interacts with [interactor2] {c‐interact} := [interactor1] is bound by [interactor2]      … 

    PATTERNS 

    An ontology‐driven integrated concept recogni>on system with proven applicability to biomedical informa>on extrac>on problems. 

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    P-18Automatic Feature Discovery for Predicting Content of User

    Utterances in Dialogs

    Svetlana StoyanchevSUNY, Stony Brook

    On Deck:Double Deck:

    P-19 Jordan Boyd-GraberP-20 Breck Baldwin

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • P-18

    Predicting Content of User Utterances in Dialog

    Svetlana Stoyanchev and Amanda StentSUNY, Stony Brook

    Two-pass ASR approach:1. predict presence of task-relevant concepts in user

    Pro

    ble

    m Goal: build dialog systems that allow users to speak freely. Automatic speech recognition (ASR) is a big issue (typical ASR error rate in dialog ~30%)

    1. predict presence of task-relevant concepts in user utterances using:1. lexical features recognized by the first-pass of the ASR

    2. dialog history features 3. prosodic features from the user’s speech

    2. Adapt language model to the predicted content

    Appro

    ach

    Result We achieve statistically significant (but small) improvements in second-pass ASR accuracy for one dialog context; plan to expand to others

    Today: Performance

    of different methods

    of choosing lexical

    features

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    P-19Syntactic Topic Models

    Jordan Boyd-Graber and David M. BleiPrinceton University

    On Deck:Double Deck:

    P-20 Breck BaldwinP-21 Cliff Joslyn

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • P-19

    α

    αT

    β

    πk

    τk∞ M

    θd

    αD

    σ

    Syntactic Topic Models Jordan Boyd-Graber and David Blei

    Princeton University

    Documents are collections of parse trees.

    z1

    w2:slept

    w3:they

    w1:START

    z2

    z3

    z4 w4:START

    z5 w2:ran

    The latent class depends on the parent node and the document's topic distribution.

    Syntactic Topic ModelsJordan Boyd-Graber and David M. Blei

    Princeton University Department of Computer Science

    {jbg,blei}@princeton.edu

    Both syntactic models and topic models are active, fruit-ful areas of research. One captures local patterns, and theother captures trends across many documents. To illustratethese different but complementary views, consider the fol-lowing incomplete sentence from a travel brochure, “In aweek, you could go to .” A syntactic model such as the in-finite tree with independent children [1] tells us what wordscould be an object of a preposition (e.g., “bed,” “school,”“debt,”), and a topic model such as the hierarchical Dirich-let process (HDP) [5] could tell us what words fit with atravel theme (“vacation,” “relax,” “exotic,” etc.). In thiswork, we develop a model that can combine the constraintsof both syntax and semantics to build categories of wordsthat are consistent with both.

    To do this, we build a model called the syntactic topicmodel (STM). Using a corpus composed of dependency parsetrees collected into documents, the STM learns “topics” thatare both thematically and syntactically consistent. Thesetopics, like the parts of speech in syntactic models or thesyntactically-uninformed topics in topic models, are distri-butions over the lexicon.

    To incorporate syntax and semantics, the STM combinesthe per-document distributions over topics (as in topic mod-els) with the part of speech transition probabilities (as insyntactic models). It does this by taking the point-wiseproduct of these distributions and then selecting a latentclass for each word from this new, renormalized distribution,similar to the product of experts model [2]. More formally,the full generative model of the corpus is:

    1. Choose global topic weights β ∼ GEM(α)2. For each topic index k = {1, . . . }:

    (a) Choose topic τk ∼ Dir(σρu)(b) Choose topic transition distribution πk ∼ DP(αT , β)

    !

    !T

    "

    #k

    $k

    % M

    &d

    !D

    '

    (a) Overall Graphical Model

    z1

    w2:lay

    w3:phrase

    w7:forw7:his

    w5:some w6:mind

    w1:START

    w9:year

    w4:in

    z2

    z3 z4

    z5

    z6

    z7

    z8

    z9

    (b) Sentence GraphicalModel

    Figure 1: Graphical model for a syntactic topic model (left); ingreater detail is the graphical model for each sentence (right).

    his, their, other, us, its, last, one, all

    0.42

    0.10

    0.57

    0.06

    0.26

    0.29

    0.08

    0.31

    0.67

    0.06

    0.28

    policy, gorbachev,

    mikhail, leader, soviet, restructuring,

    software

    0.95

    START

    garden, visit, having, aid,

    prime, despite, minister,

    especially

    0.37

    television, public,

    australia, cable, host, franchise,

    service

    0.34

    says, could,

    can, did, do, may, does, say

    0.11

    they, who, he, there, one, we, also, if

    0.11

    mr, inc, co, president,

    corp, chairman,

    vice, analyst,

    europe, eastern,

    protection, corp, poland,

    hungary, chapter, aid

    0.52

    shares, quarter,

    market, sales, earnings, interest,

    months, yield

    0.22

    0.25

    0.09

    Figure 2: On hand-parsed documents, the STM discovered twocategories of topics. Some topics (shaded with grey) were sharedacross almost all documents and filled the role of a generic partof speech, not reflecting any thematic specification. Other topics,however, are selected by a document’s semantic constraints.

    3. For each document d = {1, . . . M}:(a) Choose topic weights θd ∼ DP(αD, β)(b) For each sentence root node:

    i. Choose topic assignment z0 ∝ θdπstartii. Choose root word wd,0 ∼ mult(1, τz0 )

    (c) For each additional word wd,n and parent pn, n ∈ {1, . . . dn}i. Choose topic assignment zd,n ∝ θdπzp(d,n)ii. Choose word wd,n ∼ mult(1, τzd,n )

    To discover the best configuration of these unobservedvariables in our generative process we use variational infer-ence for nonparametric Bayesian models [3]. This processuncovers the best top-level weights, topic transitions, per-document topic distributions, topic assignments, and topics.

    We fit the STM to the Penn Treebank [4]. Instead ofgrouping all nouns into a single topic, some parts of speech(such as nouns and adjectives) are divided into specializedsyntactic groups that appear in similar documents (Fig-ure 2), but other parts of speech such as verbs and preposi-tions are shared across many documents. Quantitatively, theSTM also did better in predicting words on held-out data;its perplexity on held out documents was better (lower) thanthe HDP or the infinite tree.

    References[1] J. R. Finkel, T. Grenager, and C. D. Manning. The infinite tree.

    In ACL, pages 272–279, Prague, Czech Republic, June 2007. As-sociation for Computational Linguistics.

    [2] G. Hinton. Products of experts. In Proceedings of the Ninth In-ternational Conference on Artificial Neural Networks, pages 1–6,Edinburgh, Scotland, 1999. IEEE.

    [3] P. Liang, S. Petrov, M. Jordan, and D. Klein. The infinite PCFGusing hierarchical Dirichlet processes. In HLT, pages 688–697,2007.

    [4] M. P. Marcus, B. Santorini, and M. A. Marcinkiewicz. Building alarge annotated corpus of English: The Penn treebank. Computa-tional Linguistics, 19(2):313–330, 1994.

    [5] Y. W. Tee, M. I. Jordan, M. J. Beal, and D. M. Blei. Hierarchicaldirichlet processes. JASA, 101(476):1566–1581, December 2006.

    Learned topics are consistent with both syntax and theme.

    Poster 19

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    P-20Is Semantics Just Picking the Right Syntax for the Context

    from Multiple possiblties?

    Breck BaldwinAlias-i

    On Deck:Double Deck:

    P-21 Cliff JoslynP-23 Eiman Tamah Al-Shammari

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • P-20

    P-20Is Semantics Just Picking the Right Syntax for the Context

    from Multiple possiblties?

    Breck BaldwinAlias-i

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    P-21Semantic Hierarchies: Induction, Measurement, and

    Management

    Cliff Joslyn, Michelle Gregory, Liam McGrath, Patrick Paulson,Karin Verspoor

    Pacific Northwest National Laboratory, University of Colorado Denver

    On Deck:Double Deck:

    P-23 Eiman Tamah Al-ShammariP-24 Kimiaki Shirahama

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • P-21

    Semantic Hierarchies: Induction, Measurement, and Management

    Concept lattices

    Semantic hierarchies from relational data

    Semantic Hierarchies:Cores of ontologies80-90% of links in real-world ontologiesBecoming large:

    104-106 nodes

    Need for algorithms and measures

    Induction from text

    Visualization, annotation

    Alignment, matching

    Mathematical order theoryMetrics: Distances and similarities

    Wordet

    Gene Ontology

    relational data

    Capture implicationrelations dually between objects, attributes

    Unbiased, graphical, visual representation

    Metrics: Distances and similarities based on semi-modular valuation functionsRanks: Structure of vertical levels Morphisms: Mappings and linkages

    Issues for knowledge systemsEnable robust use of multiple inheritance: beyond trees!Avoid risks of pure graph theory, path-counting methodsProper use of vertical levels

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    P-23Syntactical Knowledge usage to Reduce Arabic/English

    Stemming Errors

    Eiman Tamah Al-ShammariKuwait University, George Mason University

    On Deck:Double Deck:

    P-24 Kimiaki ShirahamaP-25 Kazuhiro Seki

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • P-23

    P-23Syntactical Knowledge usage to Reduce Arabic/English

    Stemming Errors

    Eiman Tamah Al-ShammariKuwait University, George Mason University

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    P-24Characteristics of Textual Information in Video Data from

    the Perspective of Natural Language Processing

    Kimiaki Shirahama, Akihito Mizui and Kuniaki UeharaKobe University

    On Deck:Double Deck:

    P-25 Kazuhiro SekiP-28 Marine Carpuat

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • P-24

    Characteristic of Textual Information in Video DataCharacteristic of Textual Information in Video Datafrom the Perspective of Natural Language Processingfrom the Perspective of Natural Language Processing

    Topic detection in videos using utterances obtained by ASR method (ASR transcripts)→ Efficient search and browsing of a video archivePurposePurpose

    Video

    Audio

    Text documentText document VideoVideoAll the semantic contents are conveyedonly through a text medium.

    Semantic contents are conveyed through synchronizedvideo and audio media in a complementary manner.Synergy between video and audio media

    Pattern of word occurrences

    Preliminary examination of whether NLP methods can appropriatelyPreliminary examination of whether NLP methods can appropriately process ASR transcriptsprocess ASR transcripts

    Trigger pair extraction → NLP methods cannot treat temporal distributions of spoken words.Topic extraction by LDA → The same words are commonly spoken in different words.The same word is not spoken so many times. → Burst detection based on character’s appearance

    Trigger pair….. President Kennedy had embarked on a tour of Texas in an effort to raise campaign funds and to unite party members. The President, accompanied by Vice-President Lyndon B. Johnson, Texas Governor John Connally, ….. The motorcade started a few minutes late but managed to proceed close to its schedule. The crowds were exuberant, encroaching on every vantage point along the route. ….. Incidents such as that, the clearing weather, the bright warm sun, and the tremendous and loudly cheering crowds were exactly what the president needed. ….. The Kennedy magic was at its best. Then, more than halfway along the route through Dallas, and just as the motorcade broke through the heaviest street crowds, ….. Shots echoed through Dealey Plaza. President Kennedy was mortally wounded, Governor Connally was seriously wounded. …..

    Burst

    time

    Oh my god!President, please shake with me.

    Kennedy, it’s fine today.

    Yeah. Sure.I’m happy many people cometo this campaign at Dallas.

    Oh no! Jesus Christ!

    Discuss how to improve NLP methods for ASR transcript processing!

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    P-25Biomedical Association Discovery via Complementary TDM

    Kazuhiro Seki and Kuniaki UeharaKobe University

    On Deck:Double Deck:

    P-28 Marine CarpuatP-29 Rion Snow

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • P-25

    K. Seki & K. Uehara at Kobe University (PK. Seki & K. Uehara at Kobe University (P‐‐25)25)

    Text data mining (TDM)

    Explicit information Implicit information

    IR, IE, Classification, Summarization, etc.

    hypothesis discovery orliterature‐based discovery

    G O t l t ti G ti i ti di

    Genesg2g2g1g1 lglarticle

    Gene Ontology annotation Genetic association discovery

    Phenotypes

    Gene functions

    p1p1 p2p2 pnpn

    1f1 2f2 3f3 f 1fm‐1 fmfm

    GO

    negative

    positive

    annotationRepeat foreach gene

    DiseasedCCBP MF

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    P-28Word Sense Disambiguation for Statistical Machine

    Translation

    Marine CarpuatColumbia University

    On Deck:Double Deck:

    P-29 Rion SnowP-30 Delip Rao

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • P-28

    Word
Sense
Disambigua1on

for
Sta1s1cal
Machine
Transla1on


    Marine
Carpuat














Columbia
University
Center
for
Computa1onal
Learning
Systems


      Most
SMT
systems
do
not
explicitly
use
WSD
  sta1c
transla7on
probabili7es,
not
sensi7ve
to
context


      But
using
WSD
for
SMT
first
gave
confusing
results
  WSD
for
SMT
hurts
BLEU
score
!?
[Carpuat
&
Wu
ACL‐2005]
  But
WSD
should
help
SMT…
[Carpuat
&
Wu
IJCNLP‐05]


      Generalizing
WSD
to
Phrase
Sense
Disambigua0on
for
SMT
[Carpuat
&
Wu,
2007]
  PSD
is
fully
phrasal
just
like
conven7onal
SMT
lexicons
  PSD
predic7ons
are
fully
integrated
in
SMT
decoding
  PSD
models
are
trained
on
the
same
parallel
data
as
SMT
lexicons


    
PSD
improves
transla7on
quality
consistently
on
8
metrics
and
4
tasks


    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    P-29Crowdsourcing Annotations for Natural Language Tasks: An

    Evaluation

    Rion SnowStanford University

    On Deck:Double Deck:

    P-30 Delip RaoP-31 James Mayfield

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • P-29

    Crowdsourcing Annotations for Natural Language Tasks: An Evaluation

    • What would you do if you had an on-demand army of thousands of annotators?

    • 10,000 labels / day• 1,000 labels / dollar• Expert-quality labeling or better (with some tricks)

    • Results on five natural language tasks

    Cheap and Fast - But is it Good? Snow et al., EMNLP-2008

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    P-30Bootstrapping Extraction Patterns from Wikipedia

    Delip RaoJHU

    On Deck:Double Deck:

    P-31 James MayfieldD-1 Daniel Tunkelang

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • P-30

    P-30Bootstrapping Extraction Patterns from Wikipedia

    Delip RaoJHU

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    P-31Knowledge Base Evaluation for Semantic Knowledge

    Discovery

    James Mayfield, Bonnie Dorr, Tim Finin, Douglas Oard andChristine Piatko

    Human Language Technology Center of Excellence

    On Deck:Double Deck:

    D-1 Daniel TunkelangD-3 David Nadeau

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • P-31

    Mayfield, Dorr, Finin, Oard, Piatko NYU Symposium on Semantic Knowledge Discovery, Organization and Use

    Knowledge Base Evaluationfor Semantic Knowledge Discovery

    • Key idea: evaluate knowledge base, not extraction output• Six evaluation axes

    – Accuracy– Usefulness– Augmentation– Explanation– Adaptation– Temporal Qualification

    • This approach has many advantages!

    KBStructured Knowledge

    Entities

    Events

    Relations

    PERSONAli Hassan al-Majidيتيركتلا ديجملا دبع نسح يلعDOB: 1941Citizenship: IraqPosition: Defense Minister

    ORGANIZATIONJihaz al-Mukhabarat al-AmmaAKA: Jihaz al-KhasCountry: Iraq

    Evaluate

    Ali Hassan al-Majid

    يتيركتلا ديجملا دبع نسح يلع

    Evaluate

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    D-1Unsupervised Annotation and Exploratory Search

    Daniel TunkelangEndeca

    On Deck:Double Deck:

    D-3 David NadeauD-4 Gregory Marton

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • D-1

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    D-3Demo of Semi-Supervised Named Entity Recognition at

    OpenPlaces

    David NadeauOpenplaces

    On Deck:Double Deck:

    D-4 Gregory MartonD-5 Mona Diab

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • D-3

    open

    - Minimal human input

    - Web page wrapper induction

    Named

    openplaces

    inimal human input

    eb page wrapper induction

    Semi-supervised

    Named Entity

    places

    inimal human input

    eb page wrapper induction

    supervised

    ntity Recognition

    placestm

    eb page wrapper induction

    ecognition

    - T

    - 1 trillion ‘relations'

    Travel ontology

    1 trillion ‘relations'

    Travel ontology

    ravel ontology

    1 trillion ‘relations'

    Travel ontology

    Semantic Search Engine

    for the Travel domain

    Search Engine

    Travel domain

    Search Engine

    Travel domain

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    D-4Procedure Discovery for Time Expression Understanding

    Gregory MartonMIT

    On Deck:Double Deck:

    D-5 Mona DiabD-6 Michael Paul

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • D-4

    Procedure Discoveryfor Time Expression Understanding

    Gregory [email protected]

    Existing Lexicon"tomorrow" : (λ.t (.add t 1 'day))

    "May Day" : (λ.t (.near t #:month 5 #:day 1))

    "Thrusday" : (λ.t (.near t #:day-of-week 4))

    ...

    Learned Semantics

    Unseen Word

    "World AIDS Day""Veterans Day""May Day""Thanksgiving"...

    "Thrusday" "Earth Day"

    "Thursday""Thruway""Tuesday"...

    Source Semantics

    DistributionallySimilar Words

    (λ.t (.near t #:month 11 #:day 11))"Earth Day" : (λ.t (.near t #:month 4 #:day 22)) (λ.t (.near t #:month 11 #:day-of-week 4 #:nth 4))

    VAL="2003-04-22"

    VAL="2001-09-11"

    (λ.t (.near t #:month 5 #:day 1))

    Unseen Meaning"9/11"

    VAL="2003-04-22"

    "9/11" : (λ.t (.set-value t "2001-09-11"))

    "Thursday" : (λ.t (.near t #:day-of-week 4))

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    D-5SALAMCAT: Sense Assignment Leveraging Alignments,

    Monolingual Contexts And Translations

    Mona Diab and Weiwei GuoColumbia University

    On Deck:Double Deck:

    D-6 Michael PaulD-7 Emily Jamison

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • D-5

    D-5SALAMCAT: Sense Assignment Leveraging Alignments,

    Monolingual Constexts And Translations

    Mona Diab and Weiwei GuoColumbia University

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    D-6AIRTA: An Automatic Inter-disciplinary Research Topic

    Advisor - Where are We and Where do We Go -

    Michael Paul and Roxana GirjuUniversity of Illinois at Urbana-Champaign

    On Deck:Double Deck:

    D-7 Emily JamisonD-8 Toru Hirano

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • D-6

    Michael Paul† and Roxana Girju‡Departments of Computer Science(† ‡) and Linguistics (‡), Beckman Institute († ‡)

    University of Illinois at Urbana-Champaign{mjpaul2, girju}@illinois.edu

    IntroductionWe believe that like other disciplines, computational linguistics will drastically benefit from an inter-disciplinary perspective.

    Our tool is designed to foster interdisciplinary research in order to make breakthrough predictions for future directions.

    This is accomplished by analysing trends within and across relevant fields and then automatically suggesting new research directions and topics.

    Some fields motivating research in computational linguistics

    Trends/AnalysisBecause our data is categorized and labelled by year, we can see how research in certain fields rises and declines over time.

    We can use this informationto gauge which topics areimportant and which areasare saturated.

    We also look forcorrelations in trendsin similar fields acrossdifferent disciplines.

    Next StepThe next phase of this project (the final goal) will be to generate new topics. The key is to discover topics that are important in one discipline but have been studied little in another. These suggestions will be useful to professionals who would like to engage in research discussions with other parties, but who are not familiar with those areas. It will be beneficial to students looking for novel research topics.

    Back EndWe currently have a database with:

    4,700 papers from computational linguistics conferences

    2,300 papers from linguistics journals

    1,700 papers from education/educational psychology journals

    We will enlarge our corpus as we continue to work on this project.

    ClassificationWe categorized these papers mostly using Latent Dirichlet Allocation (LDA) with words from titles, abstracts, and full text when available.

    AIRTA: An Automatic Interdisciplinary Research Topic Advisor- Where are We and Where do We Go -

    Industry

    Linguistics MachineLearning

    CognitivePsychology

    Education

    Lexical Semantics lexical entries semantic word idioms words lexiconMorphology morphological word morphology lexical level formsMT Evaluation evaluation score human scores sentence automatic

    Named EntitiesMultimodal NLP multimodal speech gesture user language input

    entity names named entities ne information person

    Each dot represents a paper in the “Dialogue Systems” category. The coloring shows how papers can span multiple categories.

    A sample of categories and the top keywords associated with them

    Language-related topics comprise the bulk of research in education and are steadily

    increasing in prominence.

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    D-7CACTUS: A User-friendly Toolkit for Semantic

    Categorization and Clustering in the Open Domain

    Emily JamisonThe Ohio State University

    On Deck: D-8 Toru Hirano

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • D-7

    Open-domainNo Training Required

    Easy-to-use GUI

    Near-universal coverageInternet as Knowledge SourceOr, command-line interface

    CACTUS: A User-friendly Toolkit for SemanticCategorization and Clustering in the Open Domain

    Emily K. Jamison CACTUS: A 1-Slide Introduction

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • Next...

    D-8Aggregating Knowledge of Named Entity Relations

    Toru Hirano, Yoshihiro Matsuo, and Genichiro KikuiNTT Cyber Space Laboratories

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use

  • D-8

    Geographicdatabase

    D-8: Aggregating Knowledge of Named Entity Relations

    “George Bush is the President of the U.S”

    NY

    the U.S.

    NE2:String

    New YorkCity-010

    United Statesof America-001

    NE2:ID

    Speech

    President

    Relationship

    George W.Bush-001

    Bush

    George W.Bush-001

    GeorgeBush

    NE1:IDNE1:String

    Web

    [ President, George Bush, the U.S. ]

    Wikipedia

    Relational Database

    Extractor

    8 million records from 14 million web pages

    Warren Weaver Hall, New York University Semantic Knowledge Discovery, Organization and Use