aquaint phase ii 6-month workshop tampa, october 6-8, 2004

33
October 8, 2004 AQUAINT 6-month Mtg 1 HITIQA-2 Intelligence Analyst’s Assistant in High-Quality, Interactive Question Answering Research Progress Report AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

Upload: cady

Post on 12-Jan-2016

22 views

Category:

Documents


1 download

DESCRIPTION

HITIQA-2 Intelligence Analyst’s Assistant in High-Quality, Interactive Question Answering Research Progress Report. AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004. HITIQA Research Team. SUNY Albany : Prof. Tomek Strzalkowski, PI/PM Prof. Boris Yamrom, co-PI - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 1

HITIQA-2 Intelligence Analyst’s Assistant in High-Quality, Interactive Question Answering

Research Progress Report

AQUAINT Phase II 6-month Workshop

Tampa, October 6-8, 2004

Page 2: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 2

HITIQA Research Team• SUNY Albany:

– Prof. Tomek Strzalkowski, PI/PM– Prof. Boris Yamrom, co-PI– Ms. Sharon Small, Research Scientist– Ms. Hilda Hardy, Research Scientist– Mr. Sean Ryan, Research Assistant– Graduate students

• Rutgers:– Prof. Paul Kantor, co-PI– Prof. K.B. Ng– Prof. Nina Wacholder– Graduate students

Page 3: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 3

HITIQA Research Objectives • HITIQA is an Analytical QA System

– “Scenario” QA: look for facts and events in context– Not a factoid system – but factoids are complementary – Semantics is central: data-driven, knowledge-based

• QA is Dialogue with Information– Analytical task: topic + context (time, recipient, purpose)– Evolving analytical strategy: line of questions & actions– Detect, follow, anticipate, negotiate strategy turns and shifts

• HITIQA Approach– Phase I: Create basic end-to-end capabilities + validate– Phase II: Build-up knowledge + sustain productive dialogue– Phase III+: Augment the analytical process through active

assistance

Page 4: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 4

HITIQA Deployment

• HITIQA has been deployed at– MITRE: AQUAINT test-bed– RDEC/SAIC: 2 large servers, local/VPN access

• Used in tomorrow’s exercise

– PNNL: local – Metrics Challenge Workshop– Albany: local & on-line

• HITIQA on-line– Currently accessible from unsecured internet

locations (Albany)– Also within VPN (SAIC)– Unlimited access over firewalls – in progress

Page 5: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 5

Tryouts and Evaluations• On-site workshops with USNR and other analysts

– Two workshops conducted in Phase 1 with USNR– ARDA Metrics Challenge summer workshop: 2.5 weeks/8 analysts– Future workshops (Spring 2005): SAIC, Albany

• On-line evaluations with USNR– Monthly weekend drills for 6 months– Longitudinal studies: extended drill scenarios– Status: Firewall problems, working around these

• Formal Evaluations– Dry run based on the results of the Metrics Challenge?

• Testing Facilities– PNNL and MITRE installs– SAIC large-scale installation – AFRL Rome

Page 6: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 6

HITIQA-2 Task Structure

Task 1: QA

Task 4: System Adaptation

Task 2: Dialogue

Task 3: KnowledgeAcquisition

Task 5: Qualities and Aspects

Task 6: AnswerGeneration

Task 7: VisualInterface

Task 8: Tryouts & Evaluations

Page 7: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 7

Extended QA Capabilities

• Expanded Scenario Support– Persistent memory of interactions

• Questions + clarifications and offers• Questions and answers within a scenario

– Follow-through questions• Handling of drill-down questions• Handling of variant questions

– Composite Answer Space model • Expandable Answer for the entire scenario• New information found anywhere during scenario

updates previous answers

Page 8: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 8

Exploring Answer Space via Dialogue

NEARMISSES,

ALTERNATIVE INTERPRETATIONS

EXACTQUESTION

MATCH

POSSIBLEDISCARD

Anticipating related information

Page 9: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 9

Scenario Structure

• Scenario = analytical problem– A series of questions asked by analyst

• What is the history of the nuclear arms program between Russia and Iraq?

• Who has helped financed the nuclear arms program in Iraq?

• (Composite) Question– A question posed by analyst +– Any follow-on by either HITIQA or analyst

• A: How has al-Qaida conducted its efforts to acquire weapons of mass destruction?

• H: We have this information referring to bin Laden but no mention of al-Qaida. Are you interested?

Page 10: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 10

Scenario-level answer structureQ0: What is sarin’s potency?

• Botulin 100K times more toxic than sarin• Persists for 30 minutes in clothes

• Botulin 100K times more toxic than sarin• Persists for 30 minutes in clothes

potency

Q1: sarin development?

Develop(X,sarin)

Q2: nerve agents?

nerve agents

Q0: sarin’s impact on community?

Page 11: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 11

Composite Question Structure

Q = Q0 + Q1 + Q2 + …+ +

Original question posed by analyst

Clarification/offer by HITIQA

Visual panel action by analyst

Page 12: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 12

Events and Relationships• Events are basic information units in HITIQA

– Generic events– Typed events– Domain-grounded events

• Represented internally as frames:– Event type: e.g., transfer, attack, …– Attributes: e.g., people, locations, …

– Roles: e.g., agent, target, destination, …

• Frames are grouped into topics & “swarms”– Attribute & keyword overlap → topical clusters– Shared frame types & roles → event clusters– Effect, affect, sequence, … → event “swarms”

Page 13: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 13

Event Frames

… Iraq possesses a few working centrifuges and the blueprints to build them. Iraq imported centrifuge materials from Nukem of the FRG and from other sources. One decade ago, Iraq imported 27 pounds of weapons-grade uranium from France, for Osirak nuclear research center. In 1981, Israel destroyed the Osirak nuclear reactor. In November 1990, the IAEA inspected Iraq and found all material accounted for. Peter Clausen, director of research at the Union of Concerned Scientists, said scientists are divided on whether one nuclear bomb can be made by Iraq from the 27 pounds of weapons-grade uranium. Marvin Miller, senior nuclear scientist at MIT in the US, said a crude Iraqi nuclear bomb couldn't fit on a missile, but could be carried in a large aircraft.

TOPIC: importedSUB-TOPIC: uraniumLOCATION France Iraq Persian Gulf, US, IsraelOGANIZATION: Administration, Emerging Nuclear

Suppliers Project, FRGT, IAEA, MITDISEASES:PEOPLE: Bush, George Bush, Leonard S.

Spector, William Potter, Leonard Spector, Nukem, Peter Clausen, Marvin Miller

VALUE:WEAPON: missile, nuclear bomb, uraniumDATES: November 1990

FRAME TYPE: TRANSFER WMDTransfer

TRANSFER TYPE (TOPIC): imported

TRANSFER DEST (LOCATION): Iraq

TRANSFER SOURCE (LOCATION): France

TRANSFER OBJECT (WEAPON): uranium

EXTRACT

ASSIGN ROLES&

SPECIALIZE

Page 14: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 14

Event Clusters and Swarms • Groups of closely related events → clusters

– Shared types & roles → event clusters (imports of uranium)– Attributes & text similarity → topical clusters (nerve agents)

• Events may be parts of larger topics or stories– One event makes another event likely → swarming links

• If missile exports by North Korea is of interest then likely missile developments status in NK may be relevant also.

• Provide guidance for dialogue & exploration– May facilitate hypothesis formation (by analyst)

Page 15: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 15

Multiple Views of Answer Space

TRF

TRF

DEV

DEV

GEN

Topical cluster:e.g. nerve agents

TRFTRF

DEVDEV

GEN

Event cluster:e.g., import of sarin

target

target

target

target

Swarming Links

Page 16: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 16

HITIQA Dialogue capabilities• System actions: offers, suggestions, clarifications

– Offering closely related and/or swarming events:• Imports from Korea? Missile developments there?

– Suggesting additional events in a cluster with relevant events• Sarin production in Iran? In Pakistan?

• Analyst’s actions affecting answer space– Accept, partially accept, or reject offers– Strategy changes, diversions

• Support hypothesis formation– HITIQA can affect analysts actions through:

• Explicit suggestions and offers• Indirectly through: direction of dialogue, answer presentation

– Effects can be direct/immediate or indirect/delayed• What information to save (direct)• What question to ask next (indirect)

Page 17: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 17

Example (from a HITIQA drill)

Analyst: What is the history of the nuclear arms program between Russia and Iraq?

HITIQA: Do you want to see material on cooperation on development of weapons between Iraq and Russia?

Analyst: Yes … Who has helped finance the nuclear arms program in Iraq?

HITIQA: Would you be interested in information on weapons in Iraq?

Analyst: Yes … What type of nuclear weapons does Iraq possess?

HITIQA: Would you be interested in information on dollars in Russia relative to your question?

Analyst: Yes … What is the finance debt that Iraq owes Russia?

Page 18: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 18

Knowledge Acquisition Process

• Template development – frame mining– Prop-Bank/Verb bank/Time Bank, etc.

– Corpus mining of frequently occurring event types

• Roles and relationships – Develop frames for selected event types:

• Determine entities in the frame & role for each entity

• Determine seed rules for frame/role extraction

– Classify modal and other relationships• e.g., denials, threats, allegations, …

Page 19: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 19

Knowledge Acquisition Process

• Bootstrapping over text corpora– Develop feature set (context elements)

– Bootstrapping exploits duality of lexical and pattern space

• Expand from seed rules to high recall extraction

• Frame acquisition from structured data– Statistical structure-to-text alignment

• Use CNS/WMD database

– Extract seed rules from aligned corpora

Page 20: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 20

Preliminary event framesEvent Types (selected) Attributes

ATTACK attack, invade, bomb, destroy

TARGET, AGENT, TYPE, INSTR, LOC, TIME

ASSIST help, support, assist, aid

TARGET, AGENT,TYPE, INSTR, LOC, TIME

TRANSFER acquire, obtain, seize, steal, export, smuggle

CARGO,TYPE, SOURCE, DEST, CONV, TIME

DEVELOP construct, develop, manufacture, deploy, design

PROD, AGENT, TYPE, QUANT, LOC, TIME

AGREE treaty, agreement, sign, ratify

PARTIES, TYPE, INSTR, LOC, TIME

THREAT threaten, fear, menace, endanger

TARGET, AGENT, TYPE, INSTR, LOC, TIME

Page 21: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 21

Preliminary event frames, cont’dEvent Types (selected) Attributes

POLITICAL election, appoint, resign, hire, vote

TARGET, AGENT, TYPE, POSITION, LOC, TIME

LEGAL inspect, impose embargo, pass, detain

TARGET, AGENT, TYPE, CHARGE, LOC, TIME

FINANCE fund, finance, pay TARGET, SOURCE, TYPE, QUANT, LOC, TIME

CAPABLE possess, control, capable

AGENT, TYPE, INSTR, QUANT, LOC, TIME

• Modal Attributes:– Polarity: positive, negative, actual, probable, future, …– Manner: say, claim, threaten, allege, advise, refute, …– Source, if known

Page 22: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 22

Answer generation• Current Frame-based summaries

– Frames in the answer space rendered into headlines– Passages sorted by “threads”: target, time, location, etc.

• Build more coherence into the answer– Use elements Discourse Structure Theory

• Applied to passages from multiple documents

– Passages are output to form a more logical flow

• Maximize lucidity of the answer– Use dialogue history to structure the answer → folders– Compute rhetorical relations between answer elements

• justification, elaboration, evidence, contradiction, etc.

• Answer Summaries – Summarize answer passages using XDoX Summarizer

Page 23: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 23

Answer Organization Approach

Semantic relations →→→→→ Rhetorical relationsShared attributes mapped onto between text passagesSwarming links →→→→→

Frames Passages

Page 24: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 24

Answer Structuring OptionsWe also believe that Bin Ladin was seeking to acquire or develop a nuclear device. Al-Qa'ida may be pursuing a radioactive dispersal device what some call a dirty bomb.

Israeli military intelligence sources reported that Bin Laden paid over 2 mil pounds sterling to a middle-man in Kazakhstan, who promised to deliver a dirty bomb to Bin Laden within two years.

The Saudi-owned, London-based Arabic newspaper, Al-Hayat, declared that Bin Laden had obtained nuclear weapons.

Osama bin Laden probably does not have a nuclear weapon, but likely has chemical or biological weapons, Defense Secretary DonaldH. Rumsfeld said.

Frame Type: TransferType: acquireSource:Destin: Bin Laden, Al-QaidaCargo: nuc dev., dirty bomb

Frame Type: TransferType: deliverSource: mid-man in KZDestin: Bin LadenCargo: dirty bomb

Frame Type: TransferType: obtainSource:Destin: Bin LadenCargo: nuclear weapons

Frame Type: ~CapableType: possessAgent: Bin LadenInstr: nuclear weapons

However,

more

effect

negation

Specifically,

In fact,

Page 25: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 25

Integrated Visual/Language Interface

• Visual navigational context for dialogue – Visual representation for event frames, answer

spaces, and links between answer spaces. – Multi-level views: scenario, question, frame– Visual interactions integrated with QA process.

• Integrated Visual/QA interface – Questions/Answer actions immediately reflected

on visual– Folders reflecting user/system dialogue focus– Visual alerts for system updates

Page 26: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 26

Integrated Visual interface

Page 27: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 27

Dialogue focus tracking

• Animation used to:– Center clusters and folders in focus– Form new folders when system’s offers

accepted– Break up and reorganize old folders – Change colors when relevance decisions

are made

Page 28: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 28

Folder manipulation via dialogue

HITIQA: Would you be interested in concealment activities in Iraq?

Page 29: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 29

Visual Interface: frame view

Page 30: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 30

Representing Content Aspects• Properties of information “orthogonal” to

content– Type of topic: e.g., political, scientific, military, …– Type of content: e.g., historical, biographical, – Type of communication: e.g., human characteristics

• “Metadata” in answer and dialogue– To help organize the answer space– To extend dialogue beyond content only

• Acquired through machine learning– Learn over large framed corpora– When trained insert into HITIQA interface

Page 31: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 31

CA research progress• Identify reliable textual & linguistic indicators

– Lists of words; named entities– Information from frames (events, roles)– Textual information (number of words, vocab. size)– Anaphora resolution

• Incorporate into HITIQA– Frames– Interface

• Evaluate– Reliability of indicators– Usefulness of selected CAs for analysts– Impact on Hitiqa system –increment in finding useful

information and on end-to-end performance

Page 32: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 32

Some preliminary results• 100 words in List T as indicators• 5 sample sets, each consisting of 240 documents for training

and 60 for testing.• Accuracy = correctly classified documents / total documents• Results are extremely good! Better than 90%. Sometimes much

better.

ALL WORDS STEPWISE

Sample Bio Sci Pol%

(60docs) bio sci pol%

(60 docs)

1 23/23 15/15 17/22 92% 23/23 15/15 17/22 92%

2 25/27 19/19 14/14 97% 23/27 19/19 14/14 93%

3 16/17 20/23 18/20 90% 15/17 21/23 18/20 90%

4 18/18 19/19 20/23 95% 18/18 19/19 21/23 97%

5 15/15 18/24 21/21 90% 14/15 22/24 20/21 90%

Page 33: AQUAINT Phase II 6-month Workshop Tampa, October 6-8, 2004

October 8, 2004 AQUAINT 6-month Mtg 33

Intelligence Value of Information• What makes high-quality information?

– Accuracy, reliability, significance, depth, etc.– Detail level, bias, opinion/viewpoint, objectivity, …

• How to recognize high-quality information? – Textual and contextual indicators

• We can compute some qualities (e.g. depth, viewpoints)

– Individualized quality models for users• while other qualities appear highly personalized

• What qualities matter to analysts?– Why some information is better than other?– Possible panel discussion: Thursday 6-7:30pm