global analytics: text, speech, sentiment, and sense

26
Global Analytics: Text, Speech, Sentiment, and Sense Seth Grimes Alta Plana Corporation @sethgrimes December 4, 2014

Upload: seth-grimes

Post on 07-Jul-2015

550 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Global Analytics: Text, Speech, Sentiment, and Sense

Global Analytics: Text, Speech, Sentiment, and Sense

Seth GrimesAlta Plana Corporation

@sethgrimes

December 4, 2014

Page 2: Global Analytics: Text, Speech, Sentiment, and Sense

Global Analytics: Text, Speech, Sentiment, and Sense

LT-Accelerate – 4 December, 2014

2

Thus the Orb he roam'd

With narrow search; and with inspection deep

Consider'd every Creature, which of all

Most opportune might serve his Wiles.

-- John Milton, Paradise Lost

“Reading from Text is a Hard Problem”

EugèneDelacroix, St. Michael Defeats the Devil

Page 3: Global Analytics: Text, Speech, Sentiment, and Sense

Global Analytics: Text, Speech, Sentiment, and Sense

LT-Accelerate – 4 December, 2014

3

Thus the Orb he roam'd

With narrow search; and with inspection deep

Consider'd every Creature, which of all

Most opportune might serve his Wiles.

-- John Milton, Paradise Lost

“Reading from Text is a Hard Problem”

EugèneDelacroix, St. Michael Defeats the Devil

Data Space, Indexing

Search

Analysis

Intent, Goals

Context

Page 4: Global Analytics: Text, Speech, Sentiment, and Sense

Global Analytics: Text, Speech, Sentiment, and Sense

LT-Accelerate – 4 December, 2014

4

Page 5: Global Analytics: Text, Speech, Sentiment, and Sense

Global Analytics: Text, Speech, Sentiment, and Sense

LT-Accelerate – 4 December, 2014

5

Analytics is the systematic application of algorithmic methods that derive and deliver information, typically expressed quantitatively, whether in the form of indicators, tables, visualizations, or models.

• Systematic means formal & repeatable.

• Algorithmic contrasts with heuristic.

Analytics creates and/or applies models.

Page 6: Global Analytics: Text, Speech, Sentiment, and Sense

Global Analytics: Text, Speech, Sentiment, and Sense

LT-Accelerate – 4 December, 2014

6

http://www.tropicalisland.de/NYC_New_York_Brooklyn_Bridge_from_World_Trade_Center_b.jpg

x(t) = ty(t) = ½ a (et/a + e-t/a)

= acosh(t/a)

http://en.wikipedia.org/wiki/Seven_Bridges_of_K%C3%B6nigsberg

Models make the unstructured computable.

Page 7: Global Analytics: Text, Speech, Sentiment, and Sense

Global Analytics: Text, Speech, Sentiment, and Sense

LT-Accelerate – 4 December, 2014

7

Sixty+ years of analysis & modelling progress:

Text

Numbers

Patterns & Insights

Connections

Interactions

Page 8: Global Analytics: Text, Speech, Sentiment, and Sense
Page 9: Global Analytics: Text, Speech, Sentiment, and Sense

Document input and processing

Knowledge handling is key

Desk Set (1957): Computer engineer Richard Sumner (Spencer Tracy) and television network librarian Bunny Watson (Katherine Hepburn) and the "electronic brain" EMERAC.

Hans Peter Luhn

“A Business Intelligence System”

IBM Journal, October 1958

Page 10: Global Analytics: Text, Speech, Sentiment, and Sense

Global Analytics: Text, Speech, Sentiment, and Sense

LT-Accelerate – 4 December, 2014

10

“Statistical information derived from word frequency and distribution is used by the machine to compute a relative measure of significance, first for individual words and then for sentences. Sentences scoring highest in significance are extracted and printed out to become the auto-abstract.”

H.P. Luhn, The Automatic Creation of Literature Abstracts, IBM Journal, 1958.

Luhn’s analysis of Messengers of the Nervous System, a Scientific Americanarticle

http://wordle.net, applied to a Luhn-cited

NY Times article

Page 11: Global Analytics: Text, Speech, Sentiment, and Sense

Global Analytics: Text, Speech, Sentiment, and Sense

LT-Accelerate – 4 December, 2014

11

“This rather unsophisticated argument on ‘significance’ avoids such linguistic implications as grammar and syntax... No attention is paid to the logical and semantic relationships the author has established.”

-- H.P. Luhn

~ 2004-5

Page 12: Global Analytics: Text, Speech, Sentiment, and Sense
Page 13: Global Analytics: Text, Speech, Sentiment, and Sense
Page 14: Global Analytics: Text, Speech, Sentiment, and Sense
Page 15: Global Analytics: Text, Speech, Sentiment, and Sense

Global Analytics: Text, Speech, Sentiment, and Sense

LT-Accelerate – 4 December, 2014

15

Patterns, Insights & Connections

~ 2009-12

Page 16: Global Analytics: Text, Speech, Sentiment, and Sense

Global Analytics: Text, Speech, Sentiment, and Sense

LT-Accelerate – 4 December, 2014

16

… also commonly explored via dashboards.

Page 17: Global Analytics: Text, Speech, Sentiment, and Sense

Global Analytics: Text, Speech, Sentiment, and Sense

LT-Accelerate – 4 December, 2014

17

Current, 33%

Current, 31%

Current, 34%

Current, 47%

Current, 51%

Current, 56%

Current, 47%

Current, 54%

Current, 66%

Expect, 21%

Expect, 24%

Expect, 23%

Expect, 23%

Expect, 28%

Expect, 25%

Expect, 33%

Expect, 28%

Expect, 22%

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Events

Semantic annotations

Other entities – phone numbers, part/product …

Metadata such as document author,…

Concepts, that is, abstract groups of entities

Named entities – people, companies, …

Relationships and/or facts

Sentiment, opinions, attitudes, emotions,…

Topics and themes

Do you currently need (or expect to need) to extract or analyze...

Text Analytics 2014http://altaplana.com/TA2014

What information?

Page 18: Global Analytics: Text, Speech, Sentiment, and Sense

Global Analytics: Text, Speech, Sentiment, and Sense

LT-Accelerate – 4 December, 2014

18

Emotion and outcomes

Page 19: Global Analytics: Text, Speech, Sentiment, and Sense

Global Analytics: Text, Speech, Sentiment, and Sense

LT-Accelerate – 4 December, 2014

19

“The share rise in users who selected Arabic…coincided with much of the civil unrest… in Middle Eastern countries.”

http://bits.blogs.nytimes.com/2014/03/09/the-languages-of-twitter-users/

Page 20: Global Analytics: Text, Speech, Sentiment, and Sense

Global Analytics: Text, Speech, Sentiment, and Sense

LT-Accelerate – 4 December, 2014

20

10%

1%

16%

9%

36%

34%

2%

2%

18%

7%

4%

3%

13%

8%

7%

38%

3%

2%

3%

2%

5%

9%

17%

3%

28%

7%

17%

24%

2%

10%

11%

15%

8%

4%

17%

21%

3%

20%

4%

0%

1%

1%

2%

0%

0% 10% 20% 30% 40% 50% 60%

Arabic

Bahasa Indonesia or Malay

Chinese

Dutch

French

German

Greek

Hindi, Urdu, Bengali, Punjabi, or other…

Italian

Japanese

Korean

Polish

Portuguese

Russian

Scandinavian or Baltic

Spanish

Turkish or Turkic

Other African

Other Arabic script (including Urdu,…

Other East Asian

Other European or Slavic/Cyrillic

Other

Current

Within 2 years

Non-English language support?

Text Analytics 2014http://altaplana.com/TA2014

Page 21: Global Analytics: Text, Speech, Sentiment, and Sense

Global Analytics: Text, Speech, Sentiment, and Sense

LT-Accelerate – 4 December, 2014

21

Audio including speech

Images

Video

IOT

http://www.geekosystem.com/facebook-face-recognition/

http://www.sciencedirect.com/science/article/pii/S0167639312000118

http://flylib.com/books/en/2.495.1.54/1/

Beyond text

Page 22: Global Analytics: Text, Speech, Sentiment, and Sense

Global Analytics: Text, Speech, Sentiment, and Sense

LT-Accelerate – 4 December, 2014

22

http://searchuserinterfaces.com/

“It is convenient to divide the entire information access process into two main components: information retrieval through searching and browsing, and analysis and synthesis of results. This broader process is often referred to in the literature as sensemaking.

Sensemaking refers to an iterative process of formulating a conceptual representation from of a large volume of information.”

– Marti Hearst, 2009

Sensemaking

Page 23: Global Analytics: Text, Speech, Sentiment, and Sense

Global Analytics: Text, Speech, Sentiment, and Sense

LT-Accelerate – 4 December, 2014

23

ChallengesContext

Interaction

Narrative and discourse

Correlation, integration, and synthesis

Sentiment++: Mood, opinions, emotions, intent

Question answering

Dialog, storytelling

Cross-lingual / “omni-channel” implementation

Prescription, autonomy

Singularity?

Page 24: Global Analytics: Text, Speech, Sentiment, and Sense

Global Analytics: Text, Speech, Sentiment, and Sense

LT-Accelerate – 4 December, 2014

24

Opportunity enablersThe API economy

I.e., on-demand, via-API Web services

Cloud deployment and service delivery

…enabling rapid deployment

Data aggregation and enrichment

Examples: Gnip, DataSift, Spinn3r, and Moreover

Growth hacking

Knowledge graphs

Machine learning

Supervised, unsupervised, active, deep

Open source

Platforms and frameworks

Examples: UIMA, GATE… Salesforce, QlikView… Python, R

Page 25: Global Analytics: Text, Speech, Sentiment, and Sense

Global Analytics: Text, Speech, Sentiment, and Sense

LT-Accelerate – 4 December, 2014

25

Where to?

Page 26: Global Analytics: Text, Speech, Sentiment, and Sense

Global Analytics: Text, Speech, Sentiment, and Sense

Seth GrimesAlta Plana Corporation

@sethgrimes

December 4, 2014