ibm watson knowledge studio

23
Watson Knowledge Studio Cognitive Solutions

Upload: paul-godby

Post on 16-Apr-2017

1.052 views

Category:

Software


3 download

TRANSCRIPT

Page 1: IBM Watson Knowledge Studio

Watson Knowledge Studio

Cognitive Solutions

Page 2: IBM Watson Knowledge Studio

Unstructured data

• Every day we produce 2.5 quintillion bytes of new data

• Most of this new data is unstructured data

•Books, journals, health records, e-mails, blogs, tweets, etc.

• Holds many possibilities, but is vastly underutilized due to challenges in

understanding and using the data

•Typical organizations only leverage 8% of this data!

Page 3: IBM Watson Knowledge Studio

Extracting value from unstructured data

• Natural Language Processing (NLP) is a

core function for parsing and identifying

significant words in language.

• Most organizations need to mine unstructured text for specific

information that is unique to their industry or business needs

• Organizations must have the ability to customize the NLP model in order

to realize the full value/benefit of mining the unstructured text

• Helps organizations generate business insights

Page 4: IBM Watson Knowledge Studio

Introducing IBM Watson Knowledge Studio

• Software-as-a-Service (SaaS) offering available exclusively through the

IBM Cloud Marketplace

• Intended to accelerate the training and adaption of Watson with specific

industry and organizational domain knowledge

• Leverages state-of-the-art supervised machine learning techniques that

allow you to create machine-learning models that understand the

linguistic nuances, meaning, and relationships specific to your industry

• http://ibm.biz/ibmwatsonknowledgestudio

Page 5: IBM Watson Knowledge Studio

Watson Knowledge Studio

• Enables developers and domain experts to collaborate on the creation of

custom annotator components that can be used to identify mentions and

relations in unstructured text

Watson Knowledge Studio

Watson Explorer AlchemyLanguage on WDC

Analytics Exchange

SME DEV

Target Users

Page 6: IBM Watson Knowledge Studio

Domain Adaptation with Knowledge Studio

ExpensiveProcess of training machines to extract

information from new domain is fragmented

making it expensive

IsolatedIsolated development environments make it

challenging for domain experts & developers to

work together

ComplexAmbiguous nature of natural language makes it

complex for people to program machines

Challenges IBM Watson Knowledge Studio

CollaborativeSMEs work together to infuse domain

knowledge in cognitive applications

IntuitiveUse a guided experience to teach

Watson nuances of natural language

without writing a single line of code

Cost EffectiveCreate and deploy domain knowledge

infused annotators faster than ever

before using an integrated

development environment

Page 7: IBM Watson Knowledge Studio

Example: Auto manufacturer

• Use case: Identify safety defects using traffic incident reports

• Solution: Create a NLP model that understands relationships between

manufacturer, make, model, type of incident, and date of incident

Page 8: IBM Watson Knowledge Studio

Watson Knowledge Studio terminology

• An Annotator adds annotations (metadata) to text that appears in natural

language content. Used by applications to analyze and process text.

• A Type System is an inventory of everything we want WKS to

understand about the unstructured text

•Mentions = any span of text relevant to the current domain

– Example: airbag, child restraint system, etc.

•Entities = group of Mentions that refer to the same thing

– Example: CarMake, AccidentLocation

•Relation = a binary relationship between two entities

– Example: occurredAt defines a relationship between CarMake and

AccidentLocation

Page 9: IBM Watson Knowledge Studio

Annotation example

John Smith works for IBM. He has been with Big Blue for 20 years.

Entity: PERSON

John Smith

Entity: ORG

IBM Corp

Relation: employedBy Relation: employedBy

Page 10: IBM Watson Knowledge Studio

Creating an Annotator

• Knowledge curation (performed outside of WKS)

•Collect and maintain content relevant to a specific domain

• Ground truth generation

•Produce a collection of vetted data to train Watson on a specific domain

• Annotator component development

•Human annotations used to further train Watson

• Annotator component evaluation

•Determine which documents are promoted to ground truth

• Annotator component deployment

•Export model into machine-learning runtime environments

Page 11: IBM Watson Knowledge Studio

1 – Create a project

• Defines the resources required to create a machine-learning annotator

• training documents, type system, dictionaries, human annotations

Page 12: IBM Watson Knowledge Studio

2 – Create a type system

• Inventory of everything you want WKS to understand in unstructured text

•Mentions, entities, relations

Page 13: IBM Watson Knowledge Studio

3 – Add documents

• Documents that are representative of your domain content (ie: corpus)

• Create document sets and assign to human annotators

Page 14: IBM Watson Knowledge Studio

4 – Pre-annotate using dictionaries

• IBM Bluemix Analytics Exchange provides industry-specific dictionaries

that can be used to automatically annotate documents before humans

•https://console.ng.bluemix.net/data/exchange

Page 15: IBM Watson Knowledge Studio

5 – Annotate documents

• Human annotators use the Ground Truth Editor to apply type system

labels to unstructured text

• Multiple users will perform this task across document sets

Page 16: IBM Watson Knowledge Studio

5 – Analyze results

• Inter-Annotator Agreement (IAA) scores can be used to determine

whether humans are annotating overlapping documents consistently

• Documents with a passing score are promoted to ground truth

Page 17: IBM Watson Knowledge Studio

6 – Create a machine learning annotator

• Select document sets that will be used to train the annotator

• Can only train using documents that have been promoted to ground truth

Page 18: IBM Watson Knowledge Studio

End-to-end domain adaptation

Page 19: IBM Watson Knowledge Studio

Watson Knowledge Studio trial

• http://ibm.biz/ibmwatsonknowledgestudio

• Free 30-day trial

– 5 authorized users

– 10 projects

– Leverage artifacts from IBM Analytics Exchange

– Deploy models directly to the Watson Developer Cloud

Page 20: IBM Watson Knowledge Studio

AlchemyLanguage

• A collection of APIs that offer text analysis using NLP

• Helps you understand sentiment, keywords, entities,

concepts, and more

• Available via the Watson Developer Cloud

• Knowledge domains

•By default, uses a public IBM provided model

– Trained using billions of English websites and news content

•May use custom domain models created using WKS

Page 21: IBM Watson Knowledge Studio

AlchemyLanguage: custom domain model

• In Bluemix:

•Create AlchemyLanguage service instance using Advanced plan

•Obtain AlchemyAPI key

• In Watson Knowledge Studio:

•Create and train custom annotator model

•Deploy model to AlchemyLanguage service instance using API key

• In your application

•Based off desired text analysis, choose API(s) relevant to your app

•Append the following parameter/payload to API request(s):

model=name_of_model

Page 22: IBM Watson Knowledge Studio

AlchemyLanguage API demo

• URL: https://alchemy-language-demo.mybluemix.net/

{

"count": "2",

"emotions": {

"anger": "0.396646",

"disgust": "0.602397",

"fear": "0.502285",

"joy": "0.020129",

"sadness": "0.074501”

},

"sentiment": {

"score": "-0.185101",

"type": "negative"

},

"text": "Ford",

"type": "MANUFACTURER”

}

• Review API results using different models

•Public: understands general entities such

as “automobile”

•Custom: trained on traffic incident reports

and will find specific entities such as part

of car, accident outcome, or impact

Sentiment analysis on custom entities

Page 23: IBM Watson Knowledge Studio

Watson Knowledge Studio

http://ibm.biz/ibmwatsonknowledgestudio