intellisemantc - second generation semantic technologies for patents

Download IntelliSemantc - Second generation semantic technologies for patents

If you can't read please download the document

Upload: alberto-ciaramella

Post on 15-Apr-2017

166 views

Category:

Technology


2 download

TRANSCRIPT

MyIntelliPatent

Second generation semantic technologies for patent analysis

Alberto Ciaramella - IntelliSemantic

Marco Ciaramella - IntelliSemantic

PATINFO 2015 - 10/6/2015 Ilmenau

This presentation IntellisemanticSemantic technologies in patent solutions are sometimes controversial.

The first part of this presentation provides a framework and a fair overview of what exist now, and anticipates some coming evolutions, belonging to second generation semantic technologies.

The second part of this presentation provides IntelliSemantic specific examples.

.

in

OK

ContentIntellisemantic

IntelliSemantic: an introduction

Patent tasks, phases and technologies

Second generation semantic technologies

Semantic technologies demos

Embedded in MyIntelliPatent

TOPAS originated technologies

Conclusions and follow-up

Y

in

StatusIntroduction (ok)

Patent tasks (almost ok, but still requires 2 slides to conclude)

Second generation semantic technologies (the most significant section:it has to be substantially rewritten and simplidied)

MyIntelliPatent with semantics and demo (ok)

Other semantic technologies and demo (not difficult to do)

- Concusions (not difficult to do)

Speakers

Alberto Ciaramella background:

Intellisemantic CEO/founder in 2005.

Before that:

Researcher and Research Manager at CSELT, the research branch of Telecom Italia, for speech and and Natural Language Processing.

Competitive Intelligence Manager at Loquendo SpA, the CSELT spin-off for speech and language processing.

Marco Ciaramella background:

Intellisemantic product manager since 2009.

Before that:

Project officer at Enginering.

Technology consultant at HP.

IntelliSemantic

This slide has been produced initially for the PDG meeting, but it is now more precise

IntelliSemantic

The company

Solutions

The patent information challenge

IntelliSemantic

Founded in 2005, in Torino

in the incubator of the Politecnico di Torino.

Competences: natural language processing.

Research activities:

partner of the FP7 cofunded TOPAS project. (Tool for Patent Analysis and Summarization).

R&D internal activities for MyIntelliPatent.

partner of some Piemonte or Veneto region cofunded research projects for open data and NLP.

IntelliSemantic

I think that it is kind to present itself. It is not easy to summarize a company in a slide, but we will try it. This IntelliSemantic in a slide.

The obvious difficult quesstions are about the company size, the market adoptation of IntelliSemantic solutions and so: be readu to answer!!!

IntelliSemantic solutions

Intellisemantic

On the information side, the number of worldwide patents is continuously increasing, hence the effort required for any kind of patent-related task.

On the user side, the number of companies whose business can be affected by patent information is increasing and include now also a significant percentage of SMEs, which can be even more tight on costs.

But if patent analyses are performed less frequently or less deeply than required, a company can incur:

higher costs, if a company misses in due time a competitor which can invalidate its research efforts.

less benefits, if a company has not the time to extract hidden suggestions from the patent literature.

The patent information challenge IntelliSemantic

Since the number of patents to monitor is continuously increasing, it is also increasing the effort required for any kind of patent analysis. Another factor which increases this effort is the increasing variety of relevant languages, with an increasing number of patents available only in non English language. This factor is a supplier side factor. Ob the user side, an increasing number of companies, even SMEs, are in need to perform these analyses.

A solution to this challenge is to deliver smarter tools which allow professionals to concentrate their activities in the higher value-added part of their activity.

Smarter tools can include features as:

Patent specific knowledge management, to:

learn, accumulate, and reuse the company professionals knowledge.

provide a structured approach for different use cases.

Intelligent language technologies to automatically extract the text embedded knowledge, as the most relevant entities and passages, and to identify as well the patent document structure.

How to solve this challengeIntelliSemantic

The focus is clearly to company professionals. In any case, we have to be ready to the eventual question: whats is about external consultants?

Patent tasks, phases and technologies Tasks

Phases

Technologies

Semantic technologies in more details

Patent informatics solutions

Patent informatics solutions can be categorized according to three different dimensions:

tasks.

interaction phases.

technologies used.

This framework is useful:

to compare solutions.

to identify the potential benefits of a new technology on different tasks and interaction phases.

IntelliSemantic

Detail

Tasks

monitoring:

new published applications, status evolutions of already known documents.

searches:

Prior art, validity, freedom to operate.

analyses:

Technologies, competitors.

IntelliSemantic

Detail

Interaction phases

query:

by metadata, by text, by reference.

patent set results analysis:

extracts distributions (e,g. by applicant).

identifies correlations.

ranks documents to analyze in more detail.

single patent analysis:

identifies main sections.

identifies main topics.

navigates through topics and sections.

IntelliSemantic

Detail

Phases: some conclusions

the query and the patent set results analysis are characterized by recall and precision:

the recall is measured by (relevant results found / total relevant results in the data base).

the precision is measured by (relevant results found / total results found).

recall and precision are inversely related.

a safe strategy is to maximize the recall of the query, then use precise and efficient technologies to analyze results.

the single patent analysis can become more efficient by using suitable technologies.

IntelliSemantic

Detail

Tasks and phases: requirements

IntelliSemantic

Detail

Technology generations (1)

based on metadata only:

e.g. querying by IPC and applicant.

text-based, the most popular of which are:

boolean, e.g. querying by speaker recognition AND hidden Markov models. Results are included or not.

vector based, i.e. by comparing the words sequence of the query and the words sequence of results. Results are ranked.

vector based with term dependecies. A notewort example is Latent Semantic Analysis. Results can be clustered.

IntelliSemantic

Some more details are provided in http://en.wikipedia.org/wiki/Information_retrieval , which really cites 3*3 cases. In any case, in our slide, we have included only the three most popular methods.

Technology generations (2)

vector based with terms interdependencies have been called semantic technologies in patent informatics, since the Latent Semantic Algorithm (LSA) is the most popular in this class.

LSA clusters cooccuring terms, hence simulates an intelligent behaviour.

these technologies are more typically focused on recall.

IntelliSemantic

Detail

Technology generations (3)

second generation semantic solutions can be defined, as those having at least one of these characteristics:

to be user controllable, e.g. by relying on user defined lexicons.

to include patent specific algoritms, e.g. patent segmentation.

these technologies are more typically focused on precision.

mantic

Detail

Technologies: conclusions

We ordered technologies by time, without implying that a technology is superior to others simply because it is the most recent or that an older technology is to deprecated.

Technologies of different generations can coexist in the same applications:

for different tasks and phases

for different objectives, like to increase the recall or to increase the precision.

Define your requirements first, then select a technology, but:

A new technology can suggest you new requirements.

IntelliSemantic

Detail

Second generation semantic technologiesA taxonomy

Technologies and tasks enables

IntelliSemantic

A high level functions taxonomy


entities extraction:

Generic entities or tags.

Qualified entities: i.e. only measurements, or substances, or methods.

entities relationships identification:

short range: to relate entities in the same sentence.

long range: to relate claims and description.

patent structure identification:

the patent is a structured text.

the role of an entity is section specific, hence different in prior art or in claims.

IntelliSemantic

Technologies and application


Technologies mentioned in the previous slide, can be used very differently, since they can be used:

for different phases.

stand alone or in combination.

for enhancing a manual or an automatic process.

The most important issue for selecting these technologies is:

to figure out their advantages on the application side.

to select the more appropriate combination of application and technology.

Generic entity (or tag) extractor a tag is a word (e.g. inductor) or a sequence of words (e.g. speaker verification) having a well defined meaning.

from the implementation point of view we have to distinguish two phases:

the off line annotation.

the real time user interactions with annotated documents.

this also applies to other technologies mentioned in the following.

IntelliSemantic

Examples of applications enabled to build up topic specific vocabularies, from a topic-specific patent sets collections.

for queries: to extract most relevant topics in a patent and suggest them to the user in task like validity search and prior art search.

for patent set analyses:

to identify patents citing the same topics.

to score patents by topics richness.

to identify topics distribution (by applicant, by year).

for single patent analysis: to navigate a patent document through the same topic.

IntelliSemantic

Qualified entities Measurements, which can include:

physical unit (e,g. Volt) and rank (e.g. milli).

numbers (e.g. 10) and numerals (e.g. ten).

closed intervals (e.g. between 1 and 2 nm).

open intervals (e.g. up to 1 nm).

tolerance values.

Citations of patents and non patent literature.

Substances, as aluminium.

Processes, as redundancy control.

Technical quality, as piston speed.

IntelliSemantic

Examples of applications enabled for patent set analyses:

to identify patents mentioning specific measurements and ranges.

to categorize patents more related to substances, methods and a combination of.

IntelliSemantic

Structure identification functions

to identify the structure of the description:

first level: as technical field, background art, summary of invention, description of drawings, preferred embodiment.

second level, as preferred embodiments.

third level, as. objective, advantages.

to identify the structure of claims:

Interclaim, as dependent and independent claims.

intraclaim, as preamble, transition, aspects.

IntelliSemantic

Examples of applications enabled

patent segmentation only:

patent sets analyses: to select specific patent sections, as background art, and compare them.

single patent analysis: to build a patent document directory, which can facilitate the patent document navigation.

combined with entity extraction:

these technologies combine naturally, since the meaning of an entity can depend from the patent segment.

IntelliSemantic

MyIntelliPatent and semantics

MyIntelliPatent

Structured interaction

Tags

Tasks supported

Demo

IntelliSemantic

MyIntelliPatent

A smart solution for patent intelligence tasks.

MyIntelliPatent includes the company specific knowledge, since it is provided as a password-protected Software as a Service and repository. A company can build and access to its specific vocabularies, patent sets, patent annotations.

MyIntelliPatent supports structured interactions, as detailed in the following.

MyIntelliPatent includes intelligent language technologies, as detailed in the following.

Fine. In any case, it is the main slide and it could be partially rewritten

More than personalized solution it is better to present it as a knowledge management system enhanced with NL features.

Structured interaction IntelliSemantic

Queries, by metadata, by a reference patent, a reference text or even by a patent list

A first level results analysis through QuickView.

A second level analysis and statistics inlcluding metadata through Search/Statistics

A third level analysis and statistics including tags through Tag and Search/Statistics

This slide is nice, but a little too complicated here. Is is better to repace it with the table in the quick unser manual

IntelliSemantic

Second level analysis by Search/Statistics The Search/Statistics page allows the user to identify most relevant patents (by family size, by citations) or interesting (by applicant), to extract different kind of results tables and to order these results by different criteria, to extract statistics . Example shown here are only based on metadata. Tags allow more refined analyses, as shown in the following slide.SearchPatentManual_comments (2)

This is just a slide presenting a screenshot with very interesting, although specific feature, which empasize the importance of tagging, which is an important feature of MyIntelliPatent.

Other more general screenshots should be provided , as the screenshot presenting the ordered list of results. In any case we prefer not to add other screenshots, since:They can evolve with the evolution of the product, hence we could have an additional problem in the maintenance of this presentation

This presentation is typically followed by the product presentation, in which these details are more appropriate.

Linguistic intelligence: Tags A tag is a word (e.g. inductor) or a sequence of words (e.g. speaker verification) having a well defined meaning.

Tags are a distinguishing feature in MyIntelliPatent.

MyIntelliPatent can:

suggest a topic specific vocabulary from a set of topic specific patents.

allow the user to edit this suggested vocabulary.

apply the finally edited vocabulary to all collections, in such a way that vocabulary tags in a patent become new text-specific metadata.

different topic specific vocabularies can be present in the same platform.

IntelliSemantic

Motivate why tagging adds intelligence

Some examples of tags use

IntelliSemantic

This slide will be rewritten merging this information with the information of the table in the quick introduction

Extracting a tags vocabulary

IntelliSemanticThe Edit & Tag page allows to extract more relevant tags from a set of patents, to analyze these suggested tags, to edit them , to confirm the user validated vocabulary of tags. The user can also copy and paste his/her suggested vocabulary.

First level analysis by QuickView

IntelliSemantic .

This level of analysis provides a quick view of patent applicant, title, summary and extracted tags, which is a good proxy for identifying the patent interest for the user. In case of doubt, he/she can directly access from this page the whole document. This level of analysis can be enough for some tasks, as quick prior art searches.

IntelliSemantic

Tags in third level analysis: an example Tags allow to identify most relevant concepts in a patent and allows to extend the analysis based on metadata. This table summarizes the number of patents by year using a specific tag, and allows to identify first patents using a concept and the most popular concepts now.

This slide can be reatained as an useful example

This is just a slide presenting a screenshot with very interesting, although specific feature, which empasize the importance of tagging, which is an important feature of MyIntelliPatent.

Other more general screenshots should be provided , as the screenshot presenting the ordered list of results. In any case we prefer not to add other screenshots, since:They can evolve with the evolution of the product, hence we could have an additional problem in the maintenance of this presentation

This presentation is typically followed by the product presentation, in which these details are more appropriate.

TOPAS demo

The TOPAS project, participants and results

A demo with Patent description and Measurements extraction

IntelliSemantic planned exploitation

TOPAS demo

IntelliSemantic

This demo exemplifies some second generation semantic technologies not yet integrated in MyIntelliPatent.

This demo was developed by IntelliSemantic for the FP7 research project TOPAS (Tool Platform for Patent Anaysis and Summarization), which will be summarized in the following.

The EU cofunded research project TOPAS (Tool Platform for Patent Analysis and Summarization) studied, prototyped and tested some of second generation semantic technologies for English, German and French.

TOPAS was a 24 months FP7 Capacity project, under grant agreement number FP7-SME-2011 286639, from october 2011 to september 2013.

TOPAS research project

IntelliSemantic

5 TOPAS participants were Bruegman Software, IALE, IntelliSemantic, University of Stuttgart and University Pompeu Fabra.

Universities transferred all rights of technologies.

SMEs have the whole ownership of TOPAS technologies and are mutuallt independent in the exploitation.

TOPAS had qualified advisors to provide feedback on the application side; between them we can mention EPO, Fraunhofer and some companies and consultants.

TOPAS participants

IntelliSemantic

TOPAS prototyped and tested solutions for:

Qualified entities extraction.

Entities relationship identification.

Patent segmentation.

Patent summarization (not detailed here)

In English, German and French

The overview of project results has been recently published on WPI magazine, march 2015, in the paper Towards content oriented patent document processing: intelligent patent analysis and summarization.

TOPAS results

IntelliSemantic

Patent description first level This screenshot exemplifies the first level patent segmentation used to analyze a patent set, e,g to focus the analysis of results to specific sections, as in this case the background art.

IntelliSemantic

Patent description large grain is performat and efficient as well and it has to be pushed

Measurement extraction

IntelliSemantic

This screenshot exemplifies the measurement extraction used to analyze a single patent, i.e, to retrieve patent sections citing measurements and to extract the meanings of these measurements.

Patent description large grain is performat and efficient as well and it has to be pushed

IntelliSemantic has further developed some some of the TOPAS technologies and is ready to expioit them:

Integrated in new MyIntelliPatent releases, e.g. to extend patent set analyses and single patent analysis.

As technology engines to be integrated into the customer platform, and to extend it with features like patent segmentation and qualified entities extraction:

this last solution is more suitable to advanced users, as patent offices and big companies.

IntelliSemantic TOPAS exploitation

IntelliSemantic

For more information

Visit us at stand 4 for more details about

MyIntelliPatent.

Other semantic technologies.

And/or:

Contact IntelliSemantic

e-mail [email protected]

tel. +39 011 9550 380

for a Web Conference presentation.

IntelliSemantic

Status: ok

Comment.

This slide motivates the audience to visit IntelliSemantic stand, since the solution is more feature rich than presened here, and at the same time it provides a German telephone number (no, in this slide, since the German number presently costs too much).


Licence

This work is licenced under Creative Commons Attribution-NonCommercial-Share A like 3.0 Unported Licence

To view a copy of this licence visit:

http://creativecommons.org/licenses/by-nc-sa/3.0/

Intellisemantic, Politecnico di Torino

Fare clic per modificare stili del testo dello schema

Secondo livello

Terzo livello

Quarto livello

Quinto livello

I3P - 04/10/2007

Fare clic per modificare lo stile del titolo

Fare clic per modificare lo stile del sottotitolo dello schema

Intellisemantic, Politecnico di Torino

I3P - 04/10/2007

Fare clic per modificare lo stile del titolo

Fare clic per modificare stili del testo dello schemaSecondo livello

Terzo livello

Quarto livello

Quinto livello

Intellisemantic, Politecnico di Torino

I3P - 04/10/2007

Fare clic per modificare lo stile del titolo

Fare clic per modificare stili del testo dello schema

Intellisemantic, Politecnico di Torino

I3P - 04/10/2007

Fare clic per modificare lo stile del titolo

Fare clic per modificare stili del testo dello schemaSecondo livello

Terzo livello

Quarto livello

Quinto livello

Fare clic per modificare stili del testo dello schemaSecondo livello

Terzo livello

Quarto livello

Quinto livello

Intellisemantic, Politecnico di Torino

I3P - 04/10/2007

Fare clic per modificare lo stile del titolo

Fare clic per modificare stili del testo dello schema

Fare clic per modificare stili del testo dello schemaSecondo livello

Terzo livello

Quarto livello

Quinto livello

Fare clic per modificare stili del testo dello schema

Fare clic per modificare stili del testo dello schemaSecondo livello

Terzo livello

Quarto livello

Quinto livello

Intellisemantic, Politecnico di Torino

I3P - 04/10/2007

Fare clic per modificare lo stile del titolo

Intellisemantic, Politecnico di Torino

I3P - 04/10/2007

Intellisemantic, Politecnico di Torino

I3P - 04/10/2007

Fare clic per modificare lo stile del titolo

Fare clic per modificare stili del testo dello schemaSecondo livello

Terzo livello

Quarto livello

Quinto livello

Fare clic per modificare stili del testo dello schema

Intellisemantic, Politecnico di Torino

I3P - 04/10/2007

Fare clic per modificare lo stile del titolo

Fare clic per modificare stili del testo dello schema

Intellisemantic, Politecnico di Torino

I3P - 04/10/2007

Fare clic per modificare lo stile del titolo

Fare clic per modificare stili del testo dello schema

Secondo livello

Terzo livello

Quarto livello

Quinto livello

Intellisemantic, Politecnico di Torino

I3P - 04/10/2007

Fare clic per modificare lo stile del titolo

Fare clic per modificare stili del testo dello schema

Secondo livello

Terzo livello

Quarto livello

Quinto livello

Intellisemantic, Politecnico di Torino

I3P - 04/10/2007

Intellisemantic, Politecnico di Torino

I3P - 04/10/2007

Fare clic per modificare stili del testo dello schema

Secondo livello

Terzo livello

Quarto livello

Quinto livello

I3P - 04/10/2007

Intellisemantic, Politecnico di Torino

Fare clic per modificare lo stile del titolo

Fare clic per modificare lo stile del sottotitolo dello schema

I3P - 04/10/2007

Intellisemantic, Politecnico di Torino

Fare clic per modificare lo stile del titolo

Fare clic per modificare stili del testo dello schemaSecondo livello

Terzo livello

Quarto livello

Quinto livello

I3P - 04/10/2007

Intellisemantic, Politecnico di Torino

Fare clic per modificare lo stile del titolo

Fare clic per modificare stili del testo dello schema

I3P - 04/10/2007

Intellisemantic, Politecnico di Torino

Fare clic per modificare lo stile del titolo

Fare clic per modificare stili del testo dello schemaSecondo livello

Terzo livello

Quarto livello

Quinto livello

Fare clic per modificare stili del testo dello schemaSecondo livello

Terzo livello

Quarto livello

Quinto livello

I3P - 04/10/2007

Intellisemantic, Politecnico di Torino

Fare clic per modificare lo stile del titolo

Fare clic per modificare stili del testo dello schema

Fare clic per modificare stili del testo dello schemaSecondo livello

Terzo livello

Quarto livello

Quinto livello

Fare clic per modificare stili del testo dello schema

Fare clic per modificare stili del testo dello schemaSecondo livello

Terzo livello

Quarto livello

Quinto livello

I3P - 04/10/2007

Intellisemantic, Politecnico di Torino

Fare clic per modificare lo stile del titolo

I3P - 04/10/2007

Intellisemantic, Politecnico di Torino

I3P - 04/10/2007

Intellisemantic, Politecnico di Torino

Fare clic per modificare lo stile del titolo

Fare clic per modificare stili del testo dello schemaSecondo livello

Terzo livello

Quarto livello

Quinto livello

Fare clic per modificare stili del testo dello schema

I3P - 04/10/2007

Intellisemantic, Politecnico di Torino

Fare clic per modificare lo stile del titolo

Fare clic per modificare stili del testo dello schema

I3P - 04/10/2007

Intellisemantic, Politecnico di Torino

Fare clic per modificare lo stile del titolo

Fare clic per modificare stili del testo dello schema

Secondo livello

Terzo livello

Quarto livello

Quinto livello

I3P - 04/10/2007

Intellisemantic, Politecnico di Torino

Fare clic per modificare lo stile del titolo

Fare clic per modificare stili del testo dello schema

Secondo livello

Terzo livello

Quarto livello

Quinto livello

I3P - 04/10/2007

Intellisemantic, Politecnico di Torino