Download - Artificial Intelligence Research Center
Artificial Intelligence Research Center
Pereslavl-Zalessky, Russia
Program Systems Institute, RAS
Lines of research Knowledge-based Dynamic Systems Computer Linguistics: Information
Extraction, Information Retrieval, Text Categorization
Image Analysis of Data Nested Petri Nets
Miracle PS
A program system of tools for designing intelligence systems
System Architecture
Control over docking of a space vehicle with the orbital station
Control System Model: docking parameters (restrictions); analytical description of control zones; ship conditions database; ship model; station model; a set of goals; a system of rules; planned trajectory.
Control over docking of a space vehicle with the orbital station
Main control fields and boundaries between them
Control over docking of a space vehicle with the orbital station
Main Goals: Approaching Divergence Minimal destruction contact with the
station
Subgoals: Finding the station Approaching Hovering Flyby
Control over docking of a space vehicle with the orbital station
Interface
Research Prototype
Visualization Module
SIRIUS
IntelligentMeta-Search System
Intelligent Meta-Search System
Sirius - Meta-Search System with the multiagent environment of the distributed calculations and the powerful linguistic module of texts analysis
Features of system Sirius
Expansion of standard keywords search mechanisms
Input of inquiry in a natural language Use of semantic texts processing methods Automatic inclusion of new information sources Increase in accuracy of search Use of parallel calculations
Example of search inquiry The inquiry = “The President has arrived to Bruxelles”
Semantic relation DIR(X, Y) defines that Y there is a direction of movement X (role of X is «subject», role of Y is «directiv»):
DIR(President, Bruxelles)
The calculation of relevance
Relevance is calculated on :
Semantic roles Semantic connections Key words
INEX: Tools for Information
Extraction
Artificial Intelligence Research CentreProgram Systems InstituteRussian Academy of Science152020 Pereslavl-ZalesskyRussia+7 08535 [email protected]
Information extraction
Objective: extract meaningful information of a
pre-specified type from (typically large amounts of) texts for further analytical purposes
Output: data structures of a pre-specified
format (filled scenario templates)
Possible IE application scenarios:
inference of new information (knowledge acquisition)query formulation and answering in human-computer systemsautomatic generation of abstracts and summariesvisualization of document content, etc.
Named entity recognizer
identifies proper names assigns semantic features to certain
items
Information extraction rules
a domain knowledge representation formalism (scenario templates)
a set of patterns to identify template elements in a text (covering the many possible ways to talk about the target event elements)
IE pattern includes:
a set of rules that define how to retrieve this pattern in a text
a set of constraints imposed on textual elements to fit into a particular slot of the target
Coreference Resolver
recognizes different occurrences of the same entity in a text
Merging partial results
merging partially filled templates to produce a final, maximally filled template
Text categorization system The goal of text categorization is to
classify documents into a certain number of predefined categories, or classes. Each document may fall into one, more than one, or not even one category. When machine learning is used for text categorization, the goal is to train classifiers on a training set (a set of category-labeled documents).
Features Both one-word and multi-word terms
are used for text categorization. Extraction of multi-word terms is
based on partial syntactic analysis of texts.
Conventional statistics-based term weighing is enhanced by taking into account different types of term occurrence in a document.