ire major project - reasoning over knowledge base

Team 38Mentor: Ganesh JAkanksha Singh, 201505614Vikram Ahuja, 201256040Vishal Thamizharasan, 201302061IRE Major Project Reasoning over Knowledge Base

Problem StatementThe goal of the project is to introduce a model that can predict the likely truth of additional facts based on existing facts in a knowledge base. There are certain facts that arent explicitly mentioned in the Knowledge Base and yet provide insightful inferences that are often overlooked. Hence, the task of establishing these facts is necessary to bring them forward.

What is Knowledge Base Representation? A knowledge base is a representation of factual knowledge, traditionally in a graphlike structure. In a knowledge base, entities may be represented as nodes and relations as edges.

And The Problem Is? Knowledge bases characteristically suffer from incompleteness, in the form of missing edges , i.e. If A>B , B>C , than we should have an edge between A>C. To overcome this , we need to develop a method to predict the likely truth of new facts with more complicated structure, in effect, reasoning over known facts and inferring new ones.

Example

Given place of birth is Florence and profession is historian, our model can accurately predict that Francesco Guicciardinis gender is male and his nationality is Italy. These might be inferred from two pieces of common knowledge: (i) Florence is a city of Italy; (ii) Francesco is a common name among males in ItalyFor the first fact, some relations such as Matteo Rosselli has location Florence and nationality Italy exist in the knowledge base, which might imply the connection between Florence and Italy. For the second fact, we can see that many other people e.g., Francesco Patrizi are shown Italian or male in the FreeBase, which might imply that Francesco is a male or Italian name.

Neural Tensor Networks for Knowledge Base CompletionThe goal is to predict whether two entities are related by relation R. We use "Neural Tensor Networks" which differ from regular neural networks due to the way they relate entities directly to one another with the bilinear tensor product, which is the core operation in a Neural Tensor Network.

In a Neural Tensor Network the confidence in a directed edge of type R from e1 to e2 is defined as:

The model returns a high score if they are in that relationship and a low one otherwise. This allows any fact, whether implicit or explicitly mentioned in the database to be answered with a certainty score.

Entity RepresentationEntities can be represented as some function of their constituent words, which provides for the sharing of statistical strength between similar entities.

For eg. American Elephant and Asian Elephant.

We embed each work ("African" "Asian ""Elephant") and then build representations for entities as the average of those entities' constituent word vectors.

All models are trained with contrastive max margin objective functions. The main idea is that each triplet in the training set should receive a higher score than a triplet in which one of the entities is replaced with a random entity.

Instead of using MCMC for inference and learning, we use standard forward propagation and backpropagation techniques modified for the NTN. Lastly, we do not require multiple embeddings for each entity. Instead, we consider the subunits (space separated words) of entity names.

THANK YOU

ire major project - reasoning over knowledge base

Software