lsj1358 - facilitating document annotation using content and querying value

Upload: swathi-manthena

Post on 19-Oct-2015

227 views

Category:

Documents


0 download

DESCRIPTION

LSJ1358 - Facilitating Document Annotation Using Content and Querying Value

TRANSCRIPT

09703109334

Mona Secure Multi-Owner Data Sharin Mona Secure Multi-Owner Data Sharingg Cloud Computing For Mobile Users: Can Offloading Computation Save Energy Cloud Computing For Mobile Users: Can Offloading Computation Save EnergyCloud Computing For Mobile Users: Can Offloading Computation Save EnergyCloud Computing For Mobile Users: Can Offloading Computation Save EnergyCloud Computing For Mobile Users: Can Offloading Computation Save Energy Facilitating Document Annotation Using Content And Querying Value

ABSTRACT

A large number of organizations today generate and share textual descriptions of their products, services, and actions .Such collections of textual data contain significant amount of structured information, which remains buried in the unstructured text. While information extraction algorithms facilitate the extraction of structured relations, they are often expensive and inaccurate, especially when operating on top of text that does not contain any instances of the targeted structured information. We present a novel alternative approach that facilitates the generation of the structured metadata by identifying documents that are likely to contain information of interest and this information is going to be subsequently useful for querying the database. Our approach relies on the idea that humans are more likely to add the necessary metadata during creation time, if prompted by the interface; or that it is much easier for humans (and/or algorithms) to identify the metadata when such information actually exists in the document, instead of naively prompting users to fill in forms with information that is not available in the document. As a major contribution of this paper, we present algorithms that identify structured attributes that are likely to appear within the document ,by jointly utilizing the content of the text and the query workload. Our experimental evaluation shows that our approach generates superior results compared to approaches that rely only on the textual content or only on the query workload, to identify attributes of interest.Architecture:

EXISTING SYSTEM:

Many systems, though, do not even have the basic attribute-value annotation that would make a pay-as-you-go querying feasible. Existing work on query forms can beleveraged in creating the CADS adaptive query forms. They propose an algorithm to extract a query form that represents most of the queries in the database using the querability of the columns, while they extend their work discussing forms customization. Some people use the schema information to auto-complete attribute or value names in query forms. In keyword queries are used to select the most appropriate query forms.

PROPOSED SYSTEM:

In this paper, we propose CADS (Collaborative Adaptive Data Sharing platform), which is an annotate-as-you-create infrastructure that facilitates fielded data annotation .A key contribution of our system is the direct use of the query workload to direct the annotation process, in addition to examining the content of the document. In other words,

we are trying to prioritize the annotation of documents towards generating attribute values for attributes that are often used by querying users.

Modules:

1. Registration

2. Login

3. Document Upload

4. Search Techniques

5. Download Document

Modules Description

Registration:

In this module an Author(Creater) or User have to register first,then only he/she has to access the data base.

Login:

In this module, any of the above mentioned person have to login,they should login by giving their emailed and password .

Document Upload:

In this module Owner uploads an unstructured document as file(along with meta data) into database, with the help of this metadata and its contents,the end user has to download the file. He/She has to enter content/query for download the file.

Search Techniques:

Here we are using two techniques for searching the document 1)Content Search,2)Query Search.

Content Search:

It means that the document will be downloaded by giving the content which is present in the corresponding document. If its present the corresponding document will be downloaded, Otherwise it wont.

Query Search:

It means that the document will be downloaded by using query which has given in the base paper. If its input matches the document will get download otherwise it wont. Download Document: The User has to download the document using query/content values which have given in the base paper. He/She enters the correct data in the text boxes, if its correct it will download the file. Otherwise it wont.System Configuration:-

H/W System Configuration:-

Processor - Pentium III

Speed - 1.1 GHz

RAM - 256 MB (min)

Hard Disk - 20 GB

Floppy Drive - 1.44 MB

Key Board - Standard Windows Keyboard

Mouse - Two or Three Button Mouse

Monitor - SVGA

S/W System Configuration:-

Operating System :Windows95/98/2000/XP

Application Server : Tomcat5.0/6.X

Front End : HTML, Java, Jsp

Scripts : JavaScript.

Server side Script : Java Server Pages.

Database : My sql

Database Connectivity : JDBC.

Conclusion:

We proposed adaptive techniques to suggest relevant at-tributes to annotate a document, while trying to satisfy the user querying needs. Our solution is based on a probabilistic framework that considers the evidence in the document content and the query workload. We present two ways to combine these two pieces of evidence, content value and Querying value: a model that considers both components conditionally independent and a linear weighted model. Experiments shows that using our techniques, we can suggest attributes that improve the visibility of the documents with respect to the query workload by up to 50%. That is, we show that using the query workload can greatly improve the annotation process and increase the utility of shared data.

Contact: 040-40274843, 9703109334 Email id: [email protected], www.logicsystems.org.in