final report acg
TRANSCRIPT
-
8/3/2019 Final Report Acg
1/25
Automatic community generator
-
8/3/2019 Final Report Acg
2/25
Automatic community generator
Project Approval Sheet
The Project entitled
AUTOMATIC COMMUNITY GENERATOR
Is hereby approved in partial fulfillment for the Bachelors Degree of Engineering in
Information Technology and will be carried out by
Name of students: Roll Number
1. Parag Jain 062. Saurabh Jain 083. Manas Jain 04
4. Nikhil Bhutada 21
5. Yash Bhise 19
(Prof. L.M.R.J Lobo)
HOD I.T. DEPT.
Department of Information Technology
Walchand Institute of Technology, Solapur
Year 2010-2011
-
8/3/2019 Final Report Acg
3/25
Automatic community generator
Certificate
This is to certify that the Project design entitled
Automatic Community Generator
Has been carried out by
1. Parag Jain 06
2. Saurabh Jain 08
3. Manas Jain 04
4. Nikhil Bhutada 21
5. Yash Bhise 19
of B.E.(Information technology) Class in partial fulfillment for the award of Degree in
Bachelor of Engineering in Information Technology as per requirement of Solapur
University in academic Year 2010-11.
(Prof: L.M.R.J. Lobo)
Head IT Dept.
(Dr. S. A. Halkude)
Principal
Department Of Information Technology,
Walchand Institute of Technology, Solapur
-
8/3/2019 Final Report Acg
4/25
Automatic community generator
ACKNOWLEDGEMENT
It is with a great sense of gratitude that we acknowledge the support given to us
by our project guide Prof. L.M.R.J. LOBO. We really feel a tough task, to put
into words, the confidence and support that our guide Prof. L.M.R.J. LOBO
gave us, they turned to be our moral boosters. We are really grateful for this
confidence in us, which proved to be our strength all throughout our work. His
support, both technically and morally helped us in computing the first task of
our project, the Design Report. Finally we mean it. We could do this onlybecause of Prof. L.M.R.J. LOBOs guidance.
-
8/3/2019 Final Report Acg
5/25
Automatic community generator
INDEX
1.Brief Idea about the Project & Algorithms used2.Software used3.Snapshots of screens4.Results generated5.Comparison with existing systems6.Testing and Maintenance7.References
-
8/3/2019 Final Report Acg
6/25
Automatic community generator
BRIEF IDEA ABOUT PROJECT
PURPOSEThe very purpose of our project is to enhance communication by classifying
the users according to the level of activity or their communication. This is
an extension to the social networking websites where communities will be
generated automatically on the basis of blogs forum etc. Any post from the
user will be considered and will be helpful in knowing about user.
This clustering will help user to search and the other users who have
knowledge about that topic. And hence the user can communicate and get
relevant information.
FOCUSOur project focuses on how data mining can be useful to extract patterns and
find a solution to clustering. From the large data we have extracted the
keywords which occur frequently and also there use by users.
We found this task interesting and challenging because of the nature of the
project that needs thorough knowledge along with special skills to develop
web mining tools. Our project reflects our career.
Problem StatementTo build a Social networking website with enhanced features of
automatic community generation for improved communication.
Idea:We will build a social networking website with the enhancement features
of Automatic community generation on the basis of personal information,chats, forum and thread started. These communities will be useful to
other users to find information on the topics and also the specialist in thatfield can be found.
How the Automatic grouping of users is done:
Our objective with this task is to group users into subgroups to facilitate
collaboration among them with the course tools. In our case we have
opted for the clustering algorithm EM (Expectation
Maximization).Once the groups have been formed, if we have some
interaction data on a specific user the system automatically assigns
-
8/3/2019 Final Report Acg
7/25
Automatic community generator
him/her to a group. The user will then be advised to contact members of
that group.
Block Diagram
Structure of the Project1. SQL database2. Java Program i.e. Main Algorithm3. JSP pages.Tables
-
8/3/2019 Final Report Acg
8/25
Automatic community generator
blogs communities communityusers distance1 forumposts forumthreads pendingresource users userwords words
Java Program
Function Used:
int getWordId(String word) This function is used to retrieve the wordId for a
given word
DatabaseManager
Int getWordId(String word)
int getPendingBlogs()
int getPendingChats()
int getPendingPosts()
String getBlogText(int bid)
String getChatText(int cid)
String getPostText(int pid)
void idDelete(int id, String s)
int getUserid(int id,String s)
void InsertToUserwords(int uid,int wid) void calc_weight()
int[] getComm( int [][]w)
WordExtractor
static List extract(String s)
static int InsertToDatabase(String
word)
static void distance(List words)
public static void main(String args[])
-
8/3/2019 Final Report Acg
9/25
Automatic community generator
int getPendingBlogs() This function is used to retrieve the ID of pending
blogs from the datatable pendingresourse sothat the blog data can be scanned.
int getPendingPosts() This function is used to retrieve the ID of pending
posts from the datatable pendingresourse so that
the forum data can be scanned.
String getBlogText(int bid) Returns the text of the blog whose ID is passed as a
parameter to this function as returned from the
function getPendingBlogs().
String getPostText(int pid) Returns the text of the post whose ID is passed as a
parameter to this function as returned from thefunction getPendingPosts() .
void idDelete(int id, String s) This function deletes the entry from the datatable
pendingresourse after the entry in the database
is no more pending that is the WordExtractor is
executed.
int getUserid(int id,String s) This function returns the useridcorresponding to
the area in which user is working.
void InsertToUserwords(int uid,int wid) This function is used to insert a word with wordid
as widand used by the user with useridas uidand
a userwordidis assigned.
void calc_weight()
-
8/3/2019 Final Report Acg
10/25
Automatic community generator
This function uses the the formulaeWeight = hitcount * no of users to calculate a
weigtht of a word which is used for ranking
purpose.
Algorithm 1 for generating community (for single words) calc_weight()
Start
1: connect to the database ocg using ConnectionManager
2: execute the query select count(*)as h from words; //words is table in Ocg
3: print size of words;4: execute query select wid as w1,hitcount* (select count(*) from userwords
where wid=w1)as uh from words order by uh desc
5: for i=0 to size
6: Goto next record
7: Store wordID UserId into two dimensional array
8: end for
9: for i=0 to size
10: Display the two dimensional array
11: End for
12: int a[] = getComm(words);
13: for i=0 to a.length()
14: execute query select word from words where wid = "+a[i];
15: while rs.next()
16: print word;
17: execute query insert ignore into communities values(0,'"+word+"',CURRENT_DATE);
18: if st1.executeUpdate(q) != 0 then
19: execute query INSERT_ID(cid) from communities order by cid desc limit 1;
-
8/3/2019 Final Report Acg
11/25
Automatic community generator
20: print id;
21: execute query select uid from userwords where wid = "+a[i];
22: while rs.next()
23: print userid;
24: execute query insert ignore into communityusers values
(0,"+id+","+userid+");
25: end while
26: end if
27: end while
28: end for
Stop
Algorithm 2 for generating community (Distance method)
distance(List words)
Start
1: get words from string array
2: get connection
3: createStatement st,st1
4: intialize l = words.size();
5: for i to l
6: for j=i+1, j to l
7: select hits from distance1 where wid1 =+y[i]+and wid2 =+y[j]+
-
8/3/2019 Final Report Acg
12/25
Automatic community generator
8: if result
9: update distance1
10: executeUpdate(qq)
11: insert into distance1
12: executeUpdate(q)
13: String n = "select * from distance1 where hits > 4
14: while record exist
15: getstring wid1
16: getstring wid2
17: wid3=wid1+" "+wid2
18: select * from communities where name = '"+wid3+"
19: executeQuery(n)
20: if record exists
21: else
22: insert into communities values 0,wid3,CURRENT_DATE;
23: executeUpdate(n)
24: initialize new database object dbm
25: get wid1 in id1
26: get wid1 in id1
27: select LAST_INSERT_ID(cid) from communities order by cid desc limit 1;
28: while record exists
29: uu = get first record
30: insert into communityusers values(0,"+cid+","+uu+");
31: nt.executeUpdate(n);
32: end for
33: end for
Stop
-
8/3/2019 Final Report Acg
13/25
Automatic community generator
SOFTWARE USED
Front end: Java and Html
Java entrenched itself on the server side where it has clear advantages over anyother existing technology. However, just about any application has some formof user interface and front-end presentation.
Given the simplicity of HTTP/HTTPS protocols, you are also guaranteed to
enjoy the predictability of programming for various network configurations and
firewalls. But as with everything else on this planet, that comes at a price. The
trade-off with HTML is the lack of user interaction and the necessity of making
network trips to the server for every response to a user action. JavaScript is a
great language for adding uncomplicated interactive logic to otherwise staticHTML, but it is not one for developing sophisticated user interfaces that willimpress the user with self-intelligence.
Java Server Pages (JSP) technology provides a simplified, fast way to create
dynamic web content. JSP technology enables rapid development of web-based
applications that are server and platform-independent. JSP technology lets you
add snippets of servlet code directly into a text-based document. Typically, aJSP page is a text-based document that contains two types of text:
Static data, which can be expressed in any text-based format, such asHTML, Wireless Markup Language (WML), or XML
JSP technology elements, which determine how the page constructs dynamiccontent
Html can be easily embedded in JSPWe have developed our logic in JAVA and used HTML and JSP technology
to develop web pages.
Middle ware: Apache Tomcat
A Web application runs within a Web container of a Web server. The Webcontainer provides the runtime environment through components that provide
naming context and life cycle management. Some Web servers may also
provide additional services such as security and concurrency control. A Web
server may work with an EJB server to provide some of those services. A Web
server, however, does not need to be located on the same machine as an EJBserver.
Web applications are composed of web components and other data such asHTML pages. Web components can be servlet, JSP pages created with the Java
-
8/3/2019 Final Report Acg
14/25
Automatic community generator
Server Pages technology, web filters, and web event listeners. These
components typically execute in a web server and may respond to HTTP
requests from web clients. Servlet, JSP pages, and filters may be used togenerate HTML pages that are an applications user interface. They may also be
used to generate XML or other format data that is consumed by otherapplication components.
Back end: MySQL
MySQL is the world's most popular open source and platform independent
database software, with over 100 million copies of its software downloaded or
distributed throughout its history. With its superior speed, reliability, and ease
of use, MySQL has become the preferred choice for Web, Web 2.0, SaaS, ISV,
Telecom companies and forward-thinking corporate IT Managers because it
eliminates the major problems associated with downtime, maintenance andadministration for modern, online applications.
Hardware requirementsMinimum: CPU 2.6 GHz, 2 GB RAM and 160 GB HDD
-
8/3/2019 Final Report Acg
15/25
Automatic community generator
SCREENSHOTS
Home page (homepage.jsp)
Signup (signup.jsp)
-
8/3/2019 Final Report Acg
16/25
Automatic community generator
New user registration (registration.jsp)
Home page after login (home.jsp)
-
8/3/2019 Final Report Acg
17/25
Automatic community generator
Users Blog list (blog.jsp)
New Blog (blog1.jsp)
-
8/3/2019 Final Report Acg
18/25
Automatic community generator
Forum list (forum.jsp)
Post to forums (replyto.jsp)
-
8/3/2019 Final Report Acg
19/25
Automatic community generator
Search Community (communityhome.jsp)
Community page after search (community.jsp)
-
8/3/2019 Final Report Acg
20/25
Automatic community generator
List of users in community (communitypage.jsp)
User details
-
8/3/2019 Final Report Acg
21/25
Automatic community generator
RESULTS GENERATED
From the Algorithm 1 and Algorithm 2 we have generated community
automatically the screenshots of the community page are shown above (fig
9&10)
Fig: Community list this figure above shows the list of all the community automatically
generated.
Query: Select * from communities
Fig: Users list sorted according to communities
Query: select * from communityusers
-
8/3/2019 Final Report Acg
22/25
Automatic community generator
COMPARISON WITH EXISTING SYSTEMS:
In todays world Social Networking websites we search or find communities,
but if the community is not present we have to create it and then access it.
Hence Community is not generated automatically. Moreover data is notorganized properly or it is difficult to find particular data.
For example if we want to search any information about RMI in Java then we
use to go the community named as java and then try to find relevant results, but
in this case we may or may not get our desired result and also it is cumbersome
to browse the entire community , this problem of browsing for the result ischanged to simply searching the name of the required community and as
communities are created automatically so even for a small field/area a different
community is created and hence the problem of browsing for the community issolved.
-
8/3/2019 Final Report Acg
23/25
Automatic community generator
TESTING AND MAINTENANCE:
TestingThere are two approaches for testing:
1) White Box Testing2) Black Box Testing
1) White Box Testing: In our project white box testing methods can be usedto evaluate the completeness of a test suite that was created with black
box testing methods.
In white box testing we are going to check the value of each variable in
every method by using the debugger.
This allows software team to examine part of system that are rarely tested
and ensures that the most important function points have been tested.
We used two common form of code coverage are:
Function coverage which reports on functions executed. Statement coverage, which report on number of lines executed to
complete the test.
They both return code coverage metric measured as a percentage.
2) Black Box Testing:Black Box Testing treats our software as a blackbox- without any knowledge of internal implementation. Black box
testing methods include: equivalence partitioning, boundary valueanalysis, all-pairs testing, fuzz testing, model-based testing, exploratorytesting and specification-based testing.
Specification-based Testing: Specification-based testing aims to test thefunctionality of software according to the applicable requirements. Thus,
the tester inputs data into, and only sees the output from, the test object.
This level of testing usually requires thorough test cases to be provided to
the tester, who then can simply verify that for a given input, the output
value (or behavior), either "is" or "is not" the same as the expected valuespecified in the test case.
-
8/3/2019 Final Report Acg
24/25
Automatic community generator
Specification-based testing is necessary, but it is insufficient to
guard against certain risks.
Advantages and Disadvantages: The black box tester has no "bonds"with the code, and atester's perception is very simple: a code must havebugs. Using the principle, "Ask and you shall receive," black box testers
find bugs where programmers do not. On the other hand, black box
testing has been said to be "like a walk in a dark labyrinth without a
flashlight," because the tester doesn't know how the software being tested
was actually constructed. As a result, there are situations when-1) A tester writes many test cases to check something that could have
been tested by only one test case, and/or
2)Some parts of the back-end are not tested at all. Therefore, blackbox testing has the advantage of "an unaffiliated opinion", on the
one hand, and the disadvantage of "blind exploring", on the other.
Therefore, black box testing has the advantage of "an unaffiliated
opinion", on the one hand, and the disadvantage of "blind exploring", on
the other.
-
8/3/2019 Final Report Acg
25/25
Automatic community generator
REFERENCES
We have taken basic idea of grouping of users from Towards web-basedadaptive learning communities [1]. This paper gives the general idea of
grouping students depending on their requirements. They have opted for the
clustering algorithm EM Expectation Maximization.
To identify the contents of web pages, we propose a combined mechanism
which computes the product of term frequency and document frequency and
prioritizes the terms based on the calculation of entropies. We assign any Web
page into a category within domain ontology [2]. Our approach to identifying
associations between a Keyword and a predefined category is to use term-
classification rules compiled by machine learning algorithms.
CONCLUSION
We have built website which has the feature of automatic community generation
which can be used as extension to current social networking website. This will
be helpful in enhancing communication as the clustering is done automatically.Other algorithms can also be included in this to increase the efficiency of
clustering and generation of communities.
The program will be running on server at frequent intervals which will create
community time to time as set by administrator. This will update users
information.