journal impact factor- a measure of quality or popularity-1 · impact factor has been measured as a...
TRANSCRIPT
Journal Impact Factor: A Measure of Quality or Popularity?
Deepika J.
Dept. of CSE
Anna University, Chennai-25
Mahalakshmi G.S.
Dept. of CSE
Anna University, Chennai-25
1. Abstract
Due to quest for quality, researchers often prefer citing articles
from journals possessing higher Impact Factor. Until now, Journal
Impact Factor has been measured as a popularity index. i.e. only
the number of citations of the journal articles determine the impact
factor, be it any domain, regardless of the quantum of research
progressed so far. However, the art of citing has gradually changed
to be a matter of convenience and not of quality. This is due to
many journals obtaining higher visibility like open access. Due to
this dilution of policy in citation making, assessing a journal by its
quality, rather than circulation and visibility is more important.
Here, we suggest two parameters, author and article content, for
measuring a journal quality. The originality of the article along
with other evaluation measures, and the research quantum of the
article’s author(s) contribute to measuring the quality of a research
paper which accumulates to the journal quality.
Keywords: Impact Factor, Author Quality, Article Originality,
Plagiarism Detection, Metric
2. Introduction
Journal popularity is often mistaken as its
quality as journals are assessed for their impact based
on their popularity i.e. citation counts across years.
This popularity cannot be agreed as journal quality.
Popularity is based on peer like and dislike of the
journal. This is measured as the factor of number of
citations over a period of years. Whereas quality is
the function of various components such as article
content, author quality, semantic richness, etc.
In practice, the impact factor of a journal is
calculated as a measure reflecting the average
number of citations to articles published in that
particular journal. Current year impact factor of a
journal is the average number of citations received
per paper published in that journal during the two
preceding years. The problem is hereby
acknowledged that citations alone determine the
impact of the article neglecting other factors. But this
is less agreeable as this leads to certain criticisms
[Seglen PO, 1997]. Hence, there is a need to derive a
quality based metric for measuring the Journal
Impact Factor (JIF).
An article is said to be of good quality if it
possess better semantics i.e. the article content should
be good. Though an article sounds better in its
semantics another significant aspect that has to be
taken care is its originality. Here comes the
plagiarism viewpoint which depicts article’s
originality. Hence, an article quality has to
determined based on these two aspects. Thus article
quality leads to journal quality i.e. Journal Impact
Factor.
In this paper, our focus is to derive a metric
for measuring journal impact factor considering two
major factors such as author quality and article
quality. A knowledge based impact metric is derived
which is very much different from traditional impact
factor thereby implicitly making the research journals
to permit publishing of quality articles. This
gradually makes the research publications well
standardized. Yet, the traditional method of impact
factor calculation is also included to illustrate the
comparative results.
3. Literature review
3.1 Impact Factor: “A Prestige Factor”
Impact factor over a period of time has gained its
importance. Measuring impact factor as a quality
measure only by citation count is supposed to be a
flawed way [Garry Walter et al, 2003]. Garry Walter
et al states the following viewpoints as the quality
determinants for assessing the value of an article in
scientific research ground.
An article should
• add consequentially to the field through
original, innovative research findings
• expand or challenge current knowledge
• open additional areas for new research
activity
• open a pathway to advance knowledge
• integrate discoveries obtained by different
approaches and/ or disciplines through
creative synthesis, thus bringing new
insights to bear on original research and
• reflect critically on research findings to
guide the direction of further research
This common notion of impact factor has created a
range of other derived impact metrics, like, scope-
adjusted impact factor, discipline-specific impact
factor, immediacy index and cited half-life. The
fundamental assertion is to measure the true impact
of the journals that are very much into publishing
articles of scientific advancement. The underlying
belief is that innovative research will actually be
found only in original articles. However, prestige
based impact metrics were constrained to a specific
domain [Hane, Paula J, 2002].
3.2. Other major factors governing Impact Factor
The major factors to decide on a journal impact factor
includes
• Hyperlinks
• Citation measures
• Time frame
• Frequency
• Speed statistics
3.2.1 Link analysis
For a web search engine like Google with many
pages, link analysis is determined by using the Page
Rank algorithm [Sargolzaei P and F. Soleymani,
2010]. Page Rank Algorithm depends on the number
and Page Rank metric of all pages linked to it. Link
analysis for Closed Research Publications is based on
number of in-links to a document.
3.2.2 Citations
In Open source Publications, link analysis is
established via citations. An article with higher
citations need not actually be concluded as the best,
because if it has more number of authors, that leads
to a chance of making more number of self-citations
[Antonia Andrade, Raúl González-jonte and Juan
Miguel Campanarioc,2008].
The traditional impact factor does not take into
account, a journal’s relative articles, technical notes
and reviews. Seglen has shown that citation rate used
in Thomson ISI is proportional to the article length
[Seglen PO, 1997]. It is not true in many cases as
citation rate does not depend on article length. In
addition, a journal’s impact factor should be decided
by the impact created by individual articles. Seglen
[Seglen PO, 1997] stated problems with assessing a
journal based only on its citation. Seglen summarizes
four viewpoints as to how traditional impact factor
assessment is misleading:
1. Use of journal impact factors conceals the
difference in article citation rates
2. Most times, impact factors are determined
by technicalities unrelated to the scientific
quality of their articles
3. High impact factors are likely in journals
covering basic research areas
4. Article citation rates determines the journal
impact factor and not vice versa
3.2.3. Time frames
For finding a journal’s impact factor, the time frame
for citations is a major criterion to be accounted.
Initially, the time frame for impact factor calculation
was assumed as 2. In later years ‘Eigen factor’
evolved to solve this issue [CDSR,2009].However
even in this methodology, the self citation counts are
eliminated.
In addition, related indices like immediacy index,
cited half-life and aggregate impact factor are
considered. These indices suite well only for the
journal as a whole rather than an individual article of
the respective journal. However a newer index
namely h-index is coined by Hirsch [2009] to solve
this issue.
4. Problem Description: Journal Quality
Assessment
The journal impact factor is however evolved as a
valid indicator of scientific value. Starting from
traditional impact metric to present date, impact
factor is measured for journals to grade their value.
Though it is acceptable that the traditional formula is
reviewed for making changes, there is no common
metric proposed to measure a journal from the
knowledge perspective. The integrated knowledge
based impact metric proposed in this paper, counters
this challenge by grading a journal for its quality.
According to the novel idea proposed, the impact
assessment of a journal is viewed as the cumulative
impact of every article it contains (moderately similar
to Thomson). However, the dissimilarity lies in the
impact assessment of every article, i.e. quality wise
and not popularity or citation wise. In other words,
every article’s impact is viewed from two
perspectives: author quality and article quality. In this
newer approach, the individual authors’ of the article
are given importance as the quality of every author
contributes to quality of the respective articles. Also
the article quality demands more implication in
dealing with quality. Hence better semantics and
document evaluation are emphasized in analyzing the
article quality. The two viewpoints in Figure 1 author
quality and article quality are discussed briefly in the
following sections.
Figure 1 Journal Quality Assessment System
Each parameter selected for measuring journal
impact factor is integrated to form a metric for
assessing journal quality. In this newer approach, the
individual authors of the article are given importance
as the quality of the author contributes to the quality
of the article. The logic behind is quite reasoning in
considering it for quality assessment of journal. Also
the article quality demands more implication in
dealing with quality. Hence in analyzing the article
quality better semantics and document evaluation are
emphasized. The two viewpoints author quality and
article quality are discussed briefly in the following
sections.
5. Author quality: An imperative analysis 5.1Authorship merits
For an article to be good in its quality, the author(s)
of the article should possess better knowledge in
research field. There are certain remarkable factors
that influence the author quality in an article (figure
1). An author as a single or paired with co-authors
tends to produce variation in representing a concept
in the article. This is due to influence of one author
over other. Each aspect of an author starting from
their style of writing to contributions in publications
differs among authors.
The following factors are important in considering
authorship analysis.
• Author contribution
• Author bonding
• Originality in writings
• Author writing style including policies,
techniques and blunders
• Methodology of knowledge transfer
• Clarity in work
• Semantic significance
5.2Impact of Author quality
In this proposed approach a journal’s impact factor is
determined by measuring the impact of its content
articles (eq.1). Every article’s impact would be
assessed by its author quality and article quality, and
is said to be proportional with the number of articles
published in that issue More the articles in that issue,
less the impact of a particular article. Though this
idea is interesting, this is still debatable!! Here, we
assume that, more articles a journal might contain, it
is tough for an article to create more impact unless it
is of commendable quality above its fellow articles in
that issue. With this in mind, the article’s impact is
calculated as a combination of author quality and
article quality as in eq. 2.
( )
art
Art
OSRPJT
IFIF
#
∑= - eq. 1
( )
issueart
ArtfactorAu
ArtT
QQIF
_
_
#
+= - eq. 2
A journal though possess diversity of authors, every
individual author of the journal is worthy and
therefore should be graded to assess the quality of the
journal. We refer to this as author quality factor. This
is a measure of how the author creates impact in spite
of his/her author colleagues in that article. Author
Co-citation Analysis (ACA) and Web Co-link
Analysis (WCA) are examined as “sister” techniques
in the related fields of Bibliometrics and webometrics
[Alesia Zuccala, 2006]. By these techniques, the
author centrality would be assessed. The author with
high centrality gains more popularity mainly due to
his/her quality. The credits are given to the authors
based their positions they occupied in the article
[Teja Tscharntke et al., 2007]. The proposed
approach (figure 1) includes sequence-determines-
credit (SDC) and percent-contribution-indicated
(PCI) combined (SDC+PCI).
5.2.1Sequence-determines-credit” approach (SDC)
The sequence of authors should reflect the declining
importance of their contribution, as suggested by
previous authors. Authorship order only reflects
relative contribution, whereas evaluation committees
often need quantitative measures. We suggest that the
first author should get credit for the whole impact
(impact factor), the second author half, the third a
third, and if the author count exceeds three, the
Impact Factor is evenly distributed to all the author of
that particular article.
5.2.2 Percent-contribution-indicated approach (PCI)
There is a trend to detail each author’s contribution
(following requests of several journals). This should
also be used to establish the quantified credit. In this
paper, we combine both approaches and the output
measures obtained from the author quality
assessment system are based on credit and central
tendencies. Thus by author quality analysis,
individual authors of any Research Publications
article or journal are assessed for their quality which
highly contributes to the sole quality assessment of
Research Publications.
5.3 Author quality assessment metrics
The Author quality factor for the entire article is
obtained as,
( )
au
Au
factorAuT
#_
∑= - eq. 3
The individual author quality is obtained using the
formula,
( )2
SpreadSpread
Au
CreCitQ
+= - eq. 4
Citations play a major role in author quality analysis.
Hence the author quality is calculated based on the
citation spread and the credit spread which are as
given below:
au
citSpread
T
TCit
#
#= - eq. 5
art
avgval
SpreadR
CrCre
#
_= - eq. 6
2
_._
val
avgval
CrCapitalSemCr
+= - eq.6.1
Citation spread is the ratio of total citations to the
number of authors for that article. Credit spread is
taken as the ratio of average credit value to the total
references of that article.
However, average credit value (eq 6.1) is a
combination of position based author credit as well as
the semantic content of the article. The semantic
content of the article as contributed by the author is
measured by semantic capital which is a combination
of sole author credit, local and global cohesion. More
discussion is on section 6.
}1,6.0,8.0,2.0,4.0{
},,,,{_
#)_.(
#)_.(
#)_.(
#_
_
5
1_
__
5
1_
_
5
1_
_
_
=
=
×=
×=
×=
++=
∑
∑
∑
=
=
=
weight
bookthesisarticlecollectionsproceedingcatpub
Tcatpubwtcoh
Tcatpubwtcoh
TcatpubwtCr
T
cohcohCrCapitalSem
artco
catpub
glo
artcoseed
catpub
loc
artsole
catpub
ausole
au
glolocausole
- eq. 7
The position based credit value for authors based on
the author position in that article is calculated from,
( )
3_,#
Im
}3{_,1
Im
}2,1{_,Im
>=
=+
=
==
au
au
avg
val
au
au
avg
val
au
au
avg
val
PosifT
Cr
PosifPos
Cr
PosifPos
Cr
- eq.8
6. Article quality: an overview on its
significance
Every article published tends to possess an
innovative idea based on research. An article quality
is assessed based on two major viewpoints. They are
Document semantics and Document originality. In
the former, the article semantic is analyzed and in the
latter the article is analyzed for it should not be copy
of other. Both these aspects pave way for article
quality assessment.
Quality of the document can be determined
based on its organization, semantic richness,
originality, innovation, clarity, grammar, creativity,
expendability, significance, relevance, etc. In this
approach a document quality is assessed based on its
originality aspect i.e. the document is viewed from
the plagiarism viewpoint. The similarity score for the
documents across the corpus is measured thereby
contributing to the article originality.
Document Evaluation is necessary because
an article though is not a copy of other documents
may still not possess better semantics. The relevance
of the title to the article, relevance of the references
and the relevance of the citations are considered to be
the major factors in document evaluation. Document
originality gains a vital role in research as
‘Plagiarism’ appears to be more common. Though
many tools are available to detect similarity between
documents, they are not appreciably good in finding
the similarity between the documents involving
reasoning techniques. In this work, a Plagiarism
detection tool say ‘IdeaPlag’ is highlighted
incorporating Ontology and Word net for finding the
similarity score between two documents. A domain-
specific ontology is created offline thereby used in
document evaluation and plagiarism detection. The
semantics of the article relating to those in ontology
is considered for document evaluation. The semantic
similarity between the documents is determined via
IdeaPlag methodology.
6.1 Article quality metrics
In dealing with Impact factor metric, the article
quality is mainly based on its similarity score. Since
an open source tool Plagiarism Finder is used as a
comparative analysis tool, the formula for similarity
score comprises 3 similarity measures: Plagiarism
Finder, IdeaPlag and a domain expert. In addition, the
semantic similarity is proportional to whether the
article is long or short or medium, which is said by its
no. of words, and, also proportional to the number of
references. The ratings of the 3 entities are collected
and the semantic similarity of an article is calculated
using eq. 9. However, with the absence of domain
expert, the semantic similarity may be computed by
eq. 10.
More the similarity measure less is the article quality.
Therefore, the article quality is inversely proportional
to similarity score which is given by eq.11.
artart
res
XPIdea
PF
XPArtRW
T
SimSimSim
ScoreSim##
3
#
_ _+
++
=
∑∑ - eq. 9
artart
res
Idea
PF
ArtRW
T
SimSim
ScoreSim##
2
#
_+
+
=
∑
-eq.10
ScoreSimQ Art
_
1= -eq.11
7. Results and discussions
7.1. Test bed
Research articles from both an Open source and a
closed source journals are considered for the analysis.
The DOAJ (Directory for Open Access Journals) is
considered as a source and 58 articles from an open
access journal are handpicked to form an input
document corpus of Open source publications.
Similarly 30 articles are extracted from ‘Computer
Networks’ journal of Elsevier for analysis
[http://www.Elsevier.com//]. Relevant articles are
extracted from the Google Search Engine for
comparison. Every relevant article is tested with their
respective seed article to obtain the semantic score.
Digital Bibliography and Library Project (DBLP) is
used as test bed for authorship analysis
[www.informatik.uni-tier.de]. Cite seer
[citeseerx.ist.psu.edu] is used to obtain the self and
total citations of the article individually.
This is achieved by a handcrafted ontology with
about almost 900 concepts in computer networks
domain using Protégé 4.1 Beta. A snapshot of the
‘computer networks’ ontology is shown in Figure 2.
Figure 2. Snapshot of ‘Computer Networks’
Ontology in Protégé 4.1 Beta
Given the input in ‘IEEE format of the reference’ the
authors and the title are extracted. The author names
are taken as input and their positions are identified.
The credit values are given for the authors based on
their positions in the article. Up to three authors, the
Impact factor is given as 100% of impact factor for
first author, 50% of impact factor for second author
and 25% of impact factor for third author. For the
articles with >3 authors, impact factor is shared
equally among the authors.
7.2.Empirical results
7.2.1.Deriving Citation Measures
The citation counts are obtained as external and self
citations from Cite seer. The counts are zero if the
article is not found in Citeseer or if it does not have
any citations. The author quality is calculated based
on the citation spread and the credit spread. Though
citation spread is a measure of total citations (Figure
3a and 3b) of that article, credit spread (Figure 4a and
4b) measures the effort shared by every author in
constructing that article. Therefore, credit spread is
measured for every author, and overall, 161authors
across the input set in case of Open access journals
nd 100 authors in case of Closed source journals
Figure 3a. Total Citations per article of input article
corpus of Open source journal
Figure 3b. Total Citations per article of input article
corpus ‘Computer network’ journal
Figure 4a. Author Credit Spread of authors of input
article corpus of Open source journal
Figure 4b. Author Credit Spread of authors of input
article corpus ‘Computer network’ journal
Credit spread is dependent on average credit value
(Figure 5a and 5b). In other words, we perceive two
classes of credits for every author. One is based on
their position (Figure 5a and 5b), and other is based
on their contribution to their research.
As discussed earlier (eq. 7), we refer to the later as
semantic capital (Figure 7a and 7b). For this, we take
DBLP Open repository as a measure. Here, the
author names are tracked and they are given to DBLP
archive to find author contribution, i.e. soul author
contribution, contribution with local co-authors
(local, we mean, co-authors of the seed article), and
the global co-authors. From DBLP open repository,
the categories and frequency of the contributions like
article, proceedings, collections, articles, phdthesis,
master thesis and books etc. are harvested.
Figure 5a. Average Credit Value for authors of input
article corpus of Open source journal
Figure 5b. Average Credit Value for authors of input
article corpus ‘Computer network’ journal
Figure 6a. Position based credit measure for authors
of input article corpus of Open source journal
Figure 6b. Position based credit measure for authors
of input article corpus ‘Computer network’ journal
Figure 7a. Semantic capital of authors of input article
corpus of Open source journal
Figure 7b. Semantic capital of authors of input article
corpus ‘Computer network’ journal
The author’s reputation is a major concern here. The
credits are pre-determined i.e. a book will have a high
weightage than a journal article, a journal article will
have more weightage than conference proceedings
and so on, because of the research content in that
contribution, and of course, the readership coverage.
7.2.2.Author Quality Analysis
The author quality factor contributes to the impact of
individual article (eq. 3). The same is shown in
Figure 8a and 8b.
Figure 8a. Author Quality Factor for input article
corpus of Open source journal
Fig 8b. Author Quality Factor for input article corpus
‘Computer network’ journal
The author quality factor is a collective representative
measure of author quality (Figure 9a and 9b) for a
given article. And this is an average of every
constituent author’s quality, which is determined by
citation spread (Figure 10a and 10b) and credit
spread (Figure 4a and 4b).
Figure 9a. Individual Author Quality Analysis for
input article corpus of Open source journal
Figure 9b. Individual Author Quality Analysis for
input article corpus ‘Computer network’ journal
Figure 10a. Citation Spread for input article corpus of
Open source journal
Figure 10b. Citation Spread for input article corpus
‘Computer network’ journal
7.2.3 Article Quality Analysis
Article quality can be assessed via various factors as
discussed in section 6. In determining the article
originality through plagiarism, a Plagiarism detection
mechanism say ‘Idea Plagiarism’ is done and the
results are compared with an online tool ‘Plagiarism
Finder’ and ‘Expert value’.
In addition, we have assessed the similarity score
(Figure 11a and 11b) of the input document corpus.
Fig 11a. Similarity Score of input article corpus of
Open source journal in 2 tools and expert analysis
Fig 11b. Similarity Score of input article corpus
‘Computer network’ journal in 2 tools and expert
analysis
The article quality and its relation to it’s own
references is shown in Figure 13a and 13b. This is to
show that an article with more references cannot be
concluded with good quality. The quality of the
article may not be good even it possess more
references. i.e. the bibliometric analysis itself is not
enough to grade an article quality.
Figure 13a. Relation between Article Quality and
references for input article corpus of Open source
journal
Figure 13b. Relation between Article Quality and
references for input article corpus ‘Computer
network’ journal
Figure 14a. Article Impact for input article corpus of
Open source journal
Figure 14b. Article Impact for input article corpus
‘Computer network’ journal
7.2.4 Analysis of proposed impact metric
The impact of 58 individual articles is shown in
Figure 14a and that for 30 articles is shown in Figure
14b. The impact factor for Open source publications,
contributed by 58 select articles spread over the past
10 years of journal history is calculated as 0.466214.
This shall be compared with traditional idea of
impact factor calculation. If calculated, the traditional
impact factor would be Total citations / Total articles,
i.e. 117/58 = 2.017 for Open source publications
articles. From journal quality perspective, this might
look overwhelming but quite misleading [The
Thomson Reuters Impact Factor, 1994]. To compare with
the results, one can verify that ‘Computer Networks’
journal of Elsevier has 5-year Impact Factor as 1.610.
A most well known journal of DOAJ (an upcoming
Computer network journal) certainly has created
lesser impact than ‘Computer Networks’ of Elsevier,
which is the most established journal in the
respective domain. Therefore, to assess the quality of
an open access journal as 2.017 using traditional
impact measures would be less agreeable. [Figure 15]
Figure 15. Journal IF for open and closed source
publications-A comparative analysis
The above Observation depicts that Journal IF for the
Elsevier, ‘Computer Networks journal’ is measured
to be 2.103 where the traditional impact factor is
1.021[http: // www.elsevier.com /wps /find
/journaldescription.cws_home/505606/description#de
scription] which is based only on the citation factor.
Therefore, the proposed methodology yields a higher
IF value in comparison with the existing measure
(based only on citations) which is said to be
reasonably agreeable. And that for an Open Access
Journal the Impact found using the traditional method
is found to be high say 2.017 when compared with
the measure obtained from the proposed
methodology say 0.466. This is said to be less
agreeable as an Open access journal would not
possess IF of 2.017. The contradictory arisen here is a
less renowned article with higher IF is not
appreciable. Hence the proposed methodology seems
to fit finer in assessing a journal IF. [Figure 15]
8. Evaluation of Idea Plagiarism Detection:
an Exclusive comparison for Semantic
Similarity
Various tools and techniques have been evolved for
finding the similarity score for two articles. Specific
models such as Boolean model, Vector space model
and Probability model exist in determining the
similarity score between documents. Similarly,
various metrics such as cosine similarity, Jaccard co-
efficient, Hsim[Yong Zhang and Ke Deng,2010],
ontology based similarity metrics [Sridevi.U.K and
Nagaveni .N, 2010], Matching average, Dice
coefficient, Dot Products and Term Weight
Calculations[Mi Islita.com, 2006]. For comparative
analysis, two known metrics cosine similarity and
Jaccard co-efficient are computed for the documents.
The similarity measure obtained does not
involvereasoning.
8.1. Test bed for document similarity illustration
The articles from an Open Source Journal of
DOAJ are taken randomly. All these articles are
domain-specific i.e. Networks Domain.
8.2. Document Pre-processing
The text documents are pre-processed before
finding the similarity score. The general pre-
processing steps such as document parsing,
stemming, stop words removal are done to obtain
the bag of terms from the input. Then the metrics
are applied to evolve the similarity score which
are dealt in the following section.
8.3. Text Similarity
In order to find the similarity between any two
text documents the most renowned metric Cosine
similarity is used (eq. 12). This metric does not
involve any reasoning. It is performed merely
based on term frequency in the document after
certain document pre-processing steps like stop
word removal and stemming. The empirical
evaluation of the metric is shown in Table 1.
Doc 1 and 2 0.79237
Doc 1 and 3 0.576001
Doc 1 and 4 0.35279
Doc 1 and 5 0.650906
Doc 1and 6 0.368296
Table 1. Cosine Similarity score for documents
- eq.12
The similarity measures obtained via this method is
between 0 and 1. A document say Doc 1 is compared
with other Doc 2,3,4,5 and 6 to obtain the similarity
measure between them. The stop word removal and
stemming are performed in prior to similarity
detection.
The results obtained from the similarity measures
such as cosine similarity depicts the fact that the
documents 1 and 2 are said to possess high
similarity. i.e. higher percentage of plagiarism. But
the expert opinion [Table 2] on analyzing the
documents reveals that the similarity between the
documents 1 and 4 is highly similar. This is due to
lack of semantic analysis in those measures. Hence,
the system is proposed to assess the semantic
similarity between the documents involving
reasoning techniques.
Doc 1 and 2 20%
Doc 1 and 3 15%
Doc 1 and 4 50%
Doc 1 and 5 10%
Doc 1and 6 20%
Table 2. Expert’s opinion for documents similarity
8.4. Reasoning based similarity detection
Considering the above issue, the article semantics
are analyzed in identifying the similarity score
among the documents. Reasoning based similarity
detection is done for finding the similarity score
between two text documents [Archana. V et al,
(2010) communicated]. In this methodology the
Ontology and Word net are used in evaluating the
similarity score. The similarity score between
documents involving semantics is depicted in
table 3. The results obtained are similar to that of
expert opinion about document similarity. Hence
this semantic approach for document similarity
sounds better in document similarity detection.
This methodology can be used in assessing an
article quality involving document evaluation.
Doc 1 and 2 17.8942%
Doc 1 and 3 13.7247%
Doc 1 and 4 20.3925%
Doc 1 and 5 3.43025%
Doc 1and 6 17.7092%
Table 3. Document Similarity score involving semantics
The document similarity score is calculated which is
used in plagiarism detection thereby contributing to
the article quality. The article quality is determined
based on its originality which contributes towards
Journal IF calculation.
9. Conclusion
Therefore, the proposed system of calculating journal
impact factor with a view on author quality and
article quality analysis, contributes to more
productive and practical concern about the impact of
the journal in the research area of interest which is
reasonably agreeable. The proposed metrics are of
assured manner that it solves the above stated issues
and will be of interest to the entire research
community thereby assuring a quality based impact
metric for assessing the journal contribution.
10. Future work
In assessing author quality, same author with
different identity names need to be identified.
Because in dealing with DBLP same author with
different names are identified which need to be
resolved? The author quality measure gets better
value if the author names issue is resolved. The
knowledge transition of an author in the article can be
identified via various semantic analysis along with
development of knowledge base. The individual
author contribution towards publications can be
tracked which measures ‘author citation’.
Article quality assessment can be improved by
adding content analysis, semantic analysis and
knowledge transfer methods. The citation measures
can be improved by providing the levels of hierarchy
for authors for sharing the credits.
The system would further be improved that this does
not involve any reasoning in DBLP classes. The
fuzzy values given for contributions like book,
article, and proceedings, in proceedings and in
collections lack reasoning based approach. Also
while dealing with DBLP the position of main
authors present in seed article who have combined
with other authors in other contributions are not
accounted.
In citation perspective, the levels of hierarchy to
authors in social networks can be tracked to share the
article impact. This level of tracking deals with social
network analysis, more precisely, knowledge network
analysis, and this involves wide area coverage
including more research domains.
11. References
“The Thomson Reuters Impact Factor”, June
20,1994.
Alonso, S., Cabrerizo, F. J., Herrera-Viedma, E., &
Herrera, F.,” h-Index: A review focused in its
variants, computation and standardization for
different scientific fields.”,Elsevier, Journal of
Informetrics, 3(4), 273–289, April 2009.
Antonia Andrade, Raúl González-jonte and Juan
Miguel Campanarioc, “Journals that increase
their impact factor at least fourfold in a few
years:The role of journal self-citations”,Jointly
published by Akadémiai Kiadó, Budapest
Scientometrics, Vol. 80, No. 2 (2009) 517–530
and Springer, May 19, 2008.
Archana. V, Bagyalakshmi. V , Preethi. P and
Mahalakshmi.G.S,“Role of WordNet in
improving Plagiarism detection in Research
Publications”,Int. Journal of Artificial
Intelligence”,Technomathematics Reasearch
Foundations,April(2010)communicated
Archana. V, Bagyalakshmi. V, Preethi. P and
Mahalakshmi.G.S,“Comparison of NLP based
techniques to detect Plagiarism in
ResearchPublication”,Int. Journal of Computer
science and ApplicationsTechnomathematics
Reasearch
Foundations,April(2010)communicated
Benno Stein, Moshe Koppel and Israel Efstathios
Stamatatos ,” Plagiarism Analysis, Authorship
Identification and Near-Duplicate Detection”
PAN'07,ACM SIGIR Forum,Vol.41, No.2, pp
70-71, December 2007
Dirk Schoonbaert and Gilbert Roelants ,” Citation
analysis for measuring the value of scientific
publications: quality assessment tool or comedy
of errors?”, Institute of Tropical Medicine,
Antwerp, Belgium, volume 1, no. 6 , pp 739-
752, December 1996, 1st online - AUG 2007.
DOAJ web resource
http//www.doaj.org/doaj?func=loadTempl&tem
pl=about
Garcia,” An Information Retrieval Tutorial on Cosine
Similarity Measures, Dot Products and Term
Weight Calculations”, www. Mi Islita.com
Garry Walter, Sidney Bloch, Glenn Hunt and Karen
Fisher, ” Counting on citations: a flawed way to
measure quality,” MJA 2003; 178: 280-281,
Volume 178 ,17 March 2003.
Hirsch, J. E. ,”An index to quantify an individual's
scientific research output”, Proc Natl Acad Sci
USA, Vol. 102, No. 46, pp. 16569–16573,
Springerlink, Scientometrics, 29 September
2009.
Lutz Bornmanna and Hans-Dieter Daniel,” The
citation speed index: A useful bibliometric
indicator to add to the h index”, Elsevier,Journal
of Informetrics 4 (2010) 444–446, 25 March
2010.
Paula J,” The prestige (factor) is gone,Hane,”
Information Today , Wednesday, May 1
2002.http:// www2.hawaii.edu/~jacso/extra.
Plagiarism Finder1.2.2, m4-software.com
Roth
and Jean-Philippe Cointet, ” Social and
semantic coevolution in knowledge networks”,
Dynamics of Social Networks,
ScienceDirect,Elsevier, Volume 32, Issue1,
Page16-29, January2010.
Sargolzaei P and F. Soleymani, ”PageRank Problem,
Survey And Future Research Direction ”,
International Mathematical Forum, no. 19, 937 –
956, 5 March 2010.
Seglen PO ,"Why the impact factor of journals should
not be used for evaluating research “, BMJ,
Vol.314, No.7079,498-502,15 Feb 1997.
Sendhil Kumar S and Mahalakshmi G.S, ‘Context-
based citation retrieval’, Int. J. Networking and
Virtual Organisations, Elsevier, Inderscience
Publications, vol.8, No1/2, 2011.
SriDevi U.K and Nagaveni N, ”Ontology based
Similarity measure in document ranking”,
Second International Conference on Intelligent
Human-Machine Systems and Cybernetics,
IEEE, pp 125-128, vol 1,no 26, (0975 -
8887),2010.
Suber P,” Thinking about prestige, quality and Open
Access”,SPARC Open Access Newsletter, Sept.
2008.
Teja Tscharntke, Michael E Hochberg, Tatyana A
Rand, Vincent H Resh, and Jochen Krauss,”
Author Sequence and Credit for Contributions in
Multiauthored Publications ”, Plos Biology ,16
Jan, 2007.
WordNet web resource site
http//wordnet.princeton.edu/