journal impact factor- a measure of quality or popularity-1 · impact factor has been measured as a...

13
Journal Impact Factor: A Measure of Quality or Popularity? Deepika J. Dept. of CSE Anna University, Chennai-25 [email protected] Mahalakshmi G.S. Dept. of CSE Anna University, Chennai-25 [email protected] [email protected] 1. Abstract Due to quest for quality, researchers often prefer citing articles from journals possessing higher Impact Factor. Until now, Journal Impact Factor has been measured as a popularity index. i.e. only the number of citations of the journal articles determine the impact factor, be it any domain, regardless of the quantum of research progressed so far. However, the art of citing has gradually changed to be a matter of convenience and not of quality. This is due to many journals obtaining higher visibility like open access. Due to this dilution of policy in citation making, assessing a journal by its quality, rather than circulation and visibility is more important. Here, we suggest two parameters, author and article content, for measuring a journal quality. The originality of the article along with other evaluation measures, and the research quantum of the article’s author(s) contribute to measuring the quality of a research paper which accumulates to the journal quality. Keywords: Impact Factor, Author Quality, Article Originality, Plagiarism Detection, Metric 2. Introduction Journal popularity is often mistaken as its quality as journals are assessed for their impact based on their popularity i.e. citation counts across years. This popularity cannot be agreed as journal quality. Popularity is based on peer like and dislike of the journal. This is measured as the factor of number of citations over a period of years. Whereas quality is the function of various components such as article content, author quality, semantic richness, etc. In practice, the impact factor of a journal is calculated as a measure reflecting the average number of citations to articles published in that particular journal. Current year impact factor of a journal is the average number of citations received per paper published in that journal during the two preceding years. The problem is hereby acknowledged that citations alone determine the impact of the article neglecting other factors. But this is less agreeable as this leads to certain criticisms [Seglen PO, 1997]. Hence, there is a need to derive a quality based metric for measuring the Journal Impact Factor (JIF). An article is said to be of good quality if it possess better semantics i.e. the article content should be good. Though an article sounds better in its semantics another significant aspect that has to be taken care is its originality. Here comes the plagiarism viewpoint which depicts article’s originality. Hence, an article quality has to determined based on these two aspects. Thus article quality leads to journal quality i.e. Journal Impact Factor. In this paper, our focus is to derive a metric for measuring journal impact factor considering two major factors such as author quality and article quality. A knowledge based impact metric is derived which is very much different from traditional impact factor thereby implicitly making the research journals to permit publishing of quality articles. This gradually makes the research publications well standardized. Yet, the traditional method of impact factor calculation is also included to illustrate the comparative results. 3. Literature review 3.1 Impact Factor: “A Prestige Factor” Impact factor over a period of time has gained its importance. Measuring impact factor as a quality measure only by citation count is supposed to be a flawed way [Garry Walter et al, 2003]. Garry Walter et al states the following viewpoints as the quality determinants for assessing the value of an article in scientific research ground. An article should add consequentially to the field through original, innovative research findings expand or challenge current knowledge

Upload: others

Post on 01-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Journal Impact Factor- A Measure of Quality or Popularity-1 · Impact Factor has been measured as a popularity index. i.e. only the number of citations of the journal articles determine

Journal Impact Factor: A Measure of Quality or Popularity?

Deepika J.

Dept. of CSE

Anna University, Chennai-25

[email protected]

Mahalakshmi G.S.

Dept. of CSE

Anna University, Chennai-25

[email protected]

[email protected]

1. Abstract

Due to quest for quality, researchers often prefer citing articles

from journals possessing higher Impact Factor. Until now, Journal

Impact Factor has been measured as a popularity index. i.e. only

the number of citations of the journal articles determine the impact

factor, be it any domain, regardless of the quantum of research

progressed so far. However, the art of citing has gradually changed

to be a matter of convenience and not of quality. This is due to

many journals obtaining higher visibility like open access. Due to

this dilution of policy in citation making, assessing a journal by its

quality, rather than circulation and visibility is more important.

Here, we suggest two parameters, author and article content, for

measuring a journal quality. The originality of the article along

with other evaluation measures, and the research quantum of the

article’s author(s) contribute to measuring the quality of a research

paper which accumulates to the journal quality.

Keywords: Impact Factor, Author Quality, Article Originality,

Plagiarism Detection, Metric

2. Introduction

Journal popularity is often mistaken as its

quality as journals are assessed for their impact based

on their popularity i.e. citation counts across years.

This popularity cannot be agreed as journal quality.

Popularity is based on peer like and dislike of the

journal. This is measured as the factor of number of

citations over a period of years. Whereas quality is

the function of various components such as article

content, author quality, semantic richness, etc.

In practice, the impact factor of a journal is

calculated as a measure reflecting the average

number of citations to articles published in that

particular journal. Current year impact factor of a

journal is the average number of citations received

per paper published in that journal during the two

preceding years. The problem is hereby

acknowledged that citations alone determine the

impact of the article neglecting other factors. But this

is less agreeable as this leads to certain criticisms

[Seglen PO, 1997]. Hence, there is a need to derive a

quality based metric for measuring the Journal

Impact Factor (JIF).

An article is said to be of good quality if it

possess better semantics i.e. the article content should

be good. Though an article sounds better in its

semantics another significant aspect that has to be

taken care is its originality. Here comes the

plagiarism viewpoint which depicts article’s

originality. Hence, an article quality has to

determined based on these two aspects. Thus article

quality leads to journal quality i.e. Journal Impact

Factor.

In this paper, our focus is to derive a metric

for measuring journal impact factor considering two

major factors such as author quality and article

quality. A knowledge based impact metric is derived

which is very much different from traditional impact

factor thereby implicitly making the research journals

to permit publishing of quality articles. This

gradually makes the research publications well

standardized. Yet, the traditional method of impact

factor calculation is also included to illustrate the

comparative results.

3. Literature review

3.1 Impact Factor: “A Prestige Factor”

Impact factor over a period of time has gained its

importance. Measuring impact factor as a quality

measure only by citation count is supposed to be a

flawed way [Garry Walter et al, 2003]. Garry Walter

et al states the following viewpoints as the quality

determinants for assessing the value of an article in

scientific research ground.

An article should

• add consequentially to the field through

original, innovative research findings

• expand or challenge current knowledge

Page 2: Journal Impact Factor- A Measure of Quality or Popularity-1 · Impact Factor has been measured as a popularity index. i.e. only the number of citations of the journal articles determine

• open additional areas for new research

activity

• open a pathway to advance knowledge

• integrate discoveries obtained by different

approaches and/ or disciplines through

creative synthesis, thus bringing new

insights to bear on original research and

• reflect critically on research findings to

guide the direction of further research

This common notion of impact factor has created a

range of other derived impact metrics, like, scope-

adjusted impact factor, discipline-specific impact

factor, immediacy index and cited half-life. The

fundamental assertion is to measure the true impact

of the journals that are very much into publishing

articles of scientific advancement. The underlying

belief is that innovative research will actually be

found only in original articles. However, prestige

based impact metrics were constrained to a specific

domain [Hane, Paula J, 2002].

3.2. Other major factors governing Impact Factor

The major factors to decide on a journal impact factor

includes

• Hyperlinks

• Citation measures

• Time frame

• Frequency

• Speed statistics

3.2.1 Link analysis

For a web search engine like Google with many

pages, link analysis is determined by using the Page

Rank algorithm [Sargolzaei P and F. Soleymani,

2010]. Page Rank Algorithm depends on the number

and Page Rank metric of all pages linked to it. Link

analysis for Closed Research Publications is based on

number of in-links to a document.

3.2.2 Citations

In Open source Publications, link analysis is

established via citations. An article with higher

citations need not actually be concluded as the best,

because if it has more number of authors, that leads

to a chance of making more number of self-citations

[Antonia Andrade, Raúl González-jonte and Juan

Miguel Campanarioc,2008].

The traditional impact factor does not take into

account, a journal’s relative articles, technical notes

and reviews. Seglen has shown that citation rate used

in Thomson ISI is proportional to the article length

[Seglen PO, 1997]. It is not true in many cases as

citation rate does not depend on article length. In

addition, a journal’s impact factor should be decided

by the impact created by individual articles. Seglen

[Seglen PO, 1997] stated problems with assessing a

journal based only on its citation. Seglen summarizes

four viewpoints as to how traditional impact factor

assessment is misleading:

1. Use of journal impact factors conceals the

difference in article citation rates

2. Most times, impact factors are determined

by technicalities unrelated to the scientific

quality of their articles

3. High impact factors are likely in journals

covering basic research areas

4. Article citation rates determines the journal

impact factor and not vice versa

3.2.3. Time frames

For finding a journal’s impact factor, the time frame

for citations is a major criterion to be accounted.

Initially, the time frame for impact factor calculation

was assumed as 2. In later years ‘Eigen factor’

evolved to solve this issue [CDSR,2009].However

even in this methodology, the self citation counts are

eliminated.

In addition, related indices like immediacy index,

cited half-life and aggregate impact factor are

considered. These indices suite well only for the

journal as a whole rather than an individual article of

the respective journal. However a newer index

namely h-index is coined by Hirsch [2009] to solve

this issue.

4. Problem Description: Journal Quality

Assessment

The journal impact factor is however evolved as a

valid indicator of scientific value. Starting from

traditional impact metric to present date, impact

factor is measured for journals to grade their value.

Though it is acceptable that the traditional formula is

reviewed for making changes, there is no common

metric proposed to measure a journal from the

Page 3: Journal Impact Factor- A Measure of Quality or Popularity-1 · Impact Factor has been measured as a popularity index. i.e. only the number of citations of the journal articles determine

knowledge perspective. The integrated knowledge

based impact metric proposed in this paper, counters

this challenge by grading a journal for its quality.

According to the novel idea proposed, the impact

assessment of a journal is viewed as the cumulative

impact of every article it contains (moderately similar

to Thomson). However, the dissimilarity lies in the

impact assessment of every article, i.e. quality wise

and not popularity or citation wise. In other words,

every article’s impact is viewed from two

perspectives: author quality and article quality. In this

newer approach, the individual authors’ of the article

are given importance as the quality of every author

contributes to quality of the respective articles. Also

the article quality demands more implication in

dealing with quality. Hence better semantics and

document evaluation are emphasized in analyzing the

article quality. The two viewpoints in Figure 1 author

quality and article quality are discussed briefly in the

following sections.

Figure 1 Journal Quality Assessment System

Each parameter selected for measuring journal

impact factor is integrated to form a metric for

assessing journal quality. In this newer approach, the

individual authors of the article are given importance

as the quality of the author contributes to the quality

of the article. The logic behind is quite reasoning in

considering it for quality assessment of journal. Also

the article quality demands more implication in

dealing with quality. Hence in analyzing the article

quality better semantics and document evaluation are

emphasized. The two viewpoints author quality and

article quality are discussed briefly in the following

sections.

5. Author quality: An imperative analysis 5.1Authorship merits

For an article to be good in its quality, the author(s)

of the article should possess better knowledge in

research field. There are certain remarkable factors

that influence the author quality in an article (figure

1). An author as a single or paired with co-authors

tends to produce variation in representing a concept

in the article. This is due to influence of one author

over other. Each aspect of an author starting from

their style of writing to contributions in publications

differs among authors.

The following factors are important in considering

authorship analysis.

• Author contribution

• Author bonding

• Originality in writings

• Author writing style including policies,

techniques and blunders

• Methodology of knowledge transfer

• Clarity in work

• Semantic significance

5.2Impact of Author quality

In this proposed approach a journal’s impact factor is

determined by measuring the impact of its content

articles (eq.1). Every article’s impact would be

assessed by its author quality and article quality, and

is said to be proportional with the number of articles

published in that issue More the articles in that issue,

less the impact of a particular article. Though this

idea is interesting, this is still debatable!! Here, we

assume that, more articles a journal might contain, it

is tough for an article to create more impact unless it

is of commendable quality above its fellow articles in

that issue. With this in mind, the article’s impact is

calculated as a combination of author quality and

article quality as in eq. 2.

Page 4: Journal Impact Factor- A Measure of Quality or Popularity-1 · Impact Factor has been measured as a popularity index. i.e. only the number of citations of the journal articles determine

( )

art

Art

OSRPJT

IFIF

#

∑= - eq. 1

( )

issueart

ArtfactorAu

ArtT

QQIF

_

_

#

+= - eq. 2

A journal though possess diversity of authors, every

individual author of the journal is worthy and

therefore should be graded to assess the quality of the

journal. We refer to this as author quality factor. This

is a measure of how the author creates impact in spite

of his/her author colleagues in that article. Author

Co-citation Analysis (ACA) and Web Co-link

Analysis (WCA) are examined as “sister” techniques

in the related fields of Bibliometrics and webometrics

[Alesia Zuccala, 2006]. By these techniques, the

author centrality would be assessed. The author with

high centrality gains more popularity mainly due to

his/her quality. The credits are given to the authors

based their positions they occupied in the article

[Teja Tscharntke et al., 2007]. The proposed

approach (figure 1) includes sequence-determines-

credit (SDC) and percent-contribution-indicated

(PCI) combined (SDC+PCI).

5.2.1Sequence-determines-credit” approach (SDC)

The sequence of authors should reflect the declining

importance of their contribution, as suggested by

previous authors. Authorship order only reflects

relative contribution, whereas evaluation committees

often need quantitative measures. We suggest that the

first author should get credit for the whole impact

(impact factor), the second author half, the third a

third, and if the author count exceeds three, the

Impact Factor is evenly distributed to all the author of

that particular article.

5.2.2 Percent-contribution-indicated approach (PCI)

There is a trend to detail each author’s contribution

(following requests of several journals). This should

also be used to establish the quantified credit. In this

paper, we combine both approaches and the output

measures obtained from the author quality

assessment system are based on credit and central

tendencies. Thus by author quality analysis,

individual authors of any Research Publications

article or journal are assessed for their quality which

highly contributes to the sole quality assessment of

Research Publications.

5.3 Author quality assessment metrics

The Author quality factor for the entire article is

obtained as,

( )

au

Au

factorAuT

QQ

#_

∑= - eq. 3

The individual author quality is obtained using the

formula,

( )2

SpreadSpread

Au

CreCitQ

+= - eq. 4

Citations play a major role in author quality analysis.

Hence the author quality is calculated based on the

citation spread and the credit spread which are as

given below:

au

citSpread

T

TCit

#

#= - eq. 5

art

avgval

SpreadR

CrCre

#

_= - eq. 6

2

_._

val

avgval

CrCapitalSemCr

+= - eq.6.1

Citation spread is the ratio of total citations to the

number of authors for that article. Credit spread is

taken as the ratio of average credit value to the total

references of that article.

However, average credit value (eq 6.1) is a

combination of position based author credit as well as

the semantic content of the article. The semantic

content of the article as contributed by the author is

measured by semantic capital which is a combination

of sole author credit, local and global cohesion. More

discussion is on section 6.

Page 5: Journal Impact Factor- A Measure of Quality or Popularity-1 · Impact Factor has been measured as a popularity index. i.e. only the number of citations of the journal articles determine

}1,6.0,8.0,2.0,4.0{

},,,,{_

#)_.(

#)_.(

#)_.(

#_

_

5

1_

__

5

1_

_

5

1_

_

_

=

=

×=

×=

×=

++=

=

=

=

weight

bookthesisarticlecollectionsproceedingcatpub

Tcatpubwtcoh

Tcatpubwtcoh

TcatpubwtCr

T

cohcohCrCapitalSem

artco

catpub

glo

artcoseed

catpub

loc

artsole

catpub

ausole

au

glolocausole

- eq. 7

The position based credit value for authors based on

the author position in that article is calculated from,

( )

3_,#

Im

}3{_,1

Im

}2,1{_,Im

>=

=+

=

==

au

au

avg

val

au

au

avg

val

au

au

avg

val

PosifT

Cr

PosifPos

Cr

PosifPos

Cr

- eq.8

6. Article quality: an overview on its

significance

Every article published tends to possess an

innovative idea based on research. An article quality

is assessed based on two major viewpoints. They are

Document semantics and Document originality. In

the former, the article semantic is analyzed and in the

latter the article is analyzed for it should not be copy

of other. Both these aspects pave way for article

quality assessment.

Quality of the document can be determined

based on its organization, semantic richness,

originality, innovation, clarity, grammar, creativity,

expendability, significance, relevance, etc. In this

approach a document quality is assessed based on its

originality aspect i.e. the document is viewed from

the plagiarism viewpoint. The similarity score for the

documents across the corpus is measured thereby

contributing to the article originality.

Document Evaluation is necessary because

an article though is not a copy of other documents

may still not possess better semantics. The relevance

of the title to the article, relevance of the references

and the relevance of the citations are considered to be

the major factors in document evaluation. Document

originality gains a vital role in research as

‘Plagiarism’ appears to be more common. Though

many tools are available to detect similarity between

documents, they are not appreciably good in finding

the similarity between the documents involving

reasoning techniques. In this work, a Plagiarism

detection tool say ‘IdeaPlag’ is highlighted

incorporating Ontology and Word net for finding the

similarity score between two documents. A domain-

specific ontology is created offline thereby used in

document evaluation and plagiarism detection. The

semantics of the article relating to those in ontology

is considered for document evaluation. The semantic

similarity between the documents is determined via

IdeaPlag methodology.

6.1 Article quality metrics

In dealing with Impact factor metric, the article

quality is mainly based on its similarity score. Since

an open source tool Plagiarism Finder is used as a

comparative analysis tool, the formula for similarity

score comprises 3 similarity measures: Plagiarism

Finder, IdeaPlag and a domain expert. In addition, the

semantic similarity is proportional to whether the

article is long or short or medium, which is said by its

no. of words, and, also proportional to the number of

references. The ratings of the 3 entities are collected

and the semantic similarity of an article is calculated

using eq. 9. However, with the absence of domain

expert, the semantic similarity may be computed by

eq. 10.

More the similarity measure less is the article quality.

Therefore, the article quality is inversely proportional

to similarity score which is given by eq.11.

artart

res

XPIdea

PF

XPArtRW

T

SimSimSim

ScoreSim##

3

#

_ _+

++

=

∑∑ - eq. 9

artart

res

Idea

PF

ArtRW

T

SimSim

ScoreSim##

2

#

_+

+

=

-eq.10

ScoreSimQ Art

_

1= -eq.11

Page 6: Journal Impact Factor- A Measure of Quality or Popularity-1 · Impact Factor has been measured as a popularity index. i.e. only the number of citations of the journal articles determine

7. Results and discussions

7.1. Test bed

Research articles from both an Open source and a

closed source journals are considered for the analysis.

The DOAJ (Directory for Open Access Journals) is

considered as a source and 58 articles from an open

access journal are handpicked to form an input

document corpus of Open source publications.

Similarly 30 articles are extracted from ‘Computer

Networks’ journal of Elsevier for analysis

[http://www.Elsevier.com//]. Relevant articles are

extracted from the Google Search Engine for

comparison. Every relevant article is tested with their

respective seed article to obtain the semantic score.

Digital Bibliography and Library Project (DBLP) is

used as test bed for authorship analysis

[www.informatik.uni-tier.de]. Cite seer

[citeseerx.ist.psu.edu] is used to obtain the self and

total citations of the article individually.

This is achieved by a handcrafted ontology with

about almost 900 concepts in computer networks

domain using Protégé 4.1 Beta. A snapshot of the

‘computer networks’ ontology is shown in Figure 2.

Figure 2. Snapshot of ‘Computer Networks’

Ontology in Protégé 4.1 Beta

Given the input in ‘IEEE format of the reference’ the

authors and the title are extracted. The author names

are taken as input and their positions are identified.

The credit values are given for the authors based on

their positions in the article. Up to three authors, the

Impact factor is given as 100% of impact factor for

first author, 50% of impact factor for second author

and 25% of impact factor for third author. For the

articles with >3 authors, impact factor is shared

equally among the authors.

7.2.Empirical results

7.2.1.Deriving Citation Measures

The citation counts are obtained as external and self

citations from Cite seer. The counts are zero if the

article is not found in Citeseer or if it does not have

any citations. The author quality is calculated based

on the citation spread and the credit spread. Though

citation spread is a measure of total citations (Figure

3a and 3b) of that article, credit spread (Figure 4a and

4b) measures the effort shared by every author in

constructing that article. Therefore, credit spread is

measured for every author, and overall, 161authors

across the input set in case of Open access journals

nd 100 authors in case of Closed source journals

Figure 3a. Total Citations per article of input article

corpus of Open source journal

Figure 3b. Total Citations per article of input article

corpus ‘Computer network’ journal

Page 7: Journal Impact Factor- A Measure of Quality or Popularity-1 · Impact Factor has been measured as a popularity index. i.e. only the number of citations of the journal articles determine

Figure 4a. Author Credit Spread of authors of input

article corpus of Open source journal

Figure 4b. Author Credit Spread of authors of input

article corpus ‘Computer network’ journal

Credit spread is dependent on average credit value

(Figure 5a and 5b). In other words, we perceive two

classes of credits for every author. One is based on

their position (Figure 5a and 5b), and other is based

on their contribution to their research.

As discussed earlier (eq. 7), we refer to the later as

semantic capital (Figure 7a and 7b). For this, we take

DBLP Open repository as a measure. Here, the

author names are tracked and they are given to DBLP

archive to find author contribution, i.e. soul author

contribution, contribution with local co-authors

(local, we mean, co-authors of the seed article), and

the global co-authors. From DBLP open repository,

the categories and frequency of the contributions like

article, proceedings, collections, articles, phdthesis,

master thesis and books etc. are harvested.

Figure 5a. Average Credit Value for authors of input

article corpus of Open source journal

Figure 5b. Average Credit Value for authors of input

article corpus ‘Computer network’ journal

Figure 6a. Position based credit measure for authors

of input article corpus of Open source journal

Page 8: Journal Impact Factor- A Measure of Quality or Popularity-1 · Impact Factor has been measured as a popularity index. i.e. only the number of citations of the journal articles determine

Figure 6b. Position based credit measure for authors

of input article corpus ‘Computer network’ journal

Figure 7a. Semantic capital of authors of input article

corpus of Open source journal

Figure 7b. Semantic capital of authors of input article

corpus ‘Computer network’ journal

The author’s reputation is a major concern here. The

credits are pre-determined i.e. a book will have a high

weightage than a journal article, a journal article will

have more weightage than conference proceedings

and so on, because of the research content in that

contribution, and of course, the readership coverage.

7.2.2.Author Quality Analysis

The author quality factor contributes to the impact of

individual article (eq. 3). The same is shown in

Figure 8a and 8b.

Figure 8a. Author Quality Factor for input article

corpus of Open source journal

Fig 8b. Author Quality Factor for input article corpus

‘Computer network’ journal

The author quality factor is a collective representative

measure of author quality (Figure 9a and 9b) for a

given article. And this is an average of every

constituent author’s quality, which is determined by

citation spread (Figure 10a and 10b) and credit

spread (Figure 4a and 4b).

Page 9: Journal Impact Factor- A Measure of Quality or Popularity-1 · Impact Factor has been measured as a popularity index. i.e. only the number of citations of the journal articles determine

Figure 9a. Individual Author Quality Analysis for

input article corpus of Open source journal

Figure 9b. Individual Author Quality Analysis for

input article corpus ‘Computer network’ journal

Figure 10a. Citation Spread for input article corpus of

Open source journal

Figure 10b. Citation Spread for input article corpus

‘Computer network’ journal

7.2.3 Article Quality Analysis

Article quality can be assessed via various factors as

discussed in section 6. In determining the article

originality through plagiarism, a Plagiarism detection

mechanism say ‘Idea Plagiarism’ is done and the

results are compared with an online tool ‘Plagiarism

Finder’ and ‘Expert value’.

In addition, we have assessed the similarity score

(Figure 11a and 11b) of the input document corpus.

Fig 11a. Similarity Score of input article corpus of

Open source journal in 2 tools and expert analysis

Page 10: Journal Impact Factor- A Measure of Quality or Popularity-1 · Impact Factor has been measured as a popularity index. i.e. only the number of citations of the journal articles determine

Fig 11b. Similarity Score of input article corpus

‘Computer network’ journal in 2 tools and expert

analysis

The article quality and its relation to it’s own

references is shown in Figure 13a and 13b. This is to

show that an article with more references cannot be

concluded with good quality. The quality of the

article may not be good even it possess more

references. i.e. the bibliometric analysis itself is not

enough to grade an article quality.

Figure 13a. Relation between Article Quality and

references for input article corpus of Open source

journal

Figure 13b. Relation between Article Quality and

references for input article corpus ‘Computer

network’ journal

Figure 14a. Article Impact for input article corpus of

Open source journal

Figure 14b. Article Impact for input article corpus

‘Computer network’ journal

7.2.4 Analysis of proposed impact metric

The impact of 58 individual articles is shown in

Figure 14a and that for 30 articles is shown in Figure

Page 11: Journal Impact Factor- A Measure of Quality or Popularity-1 · Impact Factor has been measured as a popularity index. i.e. only the number of citations of the journal articles determine

14b. The impact factor for Open source publications,

contributed by 58 select articles spread over the past

10 years of journal history is calculated as 0.466214.

This shall be compared with traditional idea of

impact factor calculation. If calculated, the traditional

impact factor would be Total citations / Total articles,

i.e. 117/58 = 2.017 for Open source publications

articles. From journal quality perspective, this might

look overwhelming but quite misleading [The

Thomson Reuters Impact Factor, 1994]. To compare with

the results, one can verify that ‘Computer Networks’

journal of Elsevier has 5-year Impact Factor as 1.610.

A most well known journal of DOAJ (an upcoming

Computer network journal) certainly has created

lesser impact than ‘Computer Networks’ of Elsevier,

which is the most established journal in the

respective domain. Therefore, to assess the quality of

an open access journal as 2.017 using traditional

impact measures would be less agreeable. [Figure 15]

Figure 15. Journal IF for open and closed source

publications-A comparative analysis

The above Observation depicts that Journal IF for the

Elsevier, ‘Computer Networks journal’ is measured

to be 2.103 where the traditional impact factor is

1.021[http: // www.elsevier.com /wps /find

/journaldescription.cws_home/505606/description#de

scription] which is based only on the citation factor.

Therefore, the proposed methodology yields a higher

IF value in comparison with the existing measure

(based only on citations) which is said to be

reasonably agreeable. And that for an Open Access

Journal the Impact found using the traditional method

is found to be high say 2.017 when compared with

the measure obtained from the proposed

methodology say 0.466. This is said to be less

agreeable as an Open access journal would not

possess IF of 2.017. The contradictory arisen here is a

less renowned article with higher IF is not

appreciable. Hence the proposed methodology seems

to fit finer in assessing a journal IF. [Figure 15]

8. Evaluation of Idea Plagiarism Detection:

an Exclusive comparison for Semantic

Similarity

Various tools and techniques have been evolved for

finding the similarity score for two articles. Specific

models such as Boolean model, Vector space model

and Probability model exist in determining the

similarity score between documents. Similarly,

various metrics such as cosine similarity, Jaccard co-

efficient, Hsim[Yong Zhang and Ke Deng,2010],

ontology based similarity metrics [Sridevi.U.K and

Nagaveni .N, 2010], Matching average, Dice

coefficient, Dot Products and Term Weight

Calculations[Mi Islita.com, 2006]. For comparative

analysis, two known metrics cosine similarity and

Jaccard co-efficient are computed for the documents.

The similarity measure obtained does not

involvereasoning.

8.1. Test bed for document similarity illustration

The articles from an Open Source Journal of

DOAJ are taken randomly. All these articles are

domain-specific i.e. Networks Domain.

8.2. Document Pre-processing

The text documents are pre-processed before

finding the similarity score. The general pre-

processing steps such as document parsing,

stemming, stop words removal are done to obtain

the bag of terms from the input. Then the metrics

are applied to evolve the similarity score which

are dealt in the following section.

8.3. Text Similarity

In order to find the similarity between any two

text documents the most renowned metric Cosine

similarity is used (eq. 12). This metric does not

involve any reasoning. It is performed merely

based on term frequency in the document after

certain document pre-processing steps like stop

word removal and stemming. The empirical

evaluation of the metric is shown in Table 1.

Page 12: Journal Impact Factor- A Measure of Quality or Popularity-1 · Impact Factor has been measured as a popularity index. i.e. only the number of citations of the journal articles determine

Doc 1 and 2 0.79237

Doc 1 and 3 0.576001

Doc 1 and 4 0.35279

Doc 1 and 5 0.650906

Doc 1and 6 0.368296

Table 1. Cosine Similarity score for documents

- eq.12

The similarity measures obtained via this method is

between 0 and 1. A document say Doc 1 is compared

with other Doc 2,3,4,5 and 6 to obtain the similarity

measure between them. The stop word removal and

stemming are performed in prior to similarity

detection.

The results obtained from the similarity measures

such as cosine similarity depicts the fact that the

documents 1 and 2 are said to possess high

similarity. i.e. higher percentage of plagiarism. But

the expert opinion [Table 2] on analyzing the

documents reveals that the similarity between the

documents 1 and 4 is highly similar. This is due to

lack of semantic analysis in those measures. Hence,

the system is proposed to assess the semantic

similarity between the documents involving

reasoning techniques.

Doc 1 and 2 20%

Doc 1 and 3 15%

Doc 1 and 4 50%

Doc 1 and 5 10%

Doc 1and 6 20%

Table 2. Expert’s opinion for documents similarity

8.4. Reasoning based similarity detection

Considering the above issue, the article semantics

are analyzed in identifying the similarity score

among the documents. Reasoning based similarity

detection is done for finding the similarity score

between two text documents [Archana. V et al,

(2010) communicated]. In this methodology the

Ontology and Word net are used in evaluating the

similarity score. The similarity score between

documents involving semantics is depicted in

table 3. The results obtained are similar to that of

expert opinion about document similarity. Hence

this semantic approach for document similarity

sounds better in document similarity detection.

This methodology can be used in assessing an

article quality involving document evaluation.

Doc 1 and 2 17.8942%

Doc 1 and 3 13.7247%

Doc 1 and 4 20.3925%

Doc 1 and 5 3.43025%

Doc 1and 6 17.7092%

Table 3. Document Similarity score involving semantics

The document similarity score is calculated which is

used in plagiarism detection thereby contributing to

the article quality. The article quality is determined

based on its originality which contributes towards

Journal IF calculation.

9. Conclusion

Therefore, the proposed system of calculating journal

impact factor with a view on author quality and

article quality analysis, contributes to more

productive and practical concern about the impact of

the journal in the research area of interest which is

reasonably agreeable. The proposed metrics are of

assured manner that it solves the above stated issues

and will be of interest to the entire research

community thereby assuring a quality based impact

metric for assessing the journal contribution.

10. Future work

In assessing author quality, same author with

different identity names need to be identified.

Because in dealing with DBLP same author with

different names are identified which need to be

resolved? The author quality measure gets better

value if the author names issue is resolved. The

knowledge transition of an author in the article can be

identified via various semantic analysis along with

development of knowledge base. The individual

author contribution towards publications can be

tracked which measures ‘author citation’.

Article quality assessment can be improved by

adding content analysis, semantic analysis and

knowledge transfer methods. The citation measures

can be improved by providing the levels of hierarchy

for authors for sharing the credits.

The system would further be improved that this does

not involve any reasoning in DBLP classes. The

fuzzy values given for contributions like book,

article, and proceedings, in proceedings and in

collections lack reasoning based approach. Also

while dealing with DBLP the position of main

authors present in seed article who have combined

Page 13: Journal Impact Factor- A Measure of Quality or Popularity-1 · Impact Factor has been measured as a popularity index. i.e. only the number of citations of the journal articles determine

with other authors in other contributions are not

accounted.

In citation perspective, the levels of hierarchy to

authors in social networks can be tracked to share the

article impact. This level of tracking deals with social

network analysis, more precisely, knowledge network

analysis, and this involves wide area coverage

including more research domains.

11. References

“The Thomson Reuters Impact Factor”, June

20,1994.

Alonso, S., Cabrerizo, F. J., Herrera-Viedma, E., &

Herrera, F.,” h-Index: A review focused in its

variants, computation and standardization for

different scientific fields.”,Elsevier, Journal of

Informetrics, 3(4), 273–289, April 2009.

Antonia Andrade, Raúl González-jonte and Juan

Miguel Campanarioc, “Journals that increase

their impact factor at least fourfold in a few

years:The role of journal self-citations”,Jointly

published by Akadémiai Kiadó, Budapest

Scientometrics, Vol. 80, No. 2 (2009) 517–530

and Springer, May 19, 2008.

Archana. V, Bagyalakshmi. V , Preethi. P and

Mahalakshmi.G.S,“Role of WordNet in

improving Plagiarism detection in Research

Publications”,Int. Journal of Artificial

Intelligence”,Technomathematics Reasearch

Foundations,April(2010)communicated

Archana. V, Bagyalakshmi. V, Preethi. P and

Mahalakshmi.G.S,“Comparison of NLP based

techniques to detect Plagiarism in

ResearchPublication”,Int. Journal of Computer

science and ApplicationsTechnomathematics

Reasearch

Foundations,April(2010)communicated

Benno Stein, Moshe Koppel and Israel Efstathios

Stamatatos ,” Plagiarism Analysis, Authorship

Identification and Near-Duplicate Detection”

PAN'07,ACM SIGIR Forum,Vol.41, No.2, pp

70-71, December 2007

Dirk Schoonbaert and Gilbert Roelants ,” Citation

analysis for measuring the value of scientific

publications: quality assessment tool or comedy

of errors?”, Institute of Tropical Medicine,

Antwerp, Belgium, volume 1, no. 6 , pp 739-

752, December 1996, 1st online - AUG 2007.

DOAJ web resource

http//www.doaj.org/doaj?func=loadTempl&tem

pl=about

Garcia,” An Information Retrieval Tutorial on Cosine

Similarity Measures, Dot Products and Term

Weight Calculations”, www. Mi Islita.com

Garry Walter, Sidney Bloch, Glenn Hunt and Karen

Fisher, ” Counting on citations: a flawed way to

measure quality,” MJA 2003; 178: 280-281,

Volume 178 ,17 March 2003.

Hirsch, J. E. ,”An index to quantify an individual's

scientific research output”, Proc Natl Acad Sci

USA, Vol. 102, No. 46, pp. 16569–16573,

Springerlink, Scientometrics, 29 September

2009.

Lutz Bornmanna and Hans-Dieter Daniel,” The

citation speed index: A useful bibliometric

indicator to add to the h index”, Elsevier,Journal

of Informetrics 4 (2010) 444–446, 25 March

2010.

Paula J,” The prestige (factor) is gone,Hane,”

Information Today , Wednesday, May 1

2002.http:// www2.hawaii.edu/~jacso/extra.

Plagiarism Finder1.2.2, m4-software.com

Roth

and Jean-Philippe Cointet, ” Social and

semantic coevolution in knowledge networks”,

Dynamics of Social Networks,

ScienceDirect,Elsevier, Volume 32, Issue1,

Page16-29, January2010.

Sargolzaei P and F. Soleymani, ”PageRank Problem,

Survey And Future Research Direction ”,

International Mathematical Forum, no. 19, 937 –

956, 5 March 2010.

Seglen PO ,"Why the impact factor of journals should

not be used for evaluating research “, BMJ,

Vol.314, No.7079,498-502,15 Feb 1997.

Sendhil Kumar S and Mahalakshmi G.S, ‘Context-

based citation retrieval’, Int. J. Networking and

Virtual Organisations, Elsevier, Inderscience

Publications, vol.8, No1/2, 2011.

SriDevi U.K and Nagaveni N, ”Ontology based

Similarity measure in document ranking”,

Second International Conference on Intelligent

Human-Machine Systems and Cybernetics,

IEEE, pp 125-128, vol 1,no 26, (0975 -

8887),2010.

Suber P,” Thinking about prestige, quality and Open

Access”,SPARC Open Access Newsletter, Sept.

2008.

Teja Tscharntke, Michael E Hochberg, Tatyana A

Rand, Vincent H Resh, and Jochen Krauss,”

Author Sequence and Credit for Contributions in

Multiauthored Publications ”, Plos Biology ,16

Jan, 2007.

WordNet web resource site

http//wordnet.princeton.edu/