measuring open access- current state of the art

30
Measuring Open Access - Current State of the Art by Éric Archambault, D.Phil. President and CEO, Science-Metrix and 1science ESSS 2015 - Leuven

Upload: 1science

Post on 17-Aug-2015

37 views

Category:

Science


0 download

TRANSCRIPT

Measuring Open Access - Current State of the Art by

Eacuteric Archambault DPhil President and CEO Science-Metrix and 1science

ESSS 2015 - Leuven

2

The OA revolution is firmly in motion Librarians can play a key role

Traditional role ndash percolation New role ndash diffusion

Researchers too ndash be fruitful and multiply OA in academic publications complex beast Understanding the OA universe is key to useful measurement

BACKGROUND

3

Definitions Key vantage points Measuring OA Results Conclusions

SYNOPSIS

4

Budapest Open Access Initiative (2002) ldquoThe literature that should be freely accessible online is that which scholars give to the world without expectation of payment Primarily this category encompasses their peer-reviewed journal articles but it also includes any unreviewed preprints that they might wish to put online for comment or to alert colleagues to important research findings There are many degrees and kinds of wider and easier access to this literature By open access to this literature we mean its free availability on the public internet permitting any users to read download copy distribute print search or link to the full texts of these articles crawl them for indexing pass them as data to software or use them for any other lawful purpose without financial legal or technical barriers other than those inseparable from gaining access to the internet itself The only constraint on reproduction and distribution and the only role for copyright in this domain should be to give authors control over the integrity of their work and the right to be properly acknowledged and citedrdquo

DEFINITIONS

5

Green OA The main idea behind Green is self-archiving Archiving can be done in institutional and thematic repositories

Gold OA The main idea behind Gold is that journal publishers make papers available There are Gold journals (cover-to-cover) but also Gold papers published in subscription-based journals (aka ldquohybrid journalsrdquo)

DEFINITIONS

6

Complexity of OA definition and measurement notably due to

Embargoes Transiency Rights of all kind (to self-archive to crawl to recombine to use commercially etc) Discoverability

DEFINITIONS

7

Rules of involvement in OA

VANTAGE POINTS

8

Directoriesregistries of repositories Directory of OA journals

VANTAGE POINTS

9

Free or open source repository software DSpace EPrints Archimede DAITSS Dienst Enterprise-Wide Digital Repository and Archive ETD-db eXtensible Text Framework Fedora Greenstone Invenio IRPlus Keystone Digital Library Suite MOAI Omeka OPUS PubMan WEKO PeerLibrary

Source httpoadsimmonseduoadwikiFree_and_open-source_repository_software

VANTAGE POINTS

10

Key repositories arXivorg ndash the mothership PubMed Central Europe PubMed Central

Aggregators OpenAire BASE CORE

A typical repository hosted by the Umearing universitet Library

VANTAGE POINTS

11

Despite or perhaps because of all the sources of OA available it is very difficult to measure the availability of OA Here we are concerned about the availability of peer-reviewed articles published in scholarly journals Why ndash this is what policies and mandates are preoccupied with

BIBLIOMETRICS ndash PROPORTION OF OA

12

Bottom-up measurement One would have to harvest all the sources available and de-duplicate results The main problem is how to determine reliably that items

(1) were published (as opposed to an un-submitted manuscript) (2) are peer-reviewed (3) answer the ldquoso whatrdquo question (you found there were 14325678 papers so what)

BIBLIOMETRICS ndash PROPORTION OF OA

13

Top-down measurement One would have to find an exhaustive bibliographic database of peer-reviewed articles and verify the availability of all papers Main problems

(1) there is no such database (2) extremely tedious to check all of them (3) how do you actually do that

BIBLIOMETRICS ndash PROPORTION OF OA

14

Top-down measurement - Sampling Considering the enormous task at hand most authors have resorted to using sampling and search engines Harnad and team sampled articles from the Web of Science Bjoumlrk and team sampled articles from Scopus Archambault and team sampled articles from Scopus and used multiple techniques as well as search engines

BIBLIOMETRICS ndash PROPORTION OF OA

15

Dealing with search engines Use user-friendly meta-search engines such as DuckDuckGo or DogPile Try to stay below the radar using mainstream search engines Neither solution feels remotely confortable

Other solution is to build a dedicated infrastructure to facilitate OA discovery (this is the solution used by 1science)

BIBLIOMETRICS ndash PROPORTION OF OA

16

Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples

SAMPLING AND METROLOGY

17

A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows

Retrieval Precision = 119905119905119905119905+119891119905

Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records

Recall = 119905119905119905119905+119891119891

Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula

Adjustment = 119905119905+119891119891119905119905+119891119905

SAMPLING AND METROLOGY

18

Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows

119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1

+ 05119891

SAMPLING AND METROLOGY

19

The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers

MEASURING THE OF OA PAPERS

20

For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample

MEASURING THE OF OA PAPERS

21

Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013

RESULTS

0

5

10

15

20

25

30

35

40

45

50

55

60

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013

22

Translation of OA availability between April 2013 and April 2014

RESULTS

y = 2E-21e00234x

Rsup2 = 0976

y = 3E-17e00186x

Rsup2 = 09473

0

5

10

15

20

25

30

35

40

45

50

55

60

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014

Adjusted OA April 2013

23

OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011

RESULTS

y = 2E-112e01335x

Rsup2 = 09976

0

20000

40000

60000

80000

100000

120000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of

OA p

aper

s ba

ckfil

led

betw

een

Apr

il 20

13 a

nd A

pril

2014

24

Growth of the number of papers available in OA as measured in April 2014 1996ndash2013

RESULTS

y = 2E-73e009x

Rsup2 = 09971

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1000000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of p

aper

s in

O

A

Adjusted OA

Measured OA

25

Scientific impact of OA and non-OA papers published in 1996ndash2011

RESULTS

00

02

04

06

08

10

12

14

16

18

1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011

Ave

rage

of re

lativ

e ci

tatio

ns

(ARC 1

= w

orld

ave

rgae

)

OAAll PapersNot OA

26

Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals

RESULTS

1st place 2nd place 3rd place Least impact

Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061

Field

27

OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery

CONCLUSION

28

Growth of OA should be understood to comprise two main aspects

Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve

CONCLUSION

29

On average openly accessible papers have a decidedly greater impact

In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established

No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do

CONCLUSION

30

Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers

Visit Science-Metrix to learn about our evaluation and measurement activities

THANK YOU

  • Slide Number 1
  • Background
  • SYNOPSIS
  • Definitions
  • Definitions
  • Definitions
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Measuring the of OA papers
  • Measuring the of OA papers
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • Conclusion
  • Conclusion
  • Conclusion
  • Thank you

2

The OA revolution is firmly in motion Librarians can play a key role

Traditional role ndash percolation New role ndash diffusion

Researchers too ndash be fruitful and multiply OA in academic publications complex beast Understanding the OA universe is key to useful measurement

BACKGROUND

3

Definitions Key vantage points Measuring OA Results Conclusions

SYNOPSIS

4

Budapest Open Access Initiative (2002) ldquoThe literature that should be freely accessible online is that which scholars give to the world without expectation of payment Primarily this category encompasses their peer-reviewed journal articles but it also includes any unreviewed preprints that they might wish to put online for comment or to alert colleagues to important research findings There are many degrees and kinds of wider and easier access to this literature By open access to this literature we mean its free availability on the public internet permitting any users to read download copy distribute print search or link to the full texts of these articles crawl them for indexing pass them as data to software or use them for any other lawful purpose without financial legal or technical barriers other than those inseparable from gaining access to the internet itself The only constraint on reproduction and distribution and the only role for copyright in this domain should be to give authors control over the integrity of their work and the right to be properly acknowledged and citedrdquo

DEFINITIONS

5

Green OA The main idea behind Green is self-archiving Archiving can be done in institutional and thematic repositories

Gold OA The main idea behind Gold is that journal publishers make papers available There are Gold journals (cover-to-cover) but also Gold papers published in subscription-based journals (aka ldquohybrid journalsrdquo)

DEFINITIONS

6

Complexity of OA definition and measurement notably due to

Embargoes Transiency Rights of all kind (to self-archive to crawl to recombine to use commercially etc) Discoverability

DEFINITIONS

7

Rules of involvement in OA

VANTAGE POINTS

8

Directoriesregistries of repositories Directory of OA journals

VANTAGE POINTS

9

Free or open source repository software DSpace EPrints Archimede DAITSS Dienst Enterprise-Wide Digital Repository and Archive ETD-db eXtensible Text Framework Fedora Greenstone Invenio IRPlus Keystone Digital Library Suite MOAI Omeka OPUS PubMan WEKO PeerLibrary

Source httpoadsimmonseduoadwikiFree_and_open-source_repository_software

VANTAGE POINTS

10

Key repositories arXivorg ndash the mothership PubMed Central Europe PubMed Central

Aggregators OpenAire BASE CORE

A typical repository hosted by the Umearing universitet Library

VANTAGE POINTS

11

Despite or perhaps because of all the sources of OA available it is very difficult to measure the availability of OA Here we are concerned about the availability of peer-reviewed articles published in scholarly journals Why ndash this is what policies and mandates are preoccupied with

BIBLIOMETRICS ndash PROPORTION OF OA

12

Bottom-up measurement One would have to harvest all the sources available and de-duplicate results The main problem is how to determine reliably that items

(1) were published (as opposed to an un-submitted manuscript) (2) are peer-reviewed (3) answer the ldquoso whatrdquo question (you found there were 14325678 papers so what)

BIBLIOMETRICS ndash PROPORTION OF OA

13

Top-down measurement One would have to find an exhaustive bibliographic database of peer-reviewed articles and verify the availability of all papers Main problems

(1) there is no such database (2) extremely tedious to check all of them (3) how do you actually do that

BIBLIOMETRICS ndash PROPORTION OF OA

14

Top-down measurement - Sampling Considering the enormous task at hand most authors have resorted to using sampling and search engines Harnad and team sampled articles from the Web of Science Bjoumlrk and team sampled articles from Scopus Archambault and team sampled articles from Scopus and used multiple techniques as well as search engines

BIBLIOMETRICS ndash PROPORTION OF OA

15

Dealing with search engines Use user-friendly meta-search engines such as DuckDuckGo or DogPile Try to stay below the radar using mainstream search engines Neither solution feels remotely confortable

Other solution is to build a dedicated infrastructure to facilitate OA discovery (this is the solution used by 1science)

BIBLIOMETRICS ndash PROPORTION OF OA

16

Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples

SAMPLING AND METROLOGY

17

A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows

Retrieval Precision = 119905119905119905119905+119891119905

Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records

Recall = 119905119905119905119905+119891119891

Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula

Adjustment = 119905119905+119891119891119905119905+119891119905

SAMPLING AND METROLOGY

18

Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows

119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1

+ 05119891

SAMPLING AND METROLOGY

19

The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers

MEASURING THE OF OA PAPERS

20

For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample

MEASURING THE OF OA PAPERS

21

Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013

RESULTS

0

5

10

15

20

25

30

35

40

45

50

55

60

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013

22

Translation of OA availability between April 2013 and April 2014

RESULTS

y = 2E-21e00234x

Rsup2 = 0976

y = 3E-17e00186x

Rsup2 = 09473

0

5

10

15

20

25

30

35

40

45

50

55

60

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014

Adjusted OA April 2013

23

OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011

RESULTS

y = 2E-112e01335x

Rsup2 = 09976

0

20000

40000

60000

80000

100000

120000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of

OA p

aper

s ba

ckfil

led

betw

een

Apr

il 20

13 a

nd A

pril

2014

24

Growth of the number of papers available in OA as measured in April 2014 1996ndash2013

RESULTS

y = 2E-73e009x

Rsup2 = 09971

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1000000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of p

aper

s in

O

A

Adjusted OA

Measured OA

25

Scientific impact of OA and non-OA papers published in 1996ndash2011

RESULTS

00

02

04

06

08

10

12

14

16

18

1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011

Ave

rage

of re

lativ

e ci

tatio

ns

(ARC 1

= w

orld

ave

rgae

)

OAAll PapersNot OA

26

Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals

RESULTS

1st place 2nd place 3rd place Least impact

Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061

Field

27

OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery

CONCLUSION

28

Growth of OA should be understood to comprise two main aspects

Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve

CONCLUSION

29

On average openly accessible papers have a decidedly greater impact

In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established

No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do

CONCLUSION

30

Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers

Visit Science-Metrix to learn about our evaluation and measurement activities

THANK YOU

  • Slide Number 1
  • Background
  • SYNOPSIS
  • Definitions
  • Definitions
  • Definitions
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Measuring the of OA papers
  • Measuring the of OA papers
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • Conclusion
  • Conclusion
  • Conclusion
  • Thank you

3

Definitions Key vantage points Measuring OA Results Conclusions

SYNOPSIS

4

Budapest Open Access Initiative (2002) ldquoThe literature that should be freely accessible online is that which scholars give to the world without expectation of payment Primarily this category encompasses their peer-reviewed journal articles but it also includes any unreviewed preprints that they might wish to put online for comment or to alert colleagues to important research findings There are many degrees and kinds of wider and easier access to this literature By open access to this literature we mean its free availability on the public internet permitting any users to read download copy distribute print search or link to the full texts of these articles crawl them for indexing pass them as data to software or use them for any other lawful purpose without financial legal or technical barriers other than those inseparable from gaining access to the internet itself The only constraint on reproduction and distribution and the only role for copyright in this domain should be to give authors control over the integrity of their work and the right to be properly acknowledged and citedrdquo

DEFINITIONS

5

Green OA The main idea behind Green is self-archiving Archiving can be done in institutional and thematic repositories

Gold OA The main idea behind Gold is that journal publishers make papers available There are Gold journals (cover-to-cover) but also Gold papers published in subscription-based journals (aka ldquohybrid journalsrdquo)

DEFINITIONS

6

Complexity of OA definition and measurement notably due to

Embargoes Transiency Rights of all kind (to self-archive to crawl to recombine to use commercially etc) Discoverability

DEFINITIONS

7

Rules of involvement in OA

VANTAGE POINTS

8

Directoriesregistries of repositories Directory of OA journals

VANTAGE POINTS

9

Free or open source repository software DSpace EPrints Archimede DAITSS Dienst Enterprise-Wide Digital Repository and Archive ETD-db eXtensible Text Framework Fedora Greenstone Invenio IRPlus Keystone Digital Library Suite MOAI Omeka OPUS PubMan WEKO PeerLibrary

Source httpoadsimmonseduoadwikiFree_and_open-source_repository_software

VANTAGE POINTS

10

Key repositories arXivorg ndash the mothership PubMed Central Europe PubMed Central

Aggregators OpenAire BASE CORE

A typical repository hosted by the Umearing universitet Library

VANTAGE POINTS

11

Despite or perhaps because of all the sources of OA available it is very difficult to measure the availability of OA Here we are concerned about the availability of peer-reviewed articles published in scholarly journals Why ndash this is what policies and mandates are preoccupied with

BIBLIOMETRICS ndash PROPORTION OF OA

12

Bottom-up measurement One would have to harvest all the sources available and de-duplicate results The main problem is how to determine reliably that items

(1) were published (as opposed to an un-submitted manuscript) (2) are peer-reviewed (3) answer the ldquoso whatrdquo question (you found there were 14325678 papers so what)

BIBLIOMETRICS ndash PROPORTION OF OA

13

Top-down measurement One would have to find an exhaustive bibliographic database of peer-reviewed articles and verify the availability of all papers Main problems

(1) there is no such database (2) extremely tedious to check all of them (3) how do you actually do that

BIBLIOMETRICS ndash PROPORTION OF OA

14

Top-down measurement - Sampling Considering the enormous task at hand most authors have resorted to using sampling and search engines Harnad and team sampled articles from the Web of Science Bjoumlrk and team sampled articles from Scopus Archambault and team sampled articles from Scopus and used multiple techniques as well as search engines

BIBLIOMETRICS ndash PROPORTION OF OA

15

Dealing with search engines Use user-friendly meta-search engines such as DuckDuckGo or DogPile Try to stay below the radar using mainstream search engines Neither solution feels remotely confortable

Other solution is to build a dedicated infrastructure to facilitate OA discovery (this is the solution used by 1science)

BIBLIOMETRICS ndash PROPORTION OF OA

16

Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples

SAMPLING AND METROLOGY

17

A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows

Retrieval Precision = 119905119905119905119905+119891119905

Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records

Recall = 119905119905119905119905+119891119891

Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula

Adjustment = 119905119905+119891119891119905119905+119891119905

SAMPLING AND METROLOGY

18

Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows

119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1

+ 05119891

SAMPLING AND METROLOGY

19

The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers

MEASURING THE OF OA PAPERS

20

For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample

MEASURING THE OF OA PAPERS

21

Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013

RESULTS

0

5

10

15

20

25

30

35

40

45

50

55

60

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013

22

Translation of OA availability between April 2013 and April 2014

RESULTS

y = 2E-21e00234x

Rsup2 = 0976

y = 3E-17e00186x

Rsup2 = 09473

0

5

10

15

20

25

30

35

40

45

50

55

60

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014

Adjusted OA April 2013

23

OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011

RESULTS

y = 2E-112e01335x

Rsup2 = 09976

0

20000

40000

60000

80000

100000

120000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of

OA p

aper

s ba

ckfil

led

betw

een

Apr

il 20

13 a

nd A

pril

2014

24

Growth of the number of papers available in OA as measured in April 2014 1996ndash2013

RESULTS

y = 2E-73e009x

Rsup2 = 09971

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1000000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of p

aper

s in

O

A

Adjusted OA

Measured OA

25

Scientific impact of OA and non-OA papers published in 1996ndash2011

RESULTS

00

02

04

06

08

10

12

14

16

18

1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011

Ave

rage

of re

lativ

e ci

tatio

ns

(ARC 1

= w

orld

ave

rgae

)

OAAll PapersNot OA

26

Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals

RESULTS

1st place 2nd place 3rd place Least impact

Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061

Field

27

OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery

CONCLUSION

28

Growth of OA should be understood to comprise two main aspects

Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve

CONCLUSION

29

On average openly accessible papers have a decidedly greater impact

In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established

No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do

CONCLUSION

30

Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers

Visit Science-Metrix to learn about our evaluation and measurement activities

THANK YOU

  • Slide Number 1
  • Background
  • SYNOPSIS
  • Definitions
  • Definitions
  • Definitions
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Measuring the of OA papers
  • Measuring the of OA papers
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • Conclusion
  • Conclusion
  • Conclusion
  • Thank you

4

Budapest Open Access Initiative (2002) ldquoThe literature that should be freely accessible online is that which scholars give to the world without expectation of payment Primarily this category encompasses their peer-reviewed journal articles but it also includes any unreviewed preprints that they might wish to put online for comment or to alert colleagues to important research findings There are many degrees and kinds of wider and easier access to this literature By open access to this literature we mean its free availability on the public internet permitting any users to read download copy distribute print search or link to the full texts of these articles crawl them for indexing pass them as data to software or use them for any other lawful purpose without financial legal or technical barriers other than those inseparable from gaining access to the internet itself The only constraint on reproduction and distribution and the only role for copyright in this domain should be to give authors control over the integrity of their work and the right to be properly acknowledged and citedrdquo

DEFINITIONS

5

Green OA The main idea behind Green is self-archiving Archiving can be done in institutional and thematic repositories

Gold OA The main idea behind Gold is that journal publishers make papers available There are Gold journals (cover-to-cover) but also Gold papers published in subscription-based journals (aka ldquohybrid journalsrdquo)

DEFINITIONS

6

Complexity of OA definition and measurement notably due to

Embargoes Transiency Rights of all kind (to self-archive to crawl to recombine to use commercially etc) Discoverability

DEFINITIONS

7

Rules of involvement in OA

VANTAGE POINTS

8

Directoriesregistries of repositories Directory of OA journals

VANTAGE POINTS

9

Free or open source repository software DSpace EPrints Archimede DAITSS Dienst Enterprise-Wide Digital Repository and Archive ETD-db eXtensible Text Framework Fedora Greenstone Invenio IRPlus Keystone Digital Library Suite MOAI Omeka OPUS PubMan WEKO PeerLibrary

Source httpoadsimmonseduoadwikiFree_and_open-source_repository_software

VANTAGE POINTS

10

Key repositories arXivorg ndash the mothership PubMed Central Europe PubMed Central

Aggregators OpenAire BASE CORE

A typical repository hosted by the Umearing universitet Library

VANTAGE POINTS

11

Despite or perhaps because of all the sources of OA available it is very difficult to measure the availability of OA Here we are concerned about the availability of peer-reviewed articles published in scholarly journals Why ndash this is what policies and mandates are preoccupied with

BIBLIOMETRICS ndash PROPORTION OF OA

12

Bottom-up measurement One would have to harvest all the sources available and de-duplicate results The main problem is how to determine reliably that items

(1) were published (as opposed to an un-submitted manuscript) (2) are peer-reviewed (3) answer the ldquoso whatrdquo question (you found there were 14325678 papers so what)

BIBLIOMETRICS ndash PROPORTION OF OA

13

Top-down measurement One would have to find an exhaustive bibliographic database of peer-reviewed articles and verify the availability of all papers Main problems

(1) there is no such database (2) extremely tedious to check all of them (3) how do you actually do that

BIBLIOMETRICS ndash PROPORTION OF OA

14

Top-down measurement - Sampling Considering the enormous task at hand most authors have resorted to using sampling and search engines Harnad and team sampled articles from the Web of Science Bjoumlrk and team sampled articles from Scopus Archambault and team sampled articles from Scopus and used multiple techniques as well as search engines

BIBLIOMETRICS ndash PROPORTION OF OA

15

Dealing with search engines Use user-friendly meta-search engines such as DuckDuckGo or DogPile Try to stay below the radar using mainstream search engines Neither solution feels remotely confortable

Other solution is to build a dedicated infrastructure to facilitate OA discovery (this is the solution used by 1science)

BIBLIOMETRICS ndash PROPORTION OF OA

16

Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples

SAMPLING AND METROLOGY

17

A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows

Retrieval Precision = 119905119905119905119905+119891119905

Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records

Recall = 119905119905119905119905+119891119891

Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula

Adjustment = 119905119905+119891119891119905119905+119891119905

SAMPLING AND METROLOGY

18

Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows

119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1

+ 05119891

SAMPLING AND METROLOGY

19

The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers

MEASURING THE OF OA PAPERS

20

For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample

MEASURING THE OF OA PAPERS

21

Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013

RESULTS

0

5

10

15

20

25

30

35

40

45

50

55

60

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013

22

Translation of OA availability between April 2013 and April 2014

RESULTS

y = 2E-21e00234x

Rsup2 = 0976

y = 3E-17e00186x

Rsup2 = 09473

0

5

10

15

20

25

30

35

40

45

50

55

60

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014

Adjusted OA April 2013

23

OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011

RESULTS

y = 2E-112e01335x

Rsup2 = 09976

0

20000

40000

60000

80000

100000

120000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of

OA p

aper

s ba

ckfil

led

betw

een

Apr

il 20

13 a

nd A

pril

2014

24

Growth of the number of papers available in OA as measured in April 2014 1996ndash2013

RESULTS

y = 2E-73e009x

Rsup2 = 09971

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1000000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of p

aper

s in

O

A

Adjusted OA

Measured OA

25

Scientific impact of OA and non-OA papers published in 1996ndash2011

RESULTS

00

02

04

06

08

10

12

14

16

18

1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011

Ave

rage

of re

lativ

e ci

tatio

ns

(ARC 1

= w

orld

ave

rgae

)

OAAll PapersNot OA

26

Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals

RESULTS

1st place 2nd place 3rd place Least impact

Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061

Field

27

OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery

CONCLUSION

28

Growth of OA should be understood to comprise two main aspects

Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve

CONCLUSION

29

On average openly accessible papers have a decidedly greater impact

In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established

No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do

CONCLUSION

30

Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers

Visit Science-Metrix to learn about our evaluation and measurement activities

THANK YOU

  • Slide Number 1
  • Background
  • SYNOPSIS
  • Definitions
  • Definitions
  • Definitions
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Measuring the of OA papers
  • Measuring the of OA papers
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • Conclusion
  • Conclusion
  • Conclusion
  • Thank you

5

Green OA The main idea behind Green is self-archiving Archiving can be done in institutional and thematic repositories

Gold OA The main idea behind Gold is that journal publishers make papers available There are Gold journals (cover-to-cover) but also Gold papers published in subscription-based journals (aka ldquohybrid journalsrdquo)

DEFINITIONS

6

Complexity of OA definition and measurement notably due to

Embargoes Transiency Rights of all kind (to self-archive to crawl to recombine to use commercially etc) Discoverability

DEFINITIONS

7

Rules of involvement in OA

VANTAGE POINTS

8

Directoriesregistries of repositories Directory of OA journals

VANTAGE POINTS

9

Free or open source repository software DSpace EPrints Archimede DAITSS Dienst Enterprise-Wide Digital Repository and Archive ETD-db eXtensible Text Framework Fedora Greenstone Invenio IRPlus Keystone Digital Library Suite MOAI Omeka OPUS PubMan WEKO PeerLibrary

Source httpoadsimmonseduoadwikiFree_and_open-source_repository_software

VANTAGE POINTS

10

Key repositories arXivorg ndash the mothership PubMed Central Europe PubMed Central

Aggregators OpenAire BASE CORE

A typical repository hosted by the Umearing universitet Library

VANTAGE POINTS

11

Despite or perhaps because of all the sources of OA available it is very difficult to measure the availability of OA Here we are concerned about the availability of peer-reviewed articles published in scholarly journals Why ndash this is what policies and mandates are preoccupied with

BIBLIOMETRICS ndash PROPORTION OF OA

12

Bottom-up measurement One would have to harvest all the sources available and de-duplicate results The main problem is how to determine reliably that items

(1) were published (as opposed to an un-submitted manuscript) (2) are peer-reviewed (3) answer the ldquoso whatrdquo question (you found there were 14325678 papers so what)

BIBLIOMETRICS ndash PROPORTION OF OA

13

Top-down measurement One would have to find an exhaustive bibliographic database of peer-reviewed articles and verify the availability of all papers Main problems

(1) there is no such database (2) extremely tedious to check all of them (3) how do you actually do that

BIBLIOMETRICS ndash PROPORTION OF OA

14

Top-down measurement - Sampling Considering the enormous task at hand most authors have resorted to using sampling and search engines Harnad and team sampled articles from the Web of Science Bjoumlrk and team sampled articles from Scopus Archambault and team sampled articles from Scopus and used multiple techniques as well as search engines

BIBLIOMETRICS ndash PROPORTION OF OA

15

Dealing with search engines Use user-friendly meta-search engines such as DuckDuckGo or DogPile Try to stay below the radar using mainstream search engines Neither solution feels remotely confortable

Other solution is to build a dedicated infrastructure to facilitate OA discovery (this is the solution used by 1science)

BIBLIOMETRICS ndash PROPORTION OF OA

16

Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples

SAMPLING AND METROLOGY

17

A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows

Retrieval Precision = 119905119905119905119905+119891119905

Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records

Recall = 119905119905119905119905+119891119891

Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula

Adjustment = 119905119905+119891119891119905119905+119891119905

SAMPLING AND METROLOGY

18

Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows

119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1

+ 05119891

SAMPLING AND METROLOGY

19

The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers

MEASURING THE OF OA PAPERS

20

For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample

MEASURING THE OF OA PAPERS

21

Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013

RESULTS

0

5

10

15

20

25

30

35

40

45

50

55

60

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013

22

Translation of OA availability between April 2013 and April 2014

RESULTS

y = 2E-21e00234x

Rsup2 = 0976

y = 3E-17e00186x

Rsup2 = 09473

0

5

10

15

20

25

30

35

40

45

50

55

60

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014

Adjusted OA April 2013

23

OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011

RESULTS

y = 2E-112e01335x

Rsup2 = 09976

0

20000

40000

60000

80000

100000

120000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of

OA p

aper

s ba

ckfil

led

betw

een

Apr

il 20

13 a

nd A

pril

2014

24

Growth of the number of papers available in OA as measured in April 2014 1996ndash2013

RESULTS

y = 2E-73e009x

Rsup2 = 09971

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1000000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of p

aper

s in

O

A

Adjusted OA

Measured OA

25

Scientific impact of OA and non-OA papers published in 1996ndash2011

RESULTS

00

02

04

06

08

10

12

14

16

18

1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011

Ave

rage

of re

lativ

e ci

tatio

ns

(ARC 1

= w

orld

ave

rgae

)

OAAll PapersNot OA

26

Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals

RESULTS

1st place 2nd place 3rd place Least impact

Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061

Field

27

OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery

CONCLUSION

28

Growth of OA should be understood to comprise two main aspects

Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve

CONCLUSION

29

On average openly accessible papers have a decidedly greater impact

In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established

No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do

CONCLUSION

30

Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers

Visit Science-Metrix to learn about our evaluation and measurement activities

THANK YOU

  • Slide Number 1
  • Background
  • SYNOPSIS
  • Definitions
  • Definitions
  • Definitions
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Measuring the of OA papers
  • Measuring the of OA papers
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • Conclusion
  • Conclusion
  • Conclusion
  • Thank you

6

Complexity of OA definition and measurement notably due to

Embargoes Transiency Rights of all kind (to self-archive to crawl to recombine to use commercially etc) Discoverability

DEFINITIONS

7

Rules of involvement in OA

VANTAGE POINTS

8

Directoriesregistries of repositories Directory of OA journals

VANTAGE POINTS

9

Free or open source repository software DSpace EPrints Archimede DAITSS Dienst Enterprise-Wide Digital Repository and Archive ETD-db eXtensible Text Framework Fedora Greenstone Invenio IRPlus Keystone Digital Library Suite MOAI Omeka OPUS PubMan WEKO PeerLibrary

Source httpoadsimmonseduoadwikiFree_and_open-source_repository_software

VANTAGE POINTS

10

Key repositories arXivorg ndash the mothership PubMed Central Europe PubMed Central

Aggregators OpenAire BASE CORE

A typical repository hosted by the Umearing universitet Library

VANTAGE POINTS

11

Despite or perhaps because of all the sources of OA available it is very difficult to measure the availability of OA Here we are concerned about the availability of peer-reviewed articles published in scholarly journals Why ndash this is what policies and mandates are preoccupied with

BIBLIOMETRICS ndash PROPORTION OF OA

12

Bottom-up measurement One would have to harvest all the sources available and de-duplicate results The main problem is how to determine reliably that items

(1) were published (as opposed to an un-submitted manuscript) (2) are peer-reviewed (3) answer the ldquoso whatrdquo question (you found there were 14325678 papers so what)

BIBLIOMETRICS ndash PROPORTION OF OA

13

Top-down measurement One would have to find an exhaustive bibliographic database of peer-reviewed articles and verify the availability of all papers Main problems

(1) there is no such database (2) extremely tedious to check all of them (3) how do you actually do that

BIBLIOMETRICS ndash PROPORTION OF OA

14

Top-down measurement - Sampling Considering the enormous task at hand most authors have resorted to using sampling and search engines Harnad and team sampled articles from the Web of Science Bjoumlrk and team sampled articles from Scopus Archambault and team sampled articles from Scopus and used multiple techniques as well as search engines

BIBLIOMETRICS ndash PROPORTION OF OA

15

Dealing with search engines Use user-friendly meta-search engines such as DuckDuckGo or DogPile Try to stay below the radar using mainstream search engines Neither solution feels remotely confortable

Other solution is to build a dedicated infrastructure to facilitate OA discovery (this is the solution used by 1science)

BIBLIOMETRICS ndash PROPORTION OF OA

16

Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples

SAMPLING AND METROLOGY

17

A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows

Retrieval Precision = 119905119905119905119905+119891119905

Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records

Recall = 119905119905119905119905+119891119891

Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula

Adjustment = 119905119905+119891119891119905119905+119891119905

SAMPLING AND METROLOGY

18

Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows

119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1

+ 05119891

SAMPLING AND METROLOGY

19

The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers

MEASURING THE OF OA PAPERS

20

For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample

MEASURING THE OF OA PAPERS

21

Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013

RESULTS

0

5

10

15

20

25

30

35

40

45

50

55

60

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013

22

Translation of OA availability between April 2013 and April 2014

RESULTS

y = 2E-21e00234x

Rsup2 = 0976

y = 3E-17e00186x

Rsup2 = 09473

0

5

10

15

20

25

30

35

40

45

50

55

60

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014

Adjusted OA April 2013

23

OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011

RESULTS

y = 2E-112e01335x

Rsup2 = 09976

0

20000

40000

60000

80000

100000

120000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of

OA p

aper

s ba

ckfil

led

betw

een

Apr

il 20

13 a

nd A

pril

2014

24

Growth of the number of papers available in OA as measured in April 2014 1996ndash2013

RESULTS

y = 2E-73e009x

Rsup2 = 09971

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1000000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of p

aper

s in

O

A

Adjusted OA

Measured OA

25

Scientific impact of OA and non-OA papers published in 1996ndash2011

RESULTS

00

02

04

06

08

10

12

14

16

18

1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011

Ave

rage

of re

lativ

e ci

tatio

ns

(ARC 1

= w

orld

ave

rgae

)

OAAll PapersNot OA

26

Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals

RESULTS

1st place 2nd place 3rd place Least impact

Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061

Field

27

OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery

CONCLUSION

28

Growth of OA should be understood to comprise two main aspects

Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve

CONCLUSION

29

On average openly accessible papers have a decidedly greater impact

In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established

No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do

CONCLUSION

30

Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers

Visit Science-Metrix to learn about our evaluation and measurement activities

THANK YOU

  • Slide Number 1
  • Background
  • SYNOPSIS
  • Definitions
  • Definitions
  • Definitions
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Measuring the of OA papers
  • Measuring the of OA papers
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • Conclusion
  • Conclusion
  • Conclusion
  • Thank you

7

Rules of involvement in OA

VANTAGE POINTS

8

Directoriesregistries of repositories Directory of OA journals

VANTAGE POINTS

9

Free or open source repository software DSpace EPrints Archimede DAITSS Dienst Enterprise-Wide Digital Repository and Archive ETD-db eXtensible Text Framework Fedora Greenstone Invenio IRPlus Keystone Digital Library Suite MOAI Omeka OPUS PubMan WEKO PeerLibrary

Source httpoadsimmonseduoadwikiFree_and_open-source_repository_software

VANTAGE POINTS

10

Key repositories arXivorg ndash the mothership PubMed Central Europe PubMed Central

Aggregators OpenAire BASE CORE

A typical repository hosted by the Umearing universitet Library

VANTAGE POINTS

11

Despite or perhaps because of all the sources of OA available it is very difficult to measure the availability of OA Here we are concerned about the availability of peer-reviewed articles published in scholarly journals Why ndash this is what policies and mandates are preoccupied with

BIBLIOMETRICS ndash PROPORTION OF OA

12

Bottom-up measurement One would have to harvest all the sources available and de-duplicate results The main problem is how to determine reliably that items

(1) were published (as opposed to an un-submitted manuscript) (2) are peer-reviewed (3) answer the ldquoso whatrdquo question (you found there were 14325678 papers so what)

BIBLIOMETRICS ndash PROPORTION OF OA

13

Top-down measurement One would have to find an exhaustive bibliographic database of peer-reviewed articles and verify the availability of all papers Main problems

(1) there is no such database (2) extremely tedious to check all of them (3) how do you actually do that

BIBLIOMETRICS ndash PROPORTION OF OA

14

Top-down measurement - Sampling Considering the enormous task at hand most authors have resorted to using sampling and search engines Harnad and team sampled articles from the Web of Science Bjoumlrk and team sampled articles from Scopus Archambault and team sampled articles from Scopus and used multiple techniques as well as search engines

BIBLIOMETRICS ndash PROPORTION OF OA

15

Dealing with search engines Use user-friendly meta-search engines such as DuckDuckGo or DogPile Try to stay below the radar using mainstream search engines Neither solution feels remotely confortable

Other solution is to build a dedicated infrastructure to facilitate OA discovery (this is the solution used by 1science)

BIBLIOMETRICS ndash PROPORTION OF OA

16

Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples

SAMPLING AND METROLOGY

17

A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows

Retrieval Precision = 119905119905119905119905+119891119905

Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records

Recall = 119905119905119905119905+119891119891

Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula

Adjustment = 119905119905+119891119891119905119905+119891119905

SAMPLING AND METROLOGY

18

Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows

119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1

+ 05119891

SAMPLING AND METROLOGY

19

The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers

MEASURING THE OF OA PAPERS

20

For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample

MEASURING THE OF OA PAPERS

21

Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013

RESULTS

0

5

10

15

20

25

30

35

40

45

50

55

60

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013

22

Translation of OA availability between April 2013 and April 2014

RESULTS

y = 2E-21e00234x

Rsup2 = 0976

y = 3E-17e00186x

Rsup2 = 09473

0

5

10

15

20

25

30

35

40

45

50

55

60

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014

Adjusted OA April 2013

23

OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011

RESULTS

y = 2E-112e01335x

Rsup2 = 09976

0

20000

40000

60000

80000

100000

120000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of

OA p

aper

s ba

ckfil

led

betw

een

Apr

il 20

13 a

nd A

pril

2014

24

Growth of the number of papers available in OA as measured in April 2014 1996ndash2013

RESULTS

y = 2E-73e009x

Rsup2 = 09971

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1000000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of p

aper

s in

O

A

Adjusted OA

Measured OA

25

Scientific impact of OA and non-OA papers published in 1996ndash2011

RESULTS

00

02

04

06

08

10

12

14

16

18

1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011

Ave

rage

of re

lativ

e ci

tatio

ns

(ARC 1

= w

orld

ave

rgae

)

OAAll PapersNot OA

26

Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals

RESULTS

1st place 2nd place 3rd place Least impact

Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061

Field

27

OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery

CONCLUSION

28

Growth of OA should be understood to comprise two main aspects

Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve

CONCLUSION

29

On average openly accessible papers have a decidedly greater impact

In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established

No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do

CONCLUSION

30

Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers

Visit Science-Metrix to learn about our evaluation and measurement activities

THANK YOU

  • Slide Number 1
  • Background
  • SYNOPSIS
  • Definitions
  • Definitions
  • Definitions
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Measuring the of OA papers
  • Measuring the of OA papers
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • Conclusion
  • Conclusion
  • Conclusion
  • Thank you

8

Directoriesregistries of repositories Directory of OA journals

VANTAGE POINTS

9

Free or open source repository software DSpace EPrints Archimede DAITSS Dienst Enterprise-Wide Digital Repository and Archive ETD-db eXtensible Text Framework Fedora Greenstone Invenio IRPlus Keystone Digital Library Suite MOAI Omeka OPUS PubMan WEKO PeerLibrary

Source httpoadsimmonseduoadwikiFree_and_open-source_repository_software

VANTAGE POINTS

10

Key repositories arXivorg ndash the mothership PubMed Central Europe PubMed Central

Aggregators OpenAire BASE CORE

A typical repository hosted by the Umearing universitet Library

VANTAGE POINTS

11

Despite or perhaps because of all the sources of OA available it is very difficult to measure the availability of OA Here we are concerned about the availability of peer-reviewed articles published in scholarly journals Why ndash this is what policies and mandates are preoccupied with

BIBLIOMETRICS ndash PROPORTION OF OA

12

Bottom-up measurement One would have to harvest all the sources available and de-duplicate results The main problem is how to determine reliably that items

(1) were published (as opposed to an un-submitted manuscript) (2) are peer-reviewed (3) answer the ldquoso whatrdquo question (you found there were 14325678 papers so what)

BIBLIOMETRICS ndash PROPORTION OF OA

13

Top-down measurement One would have to find an exhaustive bibliographic database of peer-reviewed articles and verify the availability of all papers Main problems

(1) there is no such database (2) extremely tedious to check all of them (3) how do you actually do that

BIBLIOMETRICS ndash PROPORTION OF OA

14

Top-down measurement - Sampling Considering the enormous task at hand most authors have resorted to using sampling and search engines Harnad and team sampled articles from the Web of Science Bjoumlrk and team sampled articles from Scopus Archambault and team sampled articles from Scopus and used multiple techniques as well as search engines

BIBLIOMETRICS ndash PROPORTION OF OA

15

Dealing with search engines Use user-friendly meta-search engines such as DuckDuckGo or DogPile Try to stay below the radar using mainstream search engines Neither solution feels remotely confortable

Other solution is to build a dedicated infrastructure to facilitate OA discovery (this is the solution used by 1science)

BIBLIOMETRICS ndash PROPORTION OF OA

16

Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples

SAMPLING AND METROLOGY

17

A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows

Retrieval Precision = 119905119905119905119905+119891119905

Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records

Recall = 119905119905119905119905+119891119891

Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula

Adjustment = 119905119905+119891119891119905119905+119891119905

SAMPLING AND METROLOGY

18

Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows

119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1

+ 05119891

SAMPLING AND METROLOGY

19

The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers

MEASURING THE OF OA PAPERS

20

For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample

MEASURING THE OF OA PAPERS

21

Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013

RESULTS

0

5

10

15

20

25

30

35

40

45

50

55

60

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013

22

Translation of OA availability between April 2013 and April 2014

RESULTS

y = 2E-21e00234x

Rsup2 = 0976

y = 3E-17e00186x

Rsup2 = 09473

0

5

10

15

20

25

30

35

40

45

50

55

60

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014

Adjusted OA April 2013

23

OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011

RESULTS

y = 2E-112e01335x

Rsup2 = 09976

0

20000

40000

60000

80000

100000

120000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of

OA p

aper

s ba

ckfil

led

betw

een

Apr

il 20

13 a

nd A

pril

2014

24

Growth of the number of papers available in OA as measured in April 2014 1996ndash2013

RESULTS

y = 2E-73e009x

Rsup2 = 09971

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1000000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of p

aper

s in

O

A

Adjusted OA

Measured OA

25

Scientific impact of OA and non-OA papers published in 1996ndash2011

RESULTS

00

02

04

06

08

10

12

14

16

18

1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011

Ave

rage

of re

lativ

e ci

tatio

ns

(ARC 1

= w

orld

ave

rgae

)

OAAll PapersNot OA

26

Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals

RESULTS

1st place 2nd place 3rd place Least impact

Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061

Field

27

OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery

CONCLUSION

28

Growth of OA should be understood to comprise two main aspects

Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve

CONCLUSION

29

On average openly accessible papers have a decidedly greater impact

In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established

No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do

CONCLUSION

30

Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers

Visit Science-Metrix to learn about our evaluation and measurement activities

THANK YOU

  • Slide Number 1
  • Background
  • SYNOPSIS
  • Definitions
  • Definitions
  • Definitions
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Measuring the of OA papers
  • Measuring the of OA papers
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • Conclusion
  • Conclusion
  • Conclusion
  • Thank you

9

Free or open source repository software DSpace EPrints Archimede DAITSS Dienst Enterprise-Wide Digital Repository and Archive ETD-db eXtensible Text Framework Fedora Greenstone Invenio IRPlus Keystone Digital Library Suite MOAI Omeka OPUS PubMan WEKO PeerLibrary

Source httpoadsimmonseduoadwikiFree_and_open-source_repository_software

VANTAGE POINTS

10

Key repositories arXivorg ndash the mothership PubMed Central Europe PubMed Central

Aggregators OpenAire BASE CORE

A typical repository hosted by the Umearing universitet Library

VANTAGE POINTS

11

Despite or perhaps because of all the sources of OA available it is very difficult to measure the availability of OA Here we are concerned about the availability of peer-reviewed articles published in scholarly journals Why ndash this is what policies and mandates are preoccupied with

BIBLIOMETRICS ndash PROPORTION OF OA

12

Bottom-up measurement One would have to harvest all the sources available and de-duplicate results The main problem is how to determine reliably that items

(1) were published (as opposed to an un-submitted manuscript) (2) are peer-reviewed (3) answer the ldquoso whatrdquo question (you found there were 14325678 papers so what)

BIBLIOMETRICS ndash PROPORTION OF OA

13

Top-down measurement One would have to find an exhaustive bibliographic database of peer-reviewed articles and verify the availability of all papers Main problems

(1) there is no such database (2) extremely tedious to check all of them (3) how do you actually do that

BIBLIOMETRICS ndash PROPORTION OF OA

14

Top-down measurement - Sampling Considering the enormous task at hand most authors have resorted to using sampling and search engines Harnad and team sampled articles from the Web of Science Bjoumlrk and team sampled articles from Scopus Archambault and team sampled articles from Scopus and used multiple techniques as well as search engines

BIBLIOMETRICS ndash PROPORTION OF OA

15

Dealing with search engines Use user-friendly meta-search engines such as DuckDuckGo or DogPile Try to stay below the radar using mainstream search engines Neither solution feels remotely confortable

Other solution is to build a dedicated infrastructure to facilitate OA discovery (this is the solution used by 1science)

BIBLIOMETRICS ndash PROPORTION OF OA

16

Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples

SAMPLING AND METROLOGY

17

A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows

Retrieval Precision = 119905119905119905119905+119891119905

Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records

Recall = 119905119905119905119905+119891119891

Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula

Adjustment = 119905119905+119891119891119905119905+119891119905

SAMPLING AND METROLOGY

18

Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows

119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1

+ 05119891

SAMPLING AND METROLOGY

19

The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers

MEASURING THE OF OA PAPERS

20

For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample

MEASURING THE OF OA PAPERS

21

Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013

RESULTS

0

5

10

15

20

25

30

35

40

45

50

55

60

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013

22

Translation of OA availability between April 2013 and April 2014

RESULTS

y = 2E-21e00234x

Rsup2 = 0976

y = 3E-17e00186x

Rsup2 = 09473

0

5

10

15

20

25

30

35

40

45

50

55

60

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014

Adjusted OA April 2013

23

OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011

RESULTS

y = 2E-112e01335x

Rsup2 = 09976

0

20000

40000

60000

80000

100000

120000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of

OA p

aper

s ba

ckfil

led

betw

een

Apr

il 20

13 a

nd A

pril

2014

24

Growth of the number of papers available in OA as measured in April 2014 1996ndash2013

RESULTS

y = 2E-73e009x

Rsup2 = 09971

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1000000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of p

aper

s in

O

A

Adjusted OA

Measured OA

25

Scientific impact of OA and non-OA papers published in 1996ndash2011

RESULTS

00

02

04

06

08

10

12

14

16

18

1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011

Ave

rage

of re

lativ

e ci

tatio

ns

(ARC 1

= w

orld

ave

rgae

)

OAAll PapersNot OA

26

Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals

RESULTS

1st place 2nd place 3rd place Least impact

Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061

Field

27

OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery

CONCLUSION

28

Growth of OA should be understood to comprise two main aspects

Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve

CONCLUSION

29

On average openly accessible papers have a decidedly greater impact

In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established

No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do

CONCLUSION

30

Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers

Visit Science-Metrix to learn about our evaluation and measurement activities

THANK YOU

  • Slide Number 1
  • Background
  • SYNOPSIS
  • Definitions
  • Definitions
  • Definitions
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Measuring the of OA papers
  • Measuring the of OA papers
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • Conclusion
  • Conclusion
  • Conclusion
  • Thank you

10

Key repositories arXivorg ndash the mothership PubMed Central Europe PubMed Central

Aggregators OpenAire BASE CORE

A typical repository hosted by the Umearing universitet Library

VANTAGE POINTS

11

Despite or perhaps because of all the sources of OA available it is very difficult to measure the availability of OA Here we are concerned about the availability of peer-reviewed articles published in scholarly journals Why ndash this is what policies and mandates are preoccupied with

BIBLIOMETRICS ndash PROPORTION OF OA

12

Bottom-up measurement One would have to harvest all the sources available and de-duplicate results The main problem is how to determine reliably that items

(1) were published (as opposed to an un-submitted manuscript) (2) are peer-reviewed (3) answer the ldquoso whatrdquo question (you found there were 14325678 papers so what)

BIBLIOMETRICS ndash PROPORTION OF OA

13

Top-down measurement One would have to find an exhaustive bibliographic database of peer-reviewed articles and verify the availability of all papers Main problems

(1) there is no such database (2) extremely tedious to check all of them (3) how do you actually do that

BIBLIOMETRICS ndash PROPORTION OF OA

14

Top-down measurement - Sampling Considering the enormous task at hand most authors have resorted to using sampling and search engines Harnad and team sampled articles from the Web of Science Bjoumlrk and team sampled articles from Scopus Archambault and team sampled articles from Scopus and used multiple techniques as well as search engines

BIBLIOMETRICS ndash PROPORTION OF OA

15

Dealing with search engines Use user-friendly meta-search engines such as DuckDuckGo or DogPile Try to stay below the radar using mainstream search engines Neither solution feels remotely confortable

Other solution is to build a dedicated infrastructure to facilitate OA discovery (this is the solution used by 1science)

BIBLIOMETRICS ndash PROPORTION OF OA

16

Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples

SAMPLING AND METROLOGY

17

A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows

Retrieval Precision = 119905119905119905119905+119891119905

Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records

Recall = 119905119905119905119905+119891119891

Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula

Adjustment = 119905119905+119891119891119905119905+119891119905

SAMPLING AND METROLOGY

18

Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows

119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1

+ 05119891

SAMPLING AND METROLOGY

19

The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers

MEASURING THE OF OA PAPERS

20

For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample

MEASURING THE OF OA PAPERS

21

Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013

RESULTS

0

5

10

15

20

25

30

35

40

45

50

55

60

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013

22

Translation of OA availability between April 2013 and April 2014

RESULTS

y = 2E-21e00234x

Rsup2 = 0976

y = 3E-17e00186x

Rsup2 = 09473

0

5

10

15

20

25

30

35

40

45

50

55

60

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014

Adjusted OA April 2013

23

OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011

RESULTS

y = 2E-112e01335x

Rsup2 = 09976

0

20000

40000

60000

80000

100000

120000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of

OA p

aper

s ba

ckfil

led

betw

een

Apr

il 20

13 a

nd A

pril

2014

24

Growth of the number of papers available in OA as measured in April 2014 1996ndash2013

RESULTS

y = 2E-73e009x

Rsup2 = 09971

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1000000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of p

aper

s in

O

A

Adjusted OA

Measured OA

25

Scientific impact of OA and non-OA papers published in 1996ndash2011

RESULTS

00

02

04

06

08

10

12

14

16

18

1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011

Ave

rage

of re

lativ

e ci

tatio

ns

(ARC 1

= w

orld

ave

rgae

)

OAAll PapersNot OA

26

Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals

RESULTS

1st place 2nd place 3rd place Least impact

Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061

Field

27

OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery

CONCLUSION

28

Growth of OA should be understood to comprise two main aspects

Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve

CONCLUSION

29

On average openly accessible papers have a decidedly greater impact

In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established

No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do

CONCLUSION

30

Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers

Visit Science-Metrix to learn about our evaluation and measurement activities

THANK YOU

  • Slide Number 1
  • Background
  • SYNOPSIS
  • Definitions
  • Definitions
  • Definitions
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Measuring the of OA papers
  • Measuring the of OA papers
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • Conclusion
  • Conclusion
  • Conclusion
  • Thank you

11

Despite or perhaps because of all the sources of OA available it is very difficult to measure the availability of OA Here we are concerned about the availability of peer-reviewed articles published in scholarly journals Why ndash this is what policies and mandates are preoccupied with

BIBLIOMETRICS ndash PROPORTION OF OA

12

Bottom-up measurement One would have to harvest all the sources available and de-duplicate results The main problem is how to determine reliably that items

(1) were published (as opposed to an un-submitted manuscript) (2) are peer-reviewed (3) answer the ldquoso whatrdquo question (you found there were 14325678 papers so what)

BIBLIOMETRICS ndash PROPORTION OF OA

13

Top-down measurement One would have to find an exhaustive bibliographic database of peer-reviewed articles and verify the availability of all papers Main problems

(1) there is no such database (2) extremely tedious to check all of them (3) how do you actually do that

BIBLIOMETRICS ndash PROPORTION OF OA

14

Top-down measurement - Sampling Considering the enormous task at hand most authors have resorted to using sampling and search engines Harnad and team sampled articles from the Web of Science Bjoumlrk and team sampled articles from Scopus Archambault and team sampled articles from Scopus and used multiple techniques as well as search engines

BIBLIOMETRICS ndash PROPORTION OF OA

15

Dealing with search engines Use user-friendly meta-search engines such as DuckDuckGo or DogPile Try to stay below the radar using mainstream search engines Neither solution feels remotely confortable

Other solution is to build a dedicated infrastructure to facilitate OA discovery (this is the solution used by 1science)

BIBLIOMETRICS ndash PROPORTION OF OA

16

Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples

SAMPLING AND METROLOGY

17

A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows

Retrieval Precision = 119905119905119905119905+119891119905

Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records

Recall = 119905119905119905119905+119891119891

Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula

Adjustment = 119905119905+119891119891119905119905+119891119905

SAMPLING AND METROLOGY

18

Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows

119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1

+ 05119891

SAMPLING AND METROLOGY

19

The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers

MEASURING THE OF OA PAPERS

20

For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample

MEASURING THE OF OA PAPERS

21

Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013

RESULTS

0

5

10

15

20

25

30

35

40

45

50

55

60

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013

22

Translation of OA availability between April 2013 and April 2014

RESULTS

y = 2E-21e00234x

Rsup2 = 0976

y = 3E-17e00186x

Rsup2 = 09473

0

5

10

15

20

25

30

35

40

45

50

55

60

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014

Adjusted OA April 2013

23

OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011

RESULTS

y = 2E-112e01335x

Rsup2 = 09976

0

20000

40000

60000

80000

100000

120000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of

OA p

aper

s ba

ckfil

led

betw

een

Apr

il 20

13 a

nd A

pril

2014

24

Growth of the number of papers available in OA as measured in April 2014 1996ndash2013

RESULTS

y = 2E-73e009x

Rsup2 = 09971

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1000000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of p

aper

s in

O

A

Adjusted OA

Measured OA

25

Scientific impact of OA and non-OA papers published in 1996ndash2011

RESULTS

00

02

04

06

08

10

12

14

16

18

1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011

Ave

rage

of re

lativ

e ci

tatio

ns

(ARC 1

= w

orld

ave

rgae

)

OAAll PapersNot OA

26

Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals

RESULTS

1st place 2nd place 3rd place Least impact

Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061

Field

27

OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery

CONCLUSION

28

Growth of OA should be understood to comprise two main aspects

Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve

CONCLUSION

29

On average openly accessible papers have a decidedly greater impact

In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established

No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do

CONCLUSION

30

Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers

Visit Science-Metrix to learn about our evaluation and measurement activities

THANK YOU

  • Slide Number 1
  • Background
  • SYNOPSIS
  • Definitions
  • Definitions
  • Definitions
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Measuring the of OA papers
  • Measuring the of OA papers
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • Conclusion
  • Conclusion
  • Conclusion
  • Thank you

12

Bottom-up measurement One would have to harvest all the sources available and de-duplicate results The main problem is how to determine reliably that items

(1) were published (as opposed to an un-submitted manuscript) (2) are peer-reviewed (3) answer the ldquoso whatrdquo question (you found there were 14325678 papers so what)

BIBLIOMETRICS ndash PROPORTION OF OA

13

Top-down measurement One would have to find an exhaustive bibliographic database of peer-reviewed articles and verify the availability of all papers Main problems

(1) there is no such database (2) extremely tedious to check all of them (3) how do you actually do that

BIBLIOMETRICS ndash PROPORTION OF OA

14

Top-down measurement - Sampling Considering the enormous task at hand most authors have resorted to using sampling and search engines Harnad and team sampled articles from the Web of Science Bjoumlrk and team sampled articles from Scopus Archambault and team sampled articles from Scopus and used multiple techniques as well as search engines

BIBLIOMETRICS ndash PROPORTION OF OA

15

Dealing with search engines Use user-friendly meta-search engines such as DuckDuckGo or DogPile Try to stay below the radar using mainstream search engines Neither solution feels remotely confortable

Other solution is to build a dedicated infrastructure to facilitate OA discovery (this is the solution used by 1science)

BIBLIOMETRICS ndash PROPORTION OF OA

16

Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples

SAMPLING AND METROLOGY

17

A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows

Retrieval Precision = 119905119905119905119905+119891119905

Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records

Recall = 119905119905119905119905+119891119891

Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula

Adjustment = 119905119905+119891119891119905119905+119891119905

SAMPLING AND METROLOGY

18

Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows

119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1

+ 05119891

SAMPLING AND METROLOGY

19

The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers

MEASURING THE OF OA PAPERS

20

For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample

MEASURING THE OF OA PAPERS

21

Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013

RESULTS

0

5

10

15

20

25

30

35

40

45

50

55

60

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013

22

Translation of OA availability between April 2013 and April 2014

RESULTS

y = 2E-21e00234x

Rsup2 = 0976

y = 3E-17e00186x

Rsup2 = 09473

0

5

10

15

20

25

30

35

40

45

50

55

60

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014

Adjusted OA April 2013

23

OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011

RESULTS

y = 2E-112e01335x

Rsup2 = 09976

0

20000

40000

60000

80000

100000

120000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of

OA p

aper

s ba

ckfil

led

betw

een

Apr

il 20

13 a

nd A

pril

2014

24

Growth of the number of papers available in OA as measured in April 2014 1996ndash2013

RESULTS

y = 2E-73e009x

Rsup2 = 09971

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1000000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of p

aper

s in

O

A

Adjusted OA

Measured OA

25

Scientific impact of OA and non-OA papers published in 1996ndash2011

RESULTS

00

02

04

06

08

10

12

14

16

18

1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011

Ave

rage

of re

lativ

e ci

tatio

ns

(ARC 1

= w

orld

ave

rgae

)

OAAll PapersNot OA

26

Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals

RESULTS

1st place 2nd place 3rd place Least impact

Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061

Field

27

OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery

CONCLUSION

28

Growth of OA should be understood to comprise two main aspects

Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve

CONCLUSION

29

On average openly accessible papers have a decidedly greater impact

In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established

No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do

CONCLUSION

30

Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers

Visit Science-Metrix to learn about our evaluation and measurement activities

THANK YOU

  • Slide Number 1
  • Background
  • SYNOPSIS
  • Definitions
  • Definitions
  • Definitions
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Measuring the of OA papers
  • Measuring the of OA papers
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • Conclusion
  • Conclusion
  • Conclusion
  • Thank you

13

Top-down measurement One would have to find an exhaustive bibliographic database of peer-reviewed articles and verify the availability of all papers Main problems

(1) there is no such database (2) extremely tedious to check all of them (3) how do you actually do that

BIBLIOMETRICS ndash PROPORTION OF OA

14

Top-down measurement - Sampling Considering the enormous task at hand most authors have resorted to using sampling and search engines Harnad and team sampled articles from the Web of Science Bjoumlrk and team sampled articles from Scopus Archambault and team sampled articles from Scopus and used multiple techniques as well as search engines

BIBLIOMETRICS ndash PROPORTION OF OA

15

Dealing with search engines Use user-friendly meta-search engines such as DuckDuckGo or DogPile Try to stay below the radar using mainstream search engines Neither solution feels remotely confortable

Other solution is to build a dedicated infrastructure to facilitate OA discovery (this is the solution used by 1science)

BIBLIOMETRICS ndash PROPORTION OF OA

16

Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples

SAMPLING AND METROLOGY

17

A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows

Retrieval Precision = 119905119905119905119905+119891119905

Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records

Recall = 119905119905119905119905+119891119891

Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula

Adjustment = 119905119905+119891119891119905119905+119891119905

SAMPLING AND METROLOGY

18

Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows

119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1

+ 05119891

SAMPLING AND METROLOGY

19

The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers

MEASURING THE OF OA PAPERS

20

For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample

MEASURING THE OF OA PAPERS

21

Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013

RESULTS

0

5

10

15

20

25

30

35

40

45

50

55

60

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013

22

Translation of OA availability between April 2013 and April 2014

RESULTS

y = 2E-21e00234x

Rsup2 = 0976

y = 3E-17e00186x

Rsup2 = 09473

0

5

10

15

20

25

30

35

40

45

50

55

60

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014

Adjusted OA April 2013

23

OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011

RESULTS

y = 2E-112e01335x

Rsup2 = 09976

0

20000

40000

60000

80000

100000

120000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of

OA p

aper

s ba

ckfil

led

betw

een

Apr

il 20

13 a

nd A

pril

2014

24

Growth of the number of papers available in OA as measured in April 2014 1996ndash2013

RESULTS

y = 2E-73e009x

Rsup2 = 09971

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1000000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of p

aper

s in

O

A

Adjusted OA

Measured OA

25

Scientific impact of OA and non-OA papers published in 1996ndash2011

RESULTS

00

02

04

06

08

10

12

14

16

18

1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011

Ave

rage

of re

lativ

e ci

tatio

ns

(ARC 1

= w

orld

ave

rgae

)

OAAll PapersNot OA

26

Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals

RESULTS

1st place 2nd place 3rd place Least impact

Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061

Field

27

OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery

CONCLUSION

28

Growth of OA should be understood to comprise two main aspects

Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve

CONCLUSION

29

On average openly accessible papers have a decidedly greater impact

In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established

No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do

CONCLUSION

30

Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers

Visit Science-Metrix to learn about our evaluation and measurement activities

THANK YOU

  • Slide Number 1
  • Background
  • SYNOPSIS
  • Definitions
  • Definitions
  • Definitions
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Measuring the of OA papers
  • Measuring the of OA papers
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • Conclusion
  • Conclusion
  • Conclusion
  • Thank you

14

Top-down measurement - Sampling Considering the enormous task at hand most authors have resorted to using sampling and search engines Harnad and team sampled articles from the Web of Science Bjoumlrk and team sampled articles from Scopus Archambault and team sampled articles from Scopus and used multiple techniques as well as search engines

BIBLIOMETRICS ndash PROPORTION OF OA

15

Dealing with search engines Use user-friendly meta-search engines such as DuckDuckGo or DogPile Try to stay below the radar using mainstream search engines Neither solution feels remotely confortable

Other solution is to build a dedicated infrastructure to facilitate OA discovery (this is the solution used by 1science)

BIBLIOMETRICS ndash PROPORTION OF OA

16

Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples

SAMPLING AND METROLOGY

17

A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows

Retrieval Precision = 119905119905119905119905+119891119905

Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records

Recall = 119905119905119905119905+119891119891

Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula

Adjustment = 119905119905+119891119891119905119905+119891119905

SAMPLING AND METROLOGY

18

Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows

119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1

+ 05119891

SAMPLING AND METROLOGY

19

The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers

MEASURING THE OF OA PAPERS

20

For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample

MEASURING THE OF OA PAPERS

21

Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013

RESULTS

0

5

10

15

20

25

30

35

40

45

50

55

60

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013

22

Translation of OA availability between April 2013 and April 2014

RESULTS

y = 2E-21e00234x

Rsup2 = 0976

y = 3E-17e00186x

Rsup2 = 09473

0

5

10

15

20

25

30

35

40

45

50

55

60

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014

Adjusted OA April 2013

23

OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011

RESULTS

y = 2E-112e01335x

Rsup2 = 09976

0

20000

40000

60000

80000

100000

120000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of

OA p

aper

s ba

ckfil

led

betw

een

Apr

il 20

13 a

nd A

pril

2014

24

Growth of the number of papers available in OA as measured in April 2014 1996ndash2013

RESULTS

y = 2E-73e009x

Rsup2 = 09971

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1000000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of p

aper

s in

O

A

Adjusted OA

Measured OA

25

Scientific impact of OA and non-OA papers published in 1996ndash2011

RESULTS

00

02

04

06

08

10

12

14

16

18

1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011

Ave

rage

of re

lativ

e ci

tatio

ns

(ARC 1

= w

orld

ave

rgae

)

OAAll PapersNot OA

26

Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals

RESULTS

1st place 2nd place 3rd place Least impact

Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061

Field

27

OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery

CONCLUSION

28

Growth of OA should be understood to comprise two main aspects

Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve

CONCLUSION

29

On average openly accessible papers have a decidedly greater impact

In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established

No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do

CONCLUSION

30

Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers

Visit Science-Metrix to learn about our evaluation and measurement activities

THANK YOU

  • Slide Number 1
  • Background
  • SYNOPSIS
  • Definitions
  • Definitions
  • Definitions
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Measuring the of OA papers
  • Measuring the of OA papers
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • Conclusion
  • Conclusion
  • Conclusion
  • Thank you

15

Dealing with search engines Use user-friendly meta-search engines such as DuckDuckGo or DogPile Try to stay below the radar using mainstream search engines Neither solution feels remotely confortable

Other solution is to build a dedicated infrastructure to facilitate OA discovery (this is the solution used by 1science)

BIBLIOMETRICS ndash PROPORTION OF OA

16

Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples

SAMPLING AND METROLOGY

17

A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows

Retrieval Precision = 119905119905119905119905+119891119905

Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records

Recall = 119905119905119905119905+119891119891

Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula

Adjustment = 119905119905+119891119891119905119905+119891119905

SAMPLING AND METROLOGY

18

Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows

119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1

+ 05119891

SAMPLING AND METROLOGY

19

The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers

MEASURING THE OF OA PAPERS

20

For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample

MEASURING THE OF OA PAPERS

21

Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013

RESULTS

0

5

10

15

20

25

30

35

40

45

50

55

60

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013

22

Translation of OA availability between April 2013 and April 2014

RESULTS

y = 2E-21e00234x

Rsup2 = 0976

y = 3E-17e00186x

Rsup2 = 09473

0

5

10

15

20

25

30

35

40

45

50

55

60

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014

Adjusted OA April 2013

23

OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011

RESULTS

y = 2E-112e01335x

Rsup2 = 09976

0

20000

40000

60000

80000

100000

120000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of

OA p

aper

s ba

ckfil

led

betw

een

Apr

il 20

13 a

nd A

pril

2014

24

Growth of the number of papers available in OA as measured in April 2014 1996ndash2013

RESULTS

y = 2E-73e009x

Rsup2 = 09971

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1000000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of p

aper

s in

O

A

Adjusted OA

Measured OA

25

Scientific impact of OA and non-OA papers published in 1996ndash2011

RESULTS

00

02

04

06

08

10

12

14

16

18

1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011

Ave

rage

of re

lativ

e ci

tatio

ns

(ARC 1

= w

orld

ave

rgae

)

OAAll PapersNot OA

26

Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals

RESULTS

1st place 2nd place 3rd place Least impact

Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061

Field

27

OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery

CONCLUSION

28

Growth of OA should be understood to comprise two main aspects

Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve

CONCLUSION

29

On average openly accessible papers have a decidedly greater impact

In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established

No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do

CONCLUSION

30

Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers

Visit Science-Metrix to learn about our evaluation and measurement activities

THANK YOU

  • Slide Number 1
  • Background
  • SYNOPSIS
  • Definitions
  • Definitions
  • Definitions
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Measuring the of OA papers
  • Measuring the of OA papers
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • Conclusion
  • Conclusion
  • Conclusion
  • Thank you

16

Divergence from the real measure is due to Capacity to design instrument that provides true value (function of recall and retrieval precision) Capacity to increase statistical significance through large samples

SAMPLING AND METROLOGY

17

A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows

Retrieval Precision = 119905119905119905119905+119891119905

Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records

Recall = 119905119905119905119905+119891119891

Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula

Adjustment = 119905119905+119891119891119905119905+119891119905

SAMPLING AND METROLOGY

18

Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows

119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1

+ 05119891

SAMPLING AND METROLOGY

19

The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers

MEASURING THE OF OA PAPERS

20

For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample

MEASURING THE OF OA PAPERS

21

Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013

RESULTS

0

5

10

15

20

25

30

35

40

45

50

55

60

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013

22

Translation of OA availability between April 2013 and April 2014

RESULTS

y = 2E-21e00234x

Rsup2 = 0976

y = 3E-17e00186x

Rsup2 = 09473

0

5

10

15

20

25

30

35

40

45

50

55

60

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014

Adjusted OA April 2013

23

OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011

RESULTS

y = 2E-112e01335x

Rsup2 = 09976

0

20000

40000

60000

80000

100000

120000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of

OA p

aper

s ba

ckfil

led

betw

een

Apr

il 20

13 a

nd A

pril

2014

24

Growth of the number of papers available in OA as measured in April 2014 1996ndash2013

RESULTS

y = 2E-73e009x

Rsup2 = 09971

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1000000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of p

aper

s in

O

A

Adjusted OA

Measured OA

25

Scientific impact of OA and non-OA papers published in 1996ndash2011

RESULTS

00

02

04

06

08

10

12

14

16

18

1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011

Ave

rage

of re

lativ

e ci

tatio

ns

(ARC 1

= w

orld

ave

rgae

)

OAAll PapersNot OA

26

Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals

RESULTS

1st place 2nd place 3rd place Least impact

Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061

Field

27

OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery

CONCLUSION

28

Growth of OA should be understood to comprise two main aspects

Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve

CONCLUSION

29

On average openly accessible papers have a decidedly greater impact

In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established

No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do

CONCLUSION

30

Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers

Visit Science-Metrix to learn about our evaluation and measurement activities

THANK YOU

  • Slide Number 1
  • Background
  • SYNOPSIS
  • Definitions
  • Definitions
  • Definitions
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Measuring the of OA papers
  • Measuring the of OA papers
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • Conclusion
  • Conclusion
  • Conclusion
  • Thank you

17

A true positive (tp) in the present case is a paper known to be available in OA which is found by the harvesting instrument developed for the current project A true negative (tn) is an article which is not available for free and is not found by the instrument False positives and negatives (fp and fn) are the converse of the later Retrieval precision also called positive predictive value provides an estimation of how frequently the instrument finds correct positive results and is calculated as follows

Retrieval Precision = 119905119905119905119905+119891119905

Recall also called true positive rate or sensitivity is the capacity to correctly identify a large proportion of the positive records

Recall = 119905119905119905119905+119891119891

Knowing the precise characteristics in terms of true and false positives and negatives allows for the computation of an adjustment score which can then be applied to recalibrate the results to obtain a truer measure one that corrects the limits of the instrument The adjustment made in the previous study is based on the following formula

Adjustment = 119905119905+119891119891119905119905+119891119905

SAMPLING AND METROLOGY

18

Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows

119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1

+ 05119891

SAMPLING AND METROLOGY

19

The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers

MEASURING THE OF OA PAPERS

20

For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample

MEASURING THE OF OA PAPERS

21

Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013

RESULTS

0

5

10

15

20

25

30

35

40

45

50

55

60

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013

22

Translation of OA availability between April 2013 and April 2014

RESULTS

y = 2E-21e00234x

Rsup2 = 0976

y = 3E-17e00186x

Rsup2 = 09473

0

5

10

15

20

25

30

35

40

45

50

55

60

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014

Adjusted OA April 2013

23

OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011

RESULTS

y = 2E-112e01335x

Rsup2 = 09976

0

20000

40000

60000

80000

100000

120000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of

OA p

aper

s ba

ckfil

led

betw

een

Apr

il 20

13 a

nd A

pril

2014

24

Growth of the number of papers available in OA as measured in April 2014 1996ndash2013

RESULTS

y = 2E-73e009x

Rsup2 = 09971

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1000000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of p

aper

s in

O

A

Adjusted OA

Measured OA

25

Scientific impact of OA and non-OA papers published in 1996ndash2011

RESULTS

00

02

04

06

08

10

12

14

16

18

1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011

Ave

rage

of re

lativ

e ci

tatio

ns

(ARC 1

= w

orld

ave

rgae

)

OAAll PapersNot OA

26

Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals

RESULTS

1st place 2nd place 3rd place Least impact

Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061

Field

27

OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery

CONCLUSION

28

Growth of OA should be understood to comprise two main aspects

Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve

CONCLUSION

29

On average openly accessible papers have a decidedly greater impact

In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established

No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do

CONCLUSION

30

Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers

Visit Science-Metrix to learn about our evaluation and measurement activities

THANK YOU

  • Slide Number 1
  • Background
  • SYNOPSIS
  • Definitions
  • Definitions
  • Definitions
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Measuring the of OA papers
  • Measuring the of OA papers
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • Conclusion
  • Conclusion
  • Conclusion
  • Thank you

18

Statistical precision can be assessed with the margin of error (ME) For a proportion (p) where the population is finite and known (which is the case here as the population from which we are sampling is the Scopus database) (N) is not systematically much larger than the sample size (n) and in which the values are discrete (for example papers are discrete as one does not publish one third of a paper) given a critical score Z (which will be set at 095 in the study) ME is calculated as follows

119872119872 = 119885 119905 1minus119905 119873minus119891119891 119873minus1

+ 05119891

SAMPLING AND METROLOGY

19

The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers

MEASURING THE OF OA PAPERS

20

For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample

MEASURING THE OF OA PAPERS

21

Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013

RESULTS

0

5

10

15

20

25

30

35

40

45

50

55

60

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013

22

Translation of OA availability between April 2013 and April 2014

RESULTS

y = 2E-21e00234x

Rsup2 = 0976

y = 3E-17e00186x

Rsup2 = 09473

0

5

10

15

20

25

30

35

40

45

50

55

60

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014

Adjusted OA April 2013

23

OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011

RESULTS

y = 2E-112e01335x

Rsup2 = 09976

0

20000

40000

60000

80000

100000

120000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of

OA p

aper

s ba

ckfil

led

betw

een

Apr

il 20

13 a

nd A

pril

2014

24

Growth of the number of papers available in OA as measured in April 2014 1996ndash2013

RESULTS

y = 2E-73e009x

Rsup2 = 09971

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1000000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of p

aper

s in

O

A

Adjusted OA

Measured OA

25

Scientific impact of OA and non-OA papers published in 1996ndash2011

RESULTS

00

02

04

06

08

10

12

14

16

18

1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011

Ave

rage

of re

lativ

e ci

tatio

ns

(ARC 1

= w

orld

ave

rgae

)

OAAll PapersNot OA

26

Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals

RESULTS

1st place 2nd place 3rd place Least impact

Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061

Field

27

OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery

CONCLUSION

28

Growth of OA should be understood to comprise two main aspects

Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve

CONCLUSION

29

On average openly accessible papers have a decidedly greater impact

In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established

No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do

CONCLUSION

30

Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers

Visit Science-Metrix to learn about our evaluation and measurement activities

THANK YOU

  • Slide Number 1
  • Background
  • SYNOPSIS
  • Definitions
  • Definitions
  • Definitions
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Measuring the of OA papers
  • Measuring the of OA papers
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • Conclusion
  • Conclusion
  • Conclusion
  • Thank you

19

The harvesting engine developed by Science-Metrix searches specific sites including Scielo PubMed Central Research Gate and CiteSeerX It also uses a locally hosted version of the metadata of large-scale specialised repositories such as arXiv It systematically harvests metadata from institutional repositories listed in ROAR and OpenDOAR Finally and in addition a portion of the harvesting engine works in the cloud and searches for freely available papers

MEASURING THE OF OA PAPERS

20

For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample

MEASURING THE OF OA PAPERS

21

Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013

RESULTS

0

5

10

15

20

25

30

35

40

45

50

55

60

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013

22

Translation of OA availability between April 2013 and April 2014

RESULTS

y = 2E-21e00234x

Rsup2 = 0976

y = 3E-17e00186x

Rsup2 = 09473

0

5

10

15

20

25

30

35

40

45

50

55

60

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014

Adjusted OA April 2013

23

OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011

RESULTS

y = 2E-112e01335x

Rsup2 = 09976

0

20000

40000

60000

80000

100000

120000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of

OA p

aper

s ba

ckfil

led

betw

een

Apr

il 20

13 a

nd A

pril

2014

24

Growth of the number of papers available in OA as measured in April 2014 1996ndash2013

RESULTS

y = 2E-73e009x

Rsup2 = 09971

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1000000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of p

aper

s in

O

A

Adjusted OA

Measured OA

25

Scientific impact of OA and non-OA papers published in 1996ndash2011

RESULTS

00

02

04

06

08

10

12

14

16

18

1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011

Ave

rage

of re

lativ

e ci

tatio

ns

(ARC 1

= w

orld

ave

rgae

)

OAAll PapersNot OA

26

Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals

RESULTS

1st place 2nd place 3rd place Least impact

Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061

Field

27

OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery

CONCLUSION

28

Growth of OA should be understood to comprise two main aspects

Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve

CONCLUSION

29

On average openly accessible papers have a decidedly greater impact

In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established

No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do

CONCLUSION

30

Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers

Visit Science-Metrix to learn about our evaluation and measurement activities

THANK YOU

  • Slide Number 1
  • Background
  • SYNOPSIS
  • Definitions
  • Definitions
  • Definitions
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Measuring the of OA papers
  • Measuring the of OA papers
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • Conclusion
  • Conclusion
  • Conclusion
  • Thank you

20

For Gold Journal OA articles an estimate of the proportion of papers was made from the random sample by matching the journals that were known to be Gold to the year a paper was published Journals were obtained from the Directory of Open Access Journals (DOAJ) and the list of OA journals in PubMed Central This was done by matching journalsrsquo ISSN E-ISSN and names from Scopus to the relevant records in the sample

MEASURING THE OF OA PAPERS

21

Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013

RESULTS

0

5

10

15

20

25

30

35

40

45

50

55

60

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013

22

Translation of OA availability between April 2013 and April 2014

RESULTS

y = 2E-21e00234x

Rsup2 = 0976

y = 3E-17e00186x

Rsup2 = 09473

0

5

10

15

20

25

30

35

40

45

50

55

60

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014

Adjusted OA April 2013

23

OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011

RESULTS

y = 2E-112e01335x

Rsup2 = 09976

0

20000

40000

60000

80000

100000

120000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of

OA p

aper

s ba

ckfil

led

betw

een

Apr

il 20

13 a

nd A

pril

2014

24

Growth of the number of papers available in OA as measured in April 2014 1996ndash2013

RESULTS

y = 2E-73e009x

Rsup2 = 09971

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1000000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of p

aper

s in

O

A

Adjusted OA

Measured OA

25

Scientific impact of OA and non-OA papers published in 1996ndash2011

RESULTS

00

02

04

06

08

10

12

14

16

18

1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011

Ave

rage

of re

lativ

e ci

tatio

ns

(ARC 1

= w

orld

ave

rgae

)

OAAll PapersNot OA

26

Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals

RESULTS

1st place 2nd place 3rd place Least impact

Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061

Field

27

OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery

CONCLUSION

28

Growth of OA should be understood to comprise two main aspects

Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve

CONCLUSION

29

On average openly accessible papers have a decidedly greater impact

In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established

No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do

CONCLUSION

30

Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers

Visit Science-Metrix to learn about our evaluation and measurement activities

THANK YOU

  • Slide Number 1
  • Background
  • SYNOPSIS
  • Definitions
  • Definitions
  • Definitions
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Measuring the of OA papers
  • Measuring the of OA papers
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • Conclusion
  • Conclusion
  • Conclusion
  • Thank you

21

Evolution of the proportion of OA scientific papers as measured in April 2013 and April 2014 1996ndash2013

RESULTS

0

5

10

15

20

25

30

35

40

45

50

55

60

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014Adjusted OA April 2013Measured OA April 2014Measured OA April 2013

22

Translation of OA availability between April 2013 and April 2014

RESULTS

y = 2E-21e00234x

Rsup2 = 0976

y = 3E-17e00186x

Rsup2 = 09473

0

5

10

15

20

25

30

35

40

45

50

55

60

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014

Adjusted OA April 2013

23

OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011

RESULTS

y = 2E-112e01335x

Rsup2 = 09976

0

20000

40000

60000

80000

100000

120000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of

OA p

aper

s ba

ckfil

led

betw

een

Apr

il 20

13 a

nd A

pril

2014

24

Growth of the number of papers available in OA as measured in April 2014 1996ndash2013

RESULTS

y = 2E-73e009x

Rsup2 = 09971

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1000000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of p

aper

s in

O

A

Adjusted OA

Measured OA

25

Scientific impact of OA and non-OA papers published in 1996ndash2011

RESULTS

00

02

04

06

08

10

12

14

16

18

1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011

Ave

rage

of re

lativ

e ci

tatio

ns

(ARC 1

= w

orld

ave

rgae

)

OAAll PapersNot OA

26

Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals

RESULTS

1st place 2nd place 3rd place Least impact

Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061

Field

27

OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery

CONCLUSION

28

Growth of OA should be understood to comprise two main aspects

Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve

CONCLUSION

29

On average openly accessible papers have a decidedly greater impact

In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established

No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do

CONCLUSION

30

Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers

Visit Science-Metrix to learn about our evaluation and measurement activities

THANK YOU

  • Slide Number 1
  • Background
  • SYNOPSIS
  • Definitions
  • Definitions
  • Definitions
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Measuring the of OA papers
  • Measuring the of OA papers
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • Conclusion
  • Conclusion
  • Conclusion
  • Thank you

22

Translation of OA availability between April 2013 and April 2014

RESULTS

y = 2E-21e00234x

Rsup2 = 0976

y = 3E-17e00186x

Rsup2 = 09473

0

5

10

15

20

25

30

35

40

45

50

55

60

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

o

f pa

pers

ava

ilabl

e in

OA

Adjusted OA April 2014

Adjusted OA April 2013

23

OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011

RESULTS

y = 2E-112e01335x

Rsup2 = 09976

0

20000

40000

60000

80000

100000

120000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of

OA p

aper

s ba

ckfil

led

betw

een

Apr

il 20

13 a

nd A

pril

2014

24

Growth of the number of papers available in OA as measured in April 2014 1996ndash2013

RESULTS

y = 2E-73e009x

Rsup2 = 09971

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1000000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of p

aper

s in

O

A

Adjusted OA

Measured OA

25

Scientific impact of OA and non-OA papers published in 1996ndash2011

RESULTS

00

02

04

06

08

10

12

14

16

18

1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011

Ave

rage

of re

lativ

e ci

tatio

ns

(ARC 1

= w

orld

ave

rgae

)

OAAll PapersNot OA

26

Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals

RESULTS

1st place 2nd place 3rd place Least impact

Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061

Field

27

OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery

CONCLUSION

28

Growth of OA should be understood to comprise two main aspects

Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve

CONCLUSION

29

On average openly accessible papers have a decidedly greater impact

In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established

No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do

CONCLUSION

30

Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers

Visit Science-Metrix to learn about our evaluation and measurement activities

THANK YOU

  • Slide Number 1
  • Background
  • SYNOPSIS
  • Definitions
  • Definitions
  • Definitions
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Measuring the of OA papers
  • Measuring the of OA papers
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • Conclusion
  • Conclusion
  • Conclusion
  • Thank you

23

OA backfilling between April 2013 and April 2014 of papers published in 1996ndash2011

RESULTS

y = 2E-112e01335x

Rsup2 = 09976

0

20000

40000

60000

80000

100000

120000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of

OA p

aper

s ba

ckfil

led

betw

een

Apr

il 20

13 a

nd A

pril

2014

24

Growth of the number of papers available in OA as measured in April 2014 1996ndash2013

RESULTS

y = 2E-73e009x

Rsup2 = 09971

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1000000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of p

aper

s in

O

A

Adjusted OA

Measured OA

25

Scientific impact of OA and non-OA papers published in 1996ndash2011

RESULTS

00

02

04

06

08

10

12

14

16

18

1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011

Ave

rage

of re

lativ

e ci

tatio

ns

(ARC 1

= w

orld

ave

rgae

)

OAAll PapersNot OA

26

Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals

RESULTS

1st place 2nd place 3rd place Least impact

Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061

Field

27

OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery

CONCLUSION

28

Growth of OA should be understood to comprise two main aspects

Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve

CONCLUSION

29

On average openly accessible papers have a decidedly greater impact

In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established

No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do

CONCLUSION

30

Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers

Visit Science-Metrix to learn about our evaluation and measurement activities

THANK YOU

  • Slide Number 1
  • Background
  • SYNOPSIS
  • Definitions
  • Definitions
  • Definitions
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Measuring the of OA papers
  • Measuring the of OA papers
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • Conclusion
  • Conclusion
  • Conclusion
  • Thank you

24

Growth of the number of papers available in OA as measured in April 2014 1996ndash2013

RESULTS

y = 2E-73e009x

Rsup2 = 09971

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1000000

1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014

Num

ber

of p

aper

s in

O

A

Adjusted OA

Measured OA

25

Scientific impact of OA and non-OA papers published in 1996ndash2011

RESULTS

00

02

04

06

08

10

12

14

16

18

1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011

Ave

rage

of re

lativ

e ci

tatio

ns

(ARC 1

= w

orld

ave

rgae

)

OAAll PapersNot OA

26

Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals

RESULTS

1st place 2nd place 3rd place Least impact

Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061

Field

27

OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery

CONCLUSION

28

Growth of OA should be understood to comprise two main aspects

Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve

CONCLUSION

29

On average openly accessible papers have a decidedly greater impact

In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established

No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do

CONCLUSION

30

Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers

Visit Science-Metrix to learn about our evaluation and measurement activities

THANK YOU

  • Slide Number 1
  • Background
  • SYNOPSIS
  • Definitions
  • Definitions
  • Definitions
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Measuring the of OA papers
  • Measuring the of OA papers
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • Conclusion
  • Conclusion
  • Conclusion
  • Thank you

25

Scientific impact of OA and non-OA papers published in 1996ndash2011

RESULTS

00

02

04

06

08

10

12

14

16

18

1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011

Ave

rage

of re

lativ

e ci

tatio

ns

(ARC 1

= w

orld

ave

rgae

)

OAAll PapersNot OA

26

Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals

RESULTS

1st place 2nd place 3rd place Least impact

Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061

Field

27

OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery

CONCLUSION

28

Growth of OA should be understood to comprise two main aspects

Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve

CONCLUSION

29

On average openly accessible papers have a decidedly greater impact

In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established

No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do

CONCLUSION

30

Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers

Visit Science-Metrix to learn about our evaluation and measurement activities

THANK YOU

  • Slide Number 1
  • Background
  • SYNOPSIS
  • Definitions
  • Definitions
  • Definitions
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Measuring the of OA papers
  • Measuring the of OA papers
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • Conclusion
  • Conclusion
  • Conclusion
  • Thank you

26

Impact contest by OA type by field 2009ndash2011 NB Here Gold refers to full-Gold journals not to Gold papers in hybrid journals

RESULTS

1st place 2nd place 3rd place Least impact

Type ARC Type ARC Type ARC Type ARCAgriculture Fisheries amp Forestry Green OA 157 Other OA 132 Not OA 088 Gold OA 051Biology Other OA 137 Green OA 130 Not OA 069 Gold OA 047Biomedical Research Other OA 123 Green OA 110 Gold OA 091 Not OA 065Built Environment amp Design Green OA 156 Other OA 128 Not OA 086 Gold OA 019Chemistry Other OA 134 Green OA 128 Not OA 095 Gold OA 034Clinical Medicine Other OA 156 Green OA 108 Gold OA 064 Not OA 063Communication amp Textual Studies Other OA 182 Green OA 151 Not OA 066 Gold OA 073Earth amp Environmental Sciences Green OA 146 Other OA 126 Gold OA 098 Not OA 072Economics amp Business Green OA 146 Other OA 130 Not OA 071 Gold OA 022Enabling amp Strategic Technologies Green OA 168 Other OA 153 Not OA 083 Gold OA 052Engineering Green OA 184 Other OA 138 Not OA 083 Gold OA 055General Arts Humanities amp Social Sciences Green OA 174 Other OA 149 Not OA 073 Gold OA 013General Science amp Technology Green OA 256 Other OA 224 Gold OA 069 Not OA 011Historical Studies Green OA 237 Other OA 161 Not OA 076 Gold OA 037Information amp Communication Technology Green OA 162 Other OA 136 Gold OA 076 Not OA 069Mathematics amp Statistics Green OA 135 Other OA 111 Not OA 075 Gold OA 067Philosophy amp Theology Green OA 172 Other OA 163 Gold OA 086 Not OA 072Physics amp Astronomy Green OA 143 Gold OA 118 Other OA 104 Not OA 073Psychology amp Cognitive Sciences Other OA 135 Green OA 131 Not OA 066 Gold OA 059Public Health amp Health Services Other OA 138 Green OA 130 Not OA 076 Gold OA 071Social Sciences Green OA 154 Other OA 144 Not OA 076 Gold OA 052Visual amp Performing Arts Green OA 216 Other OA 186 Not OA 077 Gold OA 029Total Green OA 153 Other OA 136 Not OA 076 Gold OA 061

Field

27

OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery

CONCLUSION

28

Growth of OA should be understood to comprise two main aspects

Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve

CONCLUSION

29

On average openly accessible papers have a decidedly greater impact

In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established

No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do

CONCLUSION

30

Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers

Visit Science-Metrix to learn about our evaluation and measurement activities

THANK YOU

  • Slide Number 1
  • Background
  • SYNOPSIS
  • Definitions
  • Definitions
  • Definitions
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Measuring the of OA papers
  • Measuring the of OA papers
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • Conclusion
  • Conclusion
  • Conclusion
  • Thank you

27

OA is a fast-moving phenomenon It is also quite complex to understand and to measure Uptake of OA limited by heterogeneity and challenges in discovery

CONCLUSION

28

Growth of OA should be understood to comprise two main aspects

Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve

CONCLUSION

29

On average openly accessible papers have a decidedly greater impact

In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established

No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do

CONCLUSION

30

Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers

Visit Science-Metrix to learn about our evaluation and measurement activities

THANK YOU

  • Slide Number 1
  • Background
  • SYNOPSIS
  • Definitions
  • Definitions
  • Definitions
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Measuring the of OA papers
  • Measuring the of OA papers
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • Conclusion
  • Conclusion
  • Conclusion
  • Thank you

28

Growth of OA should be understood to comprise two main aspects

Organic growth as more publishers researchers and librarians increasingly make freshly published papers freely available ldquoBackfillingrdquo of already published papers by researchers and librarians and dis-embargoing of previously locked papers by publishers contribute to a translation of the availability curve

CONCLUSION

29

On average openly accessible papers have a decidedly greater impact

In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established

No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do

CONCLUSION

30

Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers

Visit Science-Metrix to learn about our evaluation and measurement activities

THANK YOU

  • Slide Number 1
  • Background
  • SYNOPSIS
  • Definitions
  • Definitions
  • Definitions
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Measuring the of OA papers
  • Measuring the of OA papers
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • Conclusion
  • Conclusion
  • Conclusion
  • Thank you

29

On average openly accessible papers have a decidedly greater impact

In 7 fields publishing in subscription-based journals and not self-archiving is the worst possible strategy In these fields Gold journals surpass in impact publishing in subscription-based journals with no self-archiving even if these Gold journals are much younger and less established

No longer adequate to publish and forget papers One has to actively market papers and think of post-publishing communication strategies Considering the high value of the knowledge contained in papers and their high public cost working to maximise diffusion and uptake is the least one can do

CONCLUSION

30

Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers

Visit Science-Metrix to learn about our evaluation and measurement activities

THANK YOU

  • Slide Number 1
  • Background
  • SYNOPSIS
  • Definitions
  • Definitions
  • Definitions
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Measuring the of OA papers
  • Measuring the of OA papers
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • Conclusion
  • Conclusion
  • Conclusion
  • Thank you

30

Visit 1science to learn about our solution to radically facilitate the discovery and use of peer-reviewed open access papers

Visit Science-Metrix to learn about our evaluation and measurement activities

THANK YOU

  • Slide Number 1
  • Background
  • SYNOPSIS
  • Definitions
  • Definitions
  • Definitions
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Vantage Points
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Bibliometrics ndash Proportion of OA
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Sampling and METROLOGY
  • Measuring the of OA papers
  • Measuring the of OA papers
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • RESULTS
  • Conclusion
  • Conclusion
  • Conclusion
  • Thank you