christian geib [email protected] is licensing the answer to existing copyright impediments...

21
Christian Geib [email protected] Is Licensing the Answer to Existing Copyright Impediments to Data Mining? Different Licensing Models and their Feasibility

Upload: samantha-grant

Post on 17-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Christian Geib christian.geib@strath.ac.uk Is Licensing the Answer to Existing Copyright Impediments to Data Mining? Different Licensing Models and their

Christian [email protected]

Is Licensing the Answer to Existing Copyright Impediments

to Data Mining?

Different Licensing Models and their Feasibility

Page 2: Christian Geib christian.geib@strath.ac.uk Is Licensing the Answer to Existing Copyright Impediments to Data Mining? Different Licensing Models and their

OutlineI. What is data mining and why is it prima facie infringing?

II. If found to be prima facie infringing are there available legal exceptions?

III. If available legal exceptions are insufficient, could licensing be the solutions?

IV. Different types of licensing

V. Specific and general problems with licensing

VI. Further research

Page 3: Christian Geib christian.geib@strath.ac.uk Is Licensing the Answer to Existing Copyright Impediments to Data Mining? Different Licensing Models and their

I. What is data mining?

‘An automatic or semi-automatic process of analysis of large quantities of data in order to discover pattern and rules (Fayyad, p. 28) The process of data mining allows researchers to

extract explicit and implicit information from data.’1

1 Fayyad et al. ‘‘From Data Mining to Knowledge Discovery in Databases’ (1996) 39 COMMUNICATIONS OF THE ACM 28

Page 4: Christian Geib christian.geib@strath.ac.uk Is Licensing the Answer to Existing Copyright Impediments to Data Mining? Different Licensing Models and their

I. What is data mining?

Examples:

A) Swanson’s study on Raynaud’s disease finding correlations in journal articles (explicit data) that are ‘logically but not bibliographically connected’1

B) Target data mining to determine pregnancy prediction score (implicit data)

2 Don R Swanson, ‘Two Medical Literatures That Are Logically but Not Bibliographically Connected’ (1987) 38 Journal Of The American Society For Information Science 228, 221.

Page 5: Christian Geib christian.geib@strath.ac.uk Is Licensing the Answer to Existing Copyright Impediments to Data Mining? Different Licensing Models and their

II. Potentially copyright-basic relevant steps for

data mining

Data Mining essentially contains 2 steps which are relevant to copyright law:

1st: the scanning of copyrighted works produces a computer image of e.g. a text and thus creates a copy.

2nd: the application of Optical Character Recognition (OCR) converts the image into a text file. This can be viewed as a second copy (Borghi/Karapapa: “OCR-ing qualitatively different”)

These files are then placed into repositories

Page 6: Christian Geib christian.geib@strath.ac.uk Is Licensing the Answer to Existing Copyright Impediments to Data Mining? Different Licensing Models and their

II. Potentially copyright-relevant steps for

data mining

More granular:

The obtaining of the sources: e.g. scraping & crawling

The transformation of the data to fit operational needs:e.g. data cleaning (punctuation etc.) & OCR-ing

The loading of the data: storage /repositories

The analysis of the data: actual mining (applying algorithms)3 3 Trialle/de Meeus d’Argenteuil Study on the legal framework of text and data mining (TDM) (2014), p.45; The Hooper Report, p. 31

Page 7: Christian Geib christian.geib@strath.ac.uk Is Licensing the Answer to Existing Copyright Impediments to Data Mining? Different Licensing Models and their

II. Potentially copyright-relevant categorization of data

Access granted by [….] to […]

Types of data

All to All www. data

Many to many Social networks data

By one to many Proprietary data (publishers [by contract])

One to one44Trialle/de Meeus d’Argenteuil Study on the legal framework of text and data mining (TDM) (2014), p.20; The Hooper Report, p.20

Confidential data

Page 8: Christian Geib christian.geib@strath.ac.uk Is Licensing the Answer to Existing Copyright Impediments to Data Mining? Different Licensing Models and their

II. Rights potentially affected by data mining

1) Reproduction right in Art. 2 InfoSoc Directive Restricted Acts in s. 16 1 (a), s. 17 CDPA 1988 and Art. 5a Database Directive 96/9/EC, s. also meaning of infringing copy in s. 27 CDPA 1988

2) Adaptation right in the copyright part of the Database Directive (“ translation, adaptation, arrangement and any other alteration”) (Art. 5b 96/9/EC)

4) Extraction & re-utilization right in the sui generis part of the Art. 7 2 (a)&(b) Database Directive 96/9/EC & Reg. 16(1) Db. Regs

Cp. British Horseracing Board v William Hill re meaning of

‘substantial’ and Crowson Fabrics Ltd v Rider re meaning of extraction [‘permanent or temporary transfer’…no requirement that extracted data use’)

Page 9: Christian Geib christian.geib@strath.ac.uk Is Licensing the Answer to Existing Copyright Impediments to Data Mining? Different Licensing Models and their

III. The new data mining exception-s. 29A

s. 29A Copies for text and data analysis for non-commercial research(1) The making of a copy of a work by a person who has lawful access to the work does not infringe copyright in the work provided that—(a) the copy is made in order that a person who has lawful access to the work may carry out a computational analysis of anything recorded in the work for the sole purpose of research for a non-commercial purpose, and(b)the copy is accompanied by a sufficient acknowledgement (unless this would be impossible for reasons of practicality or otherwise).(2) Where a copy of a work has been made under this section, copyright in the work is infringed if—(a)the copy is transferred to any other person, except where the transfer is authorised by the copyright owner, or(b)the copy is used for any purpose other than that mentioned in subsection (1)(a), except where the use is authorised by the copyright owner.…(5) To the extent that a term of a contract purports to prevent or restrict the making of a copy which, by virtue of this section, would not infringe copyright, that term is unenforceable.”

Page 10: Christian Geib christian.geib@strath.ac.uk Is Licensing the Answer to Existing Copyright Impediments to Data Mining? Different Licensing Models and their

III. Overview exceptions & potential shortcomings

Page 11: Christian Geib christian.geib@strath.ac.uk Is Licensing the Answer to Existing Copyright Impediments to Data Mining? Different Licensing Models and their

III. Overview exceptions & potential shortcomings

Page 12: Christian Geib christian.geib@strath.ac.uk Is Licensing the Answer to Existing Copyright Impediments to Data Mining? Different Licensing Models and their

III. Overview exceptions & potential shortcomings

Page 13: Christian Geib christian.geib@strath.ac.uk Is Licensing the Answer to Existing Copyright Impediments to Data Mining? Different Licensing Models and their

IV. Licensing

1. Licenses for Europe Shareholder Dialogue & withdrawal of European Research libraries.

2. Individually negotiated licenses

3. Standard licenses (pay per view[PPV])

4. Implied licences?

5. Compulsory licenses?

6. Open licenses such as open access publishing (incl. ‘hybrid’ journals), Creative Commons, open data commons, open government licenses

Page 14: Christian Geib christian.geib@strath.ac.uk Is Licensing the Answer to Existing Copyright Impediments to Data Mining? Different Licensing Models and their

IV. Licensing

What is a compulsory license?

A compulsory licence is an involuntary contract between a right holder and a third party authorised by the Government.1

5 Gowers Report, s. 4.64

Page 15: Christian Geib christian.geib@strath.ac.uk Is Licensing the Answer to Existing Copyright Impediments to Data Mining? Different Licensing Models and their

IV. Licensing

What is an implied license?

An implied license is an unwritten license that permits a party (the licensee) to do what would normally require the express permission of another party (the licensor).6

6 Elaine O’Connor, New Ink: The Perils of Superimposing Copyright Law On the Tatoo Industry, Westminster Law Review, Vol. 3 Issue 1

Page 16: Christian Geib christian.geib@strath.ac.uk Is Licensing the Answer to Existing Copyright Impediments to Data Mining? Different Licensing Models and their

V. Licensing-General problems

1. Scalability of obtaining numerous permissions (survey: 90% had to chase rights holders for permission; 12.5 %of requests for permission to use material were never answered ) 7

2. Costs for obtaining permission (especially PPV or special fees and the high margins of the journal publishers)8

3. Limited discoverability for non-subscribers 9

1. Limited evidence that copyright protection and such a market based solution does provide incentives to create 10

2. (Potential) Competition Law Challenges/unequal bargaining positions11

3. Contract law challenges in some Member States: e.g. click and browsewrap licenses

7 Gowers Report, s. 4.16; FOI request Peter Murray-Rust, : https://www.whatdotheyknow.com/ ; Peter Murray-Rust, ‘Text and Data Mining – Fighting for Our Digital Future (“Peter Murray-Rust Is the Problem”)’ <http://blogs.ch.cam.ac.uk/pmr/2013/10/02/text-and-data-mining-fighting-for-our-digital-future-peter-murray-rust-is-the-problem/> ‘Licensing destroys Text and Data Mining…Imagine as few as 1000 researchers negotiating licences with 1000 publishers. That’s 1 million licence negotiations, Requirement to use Elsevier API, s. Standardisation in the area of innovation and technological development, notably in the field of Text and Data Mining-Report from the Expert Group-European Commission 2014 8 Gowers Report, s. 4.16; Open Access, House of Commons Report, Business, Innovation and Skills Committee, s. 15: Those concerns were based in part on the market leader Reed Elsevier’s then operating profit margin of 34%. By the time of this inquiry, Reed Elsevier’s operating profit margin had increased to 37%.17 (Q6)

9 Interview Peter Murray-Rust10 Hargreaves Report, s. 2.6; Mark Lemley, Faith Based Intellectual property, 2015; Mark Lemley IP in a World Without Scarcity, 2015

11 Magill, INS-Health, Microsoft

Page 17: Christian Geib christian.geib@strath.ac.uk Is Licensing the Answer to Existing Copyright Impediments to Data Mining? Different Licensing Models and their

V. Licensing-Specific problems

1. Compulsory licensing: complex and expensive administrative procedure; limited legal certainty

2. Implied licensing: courts reluctant to imply licenses in the UK/EU12

3. Open access & Creative Commons: Open Access not a trade mark, Open Access Scientific Publishers Association (OASPA) DOES NOT CURRENTLY REQUIRE USE OF THE CC-BY LICENCE 13;

Concerns about Creative Commons Non-Commercial (NC) or BY-NC-ND; Concerns about ;Privatization of Rule Making with Creative Commons14; Concerns about hybrid journals (rising costs of Article Processing Charges [APC), discoverability) 15; Concerns re: Green and Gold Open Access 16

12 Orvec International Limited v Linfoots Limited [2014] EWHC ; 13 Standardisation in the area of innovation and technological development, notably in the field of Text and Data Mining-Report from the Expert Group-European Commission 2014, s. 28.14 Séverine Dusollier, ‘The Master’s Tools v. The Master’s House: Creative Commons v. Copyright’ (2006) 29 Columbia Journal of Law & the Arts 282-283. 15 Interview Peter Murray-Rust16 Open Access, House of Commons Report, Business, Innovation and Skills Committee, s. 18: THE FINCH REPORT-A U-TURN IN OPEN ACCESS POLICY

Page 18: Christian Geib christian.geib@strath.ac.uk Is Licensing the Answer to Existing Copyright Impediments to Data Mining? Different Licensing Models and their

V. Licensing-Is a Digital Copyright Exchange (DCE) a game-changer?

1. Role of a DCE:1. Look for different types of content across the range of media types

 

2. Define and agree what uses they wish to make of the chosen content with the licensors

 

3. Be quoted a price by the licensor for those uses of the specified content that the system is programmed to offer

 

4. Pay for the rights online within the normal e-commerce framework

5. Have the content delivered to them in the appropriate format

 6. Account back to the licensor as to what content was actually used so that

the right creators can be paid their shares 17

17 The Hooper Report - Streamlining copyright licensing for the digital age An independent report by Richard Hooper CBE and Dr Ros Lynch July 2012

Page 19: Christian Geib christian.geib@strath.ac.uk Is Licensing the Answer to Existing Copyright Impediments to Data Mining? Different Licensing Models and their

V. Licensing-Is a Digital Copyright Exchange (DCE) a game-changer?

1. Technical shortcomings: While DCE might lower some of the transaction costs, questionable if it indeed would work as seamlessly as the Domain Name System (DNS) [“Automated Licensing is the Future.” Reed Elsevier submission]18, especially as the transactions are far more complex than a normal DNS lookup

2. DCE changes little about the costs: Even if the system were as seamless as DNS, the question of costs still remains for large scale data mining operations

18 Hargreaves Report, s. 4.20

Page 20: Christian Geib christian.geib@strath.ac.uk Is Licensing the Answer to Existing Copyright Impediments to Data Mining? Different Licensing Models and their

VI. Further research

1. More factual data needed: Difficult to establish to what extent it is at the moment difficult for users to obtain the authorizations from the rightholders (i.e. mainly, the publishers) to engage in TDM. Statements made on this issue by the research sector and by publishers vary and partly contradict each other [see Licenses for Europe Discussions] 19

2. A larger body of licenses from EU research grants and licensing agreements needs to be examined on conditions stipulated therein, violations and ‘disciplining’ of researchers and if any licenses had been changed since the introduction of s. 29A CDPA 1988, s. https://www.whatdotheyknow.com/

19 Trialle/de Meeus d’Argenteuil Study on the legal framework of text and data mining (TDM) (2014), p.45; The Hooper Report, p. 31

Page 21: Christian Geib christian.geib@strath.ac.uk Is Licensing the Answer to Existing Copyright Impediments to Data Mining? Different Licensing Models and their

Thank you!Comments and questions

most welcome

[email protected]