similarity algorithms for music plagiarismmas03dm/...musicplagiarism.pdf · for music plagiarism....

24
Daniel Müllensiefen Department of Psychology Goldsmiths University of London Similarity Algorithms for Music Plagiarism

Upload: dinhdien

Post on 31-Aug-2018

226 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study

Daniel Müllensiefen

Department of Psychology

GoldsmithsUniversity of London

Similarity Algorithmsfor Music Plagiarism

Page 2: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study

StructureStructure1 Introduction2 Method and Examples3 Algorithms4 Empirical Study5 Summary – Future Research

Page 3: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study

IntroductionIntroduction: : The The BackgroundBackgroundHuge Public Interest:

Importance for the pop industry, interestingemphasis on tunes (melodic plagiarism)

Ambigity surrounding UK law in defining asubstantail part

Lack of Research, Exceptions: Timothy English: Sounds Like Teen Spirit, (2008) Stan Soocher: They Fought The Law, (1999) Charles Cronin: Concepts of Melodic Similarity in

Music-Copyright, (1998)

Page 4: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study

The 2-second rule The 1-bar rule The 5-notes rule

IntroductionIntroduction::No No hard hard and fast and fast rulesrules!!

It is more complicated than that!

What has been taken needs to be at least asubstantial part of the copyright work(Lord Millet, Designers Guild v. Williams [2000] 1 WLR 2426)

Page 5: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study

The first quantitative study on music plagiarism

IntroductionIntroduction: : MMüllensiefen üllensiefen &&Pendzich Pendzich (2009)(2009)

Questions: Can computer algorithms predict plagiarism

incidents from the melodic similarity of tunes? What is the predictive power of these algorithms

when superimposed on documented case law?

What is the frame of reference(directionality of comparisons)?

How can prior musical knowledge be takeninto account?

Page 6: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study

MethodMethod: : OutlineOutline

Selection of 20 cases* from 1970 to 2005 – focuson melodic aspects of copyright infringement

Transcription to monophonic MIDI files Analysis of judges' written opinions Reduction of court decisions to two categories

“pro plaintiff” = melodic plagiarism “contra plaintiff” = no infringement

Use of context: Corpus of 14,063 pop songsrepresenting pop music history from 1950-2006

*from UCLA/Columbia copyright infringement database: http://cip.law.ucla.edu/

Page 7: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study

The Chiffons He‘s So Fine, 1963 No. 1 in US, UK highest position 11

Example 1: Bright Tunes v.Example 1: Bright Tunes v.Harrisongs Harrisongs (1976, 420 F. Supp)(1976, 420 F. Supp)

George Harrison, My Sweet LordPublished in 1971 No.-1-Hit in US, UK & (West-)Germany

Page 8: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study

Ronald Selle, “Let It End” Unreleased (1975)

Example 2: Selle v. Gibb Example 2: Selle v. Gibb (1983, 567(1983, 567F.Supp 1173)F.Supp 1173)

Bee Gees, “How Deep Is Your Love” (1977)

Page 9: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study

AlgorithmsAlgorithms

Statistically Informed Similarity Algorithms:Importance of frequency of melodicelements for similarity assessment

Inspired from computational linguistics(Baayen, 2001) and text processing (Manning &Schütze, 1999; Jurafsky & Martin, 2000)

Conceptual Components: m-types (aka n-grams) as melodic elements Frequency counts: Type frequency (TF) and

Inverted Document Frequency (IDF)

Page 10: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study

Melodic elementsMelodic elements: : m-typesm-types

Word Type t Frequency f(t), Melodic Type τ (pitchinterval, length 2)

Frequency f(τ),

Twinkle 2 0, +7 1

little 1 +7, 0 1

star 1 0, +2 1

How 1 +2, 0 1

I 1 0, -2 3

wonder 1 -2, -2 1

what 1 -2, 0 2

you 1 0, -1 1

are 1 -1, 0 1

Page 11: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study

!

IDFC"( ) = log

C

m:" #m( )C Corpus of melodies

m melody

τ Melodic type

T # different melodic types

|m:τ ∈ m| # melodies containing τ

!

TF(m," ) =fm "( )

fm " i( )i=1

#

$

Melodic Type τ(pitch interval,length 2)

Frequency f(τ)

0, +7 1

+7, 0 1

0, +2 1

+2, 0 1

0, -2 3

-2, -2 1

-2, 0 2

0, -1 1

-1, 0 1

Type- and Type- and InvertedInvertedDocument FrequenciesDocument Frequencies

Page 12: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study

!

IDFC"( ) = log

C

m:" #m( )C Corpus of melodies

m melody

τ Melodic type

τ T # different melodic types

|m:τ ∈ m| # melodies containing τ

!

TF(m," ) =fm "( )

fm " i( )i=1

#

$

Melodic Type τ(pitch interval,length 2)

Frequency f(τ) TF(m, τ)

0, +7 1 0.083

+7, 0 1 0.083

0, +2 1 0.083

+2, 0 1 0.083

0, -2 3 0.25

-2, -2 1 0.083

-2, 0 2 0.167

0, -1 1 0.083

-1, 0 1 0.083

Type- and Type- and InvertedInvertedDocument FrequenciesDocument Frequencies

Page 13: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study

!

IDFC"( ) = log

C

m:" #m( )C Corpus of melodies

m melody

τ Melodic type

τ T # different melodic types

|m:τ ∈ m| # melodies containing τ

!

TF(m," ) =fm "( )

fm " i( )i=1

#

$

Melodic Type τ(pitch interval,length 2)

Frequency f(τ) TF(m, τ) IDFC(τ)

0, +7 1 0.083 1.57

+7, 0 1 0.083 1.36

0, +2 1 0.083 0.23

+2, 0 1 0.083 0.28

0, -2 3 0.25 0.16

-2, -2 1 0.083 0.19

-2, 0 2 0.167 0.22

0, -1 1 0.083 0.51

-1, 0 1 0.083 0.74

Type- and Type- and InvertedInvertedDocument FrequenciesDocument Frequencies

Page 14: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study

!

IDFC"( ) = log

C

m:" #m( )C Corpus of melodies

m melody

τ Melodic type

τ T # different melodic types

|m:τ ∈ m| # melodies containing τ

!

TF(m," ) =fm "( )

fm " i( )i=1

#

$

Melodic Type τ(pitch interval,length 2)

Frequency f(τ) TF(m, τ) IDFC(τ) TFIDFm,C(τ)

0, +7 1 0.083 1.57 0.13031

+7, 0 1 0.083 1.36 0.11288

0, +2 1 0.083 0.23 0.01909

+2, 0 1 0.083 0.28 0.02324

0, -2 3 0.25 0.16 0.04

-2, -2 1 0.083 0.19 0.01577

-2, 0 2 0.167 0.22 0.03674

0, -1 1 0.083 0.51 0.04233

-1, 0 1 0.083 0.74 0.06142

Type- and Type- and InvertedInvertedDocument FrequenciesDocument Frequencies

Page 15: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study

!

"C(s,t) =

TFIDFs,C(# ) $TFIDF

t,C(# )

# % sn&tn

'

TFIDFs,C(# )( )

2

$ TFIDFt,C(# )( )

2

# % sn&tn

'#% s

n&tn

'

TF-IDF CorrelationTF-IDF Correlation

Page 16: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study

Ratio Model (Tversky, 1977): Similarity σ(s,t) related to # features in s and t have common salience of features f()

!

"(s,t) =f (sn# tn )

f (sn# tn )+$f (sn \ tn )+ %f ( tn \ sn ),$,% & 0

features => m-types salience => IDF and TF different values of α, β to change frame of reference

Variable m-type lengths (n=1,…,4), entropy-weighted average

Feature-based SimilarityFeature-based Similarity

Page 17: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study

Tversky.equal measure (with α = β = 1)

!

"(s,t) =IDF

C(# )

# $ sn% t

n

&IDF

C(# )

# $ sn% t

n

& + IDFC(# )

# $ sn\ tn

& + IDFC(# )

# $ tn\ sn

&

Tversky.plaintiff.only measure (with α = 1, β = 0)

!

"plaintiff.only(s,t) =IDF

C(# )

# $ sn% t

n

&IDF

C(# )

# $ sn% t

n

& + IDFC(# )

# $ sn\ tn

&

Tversky.defendant.only measure (with α = 0, β = 1)

!

"defendant .only( t,s) =IDFC (# )# $ sn% tn

&IDFC (# )# $ sn% tn

& + IDFC (# )# $ tn \ sn&

Tversky.weighted measure with and

!

" =TF

s(# )

# $ sn% t

n

&TF

s(# )

# $ sn

&

!

" =TF

t(# )

# $ sn% t

n

&TF

t(# )

# $ tn

&

Feature-based SimilarityFeature-based Similarity

Page 18: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study

Ground Truth:20 cases with pro/contra decision (7/13)

Evaluation metrics Accuracy (% correct at optimal cut-off on

similarity scale) AUC (Area Under receiver operating

characteristic Curve)

EvaluationEvaluation

Page 19: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Edit Dis

t.

Sum C

om.

Ukkonen

TF-IDF c

orrel.

Tv.equal

Tv.pla

intif

f

Tv.defe

ndant

Tv.weig

hted

Accuracy

AUC

EvaluationEvaluation

Page 20: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study

Evaluation: ROC CurvesEvaluation: ROC Curves

Page 21: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study

Observations: Decision sometimes based on ‘characteristic motives’ (incl. rhythm)

High-level form can be important (e.g. call-and-response structure)

Frame of reference can be different

Ronald Selle, “Let It End”

Bee Gees, “How Deep Is Your Love”

A Qualitative LookA Qualitative Look

Page 22: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study

SummarySummary Court decisions can be related closely tomelodic similarity Plaintiff’s song is often the frame ofreference when determining the question ofsubstantial part (“It depends upon its importance to thecopyright work. It does not depend upon its importance to thedefendants” per Lord Millet)

Statistical information about commonnessof melodic elements is important

Page 23: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study

Future ResearchFuture Research

Conduct analogous studies with cases fromthe UK and Germany and utilize additionalUS cases

Include rhythm in m-types Use higher level features of melodies in

addition to m-types Conduct listening experiment to determine

cognitive validity

Page 24: Similarity Algorithms for Music Plagiarismmas03dm/...MusicPlagiarism.pdf · for Music Plagiarism. Structure 1 Introduction 2 Method and Examples 3 Algorithms 4 Empirical Study

Daniel Müllensiefen

Department of Psychology

Goldsmiths,University of London

Similarity algorithmsfor music plagiarism