f-measure-ys-26oct07[1].pdf

7/27/2019 F-measure-YS-26Oct07[1].pdf

1/5

c 2007 Y. Sasaki, Version: 26th October, 2007 1

The truth of the F-measure

Yutaka SasakiResearch Fellow

School of Computer Science, University of ManchesterMIB, 131 Princess Street, Manchester, M1 7DN

Yutaka.Sasaki a manchester.ac.uk

October 26, 2007

Abstract

It has been past more than 15 years since the F-measure was rstintroduced to evaluation tasks of information extraction technologyat the Fourth Message Understanding Conference (MUC-4) in 1992.Recently, sometimes I see some confusion with the denition of the F-measure, which seems to be triggered by lack of background knowledgeabout how the F-measure was derived. Since I was not involved in theprocess of the introduction or device of the F-measure, I might not bethe best person to explain this but I hope this note would be a littlehelp for those who are wondering what the F-measure really is. Thisintroduction is devoted to provide brief but sufficient information onthe F-measure.

1 Overview

Denition of the F-measure

The F-measure is dened as a harmonic mean of precision (P) and recall(R): 1

F =2P R

P + R .1 In biomedicine, precision is called positive predictive value (PPV) and recall is called

sensitivity but to my knowledge, there is nothing corresponding to the F-measure in thedomain.


2/5

If you are satised with this denition and need no further information,thats it. However, if you are deeply interested in the denition of the F-measure, you should recap the denitions of the arithmetic and harmonicmeans.

Arithmetic and harmonic means

The arithmetic mean A (an average in a usual sense) and the harmonic meanH are dened as follows.

A =1n

n

i =1

x i =1n

(x1 + x2 + ... + xn ).

H =n

ni =1

1x i

=n

1x 1 + ... +

1x n

.

When x1 = P and x2 = R, A and H will be:

A =12

(P + R).

H =2

1P +

1R

=2

P +

RP R

=2P R

P + R.

The harmonic mean is more intuitive than the arithmetic mean whencomputing a mean of ratios.

Suppose that you have a nger print recognition system and its precisionand recall be 1.0 and 0.2, respectively. Intuitively, the total performance of the system should be very low because the system covers only 20% of theregistered nger prints, which means it is almost useless.

The arithmetic mean of 1 and 0.2 is 0.6 whereas the harmonic mean of

them is 2 1 210

1 +210

=4

12=

13

.

As you see in this example, the harmonic mean (0.333...) is a morereasonable score than the arithmetic mean (0.6).

2


3/5

2 Derivation of the F-measure

Some researchers call the denition of the F-measure in the previous sectionF1 -measure. What is 1 of F 1 ?

The full denition of the F-measure is given as follows.[Chinchor, 1992]

F =( 2 + 1) P R

2 P + R(0 + ).

is a parameter that controls a balance between P and R. When = 1,F1 comes to be equivalent to the harmonic mean of P and R. If > 1,F becomes more recall-oriented and if < 1, it becomes more precision-oriented, e.g., F 0 = P .

While it seems that van Rijsbergen did not dene the formula of theF-measure per se, the origin of the denition of the F-measure is van Rijs-bergens E (effectiveness) function [van Rijsbergen, 1979]:

E = 1 1

1P

+ (1 )1R

,

where = 1 2 +1 .Lets remove using .

E = 1 1

1 2 + 1

1P

+ 1 1

2 + 11R

,

= 1 P R

1 2 +1 R +

2 +1 1 2 +1 P

.

= 1 ( 2 + 1) P R

R + 2 P .

Now you see thatE = 1 F .

Note that F rises if R or P gets better whereas E becomes small if R orP improves. This seems the reason why F is more commonly used than E.

Some people use as a parameter of F.

F =1

1P + (1 )1R

(0 1).

3


4/5

There is nothing wrong with this denition of F but use of this denitionmight cause an unnecessary confusion because F =0 .5 =F =1 . An attentionis needed that the commonly used notation F 1 means F =1 , not F =1 .

3 Further investigation in

Still, some of you are not sure why 2 is used instead of in = 1 2 +1 .The best way to understand this is to read Chapter 7 of van Rijsbergensmasterpiece[van Rijsbergen, 1979]. However, let me try to explanation thereason.

is the parameter that controls the weighting between P and R. For-mally, is dened as follows:

= R/P, whereE P

=E R

.

The motivation behind this condition is that at the point where thegradients of E w.r.t. P and R are equal, the ratio of R against P should bea desired ratio .

Please recall that E is dened as follows:

E = 1 1

1

P + (1 )

1

R

,

= 1 P R

R + (1 )P .

Now we calculate E P andE R . By the quotient rule on the derivative

of a composite function, ( f /g ) = ( f g fg )/g 2 . For conciseness, let g =R + (1 )P .

E P

= R(R + (1 )P ) P R (1 )

g2.

E R =

P (R + (1 )P ) P Rg2 .

Then, E P =E R is equivalent to:

R(R + (1 )P ) P R (1 ) = P (R + (1 )P ) PR,

4


5/5

f-measure-ys-26oct07[1].pdf

Documents