f-measure-ys-26oct07[1].pdf
TRANSCRIPT
-
7/27/2019 F-measure-YS-26Oct07[1].pdf
1/5
c 2007 Y. Sasaki, Version: 26th October, 2007 1
The truth of the F-measure
Yutaka SasakiResearch Fellow
School of Computer Science, University of ManchesterMIB, 131 Princess Street, Manchester, M1 7DN
Yutaka.Sasaki a manchester.ac.uk
October 26, 2007
Abstract
It has been past more than 15 years since the F-measure was rstintroduced to evaluation tasks of information extraction technologyat the Fourth Message Understanding Conference (MUC-4) in 1992.Recently, sometimes I see some confusion with the denition of the F-measure, which seems to be triggered by lack of background knowledgeabout how the F-measure was derived. Since I was not involved in theprocess of the introduction or device of the F-measure, I might not bethe best person to explain this but I hope this note would be a littlehelp for those who are wondering what the F-measure really is. Thisintroduction is devoted to provide brief but sufficient information onthe F-measure.
1 Overview
Denition of the F-measure
The F-measure is dened as a harmonic mean of precision (P) and recall(R): 1
F =2P R
P + R .1 In biomedicine, precision is called positive predictive value (PPV) and recall is called
sensitivity but to my knowledge, there is nothing corresponding to the F-measure in thedomain.
-
7/27/2019 F-measure-YS-26Oct07[1].pdf
2/5
If you are satised with this denition and need no further information,thats it. However, if you are deeply interested in the denition of the F-measure, you should recap the denitions of the arithmetic and harmonicmeans.
Arithmetic and harmonic means
The arithmetic mean A (an average in a usual sense) and the harmonic meanH are dened as follows.
A =1n
n
i =1
x i =1n
(x1 + x2 + ... + xn ).
H =n
ni =1
1x i
=n
1x 1 + ... +
1x n
.
When x1 = P and x2 = R, A and H will be:
A =12
(P + R).
H =2
1P +
1R
=2
P +
RP R
=2P R
P + R.
The harmonic mean is more intuitive than the arithmetic mean whencomputing a mean of ratios.
Suppose that you have a nger print recognition system and its precisionand recall be 1.0 and 0.2, respectively. Intuitively, the total performance of the system should be very low because the system covers only 20% of theregistered nger prints, which means it is almost useless.
The arithmetic mean of 1 and 0.2 is 0.6 whereas the harmonic mean of
them is 2 1 210
1 +210
=4
12=
13
.
As you see in this example, the harmonic mean (0.333...) is a morereasonable score than the arithmetic mean (0.6).
2
-
7/27/2019 F-measure-YS-26Oct07[1].pdf
3/5
2 Derivation of the F-measure
Some researchers call the denition of the F-measure in the previous sectionF1 -measure. What is 1 of F 1 ?
The full denition of the F-measure is given as follows.[Chinchor, 1992]
F =( 2 + 1) P R
2 P + R(0 + ).
is a parameter that controls a balance between P and R. When = 1,F1 comes to be equivalent to the harmonic mean of P and R. If > 1,F becomes more recall-oriented and if < 1, it becomes more precision-oriented, e.g., F 0 = P .
While it seems that van Rijsbergen did not dene the formula of theF-measure per se, the origin of the denition of the F-measure is van Rijs-bergens E (effectiveness) function [van Rijsbergen, 1979]:
E = 1 1
1P
+ (1 )1R
,
where = 1 2 +1 .Lets remove using .
E = 1 1
1 2 + 1
1P
+ 1 1
2 + 11R
,
= 1 P R
1 2 +1 R +
2 +1 1 2 +1 P
.
= 1 ( 2 + 1) P R
R + 2 P .
Now you see thatE = 1 F .
Note that F rises if R or P gets better whereas E becomes small if R orP improves. This seems the reason why F is more commonly used than E.
Some people use as a parameter of F.
F =1
1P + (1 )1R
(0 1).
3
-
7/27/2019 F-measure-YS-26Oct07[1].pdf
4/5
There is nothing wrong with this denition of F but use of this denitionmight cause an unnecessary confusion because F =0 .5 =F =1 . An attentionis needed that the commonly used notation F 1 means F =1 , not F =1 .
3 Further investigation in
Still, some of you are not sure why 2 is used instead of in = 1 2 +1 .The best way to understand this is to read Chapter 7 of van Rijsbergensmasterpiece[van Rijsbergen, 1979]. However, let me try to explanation thereason.
is the parameter that controls the weighting between P and R. For-mally, is dened as follows:
= R/P, whereE P
=E R
.
The motivation behind this condition is that at the point where thegradients of E w.r.t. P and R are equal, the ratio of R against P should bea desired ratio .
Please recall that E is dened as follows:
E = 1 1
1
P + (1 )
1
R
,
= 1 P R
R + (1 )P .
Now we calculate E P andE R . By the quotient rule on the derivative
of a composite function, ( f /g ) = ( f g fg )/g 2 . For conciseness, let g =R + (1 )P .
E P
= R(R + (1 )P ) P R (1 )
g2.
E R =
P (R + (1 )P ) P Rg2 .
Then, E P =E R is equivalent to:
R(R + (1 )P ) P R (1 ) = P (R + (1 )P ) PR,
4
-
7/27/2019 F-measure-YS-26Oct07[1].pdf
5/5