f-measure-ys-26oct07[1].pdf

Upload: imperialstar12535

Post on 02-Apr-2018

224 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/27/2019 F-measure-YS-26Oct07[1].pdf

    1/5

    c 2007 Y. Sasaki, Version: 26th October, 2007 1

    The truth of the F-measure

    Yutaka SasakiResearch Fellow

    School of Computer Science, University of ManchesterMIB, 131 Princess Street, Manchester, M1 7DN

    Yutaka.Sasaki a manchester.ac.uk

    October 26, 2007

    Abstract

    It has been past more than 15 years since the F-measure was rstintroduced to evaluation tasks of information extraction technologyat the Fourth Message Understanding Conference (MUC-4) in 1992.Recently, sometimes I see some confusion with the denition of the F-measure, which seems to be triggered by lack of background knowledgeabout how the F-measure was derived. Since I was not involved in theprocess of the introduction or device of the F-measure, I might not bethe best person to explain this but I hope this note would be a littlehelp for those who are wondering what the F-measure really is. Thisintroduction is devoted to provide brief but sufficient information onthe F-measure.

    1 Overview

    Denition of the F-measure

    The F-measure is dened as a harmonic mean of precision (P) and recall(R): 1

    F =2P R

    P + R .1 In biomedicine, precision is called positive predictive value (PPV) and recall is called

    sensitivity but to my knowledge, there is nothing corresponding to the F-measure in thedomain.

  • 7/27/2019 F-measure-YS-26Oct07[1].pdf

    2/5

    If you are satised with this denition and need no further information,thats it. However, if you are deeply interested in the denition of the F-measure, you should recap the denitions of the arithmetic and harmonicmeans.

    Arithmetic and harmonic means

    The arithmetic mean A (an average in a usual sense) and the harmonic meanH are dened as follows.

    A =1n

    n

    i =1

    x i =1n

    (x1 + x2 + ... + xn ).

    H =n

    ni =1

    1x i

    =n

    1x 1 + ... +

    1x n

    .

    When x1 = P and x2 = R, A and H will be:

    A =12

    (P + R).

    H =2

    1P +

    1R

    =2

    P +

    RP R

    =2P R

    P + R.

    The harmonic mean is more intuitive than the arithmetic mean whencomputing a mean of ratios.

    Suppose that you have a nger print recognition system and its precisionand recall be 1.0 and 0.2, respectively. Intuitively, the total performance of the system should be very low because the system covers only 20% of theregistered nger prints, which means it is almost useless.

    The arithmetic mean of 1 and 0.2 is 0.6 whereas the harmonic mean of

    them is 2 1 210

    1 +210

    =4

    12=

    13

    .

    As you see in this example, the harmonic mean (0.333...) is a morereasonable score than the arithmetic mean (0.6).

    2

  • 7/27/2019 F-measure-YS-26Oct07[1].pdf

    3/5

    2 Derivation of the F-measure

    Some researchers call the denition of the F-measure in the previous sectionF1 -measure. What is 1 of F 1 ?

    The full denition of the F-measure is given as follows.[Chinchor, 1992]

    F =( 2 + 1) P R

    2 P + R(0 + ).

    is a parameter that controls a balance between P and R. When = 1,F1 comes to be equivalent to the harmonic mean of P and R. If > 1,F becomes more recall-oriented and if < 1, it becomes more precision-oriented, e.g., F 0 = P .

    While it seems that van Rijsbergen did not dene the formula of theF-measure per se, the origin of the denition of the F-measure is van Rijs-bergens E (effectiveness) function [van Rijsbergen, 1979]:

    E = 1 1

    1P

    + (1 )1R

    ,

    where = 1 2 +1 .Lets remove using .

    E = 1 1

    1 2 + 1

    1P

    + 1 1

    2 + 11R

    ,

    = 1 P R

    1 2 +1 R +

    2 +1 1 2 +1 P

    .

    = 1 ( 2 + 1) P R

    R + 2 P .

    Now you see thatE = 1 F .

    Note that F rises if R or P gets better whereas E becomes small if R orP improves. This seems the reason why F is more commonly used than E.

    Some people use as a parameter of F.

    F =1

    1P + (1 )1R

    (0 1).

    3

  • 7/27/2019 F-measure-YS-26Oct07[1].pdf

    4/5

    There is nothing wrong with this denition of F but use of this denitionmight cause an unnecessary confusion because F =0 .5 =F =1 . An attentionis needed that the commonly used notation F 1 means F =1 , not F =1 .

    3 Further investigation in

    Still, some of you are not sure why 2 is used instead of in = 1 2 +1 .The best way to understand this is to read Chapter 7 of van Rijsbergensmasterpiece[van Rijsbergen, 1979]. However, let me try to explanation thereason.

    is the parameter that controls the weighting between P and R. For-mally, is dened as follows:

    = R/P, whereE P

    =E R

    .

    The motivation behind this condition is that at the point where thegradients of E w.r.t. P and R are equal, the ratio of R against P should bea desired ratio .

    Please recall that E is dened as follows:

    E = 1 1

    1

    P + (1 )

    1

    R

    ,

    = 1 P R

    R + (1 )P .

    Now we calculate E P andE R . By the quotient rule on the derivative

    of a composite function, ( f /g ) = ( f g fg )/g 2 . For conciseness, let g =R + (1 )P .

    E P

    = R(R + (1 )P ) P R (1 )

    g2.

    E R =

    P (R + (1 )P ) P Rg2 .

    Then, E P =E R is equivalent to:

    R(R + (1 )P ) P R (1 ) = P (R + (1 )P ) PR,

    4

  • 7/27/2019 F-measure-YS-26Oct07[1].pdf

    5/5