generalized information theory meets human cognition: … · 2018-06-14 · generalized information...

Post on 15-Jul-2020

4 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

GeneralizedInformationTheoryMeetsHumanCognition:

IntroducingaUnifiedFrameworktoModelUncertaintyandInformationSearch

VincenzoCrupi1,JonathanD.Nelson2,3,BjörnMeder3,GustavoCevolani4,andKatyaTentori5

1CenterforLogic,Language,andCognition,DepartmentofPhilosophyandEducation,

UniversityofTurin,Italy

2SchoolofPsychology,UniversityofSurrey,Guildforfd,UK

3CenterforAdaptiveBehaviorandCognition,MaxPlanckInstituteforHuman

Development,Berlin,Germany

4IMTSchoolforAdvancedStudies,Lucca,Italy

5CenterforMind/BrainSciences,UniversityofTrento,Italy

AuthorNote

CorrespondenceconcerningthisarticleshouldbeaddressedtoVincenzoCrupi,Department

ofPhilosophyandEducation,UniversityofTurin,viaSant’Ottavio20,10124,Torino(Italy),

vincenzo.crupi@unito.it.ThisresearchwassupportedbygrantsCR409/1-2,NE1713/1-2,

andME3717/2-2fromtheDeutscheForschungsgemeinschaftaspartofthepriorityprogram

NewFrameworksofRationality(SPP1516).WethankNickChater,LauraMartignon,Andrea

Passerini,andPaulPedersenforhelpfulcommentsandexchanges.

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

2

Abstract

Searchingforinformationiscriticalinmanysituations.Inmedicine,forinstance,careful

choiceofadiagnostictestcanhelpnarrowdowntherangeofplausiblediseasesthatthe

patientmighthave.Inaprobabilisticframework,testselectionisoftenmodeledbyassuming

thatpeople’sgoalistoreduceuncertaintyaboutpossiblestatesoftheworld.Incognitive

science,psychology,andmedicaldecisionmaking,Shannonentropyisthemostprominent

andmostwidelyusedmodeltoformalizeprobabilisticuncertaintyandthereductionthereof.

However,avarietyofalternativeentropymetrics(Hartley,Quadratic,Tsallis,Rényi,and

more)arepopularinthesocialandthenaturalsciences,computerscience,andphilosophyof

science.Particularentropymeasureshavebeenpredominantinparticularresearchareas,and

itisoftenanopenissuewhetherthesedivergencesemergefromdifferenttheoreticaland

practicalgoalsoraremerelyduetohistoricalaccident.Cuttingacrossdisciplinaryboundaries,

weshowthatseveralentropyandentropyreductionmeasuresariseasspecialcasesina

unifiedformalism,theSharma-Mittalframework.Usingmathematicalresults,computer

simulations,andanalysesofpublishedbehavioraldata,wediscussfourkeyquestions:How

dovariousentropymodelsrelatetoeachother?Whatinsightscanbeobtainedbyconsidering

diverseentropymodelswithinaunifiedframework?Whatisthepsychologicalplausibilityof

differententropymodels?Whatnewquestionsandinsightsforresearchonhuman

informationacquisitionfollow?Ourworkprovidesseveralnewpathwaysfortheoreticaland

empiricalresearch,reconcilingapparentlyconflictingapproachesandempiricalfindings

withinacomprehensiveandunifiedinformation-theoreticformalism.

KEYWORDS:

Entropy,Uncertainty,Valueofinformation,Informationsearch,Probabilisticmodels

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

3

GeneralizedInformationTheoryMeetsHumanCognition:

IntroducingaUnifiedFrameworktoModelUncertaintyandInformationSearch

1.Introduction

Akeytopicinthestudyofrationality,cognition,andbehavioristheeffectivesearchfor

relevantinformationorevidence.Informationsearchisalsocloselyconnectedtothenotionof

uncertainty.Typically,anagentwillseektoacquireinformationtoreduceuncertaintyabout

aninferenceordecisionproblem.Physiciansprescribemedicaltestsinordertohandlearrays

ofpossiblediagnoses.Detectivesseekwitnessesinordertoidentifytheculpritofacrime.

And,ofcourse,scientistsgatherdatainordertodiscriminateamongdifferenthypotheses.

Inpsychologyandcognitivescience,mostearlyworkoninformationacquisitionadopteda

logical,deductiveinferenceperspective.InthespiritofPopper’s(1959)influential

falsificationistphilosophyofscience,theideawasthatlearnersshouldseekinformationthat

couldhelpthemfalsifyhypotheses(e.g.,expressedasaconditionalorarule;Wason,1960,

1966,1968).However,manyhumanreasonersdidnotseemtobelievethatinformationis

usefulifandonlyifitcanpotentiallyruleout(falsify)ahypothesis.Fromthe1980s,cognitive

scientistsstartedanalyzinghumaninformationsearchwithacloserlookatinductive

inference,usingprobabilisticmodelstoquantifythevalueofinformationandendorsingthem

asnormativebenchmarks(e.g.,Baron,1985;Klayman&Ha,1987;Skov&Sherman,1986;

Slowiaczek,Klayman,Sherman,&Skov,1992;Trope&Bassok,1982,1983).Thisresearch

wasinspiredbyseminalworkinphilosophyofscience(e.g.Good,1950),statistics(e.g.

Lindley,1956),anddecisiontheory(Savage,1972).Inthisview,eachoutcomeofaquery

couldmodifyanagent’sbeliefsaboutthehypothesesbeingconsidered,thusprovidingsome

amountofinformation.Forinstance,thekeytheoreticalpointofOaksfordandChater’s(1994,

2003)analysisofWason’sselectiontaskwastoconceptualizeinformationacquisitionasa

pieceofprobabilisticinductivereasoning,assumingthatpeople’sgoalistoreduce

uncertaintyaboutwhetheraruleholdsornot.Inasimilarvein,researchersinvisionscience

haveusedmeasuresofuncertaintyreductiontopredictvisualqueriesforgathering

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

4

information(i.e.,eyemovements;Legge,Klitz,&Tjan,1997;Najemnik&Geisler,2005,2009;

Nelson&Cottrell,2007;Renninger,Coughlan,Verghese,&Malik,2005),ortoguidearobot’s

eyemovements(Denzler&Brown,2002).Probabilisticmodelsofuncertaintyreductionhave

alsobeenusedtopredicthumanqueryselectionincausalreasoning(Bramley,Lagnado,&

Speekenbrink,2015),hypothesistesting(Austerweil&Griffiths,2011;Navarro&Perfors,

2011;Nelson,Divjak,Gudmundsdottir,Martignon,&Meder,2014;Nelson,Tenenbaum,&

Movellan,2001),andcategorization(Meder&Nelson,2012;Nelson,McKenzie,Cottrell,&

Sejnowski,2010).

Ifreducinguncertaintyisamajorcognitivegoalandmotivationforinformation

acquisition,acriticalissueishowuncertaintyandthereductionthereofcanberepresentedin

arigorousmanner.Afruitfulapproachtoformalizeuncertaintyisusingthemathematical

notionofentropy,whichinturngeneratesacorrespondingmodeloftheinformationalutility

ofanexperimentastheexpectedreductionofentropy(uncertainty),sometimescalled

expectedinformationgain.

Inmanydisciplines,includingpsychologyandneuroscience(Hasson,2016),themost

prominentmodelisShannon(1948)entropy.However,anumberofnon-equivalentmeasures

ofentropyhavebeensuggested,andarebeingused,inavarietyofresearchdomains.

ExamplesincludetheapplicationofQuadraticentropyinecology(Lande,1996),thefamilyof

Rényi(1961)entropiesincomputerscienceandimageprocessing(Boztas,2014;Sahoo&

Arora,2004),andTsallisentropiesinphysics(Tsallis,2011).Itiscurrentlyunknownwhether

theseotherentropymodelswouldhavepotentialtoaddresskeytheoreticalandempirical

questionsincognitivescience.Here,webringtogetherthesedifferentmodelsina

comprehensivetheoreticalframework,theSharma-Mittalformalism(fromSharma&Mittal,

1975),whichincorporatesalargenumberofprominententropymeasuresasspecialcases.

Carefulconsiderationoftheformalpropertiesofthisfamilyofentropymeasureswillreveal

importantimplicationsformodelinguncertaintyandinformationsearchbehavior.Against

thisrichtheoreticalbackground,wewilldrawonexistingbehavioraldataandnovel

simulationstoexplorehowdifferentmodelsrelatetoeachother,elucidatetheirpsychological

meaningandplausibility,andshowhowtheycangeneratenewtestablepredictions.

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

5

Theremainderofthispaperisorganizedasfollows.Webeginbyspellingoutwhatan

entropymeasureisandhowitcanbeemployedtorepresentuncertaintyandthe

informationalvalueofqueries(questions,tests,experiments)(section2.).Subsequently,we

reviewfourrepresentativeandinfluentialdefinitionsofentropy,namelyQuadratic,Hartley,

Shannon,andErrorentropy(3.).Thesemodelshavebeen,andcontinuetobe,ofimportance

indifferentareasofresearch.Inthemaintheoreticalsectionofthepaper,wedescribea

unifiedformalframeworkgeneratingabiparametriccontinuumofentropymeasures.

Drawingonworkingeneralizedinformationtheory,weshowthatmanyextantmodelsof

entropyandexpectedentropyreductioncanbeembeddedinthiscomprehensiveformalism

(4.).Weprovideanumberofnewmathematicalresultsinthissection.Wealsoaddressthe

theoreticalmeaningoftheparametersinvolvedwhenthetargetdomainofapplicationis

humanreasoning,withimplicationsforbothnormativeanddescriptiveapproaches.Wethen

furtherelaborateontheconnectionwithexperimentalresearchinseveralways.First,we

presentsimulationresultsfromanextensiveexplorationofinformationsearchdecision

problemsinwhichalternativemodelsprovidestronglydiverging,empiricallytestable

predictions(5.).Second,wereportanddiscussanoverarchinganalysisoftheinformation-

theoreticaccountofthemostwidelyknownexperimentalparadigmforthestudyof

informationgathering,i.e.,Wason’s(1966,1968)abstractselectiontask(6.1.).Thenwe

investigatewhichmodelsperformbetteragainstdatafromarangeofexperience-based

studiesonhumaninformationsearchbehavior(Meder&Nelson,2012;Nelsonetal.,2010)

(6.2.).Wealsopointoutthatsomeentropymodelsfromthisframeworkofferpotential

explanationofhumaninformationsearchbehaviorinexperimentswhereprobabilitiesare

conveyedthroughwordsandnumbers,whichtodatehavebeenperplexingtoaccountfor

theoretically(6.3).Finally,weshowthatnewmodelsofferatheoreticallysatisfyingand

descriptivelyadequateunificationofdisparateresultsacrossdifferentkindsoftasks(6.4.).In

theGeneralDiscussion(7.),weoutlineandassesstheprospectsofageneralizedinformation-

theoreticframeworkforguidingthestudyofhumaninferenceanddecisionmaking.

Partofourdiscussionreliesandelaboratesonmathematicalanalyses,includingnovel

results.Moreover,althoughanumberofthemathematicalpointsinthepapercanbefound

scatteredthroughthemathematicsandphysicsliterature,herewebringthemtogether

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

6

systematically.WeprovideSupplementaryMaterialswherenon-trivialderivationsaregiven

accordingtoourunifiednotation.Throughouteachsectionofthetext,statementsrequiringa

mathematicalproofareflaggedbysquarebrackets[SupplMat],andtheproofisthen

presentedinthecorrespondingsubsectionoftheSupplementaryMaterialsfile.Amongthe

formalresultsprovidedthatarenoveltothebestofourknowledge,thefollowingwefind

especiallyimportant:theordinalequivalenceofSharma-Mittalentropymeasuresofthesame

order(proofinSupplMat,section4),theadditivityofallSharma-Mittalmeasuresofexpected

entropyreductionforsequentialtests(againSupplMat,4),andthedistinctiveroleofthe

degreeparameterininformationsearchtaskssuchasthePersonGame(SupplMat,5).

FurthernovelresultsincludethesubsumptionofdiversemodelssuchastheArimoto(1971)

andthePowerentropieswithintheSharma-Mittalframework(SupplMat,3),andthe

specificationofhowanumberofdifferententropymeasurescanbeconstruedwithinthe

generaltheoryofmeans(Table4).

2.Entropies,uncertainty,andinformationsearch

Accordingtoawell-knownanecdote,theoriginsofinformationtheoryweremarkedbya

wittyjokeofJohnvonNeumann.ClaudeShannonwasdoubtfulhowtocallthekeyconceptof

hisgroundbreakingworkonthe“mathematicaltheoryofcommunication”(Shannon,1948).

“Youshouldcallitentropy,”vonNeumannsuggested.Ofcourse,vonNeumannmusthave

beenawareofthecloseconnectionsbetweenShannon’sformulaandBoltzmann’sdefinition

ofentropyinclassicalstatisticalmechanics.Butthemostimportantreasonforhissuggestion,

vonNeumannquipped,wasthat“nobodyknowswhatentropyreallyis,soinadebateyouwill

alwayshavetheadvantage”(seeTribus&McIrvine,1971).Shannonacceptedtheadvice.

Severaldecadeslater,vonNeumann’sremarkseemsevenmorepointed,ifanything.

Influentialobservershavevoicedcautionandconcernabouttheproliferationofmathematical

analysesofentropyandrelatednotions(Aczél,1984,1987).Meanwhile,manyapplications

havebeendeveloped,forinstanceinphysicsandecology(see,e.g.,Beck,2009;Keylock,

2005).Butrecurrenttheoreticalcontroversieshavearisen,too,alongwithoccasional

complaintsofconceptualconfusion(seeCho,2002,andJost,2006,respectively).

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

7

Luckily,thesethornyissueswillbetangentialtoourmainconcerns.Althoughagiven

formalizationofentropycanbeconsideredfortherepresentationandmeasurementof

differentconstructsineachofavarietyofdomains,wefocusononetargetconceptforwhich

entropiescanbeemployed,namelytheuncertaintyconcerningavariableXgivenaprobability

distributionP.Inthisregard,thekeyquestionisthefollowing:Howmuchuncertaintyis

conveyedaboutvariableXbyagivenprobabilitydistributionP?Thisnotioniscentraltothe

normativeanddescriptivestudyofhumancognition.

Suppose,forinstance,thataninfectioncanbecausedbythreedifferenttypesofvirus,and

labelx1,x2,x3thecorrespondingpossibilities.Considertwodifferentprobabilityassignments,

suchas,say:

P(x1)=0.49,P(x2)=0.49,P(x3)=0.02

and

P*(x1)=0.70,P*(x2)=0.15,P*(x3)=0.15

IstheuncertaintyaboutX={x1,x2,x3}greaterunderPorunderP*?Anentropymeasure

enablesustogiveprecisequantitativevaluesinbothcase,andhenceaclearanswer.

Importantly,however,theanswerwilloftenbemeasure-dependent,fordifferententropy

measuresconveydifferentideasofuncertaintyandexhibitdistinctmathematicalproperties

oftheoreticalinterest.Wewillseethisindetaillateron.

Onceuncertaintyasourconceptualtargethasbeenoutlined,wecanturntoentropyasa

mathematicalobject.ConsiderafinitesetXofnmutuallyexclusiveandjointlyexhaustive

possibilitiesx1,…,xnonwhichaprobabilitydistributionP(X)isdefined,sothatP(X)={P(x1),

…,P(xn)},withP(xi)≥0foranyi(1≤i≤n)and∑ "($%&'∈) ) = 1.ThenelementsinX={x1,…,

xn}canbetakenasrepresentingdifferentkindsofentities,suchasevents,categories,or

propositions.Forourpurposes,entisanentropymeasureifitisafunctionfoftherelevant

probabilityvaluesonly,i.e.:

-./0(1) = 2[P(x1),…,P(xn)]

andfunctionfsatisfiesasmallnumberofbasicproperties(seebelow).Noticethat,ingeneral,

anentropyfunctioncanbereadilyextendedtothecaseofaconditionalprobability

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

8

distributiongivensomedatumy.Infact,undertheconditionalprobabilitydistributionP(X|y),

onehas-./0(1|4) = 2["($6|4),… , "($9|4)].

Shannonentropyhasbeensoprominentincognitivesciencethatsomereaderswillask:

whywedonotjuststickwithit?MorespecificobjectionsinthisveinincludethatShannon

entropyisuniquelyaxiomaticallymotivated,thatShannonentropyisalreadycentralto

psychologicaltheoryofthevalueofinformation,orthatShannonentropyisoptimalincertain

appliedsituations.Eachobjectioncanbeaddressedseparately.First,anumberofentropy

metricsinourgeneralizedframework(notonlyShannon)havebeenorcanbeuniquely

derivedfromspecificsetsofaxioms(seeCsiszár,2008).Second,althoughShannonentropy

hasanumberintuitivelydesirableproperties,itisnotaseriouscompetitivedescriptive

psychologicalmodelofthevalueofinformationinsometasks(e.g.,Nelsonetal.,2010).Third,

severalpublishedpapersinapplieddomainsreportsuperiorperformancewhenother

entropymeasuresareused(e.g.,Ramírez-Reyesetal.,2016).Indeed,Shannon’s(1948)own

viewwasthatalthoughaxiomaticcharacterizationcanlendplausibilitytomeasuresof

entropyandinformation,“therealjustification”(p.393)restsonthemeasures’operational

relevance.Ageneralizedmathematicalframeworkcanincreaseourtheoreticalunderstanding

oftherelationshipsamongdifferentmeasures,unifydiversepsychologicalfindings,and

generatenovelquestionsforfutureresearch.

Scholarshaveuseddifferentpropertiesasdefininganentropymeasure(see,e.g.,Csizsár,

2008).Besidessomeusualtechnicalrequirement(likenon-negativity),akeyideaisthat

entropyshouldbeappropriatelysensitivetohowevenorunevenadistributionis,atleast

withrespecttotheextremecasesofanuniformprobabilityfunction,U(X)={1/n,…,1/n},or

ofadeterministicfunctionV(X)whereV(xi)=1forsomei(1≤i≤n)and0forallotherxs.(In

thelattercase,thedistributionactuallyreflectsatruth-valueassignment,inlogicalparlance.)

Inoursetting,U(X)representsthehighestpossibledegreeofuncertaintyaboutX,whileunder

V(X)thetruevalueofXisknownforsure,andnouncertaintyisleft.Henceitmustholdthat,

foranyXandP(X),-./;(1) ≥ -./0(1) ≥ -./=(1),withatleastoneinequalitystrict.This

basicandminimalconditionwelabelevennesssensitivity.ItisconveyedbyShannonentropy

aswellasmanyothers,asweshallsee,anditguarantees,forinstance,thatentropyisstrictly

higherfor,say,adistributionlike{1/3,1/3,1/3}thanfor{1,0,0}.

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

9

Oncetheideaofanentropymeasureischaracterized,onecanstudydifferentmeasuresof

expectedentropyreduction.ThisamountstoconsideringtwovariablesXandY,anddefining

theexpectedreductionoftheinitialentropyofXacrosstheelementsofY.Toillustrate,inthe

viralinfectionexamplementionedabove,Xmayconcernthetypeofvirusactuallyinvolved,

whileYcouldbesomeclinicallyobservablemarker(liketheresultofabloodtest)whichis

informationallyrelevantforX.Mathematically,givenajointprobabilitydistributionP(X,Y)

overthecombinationoftwovariablesXandY(i.e.,theirCartesianproductX´Y),theactual

changeinentropyaboutXdeterminedbyanelementyinYcanberepresentedas

∆-./0(1, 4) = -./0(1) − -./0(1|4).Accordingly,theexpectedreductionoftheinitialentropy

ofXacrosstheelementsofYcanbecomputedinastandardway,asfollows:1

@0(1, A) = B ∆-./0C1, 4DE"(4DGH∈I

)

Thenotation@0(1, A)isadaptedfromworkonthefoundationsofBayesianstatistics,where

theexpectedreductioninentropyisseenasmeasuringthedependenceofvariableXon

variableY,oroftherelevanceofYforX(see,e.g.,Dawid&Musio,2014).

Verymuchasforentropyitself,theexpectedreductionofentropyremainsasgeneraland

neutralanotionaspossible.Rmeasures,too,canbegivendifferentinterpretationsin

differentdomains.Inmanycontexts,itisplausiblyassumedthatreductionoftheuncertainty

isamajordimensionofthepurelyinformational(orepistemic)valueofthesearchformore

data.WewillthusconsiderameasureRasprovidingaformalapproachtoquestionsofthe

followingkind:GivenXasatargetofinvestigation,whatistheexpectedusefulnessoffinding

outaboutYfromapurelyinformationalpointofview?Hence,thenotionofuncertaintyis

tightlycoupledtotherationalassessmentoftheexpectedinformationalutilityofpursuinga

givensearchforadditionalevidence(performingaquery,executingatest,runningan

experiment).(SeeCrupi&Tentori,2014;Nelson,2005,2008.Formorediscussion,alsosee

Evans&Over,1996;Roche&Shogenji,2016.)

1Fortechnicalreasons,wewillassumeP(yj)>0foranyj.Thisislargelyasafeprovisoforourcurrentpurposes.

Infact,inoursettingwithbothXandYfinitesets,anyzeroprobabilityoutcomeinYcouldjustbeomitted.

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

10

Formally,XandYcanjustbeseenaspartitionsofpossibilities.Inthisinterpretation,

however,theyplayquitedifferentrolesin@0(1, A).Thefirstargument,X,representsthe

overallgoaloftheinquiry,whilethesecond,Y,issupposedtobedirectlyaccessibletothe

informationseeker.Inatypicalapplication,Ywillbemoreorlessusefulatesttolearnabout

targetX,althoughunabletoconclusivelyestablishwhatthetruehypothesisinXis.

Table1.Notationemployed.

Notation Description

H={h1,…,hn} Apartitionofnpossibilities(orhypothesisspace).

P(H) ProbabilitydistributionPdefinedovertheelementsofH.

P(H|e) ProbabilitydistributionPdefinedovertheelementsofHconditionalone.

U(H) UniformprobabilitydistributionovertheelementsofH.

V(H) AprobabilitydistributionsuchthatV(hi)=1forsomei(1≤i≤n)and0forallotherhs.

H´E Thevariableobtainedbythecombination(Cartesianproduct)ofvariablesHandE.

P(H,E) JointprobabilitydistributionoverthecombinationofvariablesHandE.

J ⊥0 L GivenP(H,E),variablesHandEarestatisticallyindependent.

J ⊥0 L|M GivenP(H,E,F),variablesHandEarestatisticallyindependentconditionaloneachelementinF.

-./0(J) EntropyofHgivenP(H).

-./0(J|-) ConditionalentropyofHonegivenP(H|e).

∆-./0(J, -) ReductionoftheinitialentropyofHprovidedbye,i.e.,-./0(J) − -./0(J|-).

@0(J,L) ExpectedreductionoftheentropyofHacrosstheelementsofE,givenP(H,E).

@0(J,L|2) ExpectedreductionoftheentropyofHacrosstheelementsofE,givenP(H,E|f).

@0(J,L|M) Expectedvalueof@0(J,L|2)acrosstheelementsofF,givenP(H,E,F).

lnt(x) TheTsallisgeneralizationofthenaturallogarithm(withparametert).

et(x) TheTsallisgeneralizationoftheordinaryexponential(withparametert).

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

11

Ingeneral,theoccurrenceofoneparticularelementyofYdoesnotneedtoreducethe

initialentropyaboutX;itmightasmuchincreaseit,hencemaking∆-./0(1, 4)negative.This

quantitycanbenegativeif(forinstance)datumychangesprobabilitiesfromP(X)={0.9,0.1}

toP(X|y)={0.6,0.4}.Butcan@0(1, A),i.e.,theexpectedinformationalusefulnessofYfor

learningaboutX,benegative?SomeRmeasuresarestrictlynon-negative,butotherscanin

factbenegativeintheexpectation;thisdependsonkeypropertiesoftheunderlyingentropy

measure,aswediscusslateron.

Tosummarize,inthedomainofhumancognition,probabilitydistributionscanbe

employedtorepresentanagent’sdegreesofbelief(betheybasedonobjectivestatistical

informationorsubjectiveconfidence),withentropy-./0(1)providingaformalizationofthe

uncertaintyaboutX(givenP).Relyingonthereductionofuncertaintyasaninformational

utility,@0(1, A)istheninterpretedasameasureoftheexpectedusefulnessofaquery(test,

experiment)YrelativetoatargethypothesisspaceX.Fromnowon,toemphasizethis

interpretation,wewilloftenuseH={h1,…,hn}todenoteahypothesissetofinterestandE=

{e1,…,em}forapossiblesearchforevidence.Table1summarizesourterminologyinthis

respectaswellasforthesubsequentsections.

3.FourInfluentialEntropyModels

Wewillnowbrieflyreviewfourimportantmodelsofentropyandthecorrespondingmodels

ofexpectedentropyreduction.

3.1.Quadraticentropy

Entropy/Uncertainty.Someinterestingentropymeasureswereoriginallyproposedlong

beforetheexchangebetweenShannonandvonNeumann,whenentropywasnotyeta

scientifictermoutsidestatisticalthermodynamics.Hereisonemajorinstance:

-./0NOPQ(J) = 1 − ∑ "(ℎ%)ST'∈U

LabeledQuadraticentropyinVajdaandZvárová(2007),thismeasureiswidelyknownasthe

Gini(orGini-Simpson)index,afterGini(1912)andSimpson(1949)(alsoseeGibbs&Martin,

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

12

1962).Itisoftenemployedasanindexofbiologicaldiversity(see,e.g.,Patil&Taille,1982)

andsometimesspelledoutinthefollowingequivalentformulation:

-./0NOPQ(J) = ∑ "(ℎ%T'∈U )(1 − "(ℎ%))

TheaboveformulasuggestsameaningfulinterpretationwithHamountingtoapartitionof

hypothesesconsideredbyanuncertainagent.Inthisreading,-./NOPQ computestheaverage

(expected)surprisethattheagentwouldexperienceinfindingoutwhatthetrueelementofH

is,given1–P(h)asameasureofthesurprisethatarisesincasehobtains(seeCrupi&

Tentori,2014).2

Entropyreduction/Informationalvalueofqueries.Quadraticentropyreduction,namely,

∆-./0NOPQ(J, -) = -./0

NOPQ(J) − -./0NOPQ(J|-),hasbeenoccasionallymentionedin

philosophicalanalysesofscientificinference(Niiniluoto&Tuomela,1973,p.67).Inturn,its

associatedexpectedreductionmeasure,@0NOPQ(J, L) = ∑ ∆-./0

NOPQCJ, -DE"(-DVH∈W ),was

appliedbyHorwich(1982,pp.127-129),againinformalphilosophyofscience,andstudiedin

computersciencebyRaileanuandStoffel(2004).

3.2.Hartleyentropy

Entropy/Uncertainty.Gini’sworkdidnotplayanyapparentroleinthedevelopmentof

Shannon’s(1948)theory.AseminalpaperbyHartley(1928),however,wasastartingpoint

forShannon’sanalysis.OnelastinginsightofHartleywastheintroductionoflogarithmic

functions,whichhavebecomeubiquitousininformationtheoryeversince.AsHartleyalso

realized,thechoiceofabaseforthelogarithmisamatterofconventionallysettingaunitof

measurement(Hartley,1928,pp.539-541).Throughoutourdiscussion,wewillemploythe

naturallogarithm,denotedasln.

2-./NOPQ alsoquantifiestheoverallexpectedinaccuracyofprobabilitydistributionP(H)asmeasured

bytheso-calledBrierscore(i.e.,thesquaredEuclideandistancefromthepossibletruth-value

assignmentsoverH;seeBrier,1950;Leitgeb&Pettigrew,2010a,b;Pettigrew,2013;Selten,1998).

Festa(1993,137ff.)alsogivesausefuldiscussionofQuadraticentropyinthephilosophyofscience,

includingCarnap’s(1952)classicalworkininductivelogic.

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

13

InspiredbyHartley’s(1928)originalideathattheinformationprovidedbytheobservation

ofoneamongnpossiblevaluesofavariableisincreasinglyinformativethelargernis,and

thatitimmediatelyreflectstheentropyofthatvariable,onecandefinetheHartleyentropyas

follows(Aczél,Forte,andNg,1974):

-./0UPXYZVG(J) = [.\∑ "(ℎ%)]T'∈U ^

Undertheconvention00=0(whichisstandardintheentropyliterature),andgiventhat

P(hi)0=1wheneverP(hi)>0,-./UPXYZVG computesthelogarithmofthenumberofallnon-null

probabilityelementsinH.

Entropyreduction/Informationalvalueofqueries.Whenappliedtothedomainofreasoning

andcognition,theimplicationsofHartleyentropyrevealaninterestingPopperianflavor.A

pieceofevidenceeisuseful,itturnsout,onlytotheextentthatitexcludes(“falsifies”)atleast

someofthehypothesesinH,forotherwisethereductioninHartleyentropy,

∆-./0UPXYZVG(J, -) = -./0

UPXYZVG(J) − -./0UPXYZVG(J|-),isjustzero.Anagentadoptingsucha

measureofinformationalutilitywouldthenonlyvalueatestoutcome,e,insofarasit

conclusivelyrulesoutatleastonehypothesisinH.IfnopossibleoutcomeinEispotentiallya

“falsifier”forsomehypothesisinH,thentheexpectedreductionofHartleyentropy,@UPXYZVG ,

isalsozero,implyingthatqueryEhasnoexpectedusefulnessatallwithrespecttoH.

3.3.Shannonentropy

Entropy/Uncertainty.Inmanycontexts,thenotionofentropyissimplyandimmediately

equatedtoShannon’sformalism.Overall,suchspecialconsiderationiswell-deservedand

motivatedbycountlessapplicationsspreadovervirtuallyallbranchesofscience.Theformof

Shannonentropyisfairlywell-known:

-./0_TP99`9(J) = ∑ "(ℎ%)[. a6

0(T')bT'∈U

Concerningtheinterpretationoftheformula,manypointsmadeearlierforquadraticentropy

applytoShannonentropytoo,givenrelevantadjustments.Infact,ln(1/P(h))isanother

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

14

measureofthesurpriseinfindingoutthatastateofaffairshobtains,andthus-./_TP99`9is

itsoverallexpectedvaluerelativetoH.3

Figure1.AgraphicalillustrationofQuadratic,Hartley,Shannon,andErrorentropyasdistinctmeasuresof

uncertaintyoverabinaryhypothesissetH={h,ℎ}asafunctionoftheprobabilityofh.

Entropyreduction/Informationalvalueofqueries.ThereductionofShannonentropy,

∆-./0_TP99`9(J, -) = -./0_TP99`9(J) − -./0_TP99`9(J|-),issometimescalledinformationgain

anditisoftenconsideredasameasureoftheinformationalutilityofadatume.Itsexpected

value,alsocalledexpectedinformationgain,@0_TP99`9(J, L) = ∑ ∆-./0_TP99`9CJ, -DEVH∈W "(-D),

isthenviewedasameasureofusefulnessofqueryEforlearningaboutH.(See,e.g.,

3Thequantityln(1/P(h))alsocharacterizesapopularapproachtothemeasurementoftheinaccuracy

ofprobabilitydistributionP(H)whenhisthetrueelementinH(so-calledlogarithmicscore),

and-./_TP99`9 canbeseenascomputingtheexpectedinaccuracyofP(H)accordingly(seeGood,

1952;alsoseeGneiting&Raftery,2007).

0.0

0.5

1.0

0.00 0.25 0.50 0.75 1.00Probability of h

Entro

py o

f bin

ary

varia

ble H

EntropyHartley

Shannon

Quadratic

Error

Hartley, Quadratic, Shannon, and Error entropy

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

15

Austerweil&Griffiths,2011;Bar-Hillel&Carnap,1953;Lindley,1956;Oaksford&Chater,

1994,2003,andRuggeri&Lombrozo,2015;alsoseeBenish,1999,andNelson,2005,2008,

formorediscussion.)

3.4.Errorentropy

Entropy/Uncertainty.GivenadistributionP(H)andthegoalofpredictingthetrueelementof

H,arationalagentwouldplausiblyselecth*suchthatP(h*)=maxT'∈U["(ℎ%)],and1–

maxT'∈U["(ℎ%)]wouldthenbetheprobabilityoferror.SinceFano’s(1961)seminalwork,this

quantityhasreceivedconsiderableattentionininformationtheory.AlsoknownasBayes’s

error,wewillcallthisquantityErrorentropy:

-./0WXX`X(J) = 1 − maxT'∈U["(ℎ%)]

Notethat-./WXX`XisonlyconcernedwiththelargestvalueinthedistributionP(H),namely

maxT'∈U["(ℎ%)].Thelowerthatvalue,thehigherthechanceoferrorwereaguesstobemade,

thusthehighertheuncertaintyaboutH.

Entropyreduction/Informationalvalueofqueries.Unliketheothermodelsabove,Error

entropyhasseldombeenconsideredinthenaturalorsocialsciences.However,itcanbe

takenasasoundbasisfortheanalysisofrationalbehavior.Inthelatterdomain,itisquite

naturaltorelyonthereductionoftheexpectedprobabilityoferror∆-./0WXX`X(J, -) =

-./0WXX`X(J) − -./0WXX`X(J|-)astheutilityofadatum(oftenlabelledprobabilitygain;see

Baron,1985;Nelson,2005,2008)andonitsexpectedvalue,@0WXX`X(J, L) =

∑ ∆-./0WXX`XCJ, -DEVH∈W "(-D),astheusefulnessofaqueryortest.Indeed,thereareimportant

occurrencesofthismodelinthestudyofhumancognition.4

4AnearlyexampleisBaron’s(1985,ch,4)presentationof@WXX`X ,followingSavage(1972,ch.6).Experimental

investigationsonwhether@WXX`X canaccountforactualpatternsofreasoningincludeBaron,Beattie,and

Hershey(1988),Bramley,Lagnado,andSpeekenbrink(2015),MederandNelson(2012),Nelson,McKenzie,

Cottrell,andSejnowski(2010),andRusconi,Marelli,D’Addario,Russo,andCherubini(2014),whileCrupi,

Tentori,andLombardi(2009)reliedon@WXX`X intheircriticalanalysisofso-calledpseudodiagnosticity(alsosee

Crupi&Girotto,2014;Tweeney,Doherty,&Kleiter,2010).

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

16

4.AUnifiedFrameworkforUncertaintyandInformationSearch

Thesetofmodelsintroducedaboverepresentsadiversesampleinhistorical,theoretical,and

mathematicalterms(seeFigure1foragraphicalillustration).Istheprominenceofparticular

modelsduetofundamentaldistinctiveproperties,orlargelyduetohistoricalaccident?What

aretherelationshipsamongthesemodels?Inthissectionweshowhowallofthesemodels

canbeembeddedinaunifiedmathematicalformalism,providingnewinsight.

4.1.Sharma-Mittalentropies

LetustakeShannonentropyagainasaconvenientstartingpoint.Asnotedabove,Shannon

entropyisanaverage,morepreciselyaself-weightedaverage,displayingthefollowing

structure:

∑ "(ℎ%)f.2["(ℎ%)]T'∈U

Thelabelself-weightedindicatesthateachprobabilityP(h)servesasaweightforthevalueof

functioninfhavingthatsameprobabilityasitsargument,namely,inf[P(h)].Thefunctioninf

canbeseenascapturinganotionofatomicinformation(orsurprise),assigningavaluetoeach

distinctelementofHonthebasisofitsownprobability(andnothingelse).Anobvious

requirementhereisthatinfshouldbeadecreasingfunction,becauseafindingthatwas

antecedentlyhighlyprobable(improbable)provideslittle(much)newinformation(anidea

thatFloridi,2013,calls“inverserelationshipprinciple”afterBarwise,1997,p.491).In

Shannonentropy,onehasinf(x)=ln(1/x).Giveninf(x)=1–x,instead,Quadraticentropy

arisesfromtheverysameschemeabove.

Aself-weightedaverageisaspecialcaseofageneralized(self-weighted)mean,whichcan

becharacterizedasfollows:

gh6i∑ "(ℎ%)g{f.2["(ℎ%)]}T'∈U l

wheregisadifferentiableandstrictlyincreasingfunction(seeWang&Jiang,2005;alsosee

Muliere&Parmigiani,1993,forthefascinatinghistoryoftheseideas).Fordifferentchoicesof

g,differentkindsof(self-weighted)meansareinstantiated.Withg(x)=x,theweighted

averageaboveobtainsonceagain.Foranotherstandardinstance,g(x)=1/xgivesrisetothe

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

17

harmonicmean.Letusnowconsidertheformofgeneralized(self-weighted)meansabove

andfocusonthefollowingsetting:

g(x)=lnr[et(x)]

inf(x)=lnt(1/x)

where

[.Y($) =&(mno)h66hY

-Y& = [1 + (1 − /)$]mmno

aregeneralizedversionsofthenaturallogarithmandexponentialfunctions,respectively,

oftenassociatedwithTsallis’s(1988)work.Importantly,thelntfunctionrecoversthe

ordinarynaturallogarithmlninthelimitfort®1,sothatonecansafelyequatelnt(x)=ln(x)

fort=1andhaveaniceandsmoothgeneralizedlogarithmicfunction.5Similarly,itisassumed

that-Y& =exfort=1,asthisisthelimitfort®1[SupplMat,section1].Negativevaluesof

parametersrandtwillnotneedconcernushere:we’llbeassumingr,t≥0throughout.

Oncefedintothegeneralizedmeansequation,thesespecificationsofinf(x)andg(x)yielda

two-parameterfamilyofentropymeasuresoforderranddegreet[SupplMat,2]:

-./0_q(r,o)(J) = 6

Yh6 s1 − C∑ "(ℎ%)XT'∈U Eonmrnmt

ThelabelSMreferstoSharmaandMittal(1975),wherethisformalismwasoriginally

proposed(alsoseeMasi,2005,andHoffmann,2008).AllfunctionsintheSharma-Mittalfamily

areevennesssensitive(see2.above),thusinlinewithabasiccharacterizationofentropies

[SupplMat,2].Also,with-./_q(r,o)onecanembedthewholesetoffourclassicmeasuresin

ourinitiallist.Moreprecisely[SupplMat,3]:

5TheideaoflntisoftencreditedtoTsallisforhisworkingeneralizedthermodynamics(seeTsallis,1988,and

2011).ThemathematicalpointmaywellgobacktoEuler,however(seeHoffmann,2008,p.7).Formoretheory,

alsoseeHavrdaandCharvát(1967),Daróczy(1970),Naudts(2002),Kaniadakis,Lissia,andScarfone(2004).

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

18

– QuadraticentropycanbederivedfromtheSharma-Mittalfamilyforr=t=2,thatis,

-./0_q(u,u)(J) = -./0

NOPQ(J);

– HartleyentropycanbederivedfromtheSharma-Mittalfamilyforr=0andt=1,thatis,

-./0_q(v,m)(J) = -./0

UPXYZVG(J);

– ShannonentropycanbederivedfromtheSharma-Mittalfamilyforr=t=1,thatis,

-./0_q(m,m)(J) = -./0_TP99`9(J);

– ErrorentropyisrecoveredfromtheSharma-Mittalfamilyinthelimitforr®∞whent=

2,sothatwehave-./0_q(w,u)(J) = -./0WXX`X(J).

Figure2.TheSharma-MittalfamilyofentropymeasuresisrepresentedinaCartesianquadrantwithvaluesofthe

orderparameterrandofthedegreeparametertlyingonthex–andy–axis,respectively.Eachpointinthe

quadrantcorrespondstoaspecificentropymeasure,eachlinecorrespondstoadistinctone-parameter

generalizedentropyfunction.Severalspecialcasesarehighlighted.(Relevantreferencesandformulasarelisted

inTable4.)

●●

Hartley

Shannon

Quadratic

Rényi

Effective number

Power

Error entropy

Tsallis

Gau

ssia

n

Arimoto

Orig

in

0

1

2

3

4

0 1 2 3 4Order r

Deg

ree t

Sharma−Mittal space of entropy measures

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

19

Figure3.Agraphicalillustrationofthegeneralizedatomicinformationfunctionlnt(1/P(h))forfourdifferent

valuesoftheparametert(0,1,2,and5,respectively,forthecurvesfromtoptobottom).Appropriately,the

amountofinformationarisingfromfindingoutthathisthecaseisadecreasingfunctionofP(h).Forhighvalues

oft,however,suchdecreaseisflattened:witht=5(thelowestcurveinthefigure)findingoutthathistrue

providesalmostthesameamountofinformationforalargesetofinitialprobabilityassignments.

Agooddealmorecanbesaidaboutthescopeofthisapproach:seeFigures2and3,Table4,

andSupplMat(section3)foradditionalmaterial.Here,wewillonlymentionbrieflythree

importantfurtherpointsaboutR-measuresintheSharma-Mittalframeworkandtheir

meaningformodellinginformationsearchbehavior.Theyareasfollows.

Additivityofexpectedentropyreduction:ForanyH,E,Fand"(J, L, M),@0_q(r,o)(J, L × M) =

@0_q(r,o)(J, L) + @0

_q(r,o)(J, M|L).

Thisstatementmeansthat,foranySharma-MittalR-measure,theinformationalutilityofa

combinedtestL × MforHamountstothesumoftheplainutilityofEandtheutilityofFthat

0

1

2

3

0.00 0.25 0.50 0.75 1.00Probability of h

Gen

eral

ized

ato

mic

info

rmat

ion

of h

Degreet = 0

t = 1

t = 2

t = 5

Tsallis generalized atomic information

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

20

isexpectedconsideringallpossibleoutcomesofE[SupplMat,4].(Formally,@0_q(r,o)(J, M|L) =

∑ @0_q(r,o)CJ, M|-DE"C-DEVH∈W ,while@0

_q(r,o)CJ, M|-DEdenotestheexpectedentropyreductionof

HprovidedbyFascomputedwhenallrelevantprobabilitiesareconditionalizedonej.)

AccordingtoNelson’s(2008)discussion,thiselegantadditivitypropertyofexpectedentropy

reductionisimportantandhighlydesirableasconcernstheanalysisoftherational

assessmentoftestsorqueries.Moreover,onecanseethattheadditivityofexpectedentropy

reductioncanbeextendedtoanyfinitechainofqueriesandthusbeappliedtosequential

searchtaskssuchasthoseexperimentallyinvestigatedbyNelsonetal.(2014).

Irrelevance:ForanyH,Eand"(J, L),ifeitherE={e}orJ ⊥0 L,then@0_q(r,o)(J, L) = 0.

Thisstatementsaysthattwospecialkindsofqueriescanbeknowninadvancetobeofnouse,

thatis,informationallyinconsequentialrelativetothehypothesissetofinterest.Oneisthe

caseofanemptytestE={e}withasinglepossibilitythatisalreadyknowntoobtainwith

certainty,sothatP(e)=1.AssuggestedvividlybyFloridi(2009,p.26),thiswouldbelike

consultingtheraveninEdgarAllanPoe’sfamouspoem,whichisknowntogiveoneandthe

sameanswernomatterwhat(italwaysspellsout“Nevermore”).Theothercaseiswhen

variablesHandEareunrelated,thatis,statisticallyindependentaccordingtoP(J ⊥0 Lin

ournotation).Inbothofthesecircumstances,@0_q(r,o)(J, L) = 0simplybecausethepriorand

posteriordistributiononHareidenticalforeachpossiblevalueofE,sothatnoentropy

reductioncaneverobtain.

Bytheirrelevancecondition,emptyandunrelatedquerieshavezeroexpectedutility—but

canaqueryEhaveanegativeexpectedutility?Ifso,arationalagentwouldbewillingtopaya

costjustfornotbeingtoldwhatthetruestateofaffairsisasconcernsE,muchasan

abandonedloverwhowantstobesparedbeingtoldwhetherher/hisbelovedisorisnot

happybecauses/heexpectsmoreharmthangood.Note,however,thatforthelovernon-

informationalcostsareclearlyinvolved,whileweareassumingqueriesorteststobe

assessedinpurelyinformationalterms,bracketingallfurtherfactors(see,e.g.,Raiffa&

Schlaifer,1961,Meder&Nelson,2012,andMarkant&Gureckis,2012,forworkinvolving

situation-specificpayoffs).Inthisperspective,itisreasonableandcommontoseeirrelevance

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

21

astheworst-casescenarioandexcludethepossibilityofinformationallyharmfultests:an

irrelevanttest(whetheremptyorstatisticallyunrelated)simplycannottellusanythingof

interest,butthatisasbadasitcanget(seeGood,1967,andGoosens,1976,forseminal

analyses;alsoseeDawid,1998).6

Interestingly,notallSharma-Mittalmeasuresofexpectedentropyreductionarenon-

negative.Someofthemdoallowforthecontroversialideathattherecouldexistdetrimental

testsinpurelyinformationalterms,suchthatanagentshouldrankthemworsethanan

irrelevantsearchandtakeactivemeasurestoavoidthem(despitethemhaving,by

assumption,nointrinsiccost).Mathematically,anon-negativemeasure@0(J, L)isgenerated

ifandonlyiftheunderlyingentropymeasureisaconcavefunction[SupplMat,4],andthe

conditionsforconcavityareasfollows:

Concavity:-./0_q(r,o)(J)isaconcavefunctionof{P(h1),…,P(hn)}justincaset≥2–1/r.7

IntermsofFigure2,thismeansthatanyentropy(representedbyapoint)belowtheArimoto

curveisnotgenerallyconcave(seeFigure4foragraphicalillustrationofastronglynon-

concaveentropymeasure).Thus,iftheconcavityofentisrequired(topreservethenon-

6Intheoriesofso-calledimpreciseprobabilities,thenotionarisesofadetrimentalexperimentEinthesensethat

intervalprobabilityestimatesforeachelementinahypothesissetofinterestHcanbeproperlyincludedinthe

correspondingintervalprobabilityestimatesconditionaloneachelementinE.Thisphenomenonisknownas

dilation:one’sinitialstateofcredenceaboutHbecomeslessprecise(thusmoreuncertain,underaplausible

interpretation)nomatterhowanexperimentturnsout.Thestronglyunattractivecharacterofthisimplication

hasbeensometimesdisregarded(seeTweeneyetal.,2010,foranexampleinthepsychologyofreasoning),but

theprevailingviewisthatappropriatemovesarerequiredtoavoiditordispelit(forrecentdiscussions,see

Bradley&Steele,2014;Pedersen&Wheeler,2014).

7ThisimportantresultisproveninHoffmann(2008),andalreadymentionedinTanejaetal.(1989,p.61),who

inturnrefertovanderPyl(1978)foraproof.Wedidnotpositconcavityasadefiningpropertyofentropies,and

that’showitshouldbe,inouropinion.Concavitymaydefinitelybeconvenientorevenrequiredinsome

applications,butbarringnon-concavefunctionswouldbeoverlyrestrictiveasconcernstheformalnotionof

entropy.Inphysics,forinstance,concavityistakenasdirectlyrelevantforgeneralizedthermodynamics(Beck,

2009,p.499;Tsallis,2004,p.10).Inbiologicalapplications,ontheotherhand,concavitywassuggestedby

Lewontin(1972;alsoseeRao2010,p.71),butseenashaving“nointuitivemotivation”byPatilandTaille(1982,

p.552).

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

22

negativityofR),thenmanyprominentspecialcasesareretained(includingQuadratic,

Hartley,Shannon,andErrorentropy),butasignificantbitofthewholeSharma-Mittal

parameterspaceisruledout.Thisconcerns,forinstance,entropiesofdegree1andorder

higherthan1(seeBen-Bassat&Raviv,1978).

Table4.AsummaryoftheSharma-Mittalframeworkandseveralofitsspecialcases,includingaspecificationof

theirstructureinthegeneraltheoryofmeansandakeyreferenceforeach.

(r,t)-setting AlgebraicformofentP(H)Generalizedmeanconstruction

Characteristicfunctionanditsinverse Atomicinformation

Sharma-Mittal

Sharma&Mittal(1975)

r≥0

t≥0

1/ − 1

⎣⎢⎢⎡1 − }B "(ℎ%)X

T'∈U

~

Yh6Xh6

⎦⎥⎥⎤ g($) = [.X(-Y&) gh6($) = [.Y(-X&) f.2($) = [.Y Ç

1$É

EffectiveNumbers

Hil(1973)

r≥0

t=0}B "(ℎ%)XT'∈U

~

66hX

− 1 g($) = [.X(1 + $) gh6($) = -X& − 1 f.2($) =1 − $$

Rényi

Rényi(1961)

r≥0

t=1

11 − Ñ [.}B "(ℎ%)X

T'∈U

~ g($) = [.X(-&) gh6($) = [.(-X&) f.2($) = [. Ç1$É

Powerentropies

Laakso&Taagepera(1979)

r≥0

t=21 − }B "(ℎ%)X

T'∈U

~

6Xh6

g($) = [.X Ç1

1 − $Égh6($) = 1 − (-X&)h6 f.2($) = 1 − $

Gaussian

Frank(2004)

r=1

t≥0

1/ − 1 Ö1 − -(6hY)s∑ 0(T')Ü'∈á Z9Ç 6

0(T')Étà g($) = [.(-Y&) gh6($) = [.Y(-&) f.2($) = [.Y Ç

1$É

Arimoto

Arimoto(1971)

r≥½

/ = 2 −1Ñ

ÑÑ − 1

⎣⎢⎢⎡1 − }B "(ℎ%)X

T'∈U

~

6X

⎦⎥⎥⎤ g($) = [.X s1 + Ç

1 − ÑÑ É$t

X6hX gh6($) =

ÑÑ − 1 ä1 − (-X

&)6hXX ã f.2($) =

ÑÑ − 1 ä1 − $

Xh6X ã

Tsallis

Tsallis(1988)

r=t≥0 1/ − 1}1 − B "(ℎ%)Y

T'∈U

~ g($) = $ gh6($) = $ f.2($) = [.Y Ç1$É

Quadratic

Gini(1912)

r=t=2 1 − B "(ℎ%)ST'∈U

g($) = $ gh6($) = $ f.2($) = 1 − $

Shannon

Shannon(1948)

r=t=1 B "(ℎ%)T'∈U

[. Ç1

"(ℎ%)É g($) = $ gh6($) = $ f.2($) = [. Ç

1$É

Hartley

Hartley(1928)

r=0

t=1[. }B "(ℎ%)]

T'∈U

~ g($) = -& − 1 gh6($) = [.(1 + $) f.2($) = [. Ç1$É

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

23

Figure4.Graphicalillustrationofthenon-concaveentropy-./_q(uv,v)forabinaryhypothesissetH={h,ℎ}asa

functionoftheprobabilityofh.

4.2.Psychologicalinterpretationoftheorderanddegreeparameter

Theorderparameterr:Imbalanceandcontinuity.Whatisthemeaningoftheorderparameter

intheSharma-Mittalformalismwhenentropiesandexpectedentropyreductionmeasures

representuncertaintyandthevalueofqueries,respectively?Toclarify,letusconsiderwhat

happenswithextremevaluesofr,i.e.,ifr=0orgoestoinfinity,respectively[SupplMat,3]:

-./0_q(v,o)(J) = [.Y\∑ "(ℎ%)]T'∈U ^

-./0_q(w,o)(J) = [.Y s

6åçéÜ'∈á[0(T')]

t

Giventheconvention00=0,∑ "(ℎ%)]T'∈U simplycomputesthenumberofallelementsinH

withanon-nullprobability.Accordingly,whenr=0,entropybecomesa(increasing)function

ofthemerenumberofthe“live”(non-zeroprobability)optionsinH.Whenrgoestoinfinity,

0.0

0.5

1.0

0.00 0.25 0.50 0.75 1.00Probability of h

Entro

py o

f bin

ary

varia

ble H

Non−concave entropy

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

24

ontheotherhand,entropybecomesa(decreasing)functionoftheprobabilityofasingle

elementinH,i.e.,themostlikelyhypothesis.Thisshowsthattheorderparameterrisanindex

oftheimbalanceoftheentropyfunction,whichindicateshowmuchtheentropymeasure

discountsminor(lowprobability)hypotheses.Fororder-0measures,theactualprobability

distributionisneglected:non-zeroprobabilityhypothesisarejustcounted,asiftheywereall

equallyimportant(seeGauvrit&Morsanyi,2014).Fororder-¥measures,ontheotherhand,

onlythemostprobablehypothesismatters,andallotherhypothesesaredisregarded

altogether.Forintermediatevaluesofr,morelikelyhypothesescountmore,butlesslikely

hypothesesdoretainsomeweight.Thehigher[lower]ris,themore[less]thelikely

hypothesesareregardedandtheunlikelyhypothesesarediscounted.Importantly,for

extremevaluesoftheorderparameter,anotherwisenaturalideaofcontinuityfailsinthe

measurementofentropy:whenrgoestoeitherzeroorinfinity,itisnotthecasethatsmall

(large)changesintheprobabilitydistributionP(H)producecomparablysmall(large)changes

inentropyvalues.

Toseebetterhoworder-0entropymeasuresbehave,considerthesimplestofthem:

-./0_q(v,v)(J) = .è − 1

where.è = ∑ "(ℎ%)]T'∈U ,so.èdenotesthenumberofhypothesesinHwithanon-null

(strictlypositive)probability.Giventhe–1correction,-./_q(v,v)canbeinterpretedasthe

“numberofcontenders”foreachentityinsetH,becauseittakesvalue0whenonlyone

elementisleft.Forfuturereference,wewilllabel-./_q(v,v) Originentropybecauseitmarks

theoriginofthegraphinFigure3.Importantly,theexpectedreductionofOriginentropyis

justtheexpectednumberofhypothesesinHconclusivelyfalsifiedbyatestE.

TotheextentthatalldetailsofthepriorandposteriorprobabilitydistributionoverHare

neglected,computationaldemandsaresignificantlydecreasedwithorder-0entropies.Asa

consequence,measuresoftheexpectedreductionofanorder-0entropy(andespeciallyOrigin

entropy)alsoamounttocomparablyfrugal,heuristicorquasi-heuristicmodelsofinformation

search(seeBaronetal.’smodel,1988,p.106).Lackofcontinuity,too,isassociatedwith

heuristicmodels,whichoftenrelyondiscreteelementsinsteadofcontinuousrepresentations

(seeGigerenzer,Hertwig,&Pachur,2011;Katsikopoulos,Schooler,&Hertwig,2010).More

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

25

generally,whentheorderparameterapproaches0,entropymeasuresbecomemoreand

morebalanced,meaningthattheytreatalllivehypothesesmoreandmoreequally.What

happenstotheassociatedexpectedentropyreductionmeasuresisthattheybecomemore

andmore“Popperian”inspirit.Infact,fororder-0relevancemeasures,atestEwilldeliver

somenon-nullexpectedinformationalutilityabouthypothesissetHifandonlyifsomeofthe

possibleoutcomesofEcanconclusivelyruleoutsomeelementinH.Otherwise,theexpected

entropyreductionwillbezero,nomatterhowlargethechangesinprobabilitythatmight

arisefromE.Cognitively,relevancemeasuresofloworderwouldthendescribethe

informationsearchpreferencesofanagentwhoisdistinctivelyeagertoprunedownthelistof

candidatehypotheses,anattitudewhichmightprevailinearlierstagesofaninquiry,when

suchalistcanbesizable.

Amongentropymeasuresoforderinfinity,wealreadyknow-./_q(w,u) = 1 −

maxT'∈U["(ℎ%)]asErrorentropy.Whatthisillustratesisthat,whenrgoestoinfinity,entropy

measuresbecomemoreandmoredecision-theoreticinashort-sightedkindofway:inthe

limit,theyareonlyaffectedbytheprobabilityofacorrectguessgiventhecurrentlyavailable

information.Anotableconsequencefortheassociatedmeasuresofexpectedentropy

reductionisthatatestEcandeliversomenon-nullexpectedinformationalutilityonlyifsome

ofthepossibleoutcomesofEcanaltertheprobabilityofthemodalhypothesisinH.Ifthatis

notthecase,thentheexpectedutilitywillbezero,nomatterhowsignificantthechangesin

theprobabilitydistributionarisingfromE.Cognitively,then,R-measuresofveryhighorder

woulddescribetheinformationsearchpreferencesofanagentwhoispredominantly

concernedwithanestimateoftheprobabilityoferrorinanimpendingchoicefromsetH.

Thedegreeparametert:Perfecttestsandcertainty.Letusnowconsiderbrieflythemeaningof

thedegreeparametertintheSharma-Mittalformalismwhenentropiesandrelevance

measuresrepresentuncertaintyandthevalueofqueries,respectively.Aremarkablefact

aboutthedegreeparametertisthat(unliketheorderparameterr)itdoesnotaffectthe

rankingofentropyvalues.Indeed,onecanshowthatanySharma-Mittalentropymeasureisa

strictlyincreasingfunctionofanyothermeasureofthesameorderr,regardlessofthedegree

(foranyhypothesissetHandanyprobabilitydistributionP)[SupplMat,4].Thus,concerning

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

26

theordinalcomparisonofentropyvalues,onlyiftheorderdifferscandivergencesbetween

pairsofSMentropymeasuresarise.Ontheotherhand,theimplicationsofthedegree

parameterformeasuresofexpectedentropyreductionaresignificantandhavenotreceived

muchattention.

Asausefulbasisfordiscussion,supposethatvariablesHandEareindependent,inthe

standardsensethatforanyhiÎHandanyejÎE,P(hiÇej)=P(hi)P(ej),denotedasJ ⊥0 L.

Thenwehave[SupplMat,4]:

@0_q(r,o)(L, L) − @0

_q(r,o)(J × L, L) = (/ − 1)-./0_q(r,o)(J)-./0

_q(r,o)(L)

Ifexpectedentropyreductionisinterpretedasameasureoftheinformationalutilityof

queriesortests,thisequalitygovernstherelationshipbetweenthecomputedutilitiesofEin

caseitisa“perfect”(conclusive)testandincaseitisnot.Moreprecisely,thefirsttermonthe

left,@0_q(r,o)(L, L),measurestheexpectedinformationalutilityofaperfecttestbecausethe

testitselfandthetargetofinvestigationarethesame,hencefindingoutthetruevalueofE

removesallrelevantuncertainty.Ontheotherhand,Eisnotanymoreaperfecttestinthe

secondtermoftheequationabove,@0_q(r,o)(J × L,L),forhereamorefine-grainedhypothesis

setJ × Lisatissue,thusamoredemandingepistemictarget;hencefindingoutthetruevalue

ofEwouldnotremoveallrelevantuncertainty.(Recallthat,byassumption,Hisstatistically

independentfromE,sotheuncertaintyaboutHwouldremainuntouched,asitwere,after

knowingaboutE.).Withentropiesofdegree1(includingShannon),theassociatedmeasures

ofexpectedentropyreductionimplythatEhasexactlyidenticalutilityinbothcases,becauset

=1nullifiestheright-handsideoftheequation,regardlessoftheorderparameterr.Witht>1

theright-handsideispositive,soEisastrictlymoreusefultestwhenitisconclusivethan

whenitisnot.Witht<1,onthecontrary,theright-handsideisnegative,soEisstrictlyless

usefulatestwhenitisconclusivethanwhenitisnot.Notethattheseareordinalrelationships

(rankings).Incomparingtheexpectedinformationalutilityofqueries,thedegreeparametert

canthusplayacrucialrole.CrupiandTentori(2014,p.88)providedsomesimple

illustrationswhichcanbeadaptedasfavoringanentropywitht>1asthebasisfortheR-

measureoftheexpectedutilityofqueries(here,wepresentanillustrationinFigure5).

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

27

Figure5.Considerastandard52-cardplayingdeck,withSuitcorrespondingtothe4equallyprobablesuits,

Valuecorrespondingtothe13equallyprobablenumbers(orfaces)thatacardcantake(2through10,Jack,

Queen,King,Ace),andSuit�Valuecorrespondingtothe52equallyprobableindividualcardsinthedeck.

Supposethatyouwillbetoldthesuitofarandomlychosencard.Isthismorevaluabletoyouif(i)(perfecttest

case)yourgoalistolearnthesuit,i.e.,RP(Suit,Suit),or(ii)(inconclusivetestcase)yourgoalistolearnthe

specificcard,i.e.,RP(Suit�Value,Suit)?Whatistheratioofthevalueoftheexpectedentropyreductionin(i)vs.

(ii)?Fordegree1,theinformationtobeobtainedhasequalvalueineachcase.Fordegreesgreaterthan1,the

perfecttestismoreuseful.Fordegreeslessthanone,theinconclusivetestismoreuseful.Interestingly,asthe

figureshows,thedegreeparameteruniquelydeterminestherelativevalueofRP(Suit,Suit)andRP(Suit�Value,

Suit),regardlessoftheorderparameter.IntheFigure,valuesoftheorderparameterrandofthedegree

parametertlieonthex–andy–axis,respectively.Colorrepresentsthelogoftheratiobetweentheconclusive

testandtheinconclusivetestcaseinthecardexampleabove:blackmeansthattheinformationvaluesofthe

testsareequal(logoftheratiois0);warm/coolshadesindicatethattheconclusivetesthasahigher/lower

value,respectively(logoftheratioispositive/negative).

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

28

Themeaningofahighdegreeparameterisofparticularinterestinso-calledTsallisfamily

ofentropymeasures,obtainedfrom-./_q(r,o) whenr=t(seeTable4).ConsiderTsallis

entropyofdegree30,thatis-./_q(êv,êv).Withthismeasure,entropyremainsveryclosetoa

upperboundvalueof1/(t–1)≈0.0345unlesstheprobabilitydistributionreflectsnear-

certaintyaboutthetrueelementinthehypothesissetH.Forinstance,forasunevena

distributionas{0.90,0.05,0.05},-./ëíPZZ%í(êv)yieldsentropy0.03330,stillcloseto0.0345,

whileitquicklyapproaches0whentheprobabilityofonehypothesisexceeds0.99.Non-

Certaintyentropyseemsausefullabelforfuturereference,asmeasure-./ëíPZZ%í(êv) essentially

impliesthatentropyisalmostinvariantaslongasanappreciablelackofcertainty(a

“reasonabledoubt”,asitwere)endures.Accordingly,theentropyreductionfromapieceof

evidenceeislargelynegligibleunlessoneisledtoacquireaveryhighdegreeofcertainty

aboutH,anditapproachestheupperboundof1/(t–1)astheposteriorprobabilitycomes

closetomatchingatruth-valueassignment(withP(hi)=1forsomeiand0forallotherhs).

Uptotheinconsequentialnormalizingconstantt–1,theexpectedreductionofthisentropy,

@ëíPZZ%í(êv),amountstoasmoothvariantofNelson’setal.(2010)“probability-of-certainty

heuristic”,whereadatumeiÎEhasinformationalutility1ifitrevealsthetrueelementinH

withcertaintyandutility0otherwise,sothattheexpectedutilityofEitselfisjusttheoverall

probabilitythatcertaintyaboutHiseventuallyachievedbythattest.Theseremarksfurther

illustratethatalargerdegreetimpliesanincreasingtendencyofthecorrespondingR-

measuretovaluehighlytheattainmentofcertaintyorquasi-certaintyaboutthetarget

hypothesissetwhenassessingatest.

5.Asystematicexplorationofhowkeyinformationsearchmodelsdiverge

Dependingondifferententropyfunctions,twomeasuresRandR*oftheexpectedreductionof

entropyastheinformationalutilityoftestsmaydisagreeintheirrankings.Formally,there

existvariablesH,E,andFandprobabilitydistributionP(H,E,F)suchthat@0(J, L) > @0(J, M)

while@0∗(J, L) < @0∗(J, M);thus,R-measuresarenotgenerallyordinallyequivalent.Inthe

following,wewillfocusonanillustrativesampleofmeasuresintheSharma-Mittalframework

andshowthatsuchdivergencescanbewidespread,strong,andtellingaboutthespecific

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

29

tenetsofthosemeasures.Thismeansthatdifferententropymeasurescanprovidemarkedly

divergentimplicationsintheassessmentofpossiblequeries’expectedusefulness.Depending

ontheinterpretationofthemodels,thisinturnimpliesconflictingempiricalpredictions

and/orincompatiblenormativerecommendations.

Ourlistwillincludethreeclassicalmodelsthatarestandardatleastinsomedomains,namely

Shannon,Quadratic,andErrorentropy.Italsoincludesthreemeasureswhichwepreviously

labelledheuristicorquasi-heuristicinthattheylargelyorcompletelydisregardquantitative

informationconveyedbytherelevantprobabilitydistributionP:theseareOriginentropy(or

the“numberofcontenders”),Hartleyentropy,andNon-Certaintyentropy,asdefinedabove.

Forawidercoverageandcomparison,wealsoincludeanentropyfunctionlyingwellbelow

theArimotocurveinFigure2,thatis,-./_q(uv,v) ,andthuslabelledNon-Concave(seeFigure4).

Weransimulationstoidentifycasesofstrongdisagreementbetweenoursevenmeasures

ofexpectedentropyreduction,onapairwisebasis,aboutwhichoftwotestsistakentobe

moreuseful.Ineachsimulation,weconsideredascenariowithathreefoldhypothesisspaceH

={h1,h2,h3},andtwobinarytests,E={e,-}andF={f,2}.8Thegoalofeachsimulationwasto

findacase—thatis,aspecificjointprobabilitydistributionP(H,E,F)—wheretwoR-

measuresstronglydisagreeaboutwhichoftwotestsismostuseful.Theidealscenariohereis

acasewhereexpectedreductionofonekindofentropy(say,Origin)impliesthatEisasuseful

ascanpossiblybefound,whileFisasbadasitcanbe,andtheexpectedreductionofanother

kindofentropy(say,Shannon)impliestheopposite,withequalstrengthofconviction.

ThequantificationofthedisagreementbetweentwoR-measuresinagivencase—fora

givenP(H,E,F)—arisesfromthreesteps(alsoseeNelsonetal.,2010).(i)Normalization:for

eachmeasure,wedividenominalvaluesofexpectedentropyreduction(foreachofEandF)

bytheexpectedentropyreductionofaconclusivetestforthreeequallyprobablehypotheses,

8Weusedthree-hypothesisscenariostoillustratethedifferencesamongourselectedsampleofRmeasures,

becausescenariosofthiskindappearedtoofferareasonablebalanceofbeingsimpleyetpowerfulenoughto

deliverdivergencesthatarestrongandintuitivelyclear.Notehoweverthattwo-hypothesisscenarioscanalso

clearlydifferentiatemanyoftheRmeasures(seethereviewofbehavioralresearchonbinaryclassificationtasks

inthesubsequentsections).

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

30

thatis,by@;(J, J).(ii)PreferenceStrength:foreachmeasure,wecomputethesimple

differencebetweenthe(normalized)expectedentropyreductionfortestEandfortestF,that

is,ñó(U,W)ñò(U,U)– ñó(U,ö)ñò(U,U)

.(iii)DisagreementStrength(DS):ifthetwomeasuresagreeonwhetherE

orFismostuseful,DSisdefinedaszero;iftheydisagree,DSisdefinedasthegeometricmean

ofthosemeasures’respectiveabsolutepreferencestrengthsinstep(ii).

Inthesimulations,avarietyoftechniqueswereinvolvedinordertomaximize

disagreementstrength,includingrandomgenerationofpriorprobabilitiesoverHandof

likelihoodsforEandF,optimizationoflikelihoodsalone,andjointoptimizationoflikelihoods

andpriors.EachexamplereportedherewasfoundintheattempttomaximizeDSfora

particularpairofmeasures.Wereliedonthesimulationslargelyasaheuristictool,thus

selectingandslightlyadaptingthenumericalexamplestomakethemmoreintuitiveand

improveclarity.9

ForeachpairofR-measuresinoursampleofseven,atleastonecaseofmoderateorstrong

disagreementwasfound(Table5).Thus,foreachpairwisecomparisononecanidentify

probabilitiesforwhichthemodelsmakedivergingclaimsaboutwhichtestismoreuseful.In

whatfollows,weappendashortdiscussiontothecasesinwhichShannonentropystrongly

disagreeswitheachcompetingmodel.Suchdiscussionisillustrativeandqualitative,to

intuitivelyhighlighttheunderlyingpropertiesofdifferentmodels.Similarexplicationscould

beprovidedforallotherpairwisecomparisons,butareomittedforthesakeofbrevity.

Shannonvs.Non-CertaintyEntropy(case3inTable5;DS=0.30).Initspurestform,Non-

Certaintyentropyequals0ifonehypothesisinHisknowntobetruewithcertainty,and1

otherwise.Asaconsequence,theentropyreductionexpectedfromatestEjustamountstothe

probabilitythatfullcertaintywillbeachievedafterEisperformed.WithintheSharma-Mittal

framework,thisbehaviorcanbeoftenapproximatedbyanentropymeasuresuchasTsallisof

degree30,asexplainedabove.10OneexamplewheretheexpectedreductionofShannonand

9Itisimportanttonotethattheproceduresweuseddonotguaranteefindinggloballymaximalsolutions;thus,a

failuretofindacaseofstrongdisagreementdoesnotnecessarilyentailthatnosuchcaseexists.

10Oneshouldnote,however,thatTsallis30,unlikepurenon-certaintyentropy,isacontinuousfunction.Asa

consequence,theapproximationdescribedeventuallyfailswhenonegetsveryclosetolimitingcases.More

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

31

Non-CertaintyentropydisagreesignificantlyinvolvesapriorP(H)={0.67,0.10,0.23}.The

Non-CertaintymeasureratesverypoorlyatestEsuchthatP(H|e)={0.899,0.100,0.001},

P(H|-)={0.001,0.100,0.899},andP(e)=0.74,andstronglyprefersatestFsuchthatP(H|f)=

{1,0,0},P(H|2)={0.40,0.18,0.42},andP(f)=0.45,becausetheprobabilitytoattainfull

certaintyfromFissizable(45%).TheexpectedreductionofShannonentropyimpliesthe

oppositeranking,becausetestE,whileunabletoprovidefullcertainty,willinvariablyyielda

highlyskewedposteriorascomparedtotheprior.

Shannonvs.OriginandHartleyEntropy(case5inTable5;DS=0.56andDS=0.48,

respectively).ThereductionofbothOriginandHartleyentropysharesimilarideasof

countinghowmanyhypothesesareconclusivelyruledoutbytheevidence.Forexample,with

priorP(H)={0.500,0.499,0.001},theexpectedreductionofeitherOriginorHartleyentropy

assignsvaluezerototestEsuchthatP(H|e)={0.998,0.001,0.001},P(H|-)={0.001,0.998,

0.001},andP(e)=0.501,becausenohypothesisiseverruledoutconclusively,andrather

preferstestFsuchthatP(H|f)={0.501,0.499,0},P(H|2)={0,0.499,0.501},andP(f)=0.998.

TheexpectedreductionofShannonentropyimpliestheoppositeranking,becauseFwill

almostalwaysyieldonlyatinychangeinoveralluncertainty.

Shannonvs.Non-ConcaveEntropy(case6inTable5;DS=0.26).Fornon-concaveentropies,

theexpectedentropyreductionmayturnouttobenegative,thusindicatinganallegedly

detrimentalquery,thatis,atestwhereexpectedutilityislowerthanthatofacompletely

irrelevanttest.Thisfeatureyieldscasesofsignificantdisagreementbetweentheexpected

reductionofourillustrativeNon-Concaveentropy,-./_q(uv,v) ,andofclassicalconcave

measuressuchasShannon.WithapriorP(H)={0.66,0.17,0.17},theNon-Concavemeasure

ratesatestEsuchthatP(H|e)={1,0,0},P(H|-)={1/3,1/3,1/3},andP(e)=0.49muchlower

thananirrelevanttestFsuchthatP(H|f)=P(H|2)=P(H).Indeed,thenon-concaveR-measure

assignsasignificantnegativevaluetotestE.Thiscriticallydependsononeinterestingfact:for

precisely,Tsallis30entropyrapidlydecreasesforalmostcertaindistributionssuchas,say,P(H)={0.998,0.001,

0.001}.Infact,Tsallis30entropyissizableandalmostconstantifP(H)conveysaless-than-almost-certainstateof

belief,andbecomeslargelynegligibleotherwise.

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

32

Non-Concaveentropy,goingfromP(H)toacompletelyflatposterior,P(H|-),isanextremely

aversiveoutcome(i.e.itimpliesaverylargeincreaseinuncertainty),whilethe49%chanceof

achievingcertaintybydatumeisnothighlyvalued(afeatureoflowdegreemeasures,aswe

know).TheexpectedreductionofShannonentropyimpliestheoppositerankinginstead,asit

conveystheprinciplethatnotestcanbeinformationallylessusefulthananirrelevanttest

(suchasF).

Shannonvs.QuadraticEntropy(case8inTable5;DS=0.09).ShannonandQuadratic

entropiesaresimilarinmanyways,yetatleastcasesofmoderatedisagreementcanbefound.

OneiswithpriorP(H)={0.50,0.14,0.36}.TestEissuchthatP(H|e)={0.72,0.14,0.14},

P(H|-)={0.14,0.14,0.72},andP(e)=0.62,whilewithtestFonehasP(H|f)={0.5,0.5,0},

P(H|2)={0.5,0,0.5},andP(f)=0.28.ExpectedQuadraticentropyreductionranksEoverF,as

itputsaparticularlyhighvalueonposteriordistributionswhereonesinglehypothesiscomes

toprevail.Incomparison,thisislessimportantforthereductionofShannonentropy,aslong

assomehypothesesarecompletely(orlargely)ruledout,asoccurswithF.Accordingly,the

ShannonmeasureprefersFoverE.

Shannonvs.ErrorEntropy(case9inTable5;DS=0.20).Astrongerdisagreementarises

betweenShannonandErrorentropy.ConsiderpriorP(H)={0.50,0.18,0.32},atestEsuch

thatP(H|e)={0.65,0.18,0.17},P(H|-)={0.17,0.18,0.65},andP(e)=0.69,andatestFsuch

thatP(H|f)={0.5,0.5,0},P(H|2)={0.5,0,0.5},andP(f)=0.36.Theexpectedreductionof

ErrorentropyissignificantwithEbutzerowithF,becausethelatterwillleavethemodal

probabilityuntouched.(Notethatitdoesnotmatterthatthehypotheseswiththemaximum

probabilitychanged.)However,testF,unlikeE,willinvariablyruleoutanhypothesisthatwas

apriorisignificantlyprobable,andforthisreasonispreferredbytheShannonR-measure.

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

33

Table5.Casesofstrongdisagreementbetweensevenmeasuresofexpectedentropyreduction.TwobinarytestsEandFareconsideredforaternaryhypothesissetH.Preferencestrengthisthedifferencebetween(normalized)valuesofexpectedentropyreductionforEandF,respectively:itispositiveiftestEisstrictlypreferred,negativeifFisstrictlypreferred,andnullif

they’reratedequally.Themostrelevantpreferencevaluestobecomparedarehighlightedinbold:theyillustratethat,foreachpairofR-measuresinoursampleofseven,thetable

includesatleastonecaseofmoderateorstrongdisagreement.

n. P(H)TestE TestF Preferencestrengthintheexpectedreductionofentropy

P(H|e)vs.P(H|!) P(e)vs.P(!) P(H|f)vs.P(H|") P(f)vs.P(") Non-Certainty Origin Hartley Non-Concave Shannon Quadratic Error

1 {0.50,0.25,0.25} {0.5,0.5,0}{0.5,0,0.5}

0.50.5

{1,0,0}{1/3,1/3,1/3}

0.250.75

–0.250 0.250 0.119 0.250 0.119 0 0

2 {0.67,0.17,0.17} {0.82,0.17,0.01}{0.01,0.17,0.82}

0.80.2

{1,0,0}{1/3,1/3,1/3}

0.490.51

–0.487 –0.490 –0.490 0.394 0.046 0.062 0.240

3 {0.67,0.10,0.23} {0.899,0.1,0.001}{0.001,0.1,0.899}

0.740.26

{1,0,0}{0.40,0.18,0.42}

0.450.55

–0.409 –0.450 –0.450 0.342 0.218 0.249 0.329

4 {0.6,0.1,0.3} {1,0,0}{1/3,1/6,1/2}

0.40.6

{0.7,0.3,0}{0.55,0.0.45}

1/32/3

0.400 –0.100 0.031 0.045 0.051 0.155 0.150

5 {0.5,0.499,0.001} {0.998,0.001,0.001}{0.001,0.998,0.001}

0.5010.499

{0.501,0.499,0}{0,0.499,0.501}

0.9980.002

0.942 –0.500 –0.369 0.499 0.617 0.744 0.746

6 {0.66,0.17,0.17} {1,0,0}{1/3,1/3,1/3}

0.490.51

{0.66,0.17,0.17}{0.66,0.17,0.17}

0.50.5

0.490 0.490 0.490 –0.236 0.288 0.250 0

7 {0.53,0.25,0.22} {1,0,0}{0.295,0.375,0.330}

1/32/3

{0.53,0.25,0.22}{0.53,0.25,0.22}

0.50.5

0.333 0.333 0.333 –0.123 0.261 0.249 0.080

8 {0.50,0.14,0.36} {0.72,0.14,0.14}{0.14,0.14,0.72}

0.620.38

{0.5,0.5,0}{0.5,0,0.5}

0.280.72

0 –0.500 –0.369 0.293 –0.085 0.086 0.330

9 {0.50,0.18,0.32} {0.65,0.18,0.17}{0.17,0.18,0.65}

0.690.31

{0.5,0.5,0}{0.5,0,0.5}

0.360.64

0 –0.180 –0.133 0.213 –0.179 –0.024 0.225

10 {0.42,0.42,0.16} {0.5,0.5,0}{0,0,1}

0.840.16

{0.66,0.24,0.10}{0.10,0.66,0.24}

0.570.43

0.160 0.580 0.470 –0.146 0.241 0.115 –0.120

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

34

6.Modelcomparison:Predictionandbehavior

Nowthatwehaveseenexamplesillustratingthetheoreticalpropertiesofavarietyof

Sharma-Mittalrelevancemeasures,weturntoaddressingwhethertheSharma-Mittal

measurescanhelpwithpsychologicalornormativetheoryofthevalueofinformation.

6.1.ComprehensiveanalysisofWason’sabstractselectiontask

ThesinglemostwidelystudiedexperimentalinformationsearchparadigmisWason’s

(1966)selectiontask.Intheclassical,abstractversion,participantsarepresentedwitha

conditionalhypothesis(or“rule”),h=“ifA[antecedent],thenC[consequent]”.The

hypothesisconcernssomecards,eachofwhichhasaletterononesideandanumberon

theother,forinstanceA=“thecardhasavowelononeside”andC=“thecardhasaneven

numberontheotherside”.Onesideisdisplayedforeachoffourcards:oneinstantiatingA

(e.g.,showingletterE),oneinstantiatingnot-A(e.g.,showingletterK),oneinstantiatingC

(e.g.,showingnumber4),andoneinstantiatingnot-C(e.g.,showingnumber7).

Participantshavethereforefourinformationsearchoptionsinordertoassessthetruthor

falsityofhypothesish:turningovertheA,thenot-A,theC,orthenot-Ccard.Theyareasked

tochoosewhichonestheywouldpickupasusefultoestablishwhetherthehypothesis

holdsornot.All,none,oranysubsetofthefourcardscanbeselected.

AccordingtoWason’s(1966)original,“Popperian”readingofthetask,theAandnot-C

searchoptionsareusefulbecausetheycouldfalsifyh(bypossiblyrevealingaevennumber

andavowel,respectively),soarationalagentshouldselectthem.Thenot-AandCoptions,

onthecontrary,couldnotprovideconclusivelyrefutingevidence,sothey’reworthlessin

thisinterpretation.However,observedchoicefrequenciesdepartmarkedlyfromthese

prescriptions.InOaksfordandChater’s(1994,p.613)metaanalysis,theywere89%,16%,

62%,and25%forA,not-A,C,andnot-C,respectively.OaksfordandChater(1994,2003)

devisedBayesianmodelsofthetaskinwhichagentstreatthefourcardsassampledfroma

largerdeckandareassumedtomaximizetheexpectedreductionofuncertainty,with

Shannonentropyasthestandardmeasure.OaksfordandChaterpostulatedafoil

hypothesisℎinwhichAandCarestatisticallyindependentandatargethypothesishunder

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

35

whichCalways(oralmostalways)followsA.InOaksfordandChater’s(1994)

“deterministic”analysis,CalwaysfollowedAunderthedependencehypothesish.Akey

innovationinOaksfordandChater(2003,p.291)wastheintroductionofan“exception”

parameter,suchthatP(C|A)=1–P(exception)underh.Themodelalsorequires

parametersaandgfortheprobabilitiesP(A)andP(C)oftheantecedentandconsequentof

h.WeimplementOaksfordandChater’s(2003)model,positinga=0.22andg=0.27

(accordingtothe“rarity”assumption),andanuniformprioronH={h,ℎ},assuggestedinOaksfordandChater(2003,p.296).Weexploredtheimplicationsofcalculatingthe

expectedusefulnessofturningovereachcard,notonlyaccordingtoShannonentropy

reduction,butforthewholesetofentropymeasuresfromtheSharma-Mittalframework.11

EmpiricalData.Wefirstaddresshowwelldifferentexpectedentropyreduction

measurescorrespondtoempiricalaggregatecardselectionfrequenciesinthetask,with

respecttoOaksfordandChater’s(2003)model.Fortheselectionfrequencies,weusethe

abstractselectiontaskdataasreportedbyOaksfordandChater(1994,p.613)and

mentionedabove(89%,16%,62%,and25%forA,not-A,C,andnot-C,respectively).

Figure6(toprow)showstherankcorrelationbetweenrelevancevaluesandempirical

selectionfrequenciesforeachorderanddegreevaluefrom0to20,instepsof0.25.First

considerresultsforthemodelwithP(exception)=0(Figure6,topleftsubplot).Awide

rangeofmeasures,includingexpectedreductionofShannonandQuadraticentropy,of

somenon-concaveentropies(e.g.,#$%('(,'.+))andofmeasureswithfairlyhighdegree(e.g.,

#$%('(,-))correlateperfectlywiththerankofselectionfrequencies.However,ifahigh

degreemeasurewithmoderateorhighorderisused,therankcorrelationisnotperfect.

11Tofittherelevantpatternsofresponses,wepursuedavarietyofmethods,includingoptimizingHattori’s

“selectiontendencyfunction”(whichmapsexpectedentropyreductionontothepredictedprobabilitythata

cardwillbeselected,seeHattori,1999,2002;alsoseeStringer,Borsboom,&Wagenmakers,2011),ortaking

previouslyreportedparametersforHattori’sselectiontendencyfunction;Spearmanrankcorrelation

coefficients;andPearsoncorrelations.Similarresultswereobtainedacrossthesemethods.Becausetherank

correlationsaresimpletodiscuss,wefocusonthosehere.Fullsimulationresultsfortheseandother

measures,modelvariantswithothervaluesofP(exception),andMatlabcode,areavailablefromJ.D.N.

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

36

ConsiderforinstancetheTsallismeasureofdegree20(i.e.#$%(.(,.()).Thisleadsto

relevancevaluesfortheA,not-A,C,andnot-Ccardsof0.0281,0.0002,0.0008,and0.0084,

respectively.BecausetherelativeorderingoftheCandthenot-Ccardisincorrect(from

theperspectiveofobservedchoices),therankcorrelationisonly0.8.Thesamerank

correlationof0.8isobtained,butforadifferentreason,fromstronglynon-concave

relevancemeasures.#$%(.(,(),forinstance,givesvaluesof1.181,0.380,1.054,and0.372

(againfortheA,not-A,C,andnot-Ccards,respectively),sothatthenot-Acardisdeemed

moreinformativethanthenot-Ccardbythisrelevancemeasure.

LetusnowconsiderexpectedreductionofOriginentropy,#$%((,() ,asanexampleofthe0-

ordermeasures.Itgivesrelevancevaluesof0.527,0,0,and0.159fortheA,not-A,C,andnot-C

cards,respectively.ThisissimilartoWason’sanalysisofthetask:onlytheAandthenot-C

cardscanfalsifyahypothesis(namely,thedependencehypothesish),thusonlythosetwo

cardshavevalue.Theothercardscouldchangetherelativeplausibilityofhvs.ℎ;however,accordingto0-ordermeasures,noinformationalvalueisachievedbecausenohypothesisis

definitelyruledout.Inthissense,0-ordermeasurescanbethoughtofasbringingelementsof

theoriginallogicalinterpretationoftheselectiontaskintothesameunifiedinformation-

theoreticframeworkincludingShannonandgeneralizedentropies(seebelowformoreon

this).Interestingly,thisdoesnotimplythattheAandthenot-Ccardsareequallyvaluable:in

themodel,theAcardoffersahigherchanceoffalsifyinghthanthenot-Ccard,soitismore

valuable,accordingtothisanalysis.Thus,whileincorporatingthebasicideaoftheimportance

ofpossiblefalsification,the0-orderSharma-Mittalformalizationofinformationalvalueoffers

somethingthatthestandardlogicalreadingdoesnot:arationaleforassessingtherelative

valueamongthosequeries(theAandthenot-Ccard)providingthepossibilityoffalsifyinga

hypothesis.TheOriginentropyvaluesandtheempiricaldataagreethattheAcardismost

usefuland(uptoatie)thatthenot-Acardisleastuseful,butdisagreeonvirtuallyeverything

else;#$%((,() ’srankcorrelationtoempiricalcardselectionfrequenciesis0.6325.

WhatifOaksfordandChater’s(2003)modeliscombinedwithexceptionparameter

P(exception)=0.1,ratherthan0?Inthiscase,theempiricalselectionfrequenciesperfectly

correlatewiththetheoreticalvaluesforanevenwiderrangeofmeasuresthanforthe

“deterministic”model(Figure6,toprightplot).Forinstance,Tsallisofdegree11,i.e.#$%('',''),

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

37

whichhadrankcorrelationof0.8withP(exception)=0,hasaperfectrankcorrelationwith

0.1.Thisisduetotherelativeorderingofthenot-AandCcards.FortheP(exception)=0

model,theA,not-A,C,andnot-Ccardshad#$%('','') relevanceof0.059,0.002,0.012,and

0.016,respectively;withP(exception)=0.1,thecards’respectiverelevancevaluesare0.019,

0.001,0.007,and0.005.Inaddition,adramaticdifferencebetweenP(exception)=0and

P(exception)=0.1arisesforthe0-ordermeasures.IfP(exception)>0,evenifverysmall,no

amountofobtaineddatacaneverleadtorulingoutahypothesisinthemodel.Therefore,with

P(exception)=0.1allcardshavezerovaluefor0-ordermeasures,andthecorrelationwith

behavioraldataisundefined(plottedblackinFigure6).

AprobabilisticunderstandingofWason’snormativeindications.Finally,wediscusshowwell

theexpectedinformationalvalueofthecards,ascalculatedusingOaksfordandChater’s

(2003)modelandvariousSharma-Mittalmeasures,correspondstoWason’soriginal

interpretationofthetask.Wethusconductedthesameanalysesasabove,butinsteadofusing

thehumanselectionfrequenciesweassumedthattheAcardwasselectedwith100%,thenot-

Acardwith0%,theCcardwith0%,andthenot-Ccardwith100%probability.The0-order

relevancemeasures,againwithinOaksfordandChater’s(2003)modelwithP(exception)=0,

provideaprobabilisticunderstandingofWason’snormativeindications.LikeWason,the0-

ordermeasuresdeemonlytheAandthenot-CcardstobeusefulwhenP(exception)=0.The

rankcorrelationwiththeoreticalselectionfrequenciesfromWason’sanalysisis0.94(see

Figure6,bottomleftplot).Whyisthecorrelationnotperfect?Theprobabilistic

understandingproposed,asdiscussedabove,goesbeyondthelogicalanalysis:becausetheA

cardoffersahigherprobabilityoffalsificationthanthenot-Ccarddoesintheprobability

model,the0-orderrelevancemeasuresvaluetheformermorethanthelatter.Recallthatour

hypotheticalparticipantsalwaysselectbothcardsthatentailthepossibilityoffalsifyingthe

dependencehypothesis;thus,thecorrelationislessthanone.Theworstcorrelationwith

Wason’srankingisfromthestronglynon-concavemeasures,suchas#$%(.(,();thiscorrelation

isexactlyzero.

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

38

OaskfordandChater’s(2003)

modelwithP(exception)=0

OaskfordandChater’s(2003)

modelwithP(exception)=0.1

humandata

Wason'stheory

Figure6.PlotsofrankcorrelationvaluesfortheexpectedreductionofvariousSharma-Mittalentropiesin

OaksfordandChater’s(2003)modeloftheWasonselectiontask.Inthetoprow,modelsofexpected

entropyreductionarecomparedwithempiricalaggregatecardselectionfrequencies.Inthebottomrow,

instead,thecomparisoniswiththeoreticalchoicesimpliedbyWason’soriginalanalysisofthetask.Inthe

leftvs.rightcolumnstheconditionalprobabilityrepresentationof“ifvowel,thenevennumber”rulesout

expectionsorallowsforthem(withprobability0.1),respectively.

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

39

TheWasonselectiontaskillustratesthetheoreticalpotentialoftheSharma-Mittal

framework.Whereasotherauthorsnotedtherobustnessofprobabilisticanalysesofthe

taskacrossdifferentmeasuresofinformationalutility(seeFitelson&Hawthorne,2010;

Nelson,2005,pp.985-986;Oaksford&Chater,2007),thevarietyofmeasuresinvolvedin

thoseanalysesaroseinanadhocway.Weextendthoseresults,andshowthateventhe

traditional,allegedlyanti-Bayesianreadingofthetaskcanberecoveredsmoothlyinone

overarchingframework.Inparticular,theimplicationsofWason’sPopperian

interpretationcanberepresentedwellbythemaximizationoftheexpectedreductionofan

entirelybalanced(order-0)Sharma-Mittalmeasure(suchasOriginorHartleyentropy)ina

deterministicreadingofthetask(i.e.,withP(exception)=0).Conversely,thismeansthat

adoptingaprobabilisticapproachtoWason’staskisnotbyitselfsufficienttoaccountfor

observedbehavior.Eventhen,infact,people’schoiceswouldstilldivergefromatleast

sometheoreticallyviablemodelsofinformationsearch.

6.2.Informationsearchinexperience-basedstudies

Isthesameexpecteduncertaintyreductionmeasureabletoaccountforhumanbehavior

acrossavarietyoftasks?Toexplorethisissue,wereviewedexperimentalscenarios

employedinexperience-basedinvestigationsofinformationsearchbehavior.Inthis

experimentalparadigm,participantslearntheunderlyingstatisticalstructureofan

environmentwhereitems(planktonspecimens)arevisuallydisplayedandsubjecttoa

binaryclassification(kindAvs.B)forwhichtwobinaryfeatures(yellowvs.blackeye;dark

vs.lightclaw)arepotentiallyrelevant.Immediatefeedbackisprovidedaftereachtrialina

learningphase,untilaperformancecriterionisreached,indicatingadequatemasteryofthe

environmentalstatistics.Inasubsequentinformation-acquisitiontestphaseofthis

procedure,bothofthetwofeatures(eyeandclaw)areobscured,andparticipantshaveto

selectthemostinformative/usefulfeaturerelativetothetargetcategories(kindsof

plankton).(SeeNelsonetal.,2010,foradetaileddescription.)Inourcurrentterms,these

scenariosconcernabinaryhypothesisspaceH={specimenofkindA,specimenofkindB}

andtwobinarytestsE={yelloweye,blackeye}andF={darkclaw,lightclaw}.Ineach

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

40

case,theexperience-basedlearningphaseconveyedthestructureofthejointprobability

distributionP(H,E,F)toparticipants.Thetestphase,inwhicheitherfeatureEorFcanbe

viewed,representsawaytoseewhethertheparticipantsdeemed#/(0, 1)or#/(0, 2)tobegreater.

Table6.Choicesbetweentwobinarytests/experiments(Evs.F)forabinaryclassificationproblem(H)in

experience-basedexperimentalprocedures.Cases1-3aretakenfromNelsonetal.(2010,Exp.1);cases4-5from

Exp.3inthesamearticle;case6isanunpublishedstudyusingthesameexperimentalprocedure;cases7-8are

fromMederandNelson(2012,Exp.1).

n. P(H)TestE TestF %observed

choicesofEP(H|e)vs.P(H|3) P(e)vs.P(3) P(H|f)vs.P(H|4) P(f)vs.P(4)

1 {0.7,0.3} {0,1}

{0.754,0.246}

0.072

0.928

{1,0}

{0.501,0.499}

0.399

0.601

82%(23/28)

2 {0.7,0.3} {0,1}

{0.767,0.233}

0.087

0.913

{1,0}

{0.501,0.499}

0.399

0.601

82%(23/28)

3 {0.7,0.3} {0.109,0.891}

{0.978,0.022}

0.320

0.680

{1,0}

{0.501,0.499}

0.399

0.601

97%(28/29)

4 {0.7,0.3} {0,1}

{0.733,0,.267}

0.045

0.955

{1,0}

{0.501,0.499}

0.399

0.601

89%(8/9)

5 {0.7,0.3} {0.201,0.799}

{0.780,0.220}

0.139

0.861

{1,0}

{0.501,0.499}

0.399

0.601

70%(14/20)

6 {0.7,0.3} {0.135,0.865}

{0.848,0.152}

0.208

0.792

{1,0}

{0.501,0.499}

0.399

0.601

70%(14/20)

7 {0.44,0.56} {0.595,0.405}

{0.331,0.669}

0.414

0.586

{0,1}

{0.502,0.498}

0.123

0.877

60%(12/20)

8 {0.36,0.64} {0.090,0.910}

{0.707,0.293}

0.562

0.438

{0,1}

{0.501,0.499}

0.282

0.118

79%(15/19)

Overall,wefoundeightrelevantexperimentalscenariosfromtheexperimental

paradigmdescribedabove(theyarelistedinTable6)inwhichtherewasatleastsome

interestingdisagreementamongtheSharma-Mittalmeasuresaboutwhichfeatureismore

useful.Foreach,wederivedvaluesofexpecteduncertaintyreductionfromSharma-Mittal

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

41

measuresoforderanddegreefrom0to20,inincrementsof0.25,andwecomputedthe

simpleproportionofcasesinwhicheachmeasure’srankingof#/(0, 1)and#/(0, 2)matchedthemostprevalentobservedchoice.

Nelsonetal.(2010)devisedtheirscenariostodissociatepredictionsfromasampleof

competingandhistoricallyinfluentialmodelsofrationalinformationsearch.Their

conclusionwasthattheexpectedreductionofErrorentropy(expectedprobabilitygain,in

theirterminology)accountedforparticipants’behaviorandoutperformedtheexpected

reductionofShannonentropy(expectedinformationgain,intheirterminology).Amore

comprehensiveanalysiswithinourcurrentapproachimpliesaricherpicture.Thedataset

employedcanbeaccuratelyrepresentedintheSharma-Mittalframeworkforasignificant

rangeofdegreevaluesprovidedthattheorderparameterishighenough(theresultsare

displayedinFigure7,leftside).Observedchoicesareespeciallyconsistentwithexpected

reductionofaquiteunbalanced(e.g.,r≥4),concaveorquasi-concave(tcloseto2)

Sharma-Mittalentropymeasure.Importantly,thereisoverlapbetweenresultsfrom

modelingtheWasonselectiontaskandtheseexperience-basedlearningdata,givinghope

totheideathataunifiedtheoreticalexplanationofhumanbehaviormayextendacross

severaltasks.

6.3.Informationsearchinwords-and-numbersstudies

Theexperience-basedlearningtasksdiscussedabovewereinspiredbyanalogoustasksin

whichthepriorprobabilitiesofcategoriesandfeaturelikelihoodswerepresentedto

participantsusingwordsandnumbers(e.g.,SkovandSherman,1986).Werefertosuch

tasksasPlanetVumaexperiments,reflectingthetypicallywhimsicalcontent,suchas

classifyingspeciesofaliensonPlanetVuma,designedtonotconflictwithpeople’s

experiencewithrealobjectcategories.

WhereasexpectedreductionofErrorentropy,andothermodelsasdiscussedabove,

givesaplausibleexplanationoftheexperience-basedlearningtaskdata,individualdatain

words-and-numbersstudiesareverynoisy,andnoattempthasbeenmadetoseewhether

aunifiedtheorycouldaccountforthemodalresponsesacrossthesetasks.Wethereforere-

analyzedempiricaldatafromseveralPlanetVumaexperiments,inamanneranalogousto

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

42

ouranalysesoftheexperience-basedlearningdataabove(Figure7).Whatdotheresults

show?Tooursurprise,theresultssuggestthattheremaybeasystematicexplanationof

people’sbehavioronwords-and-numbers-basedtasks.

Experience-basedlearning(Plankton) Words-and-numberspresentation(PlanetVuma)

Figure7.Ontheleft,agraphicalillustrationoftheempiricalaccuracyofSharma-Mittalmeasuresrelativeto

binaryinformationsearchchoicesin8experience-basedexperimentalscenarios(describedinTable6).The

shadeateachpointillustratestheproportionofchoices(outof8)correctlypredictedbytheexpected

reductionofthecorrespondingunderlyingentropy,withwhiteandblackindicatingmaximum(8/8)and

minimum(0/8)accuracy,respectively.ResultssuggestthatanArimotometricofmoderateorhighorderis

highlyconsistentwithhumanchoices.Ontheright,illustrationoftheempiricalaccuracyofSharma-Mittal

measuresintheoreticallysimilartasks,butwhereprobabilisticinformationispresentedinastandard

explicitformat(withnumericpriorprobabilitiesandtestlikelihoods).Inthesetasks,individualparticipants’

testchoicesarehighlynoisy.Canasystematictheorystillaccountforthemodalresultsacrosstasks?We

analyzed13cases(describedinTable7)ofbinaryinformationsearchpreferences.Theshadeateachpoint

illustratestheproportionofcomparisons(outof13)correctlypredictedbytheexpectedreductionofthe

correspondingunderlyingentropy,withwhiteandblackagainindicatingmaximum(13/13)andminimum

(0/13)accuracy,respectively.Resultsshowthatawiderangeofmeasuresisconsistentwithavailable

experimentalfindings,includingShannonentropyaswellasavarietyofhigh-degreemeasures(degreemuch

higherthantheArimotocurve).

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

43

Table7.Choicesbetweentwobinarytests/experiments(Evs.F)forabinaryclassificationproblem(H)in

words-and-numbers(VumaPlanet)experiments.Cases1-6arefromNelson(2005);case7isfromSkov&

Sherman(1986),cases8-10arefromNelsonetal.(2010,Exp.1);cases11-13fromWuetal.(inpress,Exp.1-3).

Ineachcase,testEwasdeemedmoreusefulthantestFbytheparticipants.Weonlyreportscenariosforwhich

atleasttwoSharma-Mittalmeasuresstrictlydisagreeaboutwhichofthetestshashigherexpectedusefulness.

(Thus,notallfeaturequeriesinvolvedintheoriginalarticlesarelistedhere.)Nelson(2005)askedparticipants

togivearankorderingamongfourpossiblefeatures’informationvalues.Herewelistthesixcorresponding

pairwisecomparisons,ineachcaselabelingthefeaturethatwasrankedhigherasthefavoriteone(E).Wuetal.

(inpress)studied14differentprobability,naturalfrequency,andgraphicalinformationformatsforthe

presentationofrelevantprobabilities.Forcomparisonwithotherstudies,wetakeresultsonlyfromthestandard

probabilityformathere.

n. P(H)TestE TestF

P(e|h) P(3|6) P(f|h) P(4|6)

1 {0.5,0.5} 0.70 0.30 0.99 1.00

2 {0.5,0.5} 0.30 0.0001 0.99 1.00

3 {0.5,0.5} 0.01 0.99 0.99 1.00

4 {0.5,0.5} 0.30 0.0001 0.70 0.30

5 {0.5,0.5} 0.01 0.99 0.30 0.0001

6 {0.5,0.5} 0.01 0.99 0.70 0.30

7 {0.5,0.5} 0.90 0.55 0.65 0.30

8 {0.7,0.3} 0.57 0 0 0.24

9 {0.7,0.3} 0.57 0 0 0.29

10 {0.7,0.3} 0.05 0.95 0.57 0

11 {0.7,0.3} 0.41 0.93 0.03 0.30

12 {0.7,0.3} 0.43 1.00 0.04 0.37

13 {0.72,0.28} 0.03 0.83 0.39 1.00

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

44

ThedegreeofthemostplausiblemeasuresisconsiderablyabovetheArimotocurve,

althoughnotashighas,forinstance,Non-Certaintyentropy(order30).Fromadescriptive

psychologicalstandpoint,aplausibleinterpretationisthatwhenconfrontedwithwords-

and-numbers-typetasks,peoplehaveastrongfocusonthechancesofobtainingacertain

ornear-to-certainresult,andarelessconcernedwith(or,perhaps,attunedto)thedetails

oftheindividualitemsintheprobabilitydistribution.TheSharma-Mittalframework

providespotentialexplanationforheretoforeperplexingexperimentalresults,whilealso

highlightingkeyquestions(e.g.,howmuchpreferencefornear-certainty,exactly,do

subjectshave)forfutureempiricalresearchonwords-and-numberstasks.

6.4.UnifyingtheoryandintuitioninthePersonGame(Havingyourcakeandeatingittoo)

Inthissection,weintroduceanothertheoreticalconundrumfromtheliterature,andshow

howtheSharma-Mittalframeworkmayhelpsolveit.Aspointedoutabove,theexpected

reductionofErrorentropyhadappearedinitiallytoprovidethebestexplanationof

people’sintuitionsandbehavioronexperience-based-learning-basedinformationsearch

tasks(Nelsonetal.,2010).Butthismodelleadstopotentiallycounterintuitivebehavioron

anotherinterestingkindofinformationsearchtask,namelythePersonGame(avariantof

theTwentyQuestionsgame).Inthisgame,ncards(say,20)withdifferentfacesare

presented.Oneofthosefaceshasbeenchosenatrandom(withequalprobability)tobethe

correctfaceinaparticularroundofthegame.Theplayer’staskistofindthetruefacein

thesmallestnumberofyes/noquestionsaboutphysicalfeaturesofthefaces.Forinstance,

askingwhetherthepersonhasabeardwouldbeapossiblequestion,E={e,7},withe=beardand7=nobeard.Ifk<nisthenumberofcharacterswithabeard,thenP(e)=k/nandP(7)=(n–k)/n.Moreover,a“yes”answerwillleavekequiprobableguessesstillinplay,anda“no”answern–ksuchguesses.

Severalpapershavereported(seeNelsonetal.,2014,forreferences)thatpeople

preferentiallyaskaboutfeaturesthatarepossessedbycloseto50%oftheremaining

possibleitems,thuswithP(e)closeto0.5.Thisstrategycanbelabelledthesplit-half

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

45

heuristic.Itisoptimaltominimizetheexpectednumberofquestionsneededundersome

taskvariants(Navarro&Perfors,2011),althoughnotinthegeneralcase(Nelson,Meder,&

Jones,2016),andcanbeaccountedforusingexpectedShannonentropyreduction.But

expectedShannonentropyreductioncannotaccountforpeople’sbehavioronexperience-

basedlearninginformationsearchtasks,asouraboveanalysesshow.CanexpectedError

entropyreductionaccountfortheseresultsandintuitions?Putmorebroadly,canthesame

entropymodelprovideasatisfyingaccountforboththePersonGameandtheexperience-

basedlearningtasks?Asithappens,Errorentropycannotaccountforthepreferenceto

splittheremainingitemscloseto50%.Infact,everypossiblequestion(unlessitsansweris

knownalready,becausenoneoralloftheremainingfaceshavethefeature)hasexactlythe

sameexpectedErrorentropyreduction,namely1/k,wheretherearekitemsremaining

(Nelson,Meder,&Jones,2016).Thismightleadustowonderwhetherwemusthave

differententropy/informationmodelstoaccountforpeople’sintuitionsandbehavior

acrossthesedifferenttasks.Indeed,itwouldcallintoquestionthepotentialforaunified

andgeneralpurposetheoryofthepsychologicalvalueofinformation.

ItturnsoutthatthefindingsonwhyexpectedShannonentropyreductionfavors

questionsclosetoa50:50split,andwhyErrorentropyhasnosuchpreference,applymuch

moregenerallythantoShannonandErrorentropy.Infact,forallSharma-Mittalmeasures,

theordinalevaluationofquestionsonthePersonGameissolelyafunctionofthedegreeof

theentropymeasure,andhasnothingtodowiththeorderofthemeasure[SupplMat,5].

Amongotherthings,thisimpliesthatallentropy-basedmeasureswithdegreet=1have

theexactsamepreferencesasexpectedShannonentropyreduction,andallofthem

quantifytheusefulnessofqueryingafeatureasafunctionoftheproportionofremaining

itemsthatpossessthatfeature.Similarly,alldegree-2measures,andnotonlyError

entropy,deemallquestionstobeequallyusefulinthePersonGame.Thecoreofthis

insightstemsfromthefactthat,ifaprobabilitydistributionisuniform,thentheentropyof

thatdistributiondependsonlyonthedegreeofaSharma-Mittalentropymeasure.More

formally,foranysetofhypothesesH={h1,h2,…hn}withauniformprobabilitydistribution

U(H):

789:$%(;,<)(0) = >8?(8)

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

46

Figure8.TheexpectedentropyreductionofabinaryquestionE={e,7}inthePersonGamewithahypothesissetHofsize40(thepossibleguesses,thatis,charactersinitiallyinplay)asafunctionofthe

proportionofpossibleguessesremainingaftergettingdatume(e.g.,a“yes”answerto“hasthechosenperson

abeard?”).Questionsaredeemedmostvaluablewiththezero-degreeentropymeasures(bottomrightplot).

Althoughtheshapeofthecurveissimilarforthedegreet=0anddegreet=1measures,theactual

informationvalue(seetheyaxis)decreasesasthedegreeincreases.Fordegreet=2(forexampleforError

entropy),everyquestionisequallyuseful(providedthatthereissomeuncertaintyabouttheanswer;bottom

leftplot).Ifthedegreeisgreaterthan2,thentheleast-equally-splitquestions(e.g.,1:39questions,inthecase

of40items)aredeemedmostuseful(leftcolumn,topandmiddlerow).Theorderparameterisirrelevantfor

purposesofevaluatingquestions’expectedusefulnessinthePersonGame,becauseallpriorandpossible

posteriorprobabilitydistributionsareuniform(seetext).

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

47

Figure8showshowpossiblequestionsarevalued,inthePersonGame,asafunctionof

theproportionofremainingitemsthatpossessaparticularfeature.Weseethatift=1,as

forShannonandallRényientropies,questionswithclosetoa50:50splitarepreferred.If

thedegreetisgreaterthan1butlessthan2,questionswithclosetoa50:50splitarestill

preferred,butlessso.Ift=2,then1:99and50:50questionsaredeemedequallyuseful.

Remarkably,ifthedegreeisgreaterthan2,thena1:99questionispreferredtoa50:50

question.

WhilethechoiceofparticularSharma-Mittalmeasuresisonlypartlyconstrainedby

observedpreferencesinthePersonGamealone(andspecificallythevalueoftheorder

parameterrisnot),nothinginprinciplewouldguaranteethatajointandcoherentaccount

ofsuchbehaviorandotherfindingsexists.Itisthenimportanttopointoutthatonecan,in

fact,pickupanentropymeasurewherebytheexperience-baseddataabovefollowalong

withagreaterinformativevaluefor50:50questionsthanfor1:99questionsinthePerson

Game.Forinstance,medium-orderArimotoentropies(suchas789$%('(,'.@))willwork.

7.Generaldiscussion

Inthispaper,wehavepresentedageneralframeworkfortheformalanalysisof

uncertainty,theSharma-Mittalentropyformalism.Thisframeworkgeneratesa

comprehensiveapproachtotheinformationalvalueofqueries(questions,tests,

experiments)astheexpectedreductionofuncertainty.Theamountoftheoreticalinsight

andunificationachievedisremarkable,inourview.Moreover,suchaframeworkcanhelp

usunderstandexistingempiricalresults,andpointoutimportantresearchquestionsfor

futureinvestigationofhumanintuitionandreasoningprocessesasconcernsuncertainty

andinformationsearch.

Mathematically,theparsimonyoftheSharma-Mittalformalismisappealingandyields

decisiveadvantagesinanalyticmanipulations,derivations,andcalculations,too.Withinthe

domainofcognitivescience,noearlierattempthasbeenmadetounifysomanyexisting

modelsconcerninginformationsearch/acquisitionbehavior.Notably,thisinvolvesboth

popularcandidaterationalmeasuresofinformationalutility(suchastheexpected

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

48

reductionofShannonorErrorentropy)andavowedheuristicmodels,suchasBaronetal.’s

(1988,106)quasi-Popperianheuristic(maximizationoftheexpectednumberof

hypothesesruledout,i.e.,theexpectedreductionofOriginentropy)andNelsonetal.’s

(2010,962)“probability-of-certainty”heuristic(closelyapproximatedbytheexpected

reductionofahighdegreeTsallisentropy,orasimilarmeasure).Inaddition,onceapplied

touncertaintyandinformationsearch,theSharma-Mittalparametersarenotdumb

mathematicalconstruals,butrathercapturecognitivelyandbehaviorallymeaningfulideas.

Roughly,theorderparameter,r,captureshowmuchonedisregardsminorhypotheses(via

thekindofmeansappliedtotheprobabilityvaluesinP(H)).Thedegreeparametert,onthe

otherhand,captureshowmuchonecaresaboutgetting(veryclose)tocertainty(viathe

behaviorofthesurprise/atomicinformationfunction;seeFigure3).Thus,highorder

indicatesastrongfocusontheprevalent(mostlikely)elementinthehypothesissetand

lackofconsiderationforminorpossibilities.Averyloworder,ontheotherhand,impliesa

Popperianorquasi-Popperianattitudeintheassessmentoftests,withamarked

appreciationofpotentiallyfalsifyingoralmostfalsifyingevidence.Thedegreeparameter,

inturn,hasimportantimplicationsforhowmuchpotentiallyconclusiveexperimentsare

valued,ascomparedtoexperimentsthatareinformativebutnotconclusive.Moreover,for

eachparticularorder,ifthedegreeishigherthanthecorrespondingArimotoentropy(and

inanycaseiftheorderislessthan0.5orthedegreeisatleast2),thentheconcavityofthe

entropymeasureguaranteesthatnoexperimentwillberatedashavingnegativeexpected

usefulness.

EvenaccordingtofairlycautiousviewssuchasAczel’s(1984),theaboveremarksseem

toprovideafairlystrongmotivationtoconsiderpursuingageneralizedapproach.Hereis

anotherpossibleconcern,however.Uncertaintyandtheinformationalvalueoftestsmay

beinvolvedinmanyargumentsconcerninghumancognition.Nowweseethatthose

notionscanbeformalizedinmanydifferentways,suchthatdifferentproperties(say,

additivity,ornon-negativity)areorarenotimplied.Thus,theargumentsatissuemightbe

validforsomechoicesofthecorrespondingmeasuresandnotforothers.Thispointhas

beenlabelledtheissueofmeasure-sensitivityinrelatedareas(Fitelson,1999)—isit

somethingtobeworriedabout?Doesitraiseproblemsforourproposal?

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

49

Itisnotuncommonformeasure-sensitivitytofosterskepticalordismissivereactionson

theprospectsoftheformalanalysisoftheconceptatissue(e.g.Hurlbert,1971,Kyburg&

Teng,2001,pp.98ff.).However,measure-sensitivityisawidespreadandmundane

phenomenon.Inareasrelatedtotheformalanalysisofreasoning,theissuearises,for

instance,forBayesiantheoriesofinductiveconfirmation(e.g.,Brössel2013;Crupi&

Tentori,2016;Festa&Cevolani,2016;Glass,2013;Hájek&Joyce,2008;Roche&Shogenji,

2014),scoringrulesandmeasuresofaccuracy(e.g.,D’Agostino&Sinigaglia,2010;Leitgeb

&Pettigrew,2010a,b;Levinstein,2012;Preddetal.,2009),andmeasuresofcausal

strength(e.g.,Griffiths&Tenenbaum,2005,2009;Fitelson&Hitchcock,2011;Meder,

Mayrhofer,&Waldmann,2014;Sprenger,2016).Ourtreatmentcontributestomakethe

samepointexplicitformeasuresofuncertaintyandtheinformationalvalueof

experiments.Thisweseeasaconstructivecontribution.Theprominenceofonespecific

measureinoneresearchdomainmaywellhavebeenpartlyaffectedbyhistorical

contingencies.Asaconsequence,whenatheoreticalorexperimentalinferencerelieson

thechoiceofonemeasure,itmakessensetocheckhowrobustitisacrossdifferentchoices

or,alternatively,toacknowledgewhichmeasure-specificpropertiessupporttheconclusion

andhowcompellingtheyare.Havingapluralityofrelatedmeasuresavailableisindeedan

importantopportunity.Itpromptsthoroughinvestigationofthefeaturesofalternative

optionsandtheirrelationships(e.g.,Crupi,Chater,&Tentori,2013;Huber&Schmidt-Petri,

2009;Nelson,2005,2008),itcanprovidearichsourceoftoolsforboththeorizingandthe

designofnewexperimentalinvestigations(e.g.,Rusconietal.,2014;Schupbach,2011;

Tentorietal.,2007),anditmakesitpossibletotailorspecificmodelstovaryingtasksand

contextswithinanotherwisecoherentapproach(e.g.,Crupi&Tentori,2014;Dawid&

Musio,2014;Oaksford&Hahn,2007).

WhichSharma-Mittalmeasuresaremoreconsistentwithobservedbehavioroverall?

Accordingtoouranalyses,asubsetofSharma-Mittalinformationsearchmodelsreceivesa

significantamountofconvergentsupport.Wefoundthatmeasuresofhighbutfiniteorder

accountingfortheexperience-based(planktontask)data(Figure7,leftside)arealso

empiricallyadequateforabstractselectiontaskdata(Figure6,toprow)andresultsfroma

TwentyQuestionskindoftasksuchasthePersonGame(Figure8).Ontheotherhand,the

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

50

bestfitwithwords-and-numbers(PlanetVuma)informationsearchtasksindicatesa

differentkindofmodelwithintheSharma-Mittalframework(Figure7,rightside).For

thesecases,ouranalysisthussuggeststhatpeople’sbehaviormaycomplywithdifferent

measuresindifferentsituations,soakeyquestionarisesaboutthefeaturesofataskwhich

affectsuchvariationinaconsistentway,suchasacomparablystrongerappreciationof

certaintyorquasi-certaintyaspromptedbyanexperimentalprocedureconveying

environmentalstatisticsbyexplicitverbalandnumericalstimuli.

Beyondthisbroadoutlook,ourdiscussionalsoallowsfortheresolutionofanumberof

puzzles.Letusmentionalastone.Nelsonetal.(2010)hadconcludedfromtheir

experimentalinvestigationsthathumaninformationsearchinanexperience-basedsetting

wasappropriatelyaccountedforbymaximizationoftheexpectedreductionofError

entropy.Thisspecificmodel,however,exhibitssomequestionablepropertiesrelatedtoits

lackofmathematicalcontinuity:inparticular,ifthemostlikelyhypothesisinHisnot

changedbyanypossibleevidenceinE,thenthelatterhasnoinformationalutility

whatsoeveraccordingto#ABBCB ,nomatterifitcanruleoutothernon-negligiblehypothesesintheset(see,e.g.,cases1and6inTable6).FindingsfromBaronetal.(1988)

suggestthatthismightnotdescribehumanjudgmentadequately.Inthatstudy,

participantsweregivenafictitiousmedicaldiagnosisscenariowithP(H)={0.64,0.24,

0.12},andaseriesofpossiblebinarytestsincludingEsuchthatP(H|e)={0.47,0.35,0.18},

P(H|7)={1,0,0}andP(e)=0.68andanothercompletelyirrelevanttestF(withanevenchanceofapositive/negativeresultoneachoneoftheelementsinH,sothatP(H|f)=

P(H|D)=P(H)).Accordingto#ABBCB ,testsEandFarebothequallyworthless—

#/ABBCB(0, 1) = #/ABBCB(0, 2) = 0—becausehypothesish1ÎHremainsthemostlikelyno

matterwhat.Participants’meanratingsoftheusefulnessofEandFweremarkedly

different,however:0.48vs.0.09(ona0-1scale).Indeed,ratingEhigherthanFseemsat

leastreasonable,contrarytowhat#ABBCB implies.IntheSharma-Mittalframework,reconciliationispossible:expectedreductionofarelativelyhighorder(say,10)entropy

measurefromtheArimotofamilywouldaccountforNelsonetal.’s(2010)andsimilar

findings(seeFigure7),andstillwouldnotputtestEaboveonaparwiththeentirely

pointlesstestF.Indeed,givenourtheoreticalbackgroundandthelimitedempirical

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

51

indicationsavailable,suchameasurewouldcountasaplausiblechoiceinourview,had

onetopickupaspecificentropyunderlyingawidelyapplicablemodeloftheinformational

utilityofexperiments.Moreover,thiskindofoperationhaswiderscope.Originentropy,for

instance,mayimplylargelyappropriateratingsinsomecontexts(say,biological)andyet

notbewell-behavedbecauseofitsdiscontinuities:aSharma-Mittalmeasuresuchas

789$%((.',(.') wouldthencloselyapproximatetheformerwhileavoidingthelatter.

Manyfurtherempiricalissuescanbeaddressed.Foroneinstance,ouranalysisofhuman

datainTables6-7andFigure7providesrelativelyweakandindirectevidenceagainstnon-

concaveentropymeasuresasabasisfortheassessmentoftheinformationalutilityof

queriesbyhumanagents.However,stronglydivergingpredictionscanbegeneratedfrom

concavevs.non-concavemeasures(asillustratedincases6and7,Table5),andhenceput

toempiricaltest.Moreover,ourexplanatoryreanalysisofpriorworkwasbasedonthe

aggregatedatareportedinearlierarticles—buthowdoesthisextendtoindividual

behavior?Weareawareofnostudiesthataddressquestionsofwhetherthereare

meaningfulindividualdifferencesinthepsychologyofinformation.Thus,whileinferences

aboutindividualsshouldbethegoal(Lee,2011),thisrequiresfutureresearch,perhaps

withadaptiveBayesianexperimentaldesigntechniques(Kimetal.,2014).Bettermodelsof

individual-levelpsychologycouldalsoservethegoalofidentifyingtheinformationthat

wouldbemostinformativeforindividualhumanlearners(Gureckis&Markant,2012),

potentiallyenhancingautomatedtutorsystems.Anotherideaconcernsthedirect

assessmentofuncertainty,e.g.,whethermoreuncertaintyisperceivedin,say,P(H)={0.49,

0.49,0.02}vs.P*(H)={0.70,0.15,0.15}.Judgmentsofthiskindarelikelytoplayarolein

humanreasoninganddecision-makingandmaybeplausiblymodulatedbyanumberof

interestingfactors.Moreover,anarrayofrelevantpredictionscanbegeneratedfromthe

Sharma-Mittalframeworktodissociatesubsetsofentropymeasures.Yetasfarasweknow,

andrathersurprisingly,noestablishedexperimentalprocedureexistsforadirect

behavioralmeasurementofthejudgedoveralluncertaintyconcerningahypothesisset;

thisisanotherimportantareaforfutureinvestigation.

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

52

REFERENCES

AczélJ.(1984).Measuringinformationbeyondcommunicationtheory:Whysomegeneralized

informationmeasuresmaybeuseful,othersnot.AequationesMathematicae,27,1-19.

AczélJ.(1987).Characterizinginformationmeasures:Approachingtheendofanera.LectureNotes

inComputerScience,286,357-384.

AczélJ.,ForteB.,&NgC.T.(1974).WhytheShannonandHartleyentropiesare“natural”.Advances

inAppliedProbability,6,131-146.

ArimotoS.(1971).Information-theoreticalconsiderationsonestimationproblems.Informationand

Control,19,181-194.

AusterweilJ.L.&GriffithsT.L.(2011).Seekingconfirmationisrationalfordeterministichypotheses.

CognitiveScience,35,499-526.

Bar-HillelY.&CarnapR.(1953).Semanticinformation.BritishJournalforthePhilosophyofScience,

4,147-157.

BaronJ.(1985).Rationalityandintelligence.NewYork:CambridgeUniversityPress.

BaronJ.,BeattieJ.,&HersheyJ.C.(1988).HeuristicsandbiasesindiagnosticreasoningII:

Congruence,information,andcertainty.OrganizationalBehaviorandHumanDecisionProcesses,

42,88-110.

BarwiseJ.(1997).Informationandpossibilities.NotreDameJournalofFormalLogic,38,488-515.

BeckC.(2009).Generalisedinformationandentropymeasuresinphysics.ContemporaryPhysics,

50,495-510.

Ben-BassatM.&RavivJ.(1978).Rényi’sentropyandtheprobabilityoferror.IEEETransactionson

InformationTheory,24,324-331.

BenishW.A.(1999).Relativeentropyasameasureofdiagnosticinformation.MedicalDecision

Making,19,202-206.

BoztasS.(2014).OnRényientropiesandtheirapplicationstoguessingattacksincryptography.

IEICETransactionsonFundamentalsofElectronics,Communication,andComputerSciences,97,

2542-2548.

BradleyS.&SteeleK.(2014).Uncertainty,learningandthe‘problem’ofdilation.Erkenntnis,79,

1287–1303.

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

53

BramleyN.R.,LagnadoD.,&SpeekenbrinkM.(2015).Conservativeforgetfulscholars:Howpeople

learncausalstructurethroughsequencesofinterventions.JournalofExperimentalPsychology:

Learning,Memory,andCognition,41,708-731.

BrierG.W.(1950).Verificationofforecastsexpressedintermsofprobability.MonthlyWeather

Report,78,1-3.

BrösselP.(2013).Theproblemofmeasuresensitivityredux.PhilosophyofScience,80,378–397.

CarnapR.(1952).TheContinuumofInductiveMethods.Chicago:UniversityofChicagoPress.

ChoA.(2002).Afreshtakeondisorder,ordisorderlyscience.Science,297,1268–1269.

CrupiV.&GirottoV.(2014).Fromistoought,andback:Hownormativeconcernsforsterprogress

inreasoningresearch.FrontiersinPsychology,5,219.

CrupiV.&TentoriK.(2014).Measuringinformationandconfirmation.StudiesintheHistoryand

PhilosophyofScience,47,81-90.

CrupiV.&TentoriK.(2016).Confirmationtheory.InA.Hájek&C.Hitchcock(eds.),Oxford

HandbookofPhilosophyandProbability(pp.650-665).Oxford:OxfordUniversityPress.

CrupiV.,ChaterN.,&TentoriK.(2013).Newaxiomsforprobabilityandlikelihoodratiomeasures.

BritishJournalforthePhilosophyofScience,64,189–204.

CrupiV.,TentoriK.,&LombardiL.(2009).Pseudodiagnosticityrevisited.PsychologicalReview,116,

971-985.

CsizárI.(2008).Axiomaticcharacterizationsofinformationmeasures.Entropy,10,261-273.

D’AgostinoM.&SinigagliaC.(2010).Epistemicaccuracyandsubjectiveprobability.InM.Suárez,M.

Dorato,&M.Rèdei(eds.),EpistemologyandMethodologyofScience(pp.95-105).Berlin:

Springer.

DaróczyZ.(1970).Generalizedinformationfunctions.InformationandControl,16,36-51.

DawidA.P.(1998).Coherentmeasuresofdiscrepancy,uncertaintyanddependence,with

applicationstoBayesianpredictiveexperimentaldesign.TechnicalReport139,Departmentof

StatisticalScience,UniversityCollegeLondon

(http://www.ucl.ac.uk/Stats/research/pdfs/139b.zip).

DawidA.P.&MusioM.(2014).Theoryandapplicationsofproperscoringrules.Metron,72,169-

183.

DenzlerJ.&BrownC.M(2002).Informationtheoreticsensordataselectionforactiveobject

recognitionandstateestimation.IEEETransactionsonPatternAnalysisandMachineIntelligence,

24,145-157.

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

54

EvansJ.St.B.T.&OverD.E.(1996).Rationalityintheselectiontask:Epistemicutilityversus

uncertaintyreduction.PsychologicalReview,103,356-363.

FanoR.(1961).TransmissionofInformation:AStatisticalTheoryofCommunications.Cambridge:

MITPress.

FestaR.(1993).OptimumInductiveMethods.RijksuniversiteitGroningen.

FestaR.&CevolaniG.(2016).UnfoldingthegrammarofBayesianconfirmation:Likelihoodand

anti-likelihoodprinciples.PhilosophyofScience,forthcoming.

FitelsonB.(1999).ThepluralityofBayesianmeasuresofconfirmationandtheproblemofmeasure

sensitivity.PhilosophyofScience,66,S362–S378.

FitelsonB.&HawthorneJ.(2010).TheWasontask(s)andtheparadoxofconfirmation.

PhilosophicalPerspectives,24(Epistemology),207-241.

FitelsonB.&HitchcockC.(2011).Probabilisticmeasuresofcausalstrength.InP.McKayIllari,F.

Russo,&J.Williamson(eds.),CausalityintheSciences(pp.600–27).Oxford:OxfordUniversity

Press.

FloridiL.(2009).Philosophicalconceptionsofinformation.InG.Sommaruga(ed.),FormalTheories

ofInformation(pp.13-53).Berlin:Springer.

FloridiL.(2013).Semanticconceptionsofinformation.InE.Zalta(ed.),TheStanfordEncyclopedia

ofPhilosophy(Spring2013Edition).url=

http://plato.stanford.edu/archives/spr2013/entries/information-semantic.

FrankT.(2004).CompletedescriptionofageneralizedOrnstein-Uhlenbeckprocessrelatedtothe

non-extensiveGaussianentropy.PhysicaA,340,251-256.

GauvritN.&MorsanyiK.(2014).Theequiprobabilityfromamathematicalandpsychological

perspective.AdvancesinCognitivePsychology,10,119-130.

GibbsJ.P.&MartinW.T.(1962).Urbanization,technology,andthedivisionoflabor.American

SociologicalReview,27,667-677.

GigerenzerG.,HertwigR.,&PachurT.(eds.)(2011).Heuristics:TheFoundationsofAdaptive

Behavior.NewYork:OxfordUniversityPress.

GiniC.(1912).Variabilitàemutabilità.InMemoriedimetodologiastatistica,I:Variabilitàe

concentrazione(pp.189-358).Milano:Giuffrè,1939.

GlassD.H.(2013).Confirmationmeasuresofassociationruleinterestingness.Knowledge-Based

Systems,44,65-77.

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

55

GneitingT.&RafteryA.E.(2007).Strictlyproperscoringrules,prediction,andestimation.Journal

oftheAmericanStatisticalAssociation,102,359-378.

GoodI.J.(1950).ProbabilityandtheWeightofEvidence.NewYork:Griffin.

GoodI.J.(1952).RationalDecisions.JournaloftheRoyalStatisticalSocietyB,14,107-114.

GoodI.J.(1967).Ontheprincipleoftotalevidence.BritishJournalforthePhilosophyofScience,17,

319-321.

GoosensW.K.(1976).Acritiqueofepistemicutilities.InR.Bogdan(ed.),Localinduction(pp.93-

114).Dordrecht:Reidel.

GriffithsT.L.&TenenbaumJ.B.(2005).Structureandstrengthincausalinduction.Cognitive

Psychology,51,334-384.

GriffithsT.L.&TenenbaumJ.B.(2009).Theory-basedcausalinduction.PsychologicalReview,116,

661-716.

GureckisT.M.&MarkantD.B.(2012).Self-directedlearning:Acognitiveandcomputational

perspective.PerspectivesonPsychologicalScience,7,464-481.

HájekA.&JoyceJ.(2008).Confirmation.InS.Psillos&M.Curd(eds.),RoutledgeCompaniontothe

PhilosophyofScience(pp.115-129).NewYork:Routledge.

HartleyR.(1928).Transmissionofinformation.BellsSystemsTechnicalJournal,7,535-563.

HassonU.(2016).Theneurobiologyofuncertainty:Implicationsforstatisticallearning.

PhilosophicalTransactionsB,371,20160048.

HattoriM.(1999).TheeffectsofprobabilisticinformationinWason’sselectiontask:Ananalysisof

strategybasedontheODSmodel.InProceedingsofthe16thAnnualMeetingoftheJapanese

CognitiveScienceSociety(pp.623-626).

HattoriM.(2002).AquantitativemodelofoptimaldataselectioninWason’sselectiontask.

QuarterlyJournalofExperimentalPsychology,55,1241-1272.

HavrdaJ.&CharvátF.(1967).Quantificationmethodofclassificationprocesses.Conceptof

structurala-entropy.Kybernetica,3,30-35.

HillM.(1973).Diversityandevenness:Aunifyingnotationanditsconsequences.Ecology,54,427-

431.

HoffmannS.(2008).Generalizeddistribution-baseddiversitymeasurement:Surveyandunification.

FacultyofEconomicsandManagementMagdeburg,WorkingPaper23(http://www.ww.uni-

magdeburg.de/fwwdeka/femm/a2008_Dateien/2008_23.pdf).

HorwichP.(1982).ProbabilityandEvidence.Cambridge,UK:CambridgeUniversityPress.

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

56

HuberF.&Schmidt-PetriC.(eds.)(2009).DegreesofBelief.Dordrecht:Springer.

HurlbertS.H.(1971).Thenon-conceptofspeciesdiversity:Acritiqueandalternativeparameters.

Ecology,52,577-586.

JostL.(2006).Entropyanddiversity.Oikos,113,363-375.

KaniadakisG.,LissiaM.,&ScarfoneA.M.(2004).Deformedlogarithmsandentropies.PhysicaA,340,

41-49.

KatsikopoulosK.V.,SchoolerL.J.,&HertwigR.(2010).Therobustbeautyofordinary

information,PsychologicalReview,117,1259-1266.

KeylockJ.C.(2005).SimpsondiversityandtheShannon-Wienerindexasspecialcasesofa

generalizedentropy.Oikos,109:203-207.

KimW.,PittM.A.,LuZ.L.,SteyversM.,&Myung,J.I.(2014).Ahierarchicaladaptiveapproachto

optimalexperimentaldesign.NeuralComputation,26,2465-2492.

KlaymanJ.&HaY.-W.(1987).Confirmation,disconfirmation,andinformationinhypothesistesting.

PsychologicalReview,94,211-228.

KyburgH.E.&TengC.M.(2001).UncertainInference.NewYork:CambridgeUniversityPress.

LaaksoM.&TaageperaR.(1979).“Effective”numberofparties–Ameasurewithapplicationto

WestEurope.ComparativePoliticalStudies,12,3-27.

LandeR.(1996).Statisticsandpartitioningofspeciesdiversity,andsimilarityamongmultiple

communities.Oikos,76,5-13.

LeeM.D.(2011).HowcognitivemodelingcanbenefitfromhierarchicalBayesianmodels.Journalof

MathematicalPsychology,55,1-7.

LeggeG.E.,KlitzT.S.,&TjanB.S.(1997).Mr.Chips:Anidealobservermodelofreading.

PsychologicalReview,104,524-553.

LeitgebH.&PettigrewR.(2010a).AnobjectivejustificationofBayesianismI:Measuring

inaccuracy.PhilosophyofScience,77,201-235.

LeitgebH.&PettigrewR.(2010b).AnobjectivejustificationofBayesianismII:Theconsequencesof

minimizinginaccuracy.PhilosophyofScience,77,236-272.

LevinsteinB.(2012).LeitgebandPettigrewonaccuracyandupdating.PhilosophyofScience,79,

413-424.

LewontinR.C.(1972).Theapportionmentofhumandiversity.EvolutionaryBiology,6,381-398.

LindleyD.V.(1956).Onameasureoftheinformationprovidedbyanexperiment.Annalsof

MathematicalStatistics,27,986-1005.

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

57

MarkantD.&GureckisT.M.(2012).Doestheutilityofinformationinfluencesamplingbehavior?In

N.Miyake,D.Peebles,&R.P.Cooper(eds.),Proceedingsofthe34thAnnualConferenceofthe

CognitiveScienceSociety(pp.719-724).Austin,TX:CognitiveScienceSociety.

MasiM.(2005).AstepbeyondTsallisandRényientropies.PhysicsLettersA,338,217-224.

MederB.&NelsonJ.D.(2012).Informationsearchwithsituation-specificrewardfunctions.

JudgmentandDecisionMaking,7,119-148.

MederB.,MayrhoferR.,&WaldmannM.R.(2014).Structureinductionindiagnosticcausal

reasoning.PsychologicalReview,121,277-301.

MuliereP.&ParmigianiG.(1993).Utilityandmeansinthe1930s.StatisticalScience,8,421-432.

NajemnikJ.&GeislerW.S.(2005).Optimaleyemovementstrategiesinvisualsearch.Nature,434,

387-391.

NajemnikJ.&GeislerW.S.(2009).Simplesummationruleforoptimalfixationselectioninvisual

search.VisionResearch,49,1286-1294.

NaudtsJ.(2002).Deformedexponentialsandlogarithmsingeneralizedthermostatistics.PhysicaA,

316,323-334.

NavarroD.J.&PerforsA.F.(2011).Hypothesisgeneration,sparsecategories,andthepositivetest

strategy.PsychologicalReview,118,120-134.

NelsonJ.D.(2005).Findingusefulquestions:OnBayesiandiagnosticity,probability,impact,and

informationgain.PsychologicalReview,112,979-999.

NelsonJ.D.(2008).Towardsarationaltheoryofhumaninformationacquisition.InM.Oaksford&

N.Chater(eds.),Theprobabilisticmind:Prospectsforrationalmodelsofcognition(pp.143-163).

Oxford:OxfordUniversityPress.

NelsonJ.D.&CottrellG.W.(2007).Aprobabilisticmodelofeyemovementsinconceptformation.

Neurocomputing,70,2256-2272.

NelsonJ.D.,DivjakB.,GudmundsdottirG.,MartignonL.,&MederB.(2014).Children’ssequential

informationsearchissensitivetoenvironmentalprobabilities.Cognition,130,74-80.

NelsonJ.D.,McKenzieC.R.M.,CottrellG.W.,&SejnowskiT.J.(2010).Experiencematters:

Informationacquisitionoptimizesprobabilitygain.PsychologicalScience,21,960-969.

NelsonJ.D.,MederB.,&JonesM.(2016).Onthefinelinebetween“heuristic”and“optimal”

sequentialquestionstrategies.Submitted.

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

58

NelsonJ.D.,TenenbaumJ.B.,&MovellanJ.R.(2001).Activeinferenceinconceptlearning.InJ.D.

Moore&K.Stenning(eds.),Proceedingsofthe23rdConferenceoftheCognitiveScienceSociety

(pp.692-697).Mahwah(NJ):Erlbaum.

NiiniluotoI.&TuomelaR.(1973).TheoreticalConceptsandHypothetico-InductiveInference.

Dordrecht:Reidel.

OaksfordM.&ChaterN.(1994).Arationalanalysisoftheselectiontaskasoptimaldataselection.

PsychologicalReview,101,608-631.

OaksfordM.&ChaterN.(2003).Optimaldataselection:Revision,review,andre-evaluation.

PsychonomicBulletin&Review,10,289-318.

OaksfordM.&HahnU.(2007).Induction,deduction,andargumentstrengthinhumanreasoning

andargumentation.InA.Feeney&E.Heit(eds.),InductiveReasoning:Experimental,

Developmental,andComputationalApproaches(pp.269-301).Cambridge(UK):Cambridge

UniversityPress.

OaksfordM.&ChaterN.(2007).Bayesianrationality:Theprobabilisticapproachtohuman

reasoning.Oxford(UK):OxfordUniversityPress.

PatilG.&TailleC.(1982).Diversityasaconceptanditsmeasurement.JournaloftheAmerican

StatisticalAssociation,77,548-561.

PedersenP.&WheelerG.(2014).Demystifyingdilation.Erkenntnis,79:1305-1342.

PettigrewR.(2013).Epistemicutilityandnormsforcredences.PhilosophyCompass,8,897-908.

PopperK.R.(1959).TheLogicofScientificDiscovery.London:Routledge.

PreddJ.B.,SeiringerR.,LiebE.J.,OshersonD.,PoorH.V.,&KulkarniS.R.(2009).Probabilistic

coherenceandproperscoringrules.IEEETransactionsonInformationTheory,55,4786-4792.

RaiffaH.&SchlaiferR.(1961).AppliedStatisticalDecisionTheory.Boston:ClintonPress.

RaileanuL.E.&StoffelK.(2004).TheoreticalcomparisonbetweentheGiniIndexandInformation

Gaincriteria.AnnalsofMathematicsandArtificialIntelligence,41,77-93.

Ramírez-ReyesA.,Hernández-MontoyaA.R.,Herrera-CorralG.,&Domínguez-JiménezI.(2016).

DeterminingtheentropicindexqofTsallisentropyinimagesthroughredundancy.Entropy,18,

299.

RaoC.R.(2010).Quadraticentropyandanalysisofdiversity.Sankhya:TheIndianJournalof

Statistics,72A,70-80.

RenningerL.W.,CoughlanJ.,VergheseP.,&MalikJ.(2005).Aninformationmaximizationmodelof

eyemovements.AdvancesinNeuralInformationProcessingSystems,17,1121-1128.

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

59

RényiA.(1961).Onmeasuresofentropyandinformation.InProceedingsoftheFourthBerkeley

SymposiumonMathematicalStatisticsandProbabilityI(pp.547-556).Berkeley(CA):University

ofCaliforniaPress.

RicottaC.(2003).Onparametricevennessmeasures.JournalofTheoreticalBiology,222,189-197.

RocheW.&ShogenjiT.(2014).Dwindlingconfirmation.PhilosophyofScience,81,114-137.

RocheW.&ShogenjiT.(2016).Informationandinaccuracy.BritishJournalforthePhilosophyof

Science,forthcoming.

RuggeriA.&LombrozoT.(2015).Childrenadapttheirquestionstoachieveefficientsearch.

Cognition,143,203-216.

RusconiP.,MarelliM.,D’AddarioM.,RussoS.,&CherubiniP.(2014).Evidenceevaluation:Measure

ZcorrespondstohumanutilityjudgmentsbetterthanmeasureLandoptimal-experimental-

designmodels.JournalofExperimentalPsychology:Learning,Memory,andCognition,40,703-

723.

SahooP.K.&AroraG.(2004).Athresholdingmethodbasedontwo-dimensionalRényi’sentropy.

PatternRecognition,37,1149-1161.

SavageL.J.(1972).Thefoundationsofstatistics.NewYork:Wiley.

SeltenR.(1998).Axiomaticcharacterizationofthequadraticscoringrule.ExperimentalEconomics,

1,43–61.

ShannonC.E.(1948).Amathematicaltheoryofcommunication.BellSystemTechnicalJournal,27,

379-423and623-656.

SharmaB.&MittalD.(1975).Newnon–additivemeasuresofentropyfordiscreteprobability

distributions.JournalofMathematicalSciences(Delhi),10,28-40.

SimpsonE.H.(1949).Measurementofdiversity.Nature,163,688.

SkovR.B.&ShermanS.J.(1986).Information-gatheringprocesses:Diagnosticity,hypothesis-

confirmationstrategies,andperceivedhypothesisconfirmation.JournalofExperimental

Psychology,22,93-121.

SlowiaczekL.M.,KlaymanJ.,ShermanS.L.,&SkovR.B.(1992).Informationselectionandusein

hypothesistesting:Whatisagoodquestion,andwhatisagoodanswer?Memory&Cognition,20,

392-405.

SprengerJ.(2016).Foundationsforaprobabilistictheoryofcausalstrength.See:http://philsci-

archive.pitt.edu/11927/1/GradedCausation-v2.pdf.

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

60

StringerS.,BorsboomD.,&WagenmakersE.-J.(2011).Bayesianinferencefortheinformationgain

model.BehavioralResearchMethods,43,297-309.

TanejaI.J.,PardoL.,MoralesD.,&MenéndezM.L.(1989).Ongeneralizedinformationand

divergencemeasuresandtheirapplications:Abriefreview.Qüestiió,13,47-73.

TentoriK.,CrupiV.,BoniniN.,andOshersonD.(2007).Comparisonofconfirmationmeasures.

Cognition,103,107-119.

TribusM.&McIrvineE.C.(1971).Energyandinformation.ScientificAmerican,225,179-188.

TropeY.&BassokM.(1982).Confirmatoryanddiagnosingstrategiesinsocialinformation

gathering.JournalofPersonalityandSocialPsychology,43,22-34.

TropeY.&BassokM.(1983).Information-gatheringstrategiesinhypothesistesting.Journalof

ExperimentalandSocialPsychology,19,560-576.

TsallisC.(2002).Entropicnon-extensivity:Apossiblemeasureofcomplexity.Chaos,Solitons,and

Fractals,13,371-391.

TsallisC.(2004).Whatshouldastatisticalmechanicssatisfytoreflectnature?PhysicaD,193,3-34.

TsallisC.(2011).ThenonadditiveentropySqanditsapplicationsinphysicsandelsewhere:Some

remarks.Entropy,13,1765-1804.

Tsallis,C.(1988).PossiblegeneralizationofBoltzmann-Gibbsstatistics.JournalofStatistical

Physics,52,479-487.

TweeneyR.D.,DohertyM.,&KleiterG.D.(2010).Thepseudodiagnosticitytrap:Shouldparticipants

consideralternativehypotheses?Thinking&Reasoning,16,332-345.

VajdaI.&ZvárováJ.(2007).Ongeneralizedentropies,Bayesiandecisions,andstatisticaldiversity.

Kybernetika,43,675-696.

vanderPylT.(1978).Propriétésdel’informationd’ordreaetdetypeb.InThéoriedel’information:

Développementsrécentsetapplications(pp.161-171),ColloquesInternationalesduCNRS,276.

Paris:CentreNationaldelaRechercheScientifique.

WangG.&JiangM.(2005).Axiomaticcharacterizationofnon-linearhomomorphicmeans.

MathematicalAnalysisandApplications,303,350-363.

WasonP.(1960).Onthefailuretoeliminatehypothesesinaconceptualtask.QuarterlyJournalof

ExperimentalPsychology,12,129-140.

WasonP.(1966).Reasoning.InB.Foss(ed.),Newhorizonsinpsychology(pp.135-151).

Harmonsworth,(UK):Penguin.

GENERALIZEDINFORMATIONTHEORYANDHUMANCOGNITION

61

WasonP.(1968).Reasoningaboutarule.QuarterlyJournalofExperimentalPsychology,20,273-

281.

WuC.,MederB.,FilimonF.,&NelsonJ.D.(inpress).Askingbetterquestions:Howpresentation

formatsguideinformationsearch.JournalofExperimentalPsychology:Learning,Memory,and

Cognition,43,1274-1297.

62

ThisistheSupplementaryMaterialsfileof:

CrupiV.,NelsonJ.D.,MederB.,CevolaniG.,andTentoriK.,Generalizedinformationtheorymeetshumancognition:Introducingaunifiedframeworktomodeluncertaintyandinformationsearch.CognitiveScience,2018.

1.Generalizedlogarithmandexponential

ConsidertheTsallislogarithm,!"#(%) = (

()#*%(()#) − 1-,andnotethat1 + (1 − /)!"#(%) = %(()#),

therefore% = [1 + (1 − /)!"#(%)]2

234.Thisshowsthatthegeneralizedexponential5#6 =

[1 + (1 − /)%]2

234justistheinversefunctionof!"#(%).

Inordertoshowthattheordinarynaturallogarithmisrecoveredfrom!"#(%)(x>0)inthelimitfort®1,wepositx=1–yandfirstconsiderx≤1,sothat|–y|<1.Thenwehave:

lim#→(

{!"#(%)} = lim#→(

{!"#(1 − =)} = lim#→(

>(

()#*(1 − =)(()#) − 1-?

Bythebinomialexpansionof(1 − =)(()#):

lim#→(

>(

()#@−1 + A1 + (1 − /)(−=) +

(()#)(()#)()()B)C

D!+

(()#)(()#)()(()#)D)()B)F

G!+ ⋯IJ?

= lim#→(

>(−=) +()#)()B)C

D!+

()#)()#)()()B)F

G!+ ⋯?

= lim#→(

>(−=) −#()B)C

D!+

(#)(#K()()B)F

G!− ⋯ ?

= (−=) −()B)C

D!+

D!()B)F

G!− ⋯

= (−=) −()B)C

D+

()B)F

G−⋯

whichistheseriesexpansionof!"(1 − =) = ln(%)(recallthat|–y|<1).Forthecasex>1,onecan

positx=1/(1–y),sothatagain|–y|<1andcomputelim#→(

NA2

23OI(234)

)(

()#P = lim

#→(>−

(

#)(*(1 − =)(#)() − 1-?,

thusgettingthesameresultfromasimilarderivation.

Justlikethenaturallogarithm,lnt(x)isnon-negativeifx≥1,becauseift<1,then%(()#) ≥ %R = 1,

therefore (()#

*%(()#) − 1- ≥ 0,whileift>1,then%(()#) ≤ %R = 1,thereforeagain (()#

*%(()#) − 1- ≥ 0.

If0<x<1,lnt(x)isnegativeinstead,againlikethenaturallogarithm.

Toshowthattheordinaryexponentialisrecoveredfrom5#(%)(x>0)inthelimitfort®1,weagainrelyonthebinomialexpansion,asfollows.

lim#→(

{5#(%)} = lim#→(

>[1 + (1 − /)%]2

234?

63

=lim#→(

U1 + A(

()#I (1 − /)% + A

(

()#I A

(

()#− 1I

V(()#)6WC

D!+ A

(

()#I A

(

()#− 1I A

(

()#− 2I

V(()#)6WF

G!+ ⋯ Y

=lim#→(

>1 + A(

()#I (1 − /)% + A

(

()#I A

#

()#I(()#)C6C

D!+ A

(

()#I A

#

()#I A

D#)(

()#I(()#)F6F

G!+ ⋯ ?

=1 + % + 6C

D!+

6F

G!+ ⋯

= 1 +∑6[

\!

]\^( = 5(%)

Justliketheordinaryexponential,et(x)≥1ifx≥0,becauseift<1,thenonehas[1+(1–t)x]≥1toapositivepower1/(1–t),whileift>1,thenonehas[1+(1–t)x]≤1toanegativepower1/(1–t).If0<x<1,et(x)<1instead,againliketheordinaryexponential.

2.Sharma-Mittalentropies

FirstwewillderivetheSharma-Mittalformulafromitsgeneralizedmeanform.Wehaveg(x)=lnret(x)andinf(x)=lnt(x).Letusfindg–1(x),bysolvingy=g(x)forx,asfollows.

= = !"_5#(%)

= !"_ @(1 + (1 − /)%)2

234J

=(

()_@(1 + (1 − /)%)

23`

234 − 1J

Therefore:

1 + (1 − a)= = [1 + (1 − /)%]23`

234

[1 + (1 − a)=]2

23` = [1 + (1 − /)%]2

234

5_(=) = [1 + (1 − /)%]2

234

[5_(=)]()# = 1 + (1 − /)%

(

()#{[5_(=)]

()# − 1} = %

% = !"#5_(=)

Sog–1(x)=lnter(x).NowwehavealltheelementstoderivetheSharma-Mittalformula.

5"/b

cd(`,4)(f) = g)( >∑ hijk∈m

g @n"o A(

pkIJ?

= !"#5_ >∑ hijk∈m!"_5# @!"# A

(

pkIJ?

= !"#5_ >∑ hijk∈m!"_ A

(

pkI?

= !"#5_ U(

()_∑ hijk∈m

qA(

pkI()_

− 1rY

= !"# >1 + (1 − a) @(

()_∑ hijk∈m

Vhi(_)() − 1WJ?

2

23`

64

= !"#s1 + ∑ hijk∈mhi(_)() − ∑ hijk∈m

t

2

23`

= !"#s∑ hi_

jk∈mt

2

23`= (

()#qV∑ hi

_jk∈m

W

234

23` − 1r =(

#)(q1 − V∑ hi

_jk∈m

W

432

`32r

Letusnotethat5"/cd(`,4)satisfiesthebasicpropertiesofentropymeasures.Aspointedoutabove,

Tsallislogarithmlnt(x)isalwaysnon-negativeifx≥1,thereforesois∑ hijk∈m!"_ A

(

pkI.Moreover,

er(x)≥1ifx≥0(seeabove),so5_ >∑ hijk∈m!"_ A

(

pkI? ≥ 1andfinally!"#5_ >∑ hijk∈m

!"_ A(

pkI? ≥ 0.

Thisprovesthatnon-negativityholdsfor5"/cd(`,4).Letusthenconsiderevennesssensitivity.Wealreadyknowthat5"/cd(`,4)isnon-negative;also,∑ hi

_ = 1jk∈mincasepi=1forsomei,sothat

5"/u

cd(`,4)(f) = 0.Asaconsequence,foranyHandP(H),5"/

b

cd(`,4)(f) ≥ 5"/

u

cd(`,4)(f) = 0.Inorder

tocompletetheproofofevennesssensitivity,wewillnowstudythemaximizationof5"/cd(`,4)by

meansofso-calledLagrangemultipliers.Wehavetomaximize∑ hijk∈m!"_ A

(

pkI =

(

()_*∑ (hi)

_jk∈m

− 1-,sowestudyo(%(, … , %\) = (

()_[∑ (%i)

_(wiw\ − 1]undertheconstraint

∑ (%i)(wiw\ = 1.BytheLagrangemultipliersmethod,wegetasystemofn+1equationsasfollows:

⎩⎪⎨

⎪⎧

a

1 − a%(

(_)() = |

…a

1 − a%\

(_)() = |

%( + ⋯+ %\ = 1

wherex1=…=xn=1/nistheonlysolution.Thismeansthat∑ hijk∈m!"_ A

(

pkIiseithermaximizedor

minimizedfortheuniformdistributionU(H).Butactually5"/}

cd(`,4)(f)mustbeamaximum,sothat,

foranyHandP(H),5"/}

cd(`,4)(f) ≥ 5"/

b

cd(`,4)(f).Infact,5"/

}

cd(`,4)(f)isstrictlypositive,because

∑ hijk∈m!"_ A

(

pkI = !"_(")is(recallthatn>1).Hence,foranyH,5"/}

cd(`,4)(f) > 5"/

u

cd(`,4)(f) = 0,

andevennesssensitivityisshowntohold.3.SomespecialcasesoftheSharma-Mittalfamily

Giventheaboveanalysisofgeneralizedlogarithmsandexponentials,wehaveRényi(1961)entropyasaspecialcaseoftheSharma-Mittalfamilyasfollows:

5"/b

cd(`,2)(f) = !" >5_ @∑ hijk∈m

!"_ A(

pkIJ?

= !" �@1 + (1 − a)(

()_V∑ hi

_jk∈m

− 1WJ

2

23`Ä

= !" U*∑ hi_

jk∈m-

2

23`Y =(

()_!"V∑ hi

_jk∈m

W =5"/b

Åé\Bi(`)(f)

65

ForShannonentropy,inparticular,oneonlyneedstonotethat5"/b

cd(2,2)(f) =

!" >5 @∑ hijk∈m!" A

(

pkIJ? = ∑ hijk∈m

!" A(

pkI.

ForTsallis(1988)entropy,wehave:

5"/b

cd(4,4)(f) = !"#5# >∑ hijk∈m

!"# A(

pkI? = ∑ hijk∈m

!"# A(

pkI =

(

#)(*1 − ∑ hi

#jk∈m

- = 5"/b

ÉÑÖÜÜiÑ(4)(f)

ForanothergeneralizationofShannonentropy,i.e.Gaussianentropy(Frank,2004),wehave:

5"/b

cd(2,4)(f) = !"#5 >∑ hijk∈m

!" A(

pkI? =

(

()#�5

(()#)q∑ pkák∈àÜ\â

2

äkãr− 1Ä = 5"/

b

åÖçÑÑ(4)(f)

Thewayinwhich5"/åÖçÑÑ(4)recoversShannonentropyfort=1againfollowsbythebehaviorof

thegeneralizedlogarithm,because5"/b

åÖçÑÑ(2)(f) = !" >5 @∑ hijk∈m

!" A(

pkIJ? = ∑ hijk∈m

!" A(

pkI.

ForPowerentropies,5"/b

cd(`,C)(f) = 5"/

b

béèê_(`)(f)followsimmediatelyfrom5"/

b

cd(`,4)(f) =

(

#)(q1 − V∑ hi

_jk∈m

W

432

`32r,andthesameforQuadraticentropy,i.e.,5"/b

cd(C,C)(f) = 5"/b

ëçÖí(f).

Ifwepositt=2–1/r,wehave5"/b

cd(`,C3

2

`)(f) =

_

_)(q1 − V∑ ì(ℎi)

_jk∈m

W

2

`r,whichhappenstobe

preciselyArimoto’s(1971)entropy,underaninconsequentialchangeofparametrization(Arimoto,1971,usedaparameterbtobesetto1/rinournotation).

ForEffectiveNumbermeasures(Hill,1973),wehave:

5"/b

cd(`,ï)(f) =

(

)(q1 − V∑ hi

_jk∈m

W

32

`32r = V∑ hi_

jk∈mW

2

23` − 1 = 5"/b

ñó(`)(f)

AsafurtherpointconcerningEffectiveNumbers,consideraSharma-Mittalmeasure5"/b

cd(`,4)(f) =

!"#5_ >∑ hijk∈m!"_ A

(

pkI?,foranychoiceofrandt(bothnon-negative).WeaskwhatisthenumberN

ofequiprobableelementsinapartitionKsuchthat5"/b

cd(`,4)(f) = 5"/

}

cd(`,4)(ò).Wenotethat

5"/}

cd(`,4)(ò) = !"#5_{!"_(ô)} = !"#(ô),thusweposit:

5"/b

cd(`,4)(f) = !"#5_ >∑ hijk∈m

!"_ A(

pkI? = !"#(ô)

ô = 5_ >∑ hijk∈m!"_ A

(

pkI?

= @1 + (1 − a)(

()_V∑ hi

_ − 1jk∈mWJ

2

23`

= V∑ hi_

jk∈mW

2

23` = 5"/b

ñó(`)(f) + 1

66

Thisshowsthat,regardlessofthedegreeparametert,foranySharma-Mittalmeasureofaspecified

orderr,5"/b

ñó(`)(f) + 1computesthetheoreticalnumberNofequallyprobableelementsthat

wouldbejustasentropicasHunderthatmeasureandgivenP(H).

Thederivationoftheformof5"/cd(ï,4)isasfollows:

5"/b

cd(ï,4)(f) =

(

#)(q1 − V∑ hi

Rjk∈m

W

432

32 r= (

()#*("K)(()#) − 1-= !"#("

K)

where"KisthenumberofelementsinHwithanon-nullprobabilityaccordingtoP(H)(recallthatweapplytheconvention00=0,commonintheentropyliterature).Hartleyentropy,5"/b

mÖ_#ÜêB(f) = !"("K),immediatelyfollowsasaspecialcasefort=1,justasOriginentropy,

5"/bö_iõi\

(f) = "K − 1,fort=0.t=2yields5"/b

cd(ï,C)(f) = −[("K))( − 1] =

\ú)(

\ú.

Forthecaseofinfiniteorder,weposith∗ = maxjk∈m

(hi)andnotethefollowing(nisagainthe

overallsizeofH):

(h∗)_ ≤ ∑ hi

_ ≤jk∈m"(h∗)

_

Assuming#)(_)(

≥ 0involvesnolossofgeneralityinwhatfollows:

((h∗)_)

432

`32 ≤ V∑ hi_

jk∈mW

432

`32 ≤ ("(h∗)_)

432

`32

!" @((h∗)_)

432

`32J ≤ !" qV∑ hi_

jk∈mW

432

`32r ≤ !" @"432

`32((h∗)_)

432

`32J

!" @((h∗)_)

432

`32J ≤ !" qV∑ hi_

jk∈mW

432

`32r ≤ !" A"432

`32I + !" @((h∗)_)

432

`32J

lim_→]

>!" @((h∗)_)

432

`32J? ≤ lim_→]

U!" qV∑ hi_

jk∈mW

432

`32rY ≤ lim_→]

>#)(

_)(!"(")? + lim

_→]>!" @((h∗)

_)432

`32J?

lim_→]

>!" @((h∗)_)

432

`32J? ≤ lim_→]

U!" qV∑ hi_

jk∈mW

432

`32rY ≤ 0 + lim_→]

>!" @((h∗)_)

432

`32J?

Therefore:

lim_→]

U!" qV∑ hi_

jk∈mW

432

`32rY = lim_→]

>!" @((h∗)_)

432

`32J? = lim_→]

U!" qA(h∗)`

`32I#)(

rY

Thelimitforr®¥oftheargumentofthelnfunctionexistsandisfinite: lim_→]

qA(h∗)`

`32I#)(

r =

h∗(#)().Forthisreason,wecanconclude:

!" U lim_→]

qV∑ hi_

jk∈mW

432

`32rY = !" U lim_→]

qA(h∗)`

`32I#)(

rY

lim_→]

qV∑ hi_

jk∈mW

432

`32r = h∗(#)()

67

(

()#q lim_→]

V∑ hi_

jk∈mW

432

`32 − 1r =(

()#*h∗

(#)() − 1-

lim_→]

q(

()#âV∑ hi

_jk∈m

W

432

`32 − 1ãr =(

()#âA

(

p∗I()#

− 1ã

lim_→]

@5"/b

cd(`,4)(f)J = !"# A

(

p∗I

Errorentropyisaspecialcasefort=2,because!"D A (p∗I = −âA(

p∗I)(

− 1ã = 1 − h∗ = 1 −maxjk∈m

(hi).

t=1yields!" † (

°¢£ák∈à

(pk)§,while,fort=0,oneimmediatelyhas (

°¢£ák∈à

(pk)− 1.

4.Ordinalequivalence,additivity,andconcavity

TheordinalequivalenceofanypairofSharma-Mittalmeasures5"/cd(`,4)and5"/cd(`,4∗)withthesameorderranddifferentdegrees,tandt*,iseasilyprovenonthebasisoftheinverserelationshipoflnt(x)andet(x).Infact,foranyr,t,t*,andanyHandP(H),5"/cd(`,4)isastrictlyincreasingfunctionof5"/cd(`,4∗):

5"/b

cd(`,4)(f) = !"#5_ @∑ hijk∈m

!"_ A(

pkIJ

= !"#5#∗ >!"#∗5_ @∑ hijk∈m!"_ A

(

pkIJ?

= !"#5#∗ >5"/bcd(`,4∗)(f)?

Fordegreest,t*≠1,thisimpliesthat:

5"/b

cd(`,4)(f) =

(

()#�@1 + (1 − /∗)5"/

b

cd(`,4∗)J

234

234∗

− 1Ä

whereaswhent=1and/ort*=1,thelimitingcasesoftheordinaryexponentialand/ornaturallogarithmapply.Thisgeneralresultisnovelintheliteraturetothebestofourknowledge.However,awell-knownspecialcaseistherelationshipbetweenRényientropiesandtheEffectiveNumbermeasures(seeHill,1973,p.428,andRicotta,2003,p.191):

5"/b

Åé\Bi(`)(f) = 5"/

b

cd(`,2)(f)

= !" >5R @5"/bcd(`,ï)

(f)J?

= !" >1 + 5"/b

cd(`,ï)(f)?

= !" >5"/b

ñó(`)(f) + 1?

AnotherneatillustrationinvolvesPowerentropymeasuresandRényientropies:

68

5"/b

béèê_(`)(f) = 5"/

b

cd(`,C)(f)

= !"D >5 @5"/bcd(`,2)

(f)J?

= 1 − 5)ê\#•¶é[Ok(`)

(m)

WewillnowderivethegeneraladditivityruleforSharma-Mittalentropiesconcerningindependentvariables,i.e.,whenß ⊥b ©holds.Tosimplifynotation,belowwewilluse∑(ß)asashorthandfor

V∑ ì(%i)_

6k∈™W

432

`32(thesameforY,andsoon)andwewilluse5"/(ß)asashorthandfor

5"/b

cd(`,4)(ß)(thesamefortheexpectedreductionofentropy,R).

5"/(ß) + 5"/(©) − (/ − 1)5"/(ß)5"/(©)=

(

#)([1 − ∑(ß)] +

(

#)([1 − ∑(©)] − (/ − 1)

(

#)([1 − ∑(ß)]

(

#)([1 − ∑(©)]

=(

#)([1 − ∑(ß)] +

(

#)([1 − ∑(©)] −

(

#)([1 − ∑(ß)][1 − ∑(©)]

=(

#)(−

(

#)(∑(ß) +

(

#)(−

(

#)(∑(©) −

(

#)(+

(

#)(∑(ß) +

(

#)(∑(©) −

(

#)(∑(ß)∑(©)

=(

#)(−

(

#)(∑(ß)∑(©)

=(

#)(−

(

#)(V∑ ì(%i)

_6k∈™

W

432

`32 A∑ ì(=´)_

B¨∈≠I

432

`32

=(

#)(Æ1 − A∑ ì(%i)

_6k∈™

∑ ì(=´)_

B¨∈≠I

432

`32Ø

=(

#)(Æ1 − A∑ ∑ Vì(%i)ì(=´)W

_

B¨∈≠6k∈™I

432

`32Ø

=(

#)(Æ1 − A∑ ∑ ìV%i ∩ =´W

_

B¨∈≠6k∈™I

432

`32Ø =5"/(ß × ©)

Thisadditivityruleinturngovernstherelationshipbetweentheexpectedentropyreductionofatestincaseitisaperfect(conclusive)experimentandincaseitisnot.Moreprecisely,itimpliesthatforindependentvariablesEandH:

≤(≥, ≥) − ≤(f × ≥, ≥) = (/ − 1)5"/(f)5"/(≥)

Infact:

≤(≥, ≥) − ≤(f × ≥, ≥) = 5"/(≥) − ∑ *5"/V≥|5́ W-ìV5́ Wê¨∈ñ− 5"/(f × ≥) + ∑ *5"/Vf × ≥|5́ W-ìV5́ Wê¨∈ñ

= 5"/(≥) − 0 − 5"/(f) − 5"/(≥) + (/ − 1)5"/(f)5"/(≥) + ∑ *5"/Vf × ≥|5́ W-ìV5́ Wê¨∈ñ

= −5"/(f) + (/ − 1)5"/(f)5"/(≥) + ∑ *5"/Vf|5́ W + 5"/V≥|5́ W − (/ − 1)5"/Vf|5́ W5"/V≥|5́ W-ìV5́ Wê¨∈ñ

= −5"/(f) + (/ − 1)5"/(f)5"/(≥) + ∑ *5"/Vf|5́ W + 0 − (/ − 1)V5"/Vf|5́ W × 0W-ìV5́ Wê¨∈ñ

= −5"/(f) + (/ − 1)5"/(f)5"/(≥) + 5"/(f) = (/ − 1)5"/(f)5"/(≥)

69

Sharma-Mittalmeasuresofexpectedentropyreductionarealsogenerallyadditiveforacombinationofexperiments,thatis,foranyH,E,Fandì(f, ≥, µ),itholdsthat≤(f, ≥ × µ) =≤(f, ≥) + ≤(f, µ|≥).Toseethis,letusfirstconsidertheentropyreductionofaspecificdatume,∆5"/(f, 5) = 5"/(f) − 5"/(f|5).∆5"/isclearlyadditiveinthefollowingway:

∆5"/(f, 5 ∩ o) = 5"/(f) − 5"/(f|5 ∩ o)= 5"/(f) − 5"/(f|5) + 5"/(f|5) − 5"/(f|5 ∩ o)= ∆5"/(f, 5) + ∆5"/(f, o|5)

Butthispatterncarriesovertotheexpectedvalue≤(f, ≥ × µ):

≤(f, ≥ × µ) = ∑ ∑ *∆5"/Vf, 5́ ∩ o∑W-ìV5́ ∩ o∑W∏π∈∫ê¨∈ñ

= ∑ ∑ *∆5"/Vf, 5́ W + ∆5"/Vf, o∑|5́ W-ìVo∑|5́ WìV5́ W∏π∈∫ê¨∈ñ

= ∑ ∑ *∆5"/Vf, 5́ W-ìVo∑|5́ WìV5́ W +∏π∈∫ê¨∈ñ∑ ∑ *∆5"/Vf, o∑|5́ W-ìVo∑|5́ WìV5́ W∏π∈∫ê¨∈ñ

= ∑ *∆5"/Vf, 5́ W-ìV5́ Wê¨∈ñ+ ∑ s∑ *∆5"/Vf, o∑|5́ W-ìVo∑|5́ W∏π∈∫

tìV5́ Wê¨∈ñ

= ≤(f, ≥) + ∑ s≤Vf, µ|5́ WtìV5́ Wê¨∈ñ

= ≤(f, ≥) + ≤(f, µ|≥)

Thisresultisnovelintheliteraturetothebestofourknowledge.

Finally,wewillshowthat,foranyH,E,andP(H,E),≤(f, ≥) ≥ 0ifandonlyifentisconcave.Letªb(º)betheexpectedvalueofavariablevforsomeprobabilitydistributionP={p1,…,pm},i.e.ªb(º) = ∑ ºi

Ωi^( hi.AccordingtoamultivariateversionofJensen’sinequality,g(%(, … , %\)isa

concavefunctionifandonlyifgoftheexpectedvaluesofitsargumentsisgreaterthan(orequalto)theexpectedvalueofg,thatis:

g[ªb(%(), … , ªb(%\)] ≥ ªb[g(%(, … , %\)]

Nowwesetg(%(, … , %\) = 5"/(f|5)andwepositthatªb(%)becomputedonthebasisofP(E),i.e.ªb(º) = ∑ ºiìV5́ Wê¨∈ñ

.Assumingthatentisconcave,wehave:

(

#)(æ1 − A∑ A∑ ì(ℎi|5́ )ì(5́ê¨∈ñ

)I_

jk∈mI

432

`32

ø ≥∑ q(

#)(â1 − V∑ ì(ℎi|5́ )

_jk∈m

W

432

`32ãrìV5́ Wê¨∈ñ

(

#)(â1 − V∑ ì(ℎi)

_jk∈m

W

432

`32ã − ∑ q(

#)(â1 − V∑ ì(ℎi|5́ )

_jk∈m

W

432

`32ãrìV5́ Wê¨∈ñ≥ 0

∑ q(

#)(â1 − V∑ ì(ℎi)

_jk∈m

W

432

`32ã −(

#)(â1 − V∑ ì(ℎi|5́ )

_jk∈m

W

432

`32ãrìV5́ Wê¨∈ñ≥ 0

∑ ∆5"/(f, 5)ìV5́ Wê¨∈ñ≥ 0

≤(f, ≥) ≥ 0

70

5.ExpectedentropyreductioninthePersonGame

Toanalyzetheexpectedentropyreductionofonebinaryqueryinthepersongame,wewillpositH={h1,…,hn}(thesetofpossibleguessesastowhotherandomlyselectedcharacteris)andE={e,5}(theyes/noanswerstoaquestionsuchas“doestheselectedcharacterhaveblueeyes?”;recallthat“5”denotesthecomplementorthenegationofe).ThejointprobabilitydistributionP(H,E)isdefinedasfollows:P(hiÇe)=1/nincasei≤k(with1≤k<n)andP(hiÇe)=0otherwise;P(hiÇ5)=0incasei≤kandP(hiÇ5)=1/notherwise.ThisimpliesthatP(hi)=1/nforeachi(allguessesareinitiallyequiprobable),P(hi|e)=1/kforeachi≤k(theposteriorgiveneisauniformdistributionoverkelementsofH),andP(hi|5)=1/(n–k)foreachi>k(theposteriorgiven5isauniformdistributionovern–kelementsofH).Moreover,P(e)=k/n.Giventhegeneralfactthat

5"/}

cd(`,4)(f) = !"#("),wehave:

≤b

cd(`,4)(f, ≥) = [!"#(") − !"#(¿)]ì(5) + [!"#(") − !"#(" − ¿)]ì(5)

Algebraicmanipulationsyield:

≤b

cd(`,4)(f, ≥) =

"()#

1 − /[1 − (ì(5)D)# + ì(5)D)#)]

Inthespecialcaset=2,onethenhas≤b

cd(`,C)(f, ≥) =

(

\,sothattheexpectedusefulnessofqueryEis

constant,regardlessofthevalueofP(e).Moregenerally,however,thefirstderivativeof≤b

cd(`,4)(f, ≥)

is

"()#

1 − /[(2 − /)ì(5)()# − (2 − /)ì(5)()#]

whichequalszeroforP(e)=P(5),sothat≤b

cd(`,4)(f, ≥)hasamaximumoraminimumforP(e)=½.

Thesecondderivative,inturn,is:

"()#(/ − 2)[ì(5))# + ì(5))#]

whichisstrictlypositive[negative]incasetisstrictlyhigher[lower]than2.So,inthepersongame,

≤b

cd(`,4)(f, ≥)isastrictlyconcavefunctionofP(e)whent<2,andP(e)=½isthenamaximum.

Whent>2,onthecontrary,≤b

cd(`,4)(f, ≥)isastrictlyconvexfunctionofP(e),andP(e)=½isthena

minimum.Thisgeneralresultisnovelintheliteraturetothebestofourknowledge.

top related