the ethics of artificial intelligence - nick bostrom

7/22/2019 The Ethics of Artificial Intelligence - Nick Bostrom

1/20

1

THE ETHICS OF ARTIFICIALINTELLIGENCE

(2011)

NickBostrom

EliezerYudkowsky

DraftforCambridgeHandbookofArtificialIntelligence,eds.WilliamRamseyandKeith

Frankish(CambridgeUniversityPress,2011): forthcoming

Thepossibilityofcreatingthinkingmachinesraisesahostofethicalissues. These

questionsrelate

both

to

ensuring

that

such

machines

do

not

harm

humans

and

other

morallyrelevantbeings,andtothemoralstatusofthemachinesthemselves. Thefirst

sectiondiscussesissuesthatmayariseinthenearfutureofAI. Thesecondsection

outlineschallengesforensuringthatAIoperatessafelyasitapproacheshumansinits

intelligence. Thethirdsectionoutlineshowwemightassesswhether,andinwhat

circumstances,AIsthemselveshavemoralstatus. Inthefourthsection,weconsider

howAIsmightdifferfromhumansincertainbasicrespectsrelevanttoourethical

assessmentofthem. ThefinalsectionaddressestheissuesofcreatingAIsmore

intelligentthanhuman,andensuringthattheyusetheiradvancedintelligencefor

goodratherthanill.

EthicsinMachineLearningandOtherDomainSpecificAI

Algorithms

Imagine,inthenearfuture,abankusingamachinelearningalgorithmtorecommend

mortgageapplicationsforapproval. Arejectedapplicantbringsalawsuitagainstthe

bank,allegingthatthealgorithmisdiscriminatingraciallyagainstmortgage

applicants. Thebankrepliesthatthisisimpossible,sincethealgorithmisdeliberately

blindedtotheraceoftheapplicants. Indeed,thatwaspartofthebanksrationalefor

implementingthesystem. Evenso,statisticsshowthatthebanksapprovalratefor

blackapplicantshasbeensteadilydropping. Submittingtenapparentlyequally

qualifiedgenuineapplicants(asdeterminedbyaseparatepanelofhumanjudges)

showsthatthealgorithmacceptswhiteapplicantsandrejectsblackapplicants. What

couldpossiblybehappening?

Findingananswermaynotbeeasy. Ifthemachinelearningalgorithmisbasedona

complicatedneuralnetwork,orageneticalgorithmproducedbydirectedevolution,

thenitmayprovenearlyimpossibletounderstandwhy,orevenhow,thealgorithmis

judgingapplicantsbasedontheirrace. Ontheotherhand,amachinelearnerbasedon

decisiontrees

or

Bayesian

networks

is

much

more

transparent

to

programmer


2/20

2

inspection(Hastieetal.2001),whichmayenableanauditortodiscoverthattheAI

algorithmusestheaddressinformationofapplicantswhowerebornorpreviously

residedinpredominantlypovertystrickenareas.

AIalgorithmsplayanincreasinglylargeroleinmodernsociety,thoughusuallynot

labeledAI. Thescenariodescribedabovemightbetranspiringevenaswewrite. It

willbecomeincreasinglyimportanttodevelopAIalgorithmsthatarenotjustpowerful

andscalable,butalsotransparenttoinspectiontonameoneofmanysociallyimportant

properties.

Somechallengesofmachineethicsaremuchlikemanyotherchallengesinvolvedin

designingmachines. Designingarobotarmtoavoidcrushingstrayhumansisno

moremorallyfraughtthandesigningaflameretardantsofa. Itinvolvesnew

programmingchallenges,butnonewethicalchallenges. ButwhenAIalgorithmstakeoncognitiveworkwithsocialdimensionscognitivetaskspreviouslyperformedby

humanstheAIalgorithminheritsthesocialrequirements. Itwouldsurelybe

frustratingtofindthatnobankintheworldwillapproveyourseeminglyexcellent

loanapplication,andnobodyknowswhy,andnobodycanfindouteveninprinciple.

(Maybeyouhaveafirstnamestronglyassociatedwithdeadbeats? Whoknows?)

TransparencyisnottheonlydesirablefeatureofAI. ItisalsoimportantthatAI

algorithmstakingoversocialfunctionsbepredictabletothosetheygovern. To

understandtheimportanceofsuchpredictability,considerananalogy. Thelegal

principleofstaredecisisbindsjudgestofollowpastprecedentwheneverpossible. To

anengineer,thispreferenceforprecedentmayseemincomprehensiblewhybindthe

futuretothepast,whentechnologyisalwaysimproving? Butoneofthemost

importantfunctionsofthelegalsystemistobepredictable,sothat,e.g.,contractscan

bewrittenknowinghowtheywillbeexecuted. Thejobofthelegalsystemisnot

necessarilytooptimizesociety,buttoprovideapredictableenvironmentwithinwhich

citizenscanoptimizetheirownlives.

ItwillalsobecomeincreasinglyimportantthatAIalgorithmsberobustagainst

manipulation.A

machine

vision

system

to

scan

airline

luggage

for

bombs

must

be

robustagainsthumanadversariesdeliberatelysearchingforexploitableflawsinthe

algorithmforexample,ashapethat,placednexttoapistolinonesluggage,would

neutralizerecognitionofit. Robustnessagainstmanipulationisanordinarycriterion

ininformationsecurity;nearlythecriterion. Butitisnotacriterionthatappearsoften

inmachinelearningjournals,whicharecurrentlymoreinterestedin,e.g.,howan

algorithmscalesuponlargerparallelsystems.

Anotherimportantsocialcriterionfordealingwithorganizationsisbeingabletofind

thepersonresponsibleforgettingsomethingdone. WhenanAIsystemfailsatits

assignedtask,whotakestheblame? Theprogrammers? Theendusers? Modern


3/20

3

bureaucratsoftentakerefugeinestablishedproceduresthatdistributeresponsibilityso

widelythatnoonepersoncanbeidentifiedtoblameforthecatastrophesthatresult

(Howard1994). Theprovablydisinterestedjudgmentofanexpertsystemcouldturn

outtobeanevenbetterrefuge. EvenifanAIsystemisdesignedwithauseroverride,

onemustconsiderthecareerincentiveofabureaucratwhowillbepersonallyblamed

iftheoverridegoeswrong,andwhowouldmuchprefertoblametheAIforany

difficultdecisionwithanegativeoutcome.

Responsibility,transparency,auditability,incorruptibility,predictability,anda

tendencytonotmakeinnocentvictimsscreamwithhelplessfrustration:allcriteriathat

applytohumansperformingsocialfunctions;allcriteriathatmustbeconsideredinan

algorithmintendedtoreplacehumanjudgmentofsocialfunctions;allcriteriathatmay

notappearinajournalofmachinelearningconsideringhowanalgorithmscalesupto

morecomputers. Thislistofcriteriaisbynomeansexhaustive,butitservesasasmallsampleofwhatanincreasinglycomputerizedsocietyshouldbethinkingabout.

ArtificialGeneralIntelligence

ThereisnearlyuniversalagreementamongmodernAIprofessionalsthatArtificial

Intelligencefallsshortofhumancapabilitiesinsomecriticalsense,eventhoughAI

algorithmshavebeatenhumansinmanyspecificdomainssuchaschess. Ithasbeen

suggestedbysomethatassoonasAIresearchersfigureouthowtodosomething,that

capabilityceasestoberegardedasintelligentchesswasconsideredtheepitomeof

intelligenceuntilDeepBluewontheworldchampionshipfromKasparovbuteven

theseresearchersagreethatsomethingimportantismissingfrommodernAIs(e.g.,

Hofstadter2006).

WhilethissubfieldofArtificialIntelligenceisonlyjustcoalescing,ArtificialGeneral

Intelligence(hereafter,AGI)istheemergingtermofartusedtodenoterealAI(see,

e.g.,theeditedvolumeGoertzelandPennachin2006). Asthenameimplies,the

emergingconsensusisthatthemissingcharacteristicisgenerality. CurrentAI

algorithmswithhumanequivalentor superiorperformancearecharacterizedbya

deliberately

programmed

competence

only

in

a

single,

restricted

domain.

Deep

Blue

becametheworldchampionatchess,butitcannotevenplaycheckers,letalonedrivea

carormakeascientificdiscovery. SuchmodernAIalgorithmsresembleallbiological

lifewiththesoleexceptionofHomosapiens. Abeeexhibitscompetenceatbuilding

hives;abeaverexhibitscompetenceatbuildingdams;butabeedoesntbuilddams,

andabeavercantlearntobuildahive. Ahuman,watching,canlearntodoboth;but

thisisauniqueabilityamongbiologicallifeforms. Itisdebatablewhetherhuman

intelligenceistrulygeneralwearecertainlybetteratsomecognitivetasksthanothers

(HirschfeldandGelman1994)buthumanintelligenceissurelysignificantlymore

generallyapplicablethannonhominidintelligence.


4/20

4

ItisrelativelyeasytoenvisagethesortofsafetyissuesthatmayresultfromAI

operatingonlywithinaspecificdomain. Itisaqualitativelydifferentclassofproblem

tohandleanAGIoperatingacrossmanynovelcontextsthatcannotbepredictedin

advance.

Whenhumanengineersbuildanuclearreactor,theyenvisionthespecificeventsthat

couldgooninsideitvalvesfailing,computersfailing,coresincreasingin

temperatureandengineerthereactortorendertheseeventsnoncatastrophic. Or,on

amoremundanelevel,buildingatoasterinvolvesenvisioningbreadandenvisioning

thereactionofthebreadtothetoastersheatingelement. Thetoasteritselfdoesnot

knowthatitspurposeistomaketoastthepurposeofthetoasterisrepresentedwithin

thedesignersmind,butisnotexplicitlyrepresentedincomputationsinsidethe

toasterandsoifyouplaceclothinsideatoaster,itmaycatchfire,asthedesign

executesinanunenvisionedcontextwithanunenvisionedsideeffect.

EventaskspecificAIalgorithmsthrowusoutsidethetoasterparadigm,thedomainof

locallypreprogrammed,specificallyenvisionedbehavior. ConsiderDeepBlue,the

chessalgorithmthatbeatGarryKasparovfortheworldchampionshipofchess. Were

itthecasethatmachinescanonlydoexactlyastheyaretold,theprogrammerswould

havehadtomanuallypreprogramadatabasecontainingmovesforeverypossible

chesspositionthatDeepBluecouldencounter. ButthiswasnotanoptionforDeep

Bluesprogrammers. First,thespaceofpossiblechesspositionsisunmanageably

large. Second,iftheprogrammershadmanuallyinputwhattheyconsideredagood

moveineachpossiblesituation,theresultingsystemwouldnothavebeenableto

makestrongerchessmovesthanitscreators. Sincetheprogrammersthemselveswere

notworldchampions,suchasystemwouldnothavebeenabletodefeatGarry

Kasparov.

Increatingasuperhumanchessplayer,thehumanprogrammersnecessarilysacrificed

theirabilitytopredictDeepBlueslocal,specificgamebehavior. Instead,DeepBlues

programmershad(justifiable)confidencethatDeepBlueschessmoveswouldsatisfya

nonlocalcriterionofoptimality: namely,thatthemoveswouldtendtosteerthefuture

ofthe

game

board

into

outcomes

in

the

winning

region

as

defined

by

the

chess

rules.

Thispredictionaboutdistantconsequences,thoughitprovedaccurate,didnotallow

theprogrammerstoenvisionthelocalbehaviorofDeepBlueitsresponsetoaspecific

attackonitskingbecauseDeepBluecomputedthenonlocalgamemap,thelink

betweenamoveanditspossiblefutureconsequences,moreaccuratelythanthe

programmerscould(Yudkowsky2006).

Modernhumansdoliterallymillionsofthingstofeedthemselvestoservethefinal

consequenceofbeingfed. FewoftheseactivitieswereenvisionedbyNatureinthe

senseofbeingancestralchallengestowhichwearedirectlyadapted. Butouradapted

brainhasgrownpowerfulenoughtobesignificantlymoregenerallyapplicable;toletus


5/20

5

foreseetheconsequencesofmillionsofdifferentactionsacrossdomains,andexertour

preferencesoverfinaloutcomes. Humanscrossedspaceandputfootprintsonthe

Moon,eventhoughnoneofourancestorsencounteredachallengeanalogousto

vacuum. ComparedtodomainspecificAI,itisaqualitativelydifferentproblemto

designasystemthatwilloperatesafelyacrossthousandsofcontexts;including

contextsnotspecificallyenvisionedbyeitherthedesignersortheusers;including

contextsthatnohumanhasyetencountered. Heretheremaybenolocalspecification

ofgoodbehaviornosimplespecificationoverthebehaviorsthemselves,anymore

thanthereexistsacompactlocaldescriptionofallthewaysthathumansobtaintheir

dailybread.

TobuildanAIthatactssafelywhileactinginmanydomains,withmany

consequences,includingproblemstheengineersneverexplicitlyenvisioned,onemust

specifygoodbehaviorinsuchtermsasXsuchthattheconsequenceofXisnotharmfultohumans. Thisisnonlocal;itinvolvesextrapolatingthedistant

consequencesofactions. Thus,thisisonlyaneffectivespecificationonethatcanbe

realizedasadesignpropertyifthesystemexplicitlyextrapolatestheconsequencesof

itsbehavior. Atoastercannothavethisdesignpropertybecauseatoastercannot

foreseetheconsequencesoftoastingbread.

Imagineanengineerhavingtosay,Well,IhavenoideahowthisairplaneIbuiltwill

flysafelyindeedIhavenoideahowitwillflyatall,whetheritwillflapitswingsor

inflateitselfwithheliumorsomethingelseIhaventevenimaginedbutIassureyou,

thedesignisvery,verysafe. Thismayseemlikeanunenviablepositionfromthe

perspectiveofpublicrelations,butitshardtoseewhatotherguaranteeofethical

behaviorwouldbepossibleforageneralintelligenceoperatingonunforeseen

problems,acrossdomains,withpreferencesoverdistantconsequences. Inspectingthe

cognitivedesignmightverifythatthemindwas,indeed,searchingforsolutionsthat

wewouldclassifyasethical;butwecouldntpredictwhichspecificsolutionthemind

woulddiscover.

Respectingsuchaverificationrequiressomewaytodistinguishtrustworthy

assurances(a

procedure

which

will

not

say

the

AI

is

safe

unless

the

AI

really

is

safe)

frompurehopeandmagicalthinking(IhavenoideahowthePhilosophersStone

willtransmuteleadtogold,butIassureyou,itwill!). Oneshouldbearinmindthat

purelyhopefulexpectationshavepreviouslybeenaprobleminAIresearch

(McDermott1976).

VerifiablyconstructingatrustworthyAGIwillrequiredifferentmethods,anda

differentwayofthinking,frominspectingpowerplantsoftwareforbugsitwill

requireanAGIthatthinkslikeahumanengineerconcernedaboutethics,notjusta

simpleproductofethicalengineering.


6/20

6

ThusthedisciplineofAIethics,especiallyasappliedtoAGI,islikelytodiffer

fundamentallyfromtheethicaldisciplineofnoncognitivetechnologies,inthat:

Thelocal,specificbehavioroftheAImaynotbepredictableapartfromitssafety,eveniftheprogrammersdoeverythingright;

Verifyingthesafetyofthesystembecomesagreaterchallengebecausewemustverifywhatthesystemistryingtodo,ratherthanbeingabletoverifythe

systemssafebehaviorinalloperatingcontexts;

Ethicalcognitionitselfmustbetakenasasubjectmatterofengineering.

MachineswithMoralStatus

Adifferentsetofethicalissuesariseswhenwecontemplatethepossibilitythatsome

futureAI

systems

might

be

candidates

for

having

moral

status.

Our

dealings

with

beingspossessedofmoralstatusarenotexclusivelyamatterofinstrumental

rationality:wealsohavemoralreasonstotreatthemincertainways,andtorefrain

fromtreatingthemincertainotherways. FrancisKammhasproposedthefollowing

definitionofmoralstatus,whichwillserveforourpurposes:

Xhasmoralstatus=becauseXcountsmorallyinitsownright,itis

permissible/impermissibletodothingstoitforitsownsake.(Kamm2007:

chapter7;paraphrase)

Arockhasnomoralstatus:wemaycrushit,pulverizeit,orsubjectittoanytreatment

welikewithoutanyconcernfortherockitself. Ahumanperson,ontheotherhand,

mustbetreatednotonlyasameansbutalsoasanend. Exactlywhatitmeanstotreat

apersonasanendissomethingaboutwhichdifferentethicaltheoriesdisagree;butit

certainlyinvolvestakingherlegitimateinterestsintoaccountgivingweighttoher

wellbeinganditmayalsoinvolveacceptingstrictmoralsideconstraintsinour

dealingswithher,suchasaprohibitionagainstmurderingher,stealingfromher,or

doingavarietyofotherthingstoherorherpropertywithoutherconsent. Moreover,

itisbecauseahumanpersoncountsinherownright,andforhersake,thatitis

impermissible

to

do

to

her

these

things.

This

can

be

expressed

more

concisely

by

sayingthatahumanpersonhasmoralstatus.

Questionsaboutmoralstatusareimportantinsomeareasofpracticalethics. For

example,disputesaboutthemoralpermissibilityofabortionoftenhingeon

disagreementsaboutthemoralstatusoftheembryo. Controversiesaboutanimal

experimentationandthetreatmentofanimalsinthefoodindustryinvolvequestions

aboutthemoralstatusofdifferentspeciesofanimal. Andourobligationstowards

humanbeingswithseveredementia,suchaslatestageAlzheimerspatients,mayalso

dependonquestionsofmoralstatus.


7/20

7

ItiswidelyagreedthatcurrentAIsystemshavenomoralstatus. Wemaychange,

copy,terminate,delete,orusecomputerprogramsasweplease;atleastasfarasthe

programsthemselvesareconcerned. Themoralconstraintstowhichwearesubjectin

ourdealingswithcontemporaryAIsystemsareallgroundedinourresponsibilitiesto

otherbeings,suchasourfellowhumans,notinanydutiestothesystemsthemselves.

WhileitisfairlyconsensualthatpresentdayAIsystemslackmoralstatus,itisunclear

exactlywhatattributesgroundmoralstatus. Twocriteriaarecommonlyproposedas

beingimportantlylinkedtomoralstatus,eitherseparatelyorincombination:sentience

andsapience(orpersonhood). Thesemaybecharacterizedroughlyasfollows:

Sentience:thecapacityforphenomenalexperienceorqualia,suchasthe

capacitytofeelpainandsuffer

Sapience:asetofcapacitiesassociatedwithhigherintelligence,suchasself

awarenessandbeingareasonresponsiveagent

Onecommonviewisthatmanyanimalshavequaliaandthereforehavesomemoral

status,butthatonlyhumanbeingshavesapience,whichgivesthemahighermoral

statusthannonhumananimals.1 Thisview,ofcourse,mustconfronttheexistenceof

borderlinecasessuchas,ontheonehand,humaninfantsorhumanbeingswithsevere

mentalretardationsometimesunfortunatelyreferredtoasmarginalhumans

whichfailtosatisfythecriteriaforsapience;and,ontheotherhand,somenonhuman

animalssuchasthegreatapes,whichmightpossessatleastsomeoftheelementsof

sapience. Somedenythatsocalledmarginalhumanshavefullmoralstatus. Others

proposeadditionalwaysinwhichanobjectcouldqualifyasabearerofmoralstatus,

suchasbybeingamemberofakindthatnormallyhassentienceorsapience,orby

standinginasuitablerelationtosomebeingthatindependentlyhasmoralstatus(cf.

MaryAnneWarren2000). Forpresentpurposes,however,wewillfocusonthecriteria

ofsentienceandsapience.

ThispictureofmoralstatussuggeststhatanAIsystemwillhavesomemoralstatusifit

hasthe

capacity

for

qualia,

such

as

an

ability

to

feel

pain.

A

sentient

AI

system,

even

if

itlackslanguageandotherhighercognitivefaculties,isnotlikeastuffedtoyanimalor

awindupdoll;itismorelikealivinganimal. Itiswrongtoinflictpainonamouse,

unlesstherearesufficientlystrongmorallyoverridingreasonstodoso. Thesame

wouldholdforanysentientAIsystem. Ifinadditiontosentience,anAIsystemalso

1Alternatively,onemightdenythatmoralstatuscomesindegrees. Instead,onemightholdthat

certainbeingshavemoresignificantintereststhanotherbeings. Thus,forinstance,onecould

claimthatitisbettertosaveahumanthantosaveabird,notbecausethehumanhashigher

moralstatus,butbecausethehumanhasamoresignificantinterestinhavingherlifesavedthan

doesthebirdinhavingitslifesaved.


8/20

8

hassapienceofakindsimilartothatofanormalhumanadult,thenitwouldhavefull

moralstatus,equivalenttothatofhumanbeings.

Oneoftheideasunderlyingthismoralassessmentcanbeexpressedinstrongerform

asaprincipleofnondiscrimination:

PrincipleofSubstrateNonDiscrimination

Iftwobeingshavethesamefunctionalityandthesameconsciousexperience,

anddifferonlyinthesubstrateoftheirimplementation,thentheyhavethe

samemoralstatus.

Onecanargueforthisprincipleongroundsthatrejectingitwouldamountto

embracingapositionsimilartoracism:substratelacksfundamentalmoralsignificance

inthesamewayandforthesamereasonasskincolordoes. ThePrincipleofSubstrateNonDiscriminationdoesnotimplythatadigitalcomputercouldbeconscious,orthat

itcouldhavethesamefunctionalityasahumanbeing. Substratecanofcoursebe

morallyrelevantinsofarasitmakesadifferencetosentienceorfunctionality. But

holdingthesethingsconstant,itmakesnomoraldifferencewhetherabeingismadeof

siliconorcarbon,orwhetheritsbrainusessemiconductorsorneurotransmitters.

AnadditionalprinciplethatcanbeproposedisthatthefactthatAIsystemsare

artificiali.e.,theproductofdeliberatedesignisnotfundamentallyrelevanttotheir

moralstatus. Wecouldformulatethisasfollows:

PrincipleofOntogenyNonDiscrimination

Iftwobeingshavethesamefunctionalityandthesameconsciousness

experience,anddifferonlyinhowtheycameintoexistence,thentheyhavethe

samemoralstatus.

Today,thisideaiswidelyacceptedinthehumancasealthoughinsomecircles,

particularlyinthepast,theideathatonesmoralstatusdependsononesbloodlineor

castehasbeeninfluential. Wedonotbelievethatcausalfactorssuchasfamily

planning,assisted

delivery,

in

vitro

fertilization,

gamete

selection,

deliberate

enhancementofmaternalnutritionetc.whichintroduceanelementofdeliberate

choiceanddesigninthecreationofhumanpersonshaveanynecessaryimplicationsfor

themoralstatusoftheprogeny. Eventhosewhoareopposedtohumanreproductive

cloningformoralorreligiousreasonsgenerallyacceptthat,shouldahumanclonebe

broughttoterm,itwouldhavethesamemoralstatusasanyotherhumaninfant. The

PrincipleofOntogenyNonDiscriminationextendsthisreasoningtothecaseinvolving

entirelyartificialcognitivesystems.

Itis,ofcourse,possibleforcircumstancesofcreationtoaffecttheensuingprogenyin

suchawayastoalteritsmoralstatus. Forexample,ifsomeprocedurewere


9/20

9

performedduringconceptionorgestationthatcausedahumanfetustodevelop

withoutabrain,thenthisfactaboutontogenywouldberelevanttoourassessmentof

themoralstatusoftheprogeny. Theanencephalicchild,however,wouldhavethe

samemoralstatusasanyothersimilaranencephalicchild,includingonethathadcome

aboutthroughsomeentirelynaturalprocess. Thedifferenceinmoralstatusbetween

ananencephalicchildandanormalchildisgroundedinthequalitativedifference

betweenthetwothefactthatonehasamindwhiletheotherdoesnot. Sincethetwo

childrendonothavethesamefunctionalityandthesameconsciousexperience,the

PrincipleofOntogenyNonDiscriminationdoesnotapply.

AlthoughthePrincipleofOntogenyNonDiscriminationassertsthatabeings

ontogenyhasnoessentialbearingonitsmoralstatus,itdoesnotdenythatfactsabout

ontogenycanaffectwhatdutiesparticularmoralagentshavetowardthebeingin

question. Parentshavespecialdutiestotheirchildwhichtheydonothavetootherchildren,andwhichtheywouldnothaveeveniftherewereanotherchildqualitatively

identicaltotheirown. Similarly,thePrincipleofOntogenyNonDiscriminationis

consistentwiththeclaimthatthecreatorsorownersofanAIsystemwithmoralstatus

mayhavespecialdutiestotheirartificialmindwhichtheydonothavetoanother

artificialmind,evenifthemindsinquestionarequalitativelysimilarandhavethe

samemoralstatus.

Iftheprinciplesofnondiscriminationwithregardtosubstrateandontogenyare

accepted,thenmanyquestionsabouthowweoughttotreatartificialmindscanbe

answeredbyapplyingthesamemoralprinciplesthatweusetodetermineourduties

inmorefamiliarcontexts. Insofarasmoraldutiesstemfrommoralstatus

considerations,weoughttotreatanartificialmindinjustthesamewayasweoughtto

treataqualitativelyidenticalnaturalhumanmindinasimilarsituation. This

simplifiestheproblemofdevelopinganethicsforthetreatmentofartificialminds.

Evenifweacceptthisstance,however,wemustconfrontanumberofnovelethical

questionswhichtheaforementionedprinciplesleaveunanswered. Novelethical

questionsarisebecauseartificialmindscanhaveverydifferentpropertiesfrom

ordinaryhuman

or

animal

minds.

We

must

consider

how

these

novel

properties

wouldaffectthemoralstatusofartificialmindsandwhatitwouldmeantorespectthe

moralstatusofsuchexoticminds.

MindswithExoticProperties

Inthecaseofhumanbeings,wedonotnormallyhesitatetoascribesentienceand

consciousexperiencetoanyindividualwhoexhibitsthenormalkindsofhuman

behavior. Fewbelievetheretobeotherpeoplewhoactperfectlynormallybutlack

consciousness. However,otherhumanbeingsdonotmerelybehaveinpersonlike

wayssimilartoourselves;theyalsohavebrainsandcognitivearchitecturesthatare


10/20


11/20

11

functionalcharacteristicsoftheoriginalbrain. Theresultinguploadmayinhabita

simulatedvirtualreality,or,alternatively,itcouldbegivencontrolofaroboticbody,

enablingittointeractdirectlywithexternalphysicalreality.

Anumberofquestionsariseinthecontextofsucha scenario: Howplausibleisitthat

thisprocedurewillonedaybecometechnologicallyfeasible? Iftheprocedureworked

andproducedacomputerprogramexhibitingroughlythesamepersonality,thesame

memories,andthesamethinkingpatternsastheoriginalbrain,wouldthisprogrambe

sentient? Wouldtheuploadbethesamepersonastheindividualwhosebrainwas

disassembledintheuploadingprocess? Whathappenstopersonalidentityifan

uploadiscopiedsuchthattwosimilarorqualitativelyidenticaluploadmindsare

runninginparallel? Althoughallofthesequestionsarerelevanttotheethicsof

machineintelligence,letusherefocusonanissueinvolvingthenotionofasubjective

rateoftime.

Supposethatanuploadcouldbesentient. Ifweruntheuploadprogramonafaster

computer,thiswillcausetheupload,ifitisconnectedtoaninputdevicesuchasa

videocamera,toperceivetheexternalworldasifithadbeensloweddown. For

example,iftheuploadisrunningathousandtimesfasterthantheoriginalbrain,then

theexternalworldwillappeartotheuploadasifitweresloweddownbyafactorof

thousand. Somebodydropsaphysicalcoffeemug: Theuploadobservesthemug

slowlyfallingtothegroundwhiletheuploadfinishesreadingthemorningnewspaper

andsendsoffafewemails. Onesecondofobjectivetimecorrespondsto17minutesof

subjectivetime. Objectiveandsubjectivedurationcanthusdiverge.

Subjectivetimeisnotthesameasasubjectsestimateorperceptionofhowfasttime

flows. Humanbeingsareoftenmistakenabouttheflowoftime. Wemaybelievethat

itisoneoclockwhenitisinfactaquarterpasttwo;orastimulantdrugmightcause

ourthoughtstorace,makingitseemasthoughmoresubjectivetimehaslapsedthanis

actuallythecase. Thesemundanecasesinvolveadistortedtimeperceptionratherthan

ashiftintherateofsubjectivetime. Eveninacocaineaddledbrain,thereisprobably

notasignificantchangeinthespeedofbasicneurologicalcomputations;morelikely,

thedrug

is

causing

such

abrain

to

flicker

more

rapidly

from

one

thought

to

another,

makingitspendlesssubjectivetimethinkingeachofagreaternumberofdistinct

thoughts.

Thevariabilityofthesubjectiverateoftimeisanexoticpropertyofartificialmindsthat

raisesnovelethicalissues. Forexample,incaseswherethedurationofanexperienceis

ethicallyrelevant,shoulddurationbemeasuredinobjectiveorsubjectivetime? Ifan

uploadhascommittedacrimeandissentencedtofouryearsinprison,shouldthisbe

fourobjectiveyearswhichmightcorrespondtomanymillenniaofsubjectivetime

orshoulditbefoursubjectiveyears,whichmightbeoverinacoupleofdaysof

objectivetime? IfafastAIandahumanareinpain,isitmoreurgenttoalleviatethe


12/20


13/20

13

Moreover,sincetheAIcopywouldbeidenticaltotheoriginal,itwouldbeborn

completelymature,andthecopycouldbeginmakingitsowncopiesimmediately.

Absenthardwarelimitations,apopulationofAIscouldthereforegrowexponentially

atanextremelyrapidrate,withadoublingtimeontheorderofminutesorhours

ratherthandecadesorcenturies.

Ourcurrentethicalnormsaboutreproductionincludesomeversionofaprincipleof

reproductivefreedom,totheeffectthatitisuptoeachindividualorcoupletodecide

forthemselveswhethertohavechildrenandhowmanychildrentohave. Another

normwehave(atleastinrichandmiddleincomecountries)isthatsocietymuststep

intoprovidethebasicneedsofchildrenincaseswheretheirparentsareunableor

refusingtodoso. Itiseasytoseehowthesetwonormscouldcollideinthecontextof

entitieswiththecapacityforextremelyrapidreproduction.

Consider,forexample,apopulationofuploads,oneofwhomhappenstohavethe

desiretoproduceaslargeaclanaspossible. Givencompletereproductivefreedom,

thisuploadmaystartcopyingitselfasquicklyasitcan;andthecopiesitproduces

whichmayrunonnewcomputerhardwareownedorrentedbytheoriginal,ormay

sharethesamecomputerastheoriginalwillalsostartcopyingthemselves,sincethey

areidenticaltotheprogenitoruploadandshareitsphiloprogenicdesire. Soon,

membersoftheuploadclanwillfindthemselvesunabletopaytheelectricitybillorthe

rentforthecomputationalprocessingandstorageneededtokeepthemalive. Atthis

point,asocialwelfaresystemmightkickintoprovidethemwithatleastthebare

necessitiesforsustaininglife. Butifthepopulationgrowsfasterthantheeconomy,

resourceswillrunout;atwhichpointuploadswilleitherdieortheirabilityto

reproducewillbecurtailed. (Fortworelateddystopianscenarios,seeBostrom(2004).)

Thisscenarioillustrateshowsomemidlevelethicalprinciplesthataresuitablein

contemporarysocietiesmightneedtobemodifiedifthosesocietiesweretoinclude

personswiththeexoticpropertyofbeingabletoreproduceveryrapidly.

Thegeneralpointhereisthatwhenthinkingaboutappliedethicsforcontextsthatare

verydifferent

from

our

familiar

human

condition,

we

must

be

careful

not

to

mistake

midlevelethicalprinciplesforfoundationalnormativetruths. Putdifferently,we

mustrecognizetheextenttowhichourordinarynormativepreceptsareimplicitly

conditionedontheobtainingofvariousempiricalconditions,andtheneedtoadjust

thesepreceptsaccordinglywhenapplyingthemtohypotheticalfuturisticcasesin

whichtheirpreconditionsareassumednottoobtain. Bythis,wearenotmakingany

controversialclaimaboutmoralrelativism,butmerelyhighlightingthe

commonsensicalpointthatcontextisrelevanttotheapplicationofethicsand

suggestingthatthispointisespeciallypertinentwhenoneisconsideringtheethicsof

mindswithexoticproperties.


14/20

14

Superintelligence

I.J.Good(1965)setforththeclassichypothesisconcerningsuperintelligence:thatan

AIsufficientlyintelligenttounderstanditsowndesigncouldredesignitselforcreatea

successorsystem,

more

intelligent,

which

could

then

redesign

itself

yet

again

to

becomeevenmoreintelligent,andsooninapositivefeedbackcycle. Goodcalledthis

theintelligenceexplosion. RecursivescenariosarenotlimitedtoAI:humanswith

intelligenceaugmentedthroughabraincomputerinterfacemightturntheirmindsto

designingthenextgenerationofbraincomputerinterfaces. (Ifyouhadamachinethat

increasedyourIQ,itwouldbeboundtooccurtoyou,onceyoubecamesmartenough,

totrytodesignamorepowerfulversionofthemachine.)

Superintelligencemayalsobeachievablebyincreasingprocessingspeed. Thefastest

observedneuronsfire1000timespersecond;thefastestaxonfibersconductsignalsat

150meters/second,ahalfmillionththespeedoflight(Sandberg1999). Itseemsthatit

shouldbephysicallypossibletobuildabrainwhichcomputesamilliontimesasfastas

ahumanbrain,withoutshrinkingitssizeorrewritingitssoftware. Ifahumanmind

werethusaccelerated,asubjectiveyearofthinkingwouldbeaccomplishedforevery

31physicalsecondsintheoutsideworld,andamillenniumwouldflybyineightanda

halfhours. Vinge(1993)referredtosuchspedupmindsasweaksuperintelligence:

amindthatthinkslikeahumanbutmuchfaster.

Yudkowsky(2008a)liststhreefamiliesofmetaphorsforvisualizingthecapabilityofa

smarterthan

human

AI:

Metaphorsinspiredbydifferencesofindividualintelligencebetweenhumans:AIswillpatentnewinventions,publishgroundbreakingresearchpapers,

makemoneyonthestockmarket,orleadpoliticalpowerblocks.

Metaphorsinspiredbyknowledgedifferencesbetweenpastandpresenthumancivilizations: FastAIswillinventcapabilitiesthatfuturistscommonly

predictforhumancivilizationsacenturyormillenniuminthefuture,like

molecularnanotechnologyorinterstellartravel.

Metaphorsinspiredbydifferencesofbrainarchitecturebetweenhumansandotherbiologicalorganisms: E.g.,Vinge(1993): Imaginerunningadogmind

atveryhighspeed.Wouldathousandyearsofdoggylivingadduptoany

humaninsight? Thatis: Changesofcognitivearchitecturemightproduce

insightsthatnohumanlevelmindwouldbeabletofind,orperhapseven

represent,afteranyamountoftime.

Evenifwerestrictourselvestohistoricalmetaphors,itbecomesclearthatsuperhuman

intelligencepresentsethicalchallengesthatarequiteliterallyunprecedented. Atthis

pointthestakesarenolongeronanindividualscale(e.g.,mortgageunjustly

disapproved,house

catches

fire,

person

agent

mistreated)

but

on

aglobal

or

cosmic


15/20

15

scale(e.g.,humanityisextinguishedandreplacedbynothingwewouldregardas

worthwhile). Or,ifsuperintelligencecanbeshapedtobebeneficial,then,depending

onitstechnologicalcapabilities,itmightmakeshortworkofmanypresentday

problemsthathaveprovendifficulttoourhumanlevelintelligence.

SuperintelligenceisoneofseveralexistentialrisksasdefinedbyBostrom(2002):a

riskwhereanadverseoutcomewouldeitherannihilateEarthoriginatingintelligent

lifeorpermanentlyanddrasticallycurtailitspotential. Conversely,apositive

outcomeforsuperintelligencecouldpreserveEarthoriginatingintelligentlifeandhelp

fulfillitspotential. Itisimportanttoemphasizethatsmartermindsposegreat

potentialbenefitsaswellasrisks.

Attemptstoreasonaboutglobalcatastrophicrisksmaybesusceptibletoanumberof

cognitivebiases(Yudkowsky2008b),includingthegoodstorybiasproposedbyBostrom(2002):

Supposeourintuitionsaboutwhichfuturescenariosareplausibleand

realisticareshapedbywhatweseeonTVandinmoviesandwhatwereadin

novels. (Afterall,alargepartofthediscourseaboutthefuturethatpeople

encounterisintheformoffictionandotherrecreationalcontexts.) Weshould

then,whenthinkingcritically,suspectourintuitionsofbeingbiasedinthe

directionofoverestimatingtheprobabilityofthosescenariosthatmakefora

goodstory,sincesuchscenarioswillseemmuchmorefamiliarandmore

real. ThisGoodstorybiascouldbequitepowerful. Whenwasthelasttime

yousawamovieabouthumankindsuddenlygoingextinct(withoutwarning

andwithoutbeingreplacedbysomeothercivilization)? Whilethisscenario

maybemuchmoreprobablethanascenarioinwhichhumanheroes

successfullyrepelaninvasionofmonstersorrobotwarriors,itwouldntbe

muchfuntowatch.

Trulydesirableoutcomesmakepoormovies: Noconflictmeansnostory. While

AsimovsThreeLawsofRobotics(Asimov1942)aresometimescitedasamodelfor

ethicalAI

development,

the

Three

Laws

are

as

much

aplot

device

as

Asimovs

positronicbrain. IfAsimovhaddepictedtheThreeLawsasworkingwell,hewould

havehadnostories.

ItwouldbeamistaketoregardAIsasaspecieswithfixedcharacteristicsandask,

Willtheybegoodorevil? ThetermArtificialIntelligencereferstoavastdesign

space,presumablymuchlargerthanthespaceofhumanminds(sinceallhumansshare

acommonbrainarchitecture). Itmaybeaformofgoodstorybiastoask,WillAIsbe

goodorevil?asiftryingtopickapremiseforamovieplot. Thereplyshouldbe,

ExactlywhichAIdesignareyoutalkingabout?


16/20

16

CancontrolovertheinitialprogrammingofanArtificialIntelligencetranslateinto

influenceonitslatereffectontheworld? Kurzweil(2005)holdsthat[i]ntelligenceis

inherentlyimpossibletocontrol,andthatdespiteanyhumanattemptsattaking

precautions,[b]ydefinitionintelligententitieshavetheclevernesstoeasily

overcomesuchbarriers. LetussupposethattheAIisnotonlyclever,butthat,aspart

oftheprocessofimprovingitsownintelligence,ithasunhinderedaccesstoitsown

sourcecode:itcanrewriteitselftoanythingitwantsitselftobe. Yetitdoesnotfollow

thattheAImustwanttorewriteitselftoahostileform.

ConsiderGandhi,whoseemstohavepossessedasinceredesirenottokillpeople.

Gandhiwouldnotknowinglytakeapillthatcausedhimtowanttokillpeople,

becauseGandhiknowsthatifhewantstokillpeople,hewillprobablykillpeople,and

thecurrentversionofGandhidoesnotwanttokill. Moregenerally,itseemslikely

thatmostselfmodifyingmindswillnaturallyhavestableutilityfunctions,whichimpliesthataninitialchoiceofminddesigncanhavelastingeffects(Omohundro

2008).

AtthispointinthedevelopmentofAIscience,isthereanywaywecantranslatethe

taskoffindingadesignforgoodAIsintoamodernresearchdirection? Itmayseem

prematuretospeculate,butonedoessuspectthatsomeAIparadigmsaremorelikely

thanotherstoeventuallyproveconducivetothecreationofintelligentselfmodifying

agentswhosegoalsremainpredictableevenaftermultipleiterationsofself

improvement. Forexample,theBayesianbranchofAI,inspiredbycoherent

mathematicalsystemssuchasprobabilitytheoryandexpectedutilitymaximization,

seemsmoreamenabletothepredictableselfmodificationproblemthanevolutionary

programmingandgeneticalgorithms. Thisisacontroversialstatement,butit

illustratesthepointthatifwearethinkingaboutthechallengeofsuperintelligence

downtheroad,thiscanindeedbeturnedintodirectionaladviceforpresentAI

research.

YetevensupposingthatwecanspecifyanAIsgoalsystemtobepersistentunderself

modificationandselfimprovement,thisonlybeginstotouchonthecoreethical

problemsof

creating

superintelligence.

Humans,

the

first

general

intelligences

to

exist

onEarth,haveusedthatintelligencetosubstantiallyreshapetheglobecarving

mountains,tamingrivers,buildingskyscrapers,farmingdeserts,producing

unintendedplanetaryclimatechanges. Amorepowerfulintelligencecouldhave

correspondinglylargerconsequences.

Consideragainthehistoricalmetaphorforsuperintelligencedifferencessimilartothe

differencesbetweenpastandpresentcivilizations. Ourpresentcivilizationisnot

separatedfromancientGreeceonlybyimprovedscienceandincreasedtechnological

capability. Thereisadifferenceofethicalperspectives: AncientGreeksthought

slaverywasacceptable;wethinkotherwise. Evenbetweenthenineteenthand


17/20


18/20

18

smarterthanhumans,thenthedisciplineofmachineethicsmustcommititselfto

seekinghumansuperior(notjusthumanequivalent)niceness.3

ConclusionAlthoughcurrentAIoffersusfewethicalissuesthatarenotalreadypresentinthe

designofcarsorpowerplants,theapproachofAIalgorithmstowardmorehumanlike

thoughtportendspredictablecomplications. SocialrolesmaybefilledbyAI

algorithms,implyingnewdesignrequirementsliketransparencyandpredictability.

SufficientlygeneralAIalgorithmsmaynolongerexecuteinpredictablecontexts,

requiringnewkindsofsafetyassuranceandtheengineeringofartificialethical

considerations. AIswithsufficientlyadvancedmentalstates,ortherightkindof

states,willhavemoralstatus,andsomemaycountaspersonsthoughperhaps

personsvery

much

unlike

the

sort

that

exist

now,

perhaps

governed

by

different

rules.

Andfinally,theprospectofAIswithsuperhumanintelligenceandsuperhuman

abilitiespresentsuswiththeextraordinarychallengeofstatinganalgorithmthat

outputssuperethicalbehavior. Thesechallengesmayseemvisionary,butitseems

predictablethatwewillencounterthem;andtheyarenotdevoidofsuggestionsfor

presentdayresearchdirections.

Authorbiographies

NickBostromisProfessorintheFacultyofPhilosophyatOxfordUniversityand

Directorof

the

Future

of

Humanity

Institute

within

the

Oxford

Martin

School.

He

is

theauthorofsome200publications,includingAnthropicBias(Routledge,2002),Global

CatastrophicRisks(ed.,OUP,2008),andEnhancingHumans(ed.,OUP,2009). His

researchcoversarangeofbigpicturequestionsforhumanity. Heiscurrentlyworking

abookonthefutureofmachineintelligenceanditsstrategicimplications.

EliezerYudkowskyisaResearchFellowattheSingularityInstituteforArtificial

Intelligencewhereheworksfulltimeontheforeseeabledesignissuesofgoal

architecturesinselfimprovingAI. Hiscurrentworkcentersonmodifyingclassical

decisiontheorytocoherentlydescribeselfmodification. Heisalsoknownforhis

popularwritingonissuesofhumanrationalityandcognitivebiases.

Furtherreading

Bostrom,N.2004.TheFutureofHumanEvolution,inDeathandAntiDeath: Two

HundredYearsAfterKant,FiftyYearsAfterTuring,ed.CharlesTandy(PaloAlto,

California:RiaUniversityPress). Thispaperexploressomeevolutionarydynamics

thatcouldleadapopulationofdiverseuploadstodevelopindystopiandirections.

3TheauthorsaregratefultoRebeccaRoacheforresearchassistanceandtotheeditorsofthis

volumefordetailedcommentsonanearlierversionofourmanuscript.


19/20

19

Yudkowsky,E.2008a.ArtificialIntelligenceasaPositiveandNegativeFactorin

GlobalRisk,inBostromandCirkovic(eds.),pp.308345. Anintroductiontothe

risksandchallengespresentedbythepossibilityofrecursivelyselfimproving

superintelligentmachines.

Wendell,W.2008.MoralMachines:TeachingRobotsRightfromWrong(Oxford

UniversityPress,2008). Acomprehensivesurveyofrecentdevelopments.

References

Asimov,I.1942.Runaround,AstoundingScienceFiction,March1942.

Beauchamp,T.andChilress,J.PrinciplesofBiomedicalEthics.Oxford:OxfordUniversity

Press.

Bostrom,N.2002. ExistentialRisks:AnalyzingHumanExtinctionScenarios,Journalof

EvolutionandTechnology9

(http://www.nickbostrom.com/existential/risks.html).

Bostrom,N.2003.AstronomicalWaste:TheOpportunityCostofDelayed

TechnologicalDevelopment,Utilitas15:308314.

Bostrom,N.2004.TheFutureofHumanEvolution,inDeathandAntiDeath: Two

HundredYearsAfterKant,FiftyYearsAfterTuring,ed.CharlesTandy(Palo

Alto,California:RiaUniversityPress)

(http://www.nickbostrom.com/fut/evolution.pdf)

Bostrom,N.andCirkovic,M.(eds.)2007.GlobalCatastrophic

Risks.Oxford:Oxford

UniversityPress.

Chalmers,D.J.,1996,TheConsciousMind:InSearchofaFundamentalTheory.NewYork

andOxford:OxfordUniversityPress

Hirschfeld,L.A.andGelman,S.A.(eds.)1994.MappingtheMind:DomainSpecificityin

CognitionandCulture,Cambridge:CambridgeUniversityPress.

Goertzel,B.andPennachin,C.(eds.)2006.ArtificialGeneralIntelligence.NewYork,NY:

SpringerVerlag.

Good,I.J.1965.SpeculationsConcerningtheFirstUltraintelligentMachine,inAlt,F.

L.

and

Rubinoff,

M.

(eds.)

Advances

in

Computers,

6,

New

York:

Academic

Press. Pp.3188.

Hastie,T.,Tibshirani,R.andFriedman,J.2001. TheElementsofStatisticalLearning.New

York,NY:SpringerScience.

Henley,K.1993.AbstractPrinciples,MidlevelPrinciples,andtheRuleofLaw,Law

andPhilosophy12:12132.

Hofstadter,D.2006.TryingtoMuseRationallyabouttheSingularityScenario,

presentedattheSingularitySummitatStanford,2006.

Howard,PhilipK.1994. TheDeathofCommonSense:HowLawisSuffocatingAmerica.

NewYork,NY:WarnerBooks.


20/20

20

Kamm,F.2007. IntricateEthics:Rights,Responsibilities,andPermissibleHarm.Oxford:

OxfordUniversityPress.

Kurzweil,R.2005. TheSingularityIsNear:WhenHumansTranscendBiology.NewYork,

NY:Viking.

McDermott,D.1976.Artificialintelligencemeetsnaturalstupidity,ACMSIGART

Newsletter57:49.

Omohundro,S.2008.TheBasicAIDrives,ProceedingsoftheAGI08Workshop.

Amsterdam:IOSPress. Pp.483492.

Sandberg,A.1999.ThePhysicsofInformationProcessingSuperobjects:DailyLife

AmongtheJupiterBrains,JournalofEvolutionandTechnology,5.

Vinge,V.1993. TheComingTechnologicalSingularity,presentedattheVISION21

Symposium,March,1993.

Warren,M.E.2000.MoralStatus:ObligationstoPersonsandOtherLivingThings. Oxford:

OxfordUniversityPress.Yudkowsky,E.2006. AIasaPreciseArt,presentedatthe2006AGIWorkshopin

Bethesda,MD.

Yudkowsky,E.2008a.ArtificialIntelligenceasaPositiveandNegativeFactorin

GlobalRisk,inBostromandCirkovic(eds.),pp.308345.

Yudkowsky,E.2008b.Cognitivebiasespotentiallyaffectingjudgmentofglobalrisks,

inBostromandCirkovic(eds.),pp.91119.

the ethics of artificial intelligence - nick bostrom

Documents