automate the boring stuff with python: albert sweigart · 2015-07-13 · automate the boring stuff...

603

Upload: others

Post on 22-Apr-2020

32 views

Category:

Documents


3 download

TRANSCRIPT

AutomatetheBoringStuffwithPython:PracticalProgrammingforTotal

Beginners

AlbertSweigart

PublishedbyNoStarchPress

FormynephewJack

AbouttheAuthorAlSweigartisasoftwaredeveloperandtechbookauthorlivinginSanFrancisco.Pythonishisfavoriteprogramminglanguage,andheisthedeveloperofseveralopensourcemodulesforit.HisotherbooksarefreelyavailableunderaCreativeCommonslicenseonhiswebsitehttp://www.inventwithpython.com/.Hiscatweighs14pounds.

AbouttheTechReviewerAriLacenskiisadeveloperofAndroidapplicationsandPythonsoftware.ShelivesinSanFrancisco,whereshewritesaboutAndroidprogrammingathttp://gradlewhy.ghost.io/andmentorswithWomenWhoCode.She’salsoafolkguitarist.

AcknowledgmentsIcouldn’thavewrittenabooklikethiswithoutthehelpofalotofpeople.I’dliketothankBillPollock;myeditors,LaurelChun,LeslieShen,GregPoulos,andJenniferGriffith-Delgado;andtherestofthestaffatNoStarchPressfortheirinvaluablehelp.Thankstomytechreviewer,AriLacenski,forgreatsuggestions,edits,andsupport.

ManythankstoourBenevolentDictatorForLife,GuidovanRossum,andeveryoneatthePythonSoftwareFoundationfortheirgreatwork.ThePythoncommunityisthebestoneI’vefoundinthetechindustry.

Finally,Iwouldliketothankmyfamily,friends,andthegangatShotwell’sfornotmindingthebusylifeI’vehadwhilewritingthisbook.Cheers!

Introduction“You’vejustdoneintwohourswhatittakesthethreeofustwodaystodo.”Mycollegeroommatewasworkingataretailelectronicsstoreintheearly2000s.Occasionally,thestorewouldreceiveaspreadsheetofthousandsofproductpricesfromitscompetitor.Ateamofthreeemployeeswouldprintthespreadsheetontoathickstackofpaperandsplititamongthemselves.Foreachproductprice,theywouldlookuptheirstore’spriceandnotealltheproductsthattheircompetitorssoldforless.Itusuallytookacoupleofdays.

“Youknow,Icouldwriteaprogramtodothatifyouhavetheoriginalfilefortheprintouts,”myroommatetoldthem,whenhesawthemsittingonthefloorwithpapersscatteredandstackedaroundthem.

Afteracoupleofhours,hehadashortprogramthatreadacompetitor’spricefromafile,foundtheproductinthestore’sdatabase,andnotedwhetherthecompetitorwascheaper.Hewasstillnewtoprogramming,andhespentmostofhistimelookingupdocumentationinaprogrammingbook.Theactualprogramtookonlyafewsecondstorun.Myroommateandhisco-workerstookanextra-longlunchthatday.

Thisisthepowerofcomputerprogramming.AcomputerislikeaSwissArmyknifethatyoucanconfigureforcountlesstasks.Manypeoplespendhoursclickingandtypingtoperformrepetitivetasks,unawarethatthemachinethey’reusingcoulddotheirjobinsecondsiftheygaveittherightinstructions.

WhomIsThisBookFor?Softwareisatthecoreofsomanyofthetoolsweusetoday:Nearlyeveryoneusessocialnetworkstocommunicate,manypeoplehaveInternet-connectedcomputersintheirphones,andmostofficejobsinvolveinteractingwithacomputertogetworkdone.Asaresult,thedemandforpeoplewhocancodehasskyrocketed.Countlessbooks,interactivewebtutorials,anddeveloperbootcampspromisetoturnambitiousbeginnersintosoftwareengineerswithsix-figuresalaries.

Thisbookisnotforthosepeople.It’sforeveryoneelse.

Onitsown,thisbookwon’tturnyouintoaprofessionalsoftwaredeveloperanymorethanafewguitarlessonswillturnyouintoarockstar.Butifyou’reanofficeworker,administrator,academic,oranyoneelsewhousesacomputerforworkorfun,youwilllearnthebasicsofprogrammingsothatyoucanautomatesimpletaskssuchasthefollowing:

MovingandrenamingthousandsoffilesandsortingthemintofoldersFillingoutonlineforms,notypingrequiredDownloadingfilesorcopytextfromawebsitewheneveritupdatesHavingyourcomputertextyoucustomnotificationsUpdatingorformattingExcelspreadsheetsCheckingyouremailandsendingoutprewrittenresponses

Thesetasksaresimplebuttime-consumingforhumans,andthey’reoftensotrivialorspecificthatthere’snoready-madesoftwaretoperformthem.Armedwithalittlebitofprogrammingknowledge,youcanhaveyourcomputerdothesetasksforyou.

ConventionsThisbookisnotdesignedasareferencemanual;it’saguideforbeginners.Thecodingstylesometimesgoesagainstbestpractices(forexample,someprogramsuseglobalvariables),butthat’satrade-offtomakethecodesimplertolearn.Thisbookismadeforpeopletowritethrowawaycode,sothere’snotmuchtimespentonstyleandelegance.Sophisticatedprogrammingconcepts—likeobject-orientedprogramming,listcomprehensions,andgenerators—aren’tcoveredbecauseofthecomplexitytheyadd.Veteranprogrammersmaypointoutwaysthecodeinthisbookcouldbechangedtoimproveefficiency,butthisbookismostlyconcernedwithgettingprogramstoworkwiththeleastamountofeffort.

WhatIsProgramming?Televisionshowsandfilmsoftenshowprogrammersfuriouslytypingcrypticstreamsof1sand0songlowingscreens,butmodernprogrammingisn’tthatmysterious.Programmingissimplytheactofenteringinstructionsforthecomputertoperform.Theseinstructionsmightcrunchsomenumbers,modifytext,lookupinformationinfiles,orcommunicatewithothercomputersovertheInternet.

Allprogramsusebasicinstructionsasbuildingblocks.Hereareafewofthemostcommonones,inEnglish:

“Dothis;thendothat.”“Ifthisconditionistrue,performthisaction;otherwise,dothataction.”“Dothisactionthatnumberoftimes.”“Keepdoingthatuntilthisconditionistrue.”

Youcancombinethesebuildingblockstoimplementmoreintricatedecisions,too.Forexample,herearetheprogramminginstructions,calledthesourcecode,forasimpleprogramwritteninthePythonprogramminglanguage.Startingatthetop,thePythonsoftwarerunseachlineofcode(somelinesarerunonlyifacertainconditionistrueorelsePythonrunssomeotherline)untilitreachesthebottom.

➊passwordFile=open('SecretPasswordFile.txt')

➋secretPassword=passwordFile.read()

➌print('Enteryourpassword.')

typedPassword=input()

➍iftypedPassword==secretPassword:

➎print('Accessgranted')

➏iftypedPassword=='12345':

➐print('Thatpasswordisonethatanidiotputsontheirluggage.')

else:

➑print('Accessdenied')

Youmightnotknowanythingaboutprogramming,butyoucouldprobablymakeareasonableguessatwhatthepreviouscodedoesjustbyreadingit.First,thefileSecretPasswordFile.txtisopened➊,andthesecretpasswordinitisread➋.Then,theuserispromptedtoinputapassword(fromthekeyboard)➌.Thesetwopasswordsarecompared➍,andifthey’rethesame,theprogramprintsAccessgrantedtothescreen➎.Next,theprogramcheckstoseewhetherthepasswordis12345➏andhintsthatthischoicemightnotbethebestforapassword➐.Ifthepasswordsarenotthesame,theprogramprintsAccessdeniedtothescreen➑.

WhatIsPython?PythonreferstothePythonprogramminglanguage(withsyntaxrulesforwritingwhatisconsideredvalidPythoncode)andthePythoninterpretersoftwarethatreadssourcecode(writteninthePythonlanguage)andperformsitsinstructions.ThePythoninterpreterisfreetodownloadfromhttp://python.org/,andthereareversionsforLinux,OSX,andWindows.

ThenamePythoncomesfromthesurrealBritishcomedygroupMontyPython,notfromthesnake.PythonprogrammersareaffectionatelycalledPythonistas,andbothMontyPythonandserpentinereferencesusuallypepperPythontutorialsanddocumentation.

ProgrammersDon’tNeedtoKnowMuchMath

ThemostcommonanxietyIhearaboutlearningtoprogramisthatpeoplethinkitrequiresalotofmath.Actually,mostprogrammingdoesn’trequiremathbeyondbasicarithmetic.Infact,beinggoodatprogrammingisn’tthatdifferentfrombeinggoodatsolvingSudokupuzzles.

TosolveaSudokupuzzle,thenumbers1through9mustbefilledinforeachrow,eachcolumn,andeach3×3interiorsquareofthefull9×9board.Youfindasolutionbyapplyingdeductionandlogicfromthestartingnumbers.Forexample,since5appearsinthetopleftoftheSudokupuzzleshowninFigureI-1,itcannotappearelsewhereinthetoprow,intheleftmostcolumn,orinthetop-left3×3square.Solvingonerow,column,orsquareatatimewillprovidemorenumbercluesfortherestofthepuzzle.

FigureI-1.AnewSudokupuzzle(left)anditssolution(right).Despiteusingnumbers,Sudokudoesn’tinvolvemuchmath.(Images©WikimediaCommons)

JustbecauseSudokuinvolvesnumbersdoesn’tmeanyouhavetobegoodatmathtofigureoutthesolution.Thesameistrueofprogramming.LikesolvingaSudokupuzzle,writingprogramsinvolvesbreakingdownaproblemintoindividual,detailedsteps.Similarly,whendebuggingprograms(thatis,findingandfixingerrors),you’llpatientlyobservewhattheprogramisdoingandfindthecauseofthebugs.Andlikeallskills,themoreyouprogram,thebetteryou’llbecome.

ProgrammingIsaCreativeActivityProgrammingisacreativetask,somewhatlikeconstructingacastleoutofLEGObricks.Youstartwithabasicideaofwhatyouwantyourcastletolooklikeandinventoryyouravailableblocks.Thenyoustartbuilding.Onceyou’vefinishedbuildingyourprogram,youcanprettyupyourcodejustlikeyouwouldyourcastle.

Thedifferencebetweenprogrammingandothercreativeactivitiesisthatwhenprogramming,youhavealltherawmaterialsyouneedinyourcomputer;youdon’tneedtobuyanyadditionalcanvas,paint,film,yarn,LEGObricks,orelectroniccomponents.Whenyourprogramiswritten,itcaneasilybesharedonlinewiththeentireworld.Andthoughyou’llmakemistakeswhenprogramming,theactivityisstillalotoffun.

AboutThisBookThefirstpartofthisbookcoversbasicPythonprogrammingconcepts,andthesecondpartcoversvarioustasksyoucanhaveyourcomputerautomate.Eachchapterinthesecondparthasprojectprogramsforyoutostudy.Here’sabriefrundownofwhatyou’llfindineachchapter:

PartI

Chapter1.Coversexpressions,themostbasictypeofPythoninstruction,andhowtousethePythoninteractiveshellsoftwaretoexperimentwithcode.Chapter2.Explainshowtomakeprogramsdecidewhichinstructionstoexecutesoyourcodecanintelligentlyrespondtodifferentconditions.Chapter3.Instructsyouonhowtodefineyourownfunctionssothatyoucanorganizeyourcodeintomoremanageablechunks.Chapter4.Introducesthelistdatatypeandexplainshowtoorganizedata.Chapter5.Introducesthedictionarydatatypeandshowsyoumorepowerfulwaystoorganizedata.Chapter6.Coversworkingwithtextdata(calledstringsinPython).

PartII

Chapter7.CovershowPythoncanmanipulatestringsandsearchfortextpatternswithregularexpressions.Chapter8.Explainshowyourprogramscanreadthecontentsoftextfilesandsaveinformationtofilesonyourharddrive.Chapter9.ShowshowPythoncancopy,move,rename,anddeletelargenumbersoffilesmuchfasterthanahumanusercan.Italsoexplainscompressinganddecompressingfiles.Chapter10.ShowshowtousePython’svariousbug-findingandbug-fixingtools.Chapter11.Showshowtowriteprogramsthatcanautomaticallydownloadwebpagesandparsethemforinformation.Thisiscalledwebscraping.Chapter12.CoversprogrammaticallymanipulatingExcelspreadsheetssothatyoudon’thavetoreadthem.Thisishelpfulwhenthenumberofdocumentsyouhavetoanalyzeisinthehundredsorthousands.Chapter13.CoversprogrammaticallyreadingWordandPDFdocuments.Chapter14.ContinuestoexplainhowtoprogrammaticallymanipulatedocumentswithCSVandJSONfiles.Chapter15.ExplainshowtimeanddatesarehandledbyPythonprogramsandhowtoscheduleyourcomputertoperformtasksatcertaintimes.ThischapteralsoshowshowyourPythonprogramscanlaunchnon-Pythonprograms.Chapter16.Explainshowtowriteprogramsthatcansendemailsandtextmessagesonyourbehalf.Chapter17.ExplainshowtoprogrammaticallymanipulateimagessuchasJPEGorPNGfiles.Chapter18.Explainshowtoprogrammaticallycontrolthemouseandkeyboardtoautomateclicksandkeypresses.

DownloadingandInstallingPythonYoucandownloadPythonforWindows,OSX,andUbuntuforfreefromhttp://python.org/downloads/.Ifyoudownloadthelatestversionfromthewebsite’sdownloadpage,alloftheprogramsinthisbookshouldwork.

WARNING

BesuretodownloadaversionofPython3(suchas3.4.0).TheprogramsinthisbookarewrittentorunonPython3andmaynotruncorrectly,ifatall,onPython2.

You’llfindPythoninstallersfor64-bitand32-bitcomputersforeachoperatingsystemonthedownloadpage,sofirstfigureoutwhichinstalleryouneed.Ifyouboughtyourcomputerin2007orlater,itismostlikelya64-bitsystem.Otherwise,youhavea32-bitversion,buthere’showtofindoutforsure:

OnWindows,selectStart▸ControlPanel▸SystemandcheckwhetherSystemTypesays64-bitor32-bit.OnOSX,gotheApplemenu,selectAboutThisMac▸MoreInfo▸SystemReport▸Hardware,andthenlookattheProcessorNamefield.IfitsaysIntelCoreSoloorIntelCoreDuo,youhavea32-bitmachine.Ifitsaysanythingelse(includingIntelCore2Duo),youhavea64-bitmachine.OnUbuntuLinux,openaTerminalandrunthecommanduname-m.Aresponseofi686means32-bit,andx86_64means64-bit.

OnWindows,downloadthePythoninstaller(thefilenamewillendwith.msi)anddouble-clickit.FollowtheinstructionstheinstallerdisplaysonthescreentoinstallPython,aslistedhere:

1. SelectInstallforAllUsersandthenclickNext.2. InstalltotheC:\Python34folderbyclickingNext.3. ClickNextagaintoskiptheCustomizePythonsection.

OnMacOSX,downloadthe.dmgfilethat’srightforyourversionofOSXanddouble-clickit.FollowtheinstructionstheinstallerdisplaysonthescreentoinstallPython,aslistedhere:

1. WhentheDMGpackageopensinanewwindow,double-clickthePython.mpkgfile.Youmayhavetoentertheadministratorpassword.

2. ClickContinuethroughtheWelcomesectionandclickAgreetoacceptthelicense.3. SelectHDMacintosh(orwhatevernameyourharddrivehas)andclickInstall.

Ifyou’rerunningUbuntu,youcaninstallPythonfromtheTerminalbyfollowingthesesteps:

1. OpentheTerminalwindow.2. Entersudoapt-getinstallpython3.3. Entersudoapt-getinstallidle3.4. Entersudoapt-getinstallpython3-pip.

StartingIDLEWhilethePythoninterpreteristhesoftwarethatrunsyourPythonprograms,theinteractivedevelopmentenvironment(IDLE)softwareiswhereyou’llenteryourprograms,muchlikeawordprocessor.Let’sstartIDLEnow.

OnWindows7ornewer,clicktheStarticoninthelower-leftcornerofyourscreen,enterIDLEinthesearchbox,andselectIDLE(PythonGUI).OnWindowsXP,clicktheStartbuttonandthenselectPrograms▸Python3.4▸IDLE(PythonGUI).OnMacOSX,opentheFinderwindow,clickApplications,clickPython3.4,andthenclicktheIDLEicon.OnUbuntu,selectApplications▸Accessories▸Terminalandthenenteridle3.(YoumayalsobeabletoclickApplicationsatthetopofthescreen,selectProgramming,andthenclickIDLE3.)

TheInteractiveShellNomatterwhichoperatingsystemyou’rerunning,theIDLEwindowthatfirstappearsshouldbemostlyblankexceptfortextthatlookssomethinglikethis:

Python3.4.0(v3.4.0:04f714765c13,Mar162014,19:25:23)[MSCv.160064

bit(AMD64)]onwin32Type"copyright","credits"or"license()"formore

information.

>>>

Thiswindowiscalledtheinteractiveshell.Ashellisaprogramthatletsyoutypeinstructionsintothecomputer,muchliketheTerminalorCommandPromptonOSXandWindows,respectively.Python’sinteractiveshellletsyouenterinstructionsforthePythoninterpretersoftwaretorun.Thecomputerreadstheinstructionsyouenterandrunsthemimmediately.

Forexample,enterthefollowingintotheinteractiveshellnexttothe>>>prompt:>>>print('Helloworld!')

AfteryoutypethatlineandpressENTER,theinteractiveshellshoulddisplaythisinresponse:

>>>print('Helloworld!')

Helloworld!

HowtoFindHelpSolvingprogrammingproblemsonyourowniseasierthanyoumightthink.Ifyou’renotconvinced,thenlet’scauseanerroronpurpose:Enter'42'+3intotheinteractiveshell.Youdon’tneedtoknowwhatthisinstructionmeansrightnow,buttheresultshouldlooklikethis:

>>>'42'+3

➊Traceback(mostrecentcalllast):

File"<pyshell#0>",line1,in<module>

'42'+3

➋TypeError:Can'tconvert'int'objecttostrimplicitly

>>>

Theerrormessage➋appearedherebecausePythoncouldn’tunderstandyourinstruction.Thetracebackpart➊oftheerrormessageshowsthespecificinstructionandlinenumberthatPythonhadtroublewith.Ifyou’renotsurewhattomakeofaparticularerrormessage,searchonlinefortheexacterrormessage.Enter“TypeError:Can’tconvert‘int’objecttostrimplicitly”(includingthequotes)intoyourfavoritesearchengine,andyoushouldseetonsoflinksexplainingwhattheerrormessagemeansandwhatcausesit,asshowninFigureI-2.

FigureI-2.TheGoogleresultsforanerrormessagecanbeveryhelpful.

You’lloftenfindthatsomeoneelsehadthesamequestionasyouandthatsomeotherhelpfulpersonhasalreadyansweredit.Noonepersoncanknoweverythingaboutprogramming,soaneverydaypartofanysoftwaredeveloper’sjobislookingupanswerstotechnicalquestions.

AskingSmartProgrammingQuestionsIfyoucan’tfindtheanswerbysearchingonline,tryaskingpeopleinawebforumsuchasStackOverlow(http://stackoverflow.com/)orthe“learnprogramming”subredditathttp://reddit.com/r/learnprogramming/.Butkeepinmindtherearesmartwaystoaskprogrammingquestionsthathelpothershelpyou.BesuretoreadtheFrequentlyAskedQuestionssectionsthesewebsiteshaveabouttheproperwaytopostquestions.

Whenaskingprogrammingquestions,remembertodothefollowing:

Explainwhatyouaretryingtodo,notjustwhatyoudid.Thisletsyourhelperknowifyouareonthewrongtrack.Specifythepointatwhichtheerrorhappens.Doesitoccurattheverystartoftheprogramoronlyafteryoudoacertainaction?Copyandpastetheentireerrormessageandyourcodetohttp://pastebin.com/orhttp://gist.github.com/.ThesewebsitesmakeiteasytosharelargeamountsofcodewithpeopleovertheWeb,withouttheriskoflosinganytextformatting.YoucanthenputtheURLofthepostedcodeinyouremailorforumpost.Forexample,heresomepiecesofcodeI’veposted:http://pastebin.com/SzP2DbFx/andhttps://gist.github.com/asweigart/6912168/.Explainwhatyou’vealreadytriedtodotosolveyourproblem.Thistellspeopleyou’vealreadyputinsomeworktofigurethingsoutonyourown.ListtheversionofPythonyou’reusing.(Therearesomekeydifferencesbetweenversion2Pythoninterpretersandversion3Pythoninterpreters.)Also,saywhichoperatingsystemandversionyou’rerunning.Iftheerrorcameupafteryoumadeachangetoyourcode,explainexactlywhatyouchanged.Saywhetheryou’reabletoreproducetheerroreverytimeyouruntheprogramorwhetherithappensonlyafteryouperformcertainactions.Explainwhatthoseactionsare,ifso.

Alwaysfollowgoodonlineetiquetteaswell.Forexample,don’tpostyourquestionsinallcapsormakeunreasonabledemandsofthepeopletryingtohelpyou.

SummaryFormostpeople,theircomputerisjustanapplianceinsteadofatool.Butbylearninghowtoprogram,you’llgainaccesstooneofthemostpowerfultoolsofthemodernworld,andyou’llhavefunalongtheway.Programmingisn’tbrainsurgery—it’sfineforamateurstoexperimentandmakemistakes.

IlovehelpingpeoplediscoverPython.Iwriteprogrammingtutorialsonmyblogathttp://inventwithpython.com/blog/,[email protected].

Thisbookwillstartyouofffromzeroprogrammingknowledge,butyoumayhavequestionsbeyonditsscope.Rememberthataskingeffectivequestionsandknowinghowtofindanswersareinvaluabletoolsonyourprogrammingjourney.

Let’sbegin!

PartI.PythonProgrammingBasics

Chapter1.PythonBasicsThePythonprogramminglanguagehasawiderangeofsyntacticalconstructions,standardlibraryfunctions,andinteractivedevelopmentenvironmentfeatures.Fortunately,youcanignoremostofthat;youjustneedtolearnenoughtowritesomehandylittleprograms.

Youwill,however,havetolearnsomebasicprogrammingconceptsbeforeyoucandoanything.Likeawizard-in-training,youmightthinktheseconceptsseemarcaneandtedious,butwithsomeknowledgeandpractice,you’llbeabletocommandyourcomputerlikeamagicwandtoperformincrediblefeats.

Thischapterhasafewexamplesthatencourageyoutotypeintotheinteractiveshell,whichletsyouexecutePythoninstructionsoneatatimeandshowsyoutheresultsinstantly.UsingtheinteractiveshellisgreatforlearningwhatbasicPythoninstructionsdo,sogiveitatryasyoufollowalong.You’llrememberthethingsyoudomuchbetterthanthethingsyouonlyread.

EnteringExpressionsintotheInteractiveShellYouruntheinteractiveshellbylaunchingIDLE,whichyouinstalledwithPythonintheintroduction.OnWindows,opentheStartmenu,selectAllPrograms▸Python3.3,andthenselectIDLE(PythonGUI).OnOSX,selectApplications▸MacPython3.3▸IDLE.OnUbuntu,openanewTerminalwindowandenteridle3.

Awindowwiththe>>>promptshouldappear;that’stheinteractiveshell.Enter2+2attheprompttohavePythondosomesimplemath.

>>>2+2

4

TheIDLEwindowshouldnowshowsometextlikethis:Python3.3.2(v3.3.2:d047928ae3f6,May162013,00:06:53)[MSCv.160064bit

(AMD64)]onwin32

Type"copyright","credits"or"license()"formoreinformation.

>>>2+2

4

>>>

InPython,2+2iscalledanexpression,whichisthemostbasickindofprogramminginstructioninthelanguage.Expressionsconsistofvalues(suchas2)andoperators(suchas+),andtheycanalwaysevaluate(thatis,reduce)downtoasinglevalue.ThatmeansyoucanuseexpressionsanywhereinPythoncodethatyoucouldalsouseavalue.

Inthepreviousexample,2+2isevaluateddowntoasinglevalue,4.Asinglevaluewithnooperatorsisalsoconsideredanexpression,thoughitevaluatesonlytoitself,asshownhere:

>>>2

2

ERRORSAREOKAY!

Programswillcrashiftheycontaincodethecomputercan’tunderstand,whichwillcausePythontoshowanerrormessage.Anerrormessagewon’tbreakyourcomputer,though,sodon’tbeafraidtomakemistakes.Acrashjustmeanstheprogramstoppedrunningunexpectedly.

Ifyouwanttoknowmoreaboutanerrormessage,youcansearchfortheexactmessagetextonlinetofindoutmoreaboutthatspecificerror.Youcanalsocheckouttheresourcesathttp://nostarch.com/automatestuff/toseealistofcommonPythonerrormessagesandtheirmeanings.

ThereareplentyofotheroperatorsyoucanuseinPythonexpressions,too.Forexample,Table1-1listsallthemathoperatorsinPython.

Table1-1.MathOperatorsfromHighesttoLowestPrecedence

Operator Operation Example Evaluatesto…

** Exponent 2**3 8

% Modulus/remainder 22%8 6

// Integerdivision/flooredquotient 22//8 2

/ Division 22/8 2.75

* Multiplication 3*5 15

- Subtraction 5-2 3

+ Addition 2+2 4

Theorderofoperations(alsocalledprecedence)ofPythonmathoperatorsissimilartothatofmathematics.The**operatorisevaluatedfirst;the*,/,//,and%operatorsareevaluatednext,fromlefttoright;andthe+and-operatorsareevaluatedlast(alsofromlefttoright).Youcanuseparenthesestooverridetheusualprecedenceifyouneedto.Enterthefollowingexpressionsintotheinteractiveshell:

>>>2+3*6

20

>>>(2+3)*6

30

>>>48565878*578453

28093077826734

>>>2**8

256

>>>23/7

3.2857142857142856

>>>23//7

3

>>>23%7

2

>>>2+2

4

>>>(5-1)*((7+1)/(3-1))

16.0

Ineachcase,youastheprogrammermustentertheexpression,butPythondoesthehardpartofevaluatingitdowntoasinglevalue.Pythonwillkeepevaluatingpartsoftheexpressionuntilitbecomesasinglevalue,asshowninFigure1-1.

Figure1-1.Evaluatinganexpressionreducesittoasinglevalue.

TheserulesforputtingoperatorsandvaluestogethertoformexpressionsareafundamentalpartofPythonasaprogramminglanguage,justlikethegrammarrulesthathelpuscommunicate.Here’sanexample:

ThisisagrammaticallycorrectEnglishsentence.ThisgrammaticallyissentencenotEnglishcorrecta.

Thesecondlineisdifficulttoparsebecauseitdoesn’tfollowtherulesofEnglish.Similarly,ifyoutypeinabadPythoninstruction,Pythonwon’tbeabletounderstanditandwilldisplayaSyntaxErrorerrormessage,asshownhere:

>>>5+

File"<stdin>",line1

5+

^

SyntaxError:invalidsyntax

>>>42+5+*2

File"<stdin>",line1

42+5+*2

^

SyntaxError:invalidsyntax

Youcanalwaystesttoseewhetheraninstructionworksbytypingitintotheinteractiveshell.Don’tworryaboutbreakingthecomputer:TheworstthingthatcouldhappenisthatPythonrespondswithanerrormessage.Professionalsoftwaredevelopersgeterrormessageswhilewritingcodeallthetime.

TheInteger,Floating-Point,andStringDataTypesRememberthatexpressionsarejustvaluescombinedwithoperators,andtheyalwaysevaluatedowntoasinglevalue.Adatatypeisacategoryforvalues,andeveryvaluebelongstoexactlyonedatatype.ThemostcommondatatypesinPythonarelistedinTable1-2.Thevalues-2and30,forexample,aresaidtobeintegervalues.Theinteger(orint)datatypeindicatesvaluesthatarewholenumbers.Numberswithadecimalpoint,suchas3.14,arecalledfloating-pointnumbers(orfloats).Notethateventhoughthevalue42isaninteger,thevalue42.0wouldbeafloating-pointnumber.

Table1-2.CommonDataTypes

Datatype Examples

Integers -2,-1,0,1,2,3,4,5

Floating-pointnumbers -1.25,-1.0,--0.5,0.0,0.5,1.0,1.25

Strings 'a','aa','aaa','Hello!','11cats'

Pythonprogramscanalsohavetextvaluescalledstrings,orstrs(pronounced“stirs”).Alwayssurroundyourstringinsinglequote(')characters(asin'Hello'or'Goodbyecruelworld!')soPythonknowswherethestringbeginsandends.Youcanevenhaveastringwithnocharactersinit,'',calledablankstring.StringsareexplainedingreaterdetailinChapter4.

IfyoueverseetheerrormessageSyntaxError:EOLwhilescanningstringliteral,youprobablyforgotthefinalsinglequotecharacterattheendofthestring,suchasinthisexample:

>>>'Helloworld!

SyntaxError:EOLwhilescanningstringliteral

StringConcatenationandReplicationThemeaningofanoperatormaychangebasedonthedatatypesofthevaluesnexttoit.Forexample,+istheadditionoperatorwhenitoperatesontwointegersorfloating-pointvalues.However,when+isusedontwostringvalues,itjoinsthestringsasthestringconcatenationoperator.Enterthefollowingintotheinteractiveshell:

>>>'Alice'+'Bob'

'AliceBob'

Theexpressionevaluatesdowntoasingle,newstringvaluethatcombinesthetextofthetwostrings.However,ifyoutrytousethe+operatoronastringandanintegervalue,Pythonwillnotknowhowtohandlethis,anditwilldisplayanerrormessage.

>>>'Alice'+42

Traceback(mostrecentcalllast):

File"<pyshell#26>",line1,in<module>

'Alice'+42

TypeError:Can'tconvert'int'objecttostrimplicitly

TheerrormessageCan'tconvert'int'objecttostrimplicitlymeansthatPythonthoughtyouweretryingtoconcatenateanintegertothestring'Alice'.Yourcodewillhavetoexplicitlyconverttheintegertoastring,becausePythoncannotdothisautomatically.(ConvertingdatatypeswillbeexplainedinDissectingYourProgramwhentalkingaboutthestr(),int(),andfloat()functions.)

The*operatorisusedformultiplicationwhenitoperatesontwointegerorfloating-pointvalues.Butwhenthe*operatorisusedononestringvalueandoneintegervalue,itbecomesthestringreplicationoperator.Enterastringmultipliedbyanumberintotheinteractiveshelltoseethisinaction.

>>>'Alice'*5

'AliceAliceAliceAliceAlice'

Theexpressionevaluatesdowntoasinglestringvaluethatrepeatstheoriginalanumberoftimesequaltotheintegervalue.Stringreplicationisausefultrick,butit’snotusedasoftenasstringconcatenation.

The*operatorcanbeusedwithonlytwonumericvalues(formultiplication)oronestringvalueandoneintegervalue(forstringreplication).Otherwise,Pythonwilljustdisplayanerrormessage.

>>>'Alice'*'Bob'

Traceback(mostrecentcalllast):

File"<pyshell#32>",line1,in<module>

'Alice'*'Bob'

TypeError:can'tmultiplysequencebynon-intoftype'str'

>>>'Alice'*5.0

Traceback(mostrecentcalllast):

File"<pyshell#33>",line1,in<module>

'Alice'*5.0

TypeError:can'tmultiplysequencebynon-intoftype'float'

ItmakessensethatPythonwouldn’tunderstandtheseexpressions:Youcan’tmultiplytwowords,andit’shardtoreplicateanarbitrarystringafractionalnumberoftimes.

StoringValuesinVariablesAvariableislikeaboxinthecomputer’smemorywhereyoucanstoreasinglevalue.Ifyouwanttousetheresultofanevaluatedexpressionlaterinyourprogram,youcansaveitinsideavariable.

AssignmentStatementsYou’llstorevaluesinvariableswithanassignmentstatement.Anassignmentstatementconsistsofavariablename,anequalsign(calledtheassignmentoperator),andthevaluetobestored.Ifyouentertheassignmentstatementspam=42,thenavariablenamedspamwillhavetheintegervalue42storedinit.

Thinkofavariableasalabeledboxthatavalueisplacedin,asinFigure1-2.

Figure1-2.spam=42isliketellingtheprogram,“Thevariablespamnowhastheintegervalue42init.”

Forexample,enterthefollowingintotheinteractiveshell:➊>>>spam=40

>>>spam

40

>>>eggs=2

➋>>>spam+eggs

42

>>>spam+eggs+spam

82

➌>>>spam=spam+2

>>>spam

42

Avariableisinitialized(orcreated)thefirsttimeavalueisstoredinit➊.Afterthat,youcanuseitinexpressionswithothervariablesandvalues➋.Whenavariableisassignedanewvalue➌,theoldvalueisforgotten,whichiswhyspamevaluatedto42insteadof40attheendoftheexample.Thisiscalledoverwritingthevariable.Enterthefollowingcodeintotheinteractiveshelltotryoverwritingastring:

>>>spam='Hello'

>>>spam

'Hello'

>>>spam='Goodbye'

>>>spam

'Goodbye'

JustliketheboxinFigure1-3,thespamvariableinthisexamplestores'Hello'untilyoureplaceitwith'Goodbye'.

Figure1-3.Whenanewvalueisassignedtoavariable,theoldoneisforgotten.

VariableNamesTable1-3hasexamplesoflegalvariablenames.Youcannameavariableanythingaslongasitobeysthefollowingthreerules:

1. Itcanbeonlyoneword.2. Itcanuseonlyletters,numbers,andtheunderscore(_)character.3. Itcan’tbeginwithanumber.

Table1-3.ValidandInvalidVariableNames

Validvariablenames Invalidvariablenames

balance current-balance(hyphensarenotallowed)

currentBalance currentbalance(spacesarenotallowed)

current_balance 4account(can’tbeginwithanumber)

_spam 42(can’tbeginwithanumber)

SPAM total_$um(specialcharacterslike$arenotallowed)

account4 'hello'(specialcharacterslike'arenotallowed)

Variablenamesarecase-sensitive,meaningthatspam,SPAM,Spam,andsPaMarefour

differentvariables.ItisaPythonconventiontostartyourvariableswithalowercaseletter.

Thisbookusescamelcaseforvariablenamesinsteadofunderscores;thatis,variablesLookLikeThisinsteadoflooking_like_this.SomeexperiencedprogrammsmaypointoutthattheofficialPythoncodestyle,PEP8,saysthatunderscoresshouldbeused.Iunapologeticallyprefercamelcaseandpointto“AFoolishConsistencyIstheHobgoblinofLittleMinds”inPEP8itself:

“Consistencywiththestyleguideisimportant.Butmostimportantly:knowwhentobeinconsistent—sometimesthestyleguidejustdoesn’tapply.Whenindoubt,useyourbestjudgment.”

Agoodvariablenamedescribesthedataitcontains.ImaginethatyoumovedtoanewhouseandlabeledallofyourmovingboxesasStuff.You’dneverfindanything!Thevariablenamesspam,eggs,andbaconareusedasgenericnamesfortheexamplesinthisbookandinmuchofPython’sdocumentation(inspiredbytheMontyPython“Spam”sketch),butinyourprograms,adescriptivenamewillhelpmakeyourcodemorereadable.

YourFirstProgramWhiletheinteractiveshellisgoodforrunningPythoninstructionsoneatatime,towriteentirePythonprograms,you’lltypetheinstructionsintothefileeditor.ThefileeditorissimilartotexteditorssuchasNotepadorTextMate,butithassomespecificfeaturesfortypinginsourcecode.ToopenthefileeditorinIDLE,selectFile▸NewWindow.

Thewindowthatappearsshouldcontainacursorawaitingyourinput,butit’sdifferentfromtheinteractiveshell,whichrunsPythoninstructionsassoonasyoupressENTER.Thefileeditorletsyoutypeinmanyinstructions,savethefile,andruntheprogram.Here’showyoucantellthedifferencebetweenthetwo:

Theinteractiveshellwindowwillalwaysbetheonewiththe>>>prompt.Thefileeditorwindowwillnothavethe>>>prompt.

Nowit’stimetocreateyourfirstprogram!Whenthefileeditorwindowopens,typethefollowingintoit:

➊#Thisprogramsayshelloandasksformyname.

➋print('Helloworld!')

print('Whatisyourname?')#askfortheirname

➌myName=input()

➍print('Itisgoodtomeetyou,'+myName)

➎print('Thelengthofyournameis:')

print(len(myName))

➏print('Whatisyourage?')#askfortheirage

myAge=input()

print('Youwillbe'+str(int(myAge)+1)+'inayear.')

Onceyou’veenteredyoursourcecode,saveitsothatyouwon’thavetoretypeiteachtimeyoustartIDLE.Fromthemenuatthetopofthefileeditorwindow,selectFile▸SaveAs.IntheSaveAswindow,enterhello.pyintheFileNamefieldandthenclickSave.

Youshouldsaveyourprogramseveryonceinawhileasyoutypethem.Thatway,ifthecomputercrashesoryouaccidentallyexitfromIDLE,youwon’tlosethecode.Asashortcut,youcanpressCTRL-SonWindowsandLinuxor⌘-SonOSXtosaveyourfile.

Onceyou’vesaved,let’srunourprogram.SelectRun▸RunModuleorjustpresstheF5key.YourprogramshouldrunintheinteractiveshellwindowthatappearedwhenyoufirststartedIDLE.Remember,youhavetopressF5fromthefileeditorwindow,nottheinteractiveshellwindow.Enteryournamewhenyourprogramasksforit.Theprogram’soutputintheinteractiveshellshouldlooksomethinglikethis:

Python3.3.2(v3.3.2:d047928ae3f6,May162013,00:06:53)[MSCv.160064bit

(AMD64)]onwin32

Type"copyright","credits"or"license()"formoreinformation.

>>>================================RESTART================================

>>>

Helloworld!

Whatisyourname?

Al

Itisgoodtomeetyou,Al

Thelengthofyournameis:

2

Whatisyourage?

4

Youwillbe5inayear.

>>>

Whentherearenomorelinesofcodetoexecute,thePythonprogramterminates;thatis,it

stopsrunning.(YoucanalsosaythatthePythonprogramexits.)

YoucanclosethefileeditorbyclickingtheXatthetopofthewindow.Toreloadasavedprogram,selectFile▸Openfromthemenu.Dothatnow,andinthewindowthatappears,choosehello.py,andclicktheOpenbutton.Yourpreviouslysavedhello.pyprogramshouldopeninthefileeditorwindow.

DissectingYourProgramWithyournewprogramopeninthefileeditor,let’stakeaquicktourofthePythoninstructionsitusesbylookingatwhateachlineofcodedoes.

CommentsThefollowinglineiscalledacomment.

➊#Thisprogramsayshelloandasksformyname.

Pythonignorescomments,andyoucanusethemtowritenotesorremindyourselfwhatthecodeistryingtodo.Anytextfortherestofthelinefollowingahashmark(#)ispartofacomment.

Sometimes,programmerswillputa#infrontofalineofcodetotemporarilyremoveitwhiletestingaprogram.Thisiscalledcommentingoutcode,anditcanbeusefulwhenyou’retryingtofigureoutwhyaprogramdoesn’twork.Youcanremovethe#laterwhenyouarereadytoputthelinebackin.

Pythonalsoignorestheblanklineafterthecomment.Youcanaddasmanyblanklinestoyourprogramasyouwant.Thiscanmakeyourcodeeasiertoread,likeparagraphsinabook.

Theprint()FunctionTheprint()functiondisplaysthestringvalueinsidetheparenthesesonthescreen.

➋print('Helloworld!')

print('Whatisyourname?')#askfortheirname

Thelineprint('Helloworld!')means“Printoutthetextinthestring'Helloworld!'.”WhenPythonexecutesthisline,yousaythatPythoniscallingtheprint()functionandthestringvalueisbeingpassedtothefunction.Avaluethatispassedtoafunctioncallisanargument.Noticethatthequotesarenotprintedtothescreen.Theyjustmarkwherethestringbeginsandends;theyarenotpartofthestringvalue.

NOTE

Youcanalsousethisfunctiontoputablanklineonthescreen;justcallprint()withnothinginbetweentheparentheses.

Whenwritingafunctionname,theopeningandclosingparenthesesattheendidentifyitasthenameofafunction.Thisiswhyinthisbookyou’llseeprint()ratherthanprint.Chapter2describesfunctionsinmoredetail.

Theinput()FunctionTheinput()functionwaitsfortheusertotypesometextonthekeyboardandpressENTER.

➌myName=input()

Thisfunctioncallevaluatestoastringequaltotheuser’stext,andthepreviouslineofcodeassignsthemyNamevariabletothisstringvalue.

Youcanthinkoftheinput()functioncallasanexpressionthatevaluatestowhateverstringtheusertypedin.Iftheuserentered'Al',thentheexpressionwouldevaluateto

myName='Al'.

PrintingtheUser’sNameThefollowingcalltoprint()actuallycontainstheexpression'Itisgoodtomeetyou,'+myNamebetweentheparentheses.

➍print('Itisgoodtomeetyou,'+myName)

Rememberthatexpressionscanalwaysevaluatetoasinglevalue.If'Al'isthevaluestoredinmyNameonthepreviousline,thenthisexpressionevaluatesto'Itisgoodtomeetyou,Al'.Thissinglestringvalueisthenpassedtoprint(),whichprintsitonthescreen.

Thelen()FunctionYoucanpassthelen()functionastringvalue(oravariablecontainingastring),andthefunctionevaluatestotheintegervalueofthenumberofcharactersinthatstring.

➎print('Thelengthofyournameis:')

print(len(myName))

Enterthefollowingintotheinteractiveshelltotrythis:>>>len('hello')

5

>>>len('Myveryenergeticmonsterjustscarfednachos.')

46

>>>len('')

0

Justlikethoseexamples,len(myName)evaluatestoaninteger.Itisthenpassedtoprint()tobedisplayedonthescreen.Noticethatprint()allowsyoutopassiteitherintegervaluesorstringvalues.Butnoticetheerrorthatshowsupwhenyoutypethefollowingintotheinteractiveshell:

>>>print('Iam'+29+'yearsold.')

Traceback(mostrecentcalllast):

File"<pyshell#6>",line1,in<module>

print('Iam'+29+'yearsold.')

TypeError:Can'tconvert'int'objecttostrimplicitly

Theprint()functionisn’tcausingthaterror,butratherit’stheexpressionyoutriedtopasstoprint().Yougetthesameerrormessageifyoutypetheexpressionintotheinteractiveshellonitsown.

>>>'Iam'+29+'yearsold.'

Traceback(mostrecentcalllast):

File"<pyshell#7>",line1,in<module>

'Iam'+29+'yearsold.'

TypeError:Can'tconvert'int'objecttostrimplicitly

Pythongivesanerrorbecauseyoucanusethe+operatoronlytoaddtwointegerstogetherorconcatenatetwostrings.Youcan’taddanintegertoastringbecausethisisungrammaticalinPython.Youcanfixthisbyusingastringversionoftheintegerinstead,asexplainedinthenextsection.

Thestr(),int(),andfloat()FunctionsIfyouwanttoconcatenateanintegersuchas29withastringtopasstoprint(),you’llneedtogetthevalue'29',whichisthestringformof29.Thestr()functioncanbepassedanintegervalueandwillevaluatetoastringvalueversionofit,asfollows:

>>>str(29)

'29'

>>>print('Iam'+str(29)+'yearsold.')

Iam29yearsold.

Becausestr(29)evaluatesto'29',theexpression'Iam'+str(29)+'yearsold.'evaluatesto'Iam'+'29'+'yearsold.',whichinturnevaluatesto'Iam29yearsold.'.Thisisthevaluethatispassedtotheprint()function.

Thestr(),int(),andfloat()functionswillevaluatetothestring,integer,andfloating-pointformsofthevalueyoupass,respectively.Tryconvertingsomevaluesintheinteractiveshellwiththesefunctions,andwatchwhathappens.

>>>str(0)

'0'

>>>str(-3.14)

'-3.14'

>>>int('42')

42

>>>int('-99')

-99

>>>int(1.25)

1

>>>int(1.99)

1

>>>float('3.14')

3.14

>>>float(10)

10.0

Thepreviousexamplescallthestr(),int(),andfloat()functionsandpassthemvaluesoftheotherdatatypestoobtainastring,integer,orfloating-pointformofthosevalues.

Thestr()functionishandywhenyouhaveanintegerorfloatthatyouwanttoconcatenatetoastring.Theint()functionisalsohelpfulifyouhaveanumberasastringvaluethatyouwanttouseinsomemathematics.Forexample,theinput()functionalwaysreturnsastring,eveniftheuserentersanumber.Enterspam=input()intotheinteractiveshellandenter101whenitwaitsforyourtext.

>>>spam=input()

101

>>>spam

'101'

Thevaluestoredinsidespamisn’ttheinteger101butthestring'101'.Ifyouwanttodomathusingthevalueinspam,usetheint()functiontogettheintegerformofspamandthenstorethisasthenewvalueinspam.

>>>spam=int(spam)

>>>spam

101

Nowyoushouldbeabletotreatthespamvariableasanintegerinsteadofastring.>>>spam*10/5

202.0

Notethatifyoupassavaluetoint()thatitcannotevaluateasaninteger,Pythonwilldisplayanerrormessage.

>>>int('99.99')

Traceback(mostrecentcalllast):

File"<pyshell#18>",line1,in<module>

int('99.99')

ValueError:invalidliteralforint()withbase10:'99.99'

>>>int('twelve')

Traceback(mostrecentcalllast):

File"<pyshell#19>",line1,in<module>

int('twelve')

ValueError:invalidliteralforint()withbase10:'twelve'

Theint()functionisalsousefulifyouneedtoroundafloating-pointnumberdown.Ifyouwanttoroundafloating-pointnumberup,justadd1toitafterward.

>>>int(7.7)

7

>>>int(7.7)+1

8

Inyourprogram,youusedtheint()andstr()functionsinthelastthreelinestogetavalueoftheappropriatedatatypeforthecode.

➏print('Whatisyourage?')#askfortheirage

myAge=input()

print('Youwillbe'+str(int(myAge)+1)+'inayear.')

ThemyAgevariablecontainsthevaluereturnedfrominput().Becausetheinput()functionalwaysreturnsastring(eveniftheusertypedinanumber),youcanusetheint(myAge)codetoreturnanintegervalueofthestringinmyAge.Thisintegervalueisthenaddedto1intheexpressionint(myAge)+1.

Theresultofthisadditionispassedtothestr()function:str(int(myAge)+1).Thestringvaluereturnedisthenconcatenatedwiththestrings'Youwillbe'and'inayear.'toevaluatetoonelargestringvalue.Thislargestringisfinallypassedtoprint()tobedisplayedonthescreen.

Let’ssaytheuserentersthestring'4'formyAge.Thestring'4'isconvertedtoaninteger,soyoucanaddonetoit.Theresultis5.Thestr()functionconvertstheresultbacktoastring,soyoucanconcatenateitwiththesecondstring,'inayear.',tocreatethefinalmessage.TheseevaluationstepswouldlooksomethinglikeFigure1-4.

TEXTANDNUMBEREQUIVALENCE

Althoughthestringvalueofanumberisconsideredacompletelydifferentvaluefromtheintegerorfloating-pointversion,anintegercanbeequaltoafloatingpoint.

>>>42=='42'

False

>>>42==42.0

True

>>>42.0==0042.000

True

Pythonmakesthisdistinctionbecausestringsaretext,whileintegersandfloatsarebothnumbers.

Figure1-4.Theevaluationsteps,if4wasstoredinmyAge

SummaryYoucancomputeexpressionswithacalculatorortypestringconcatenationswithawordprocessor.Youcanevendostringreplicationeasilybycopyingandpastingtext.Butexpressions,andtheircomponentvalues—operators,variables,andfunctioncalls—arethebasicbuildingblocksthatmakeprograms.Onceyouknowhowtohandletheseelements,youwillbeabletoinstructPythontooperateonlargeamountsofdataforyou.

Itisgoodtorememberthedifferenttypesofoperators(+,-,*,/,//,%,and**formathoperations,and+and*forstringoperations)andthethreedatatypes(integers,floating-pointnumbers,andstrings)introducedinthischapter.

Afewdifferentfunctionswereintroducedaswell.Theprint()andinput()functionshandlesimpletextoutput(tothescreen)andinput(fromthekeyboard).Thelen()functiontakesastringandevaluatestoanintofthenumberofcharactersinthestring.Thestr(),int(),andfloat()functionswillevaluatetothestring,integer,orfloating-pointnumberformofthevaluetheyarepassed.

Inthenextchapter,youwilllearnhowtotellPythontomakeintelligentdecisionsaboutwhatcodetorun,whatcodetoskip,andwhatcodetorepeatbasedonthevaluesithas.Thisisknownasflowcontrol,anditallowsyoutowriteprogramsthatmakeintelligentdecisions.

PracticeQuestionsQ: 1.Whichofthefollowingareoperators,andwhicharevalues?

*

'hello'

-88.8

-

/

+

5

Q: 2.Whichofthefollowingisavariable,andwhichisastring?spam

'spam'

Q: 3.Namethreedatatypes.

Q: 4.Whatisanexpressionmadeupof?Whatdoallexpressionsdo?

Q: 5.Thischapterintroducedassignmentstatements,likespam=10.Whatisthedifferencebetweenanexpressionandastatement?

Q: 6.Whatdoesthevariablebaconcontainafterthefollowingcoderuns?bacon=20

bacon+1

Q: 7.Whatshouldthefollowingtwoexpressionsevaluateto?'spam'+'spamspam'

'spam'*3

Q: 8.Whyiseggsavalidvariablenamewhile100isinvalid?

Q: 9.Whatthreefunctionscanbeusedtogettheinteger,floating-pointnumber,orstringversionofavalue?

Q: 10.Whydoesthisexpressioncauseanerror?Howcanyoufixit?'Ihaveeaten'+99+'burritos.'

Extracredit:SearchonlineforthePythondocumentationforthelen()function.Itwillbeonawebpagetitled“Built-inFunctions.”SkimthelistofotherfunctionsPythonhas,lookupwhattheround()functiondoes,andexperimentwithitintheinteractiveshell.

Chapter2.FlowControlSoyouknowthebasicsofindividualinstructionsandthataprogramisjustaseriesofinstructions.Buttherealstrengthofprogrammingisn’tjustrunning(orexecuting)oneinstructionafteranotherlikeaweekenderrandlist.Basedonhowtheexpressionsevaluate,theprogramcandecidetoskipinstructions,repeatthem,orchooseoneofseveralinstructionstorun.Infact,youalmostneverwantyourprogramstostartfromthefirstlineofcodeandsimplyexecuteeveryline,straighttotheend.FlowcontrolstatementscandecidewhichPythoninstructionstoexecuteunderwhichconditions.

Theseflowcontrolstatementsdirectlycorrespondtothesymbolsinaflowchart,soI’llprovideflowchartversionsofthecodediscussedinthischapter.Figure2-1showsaflowchartforwhattodoifit’sraining.FollowthepathmadebythearrowsfromStarttoEnd.

Figure2-1.Aflowcharttotellyouwhattodoifitisraining

Inaflowchart,thereisusuallymorethanonewaytogofromthestarttotheend.Thesameistrueforlinesofcodeinacomputerprogram.Flowchartsrepresentthesebranchingpointswithdiamonds,whiletheotherstepsarerepresentedwithrectangles.Thestartingandendingstepsarerepresentedwithroundedrectangles.

Butbeforeyoulearnaboutflowcontrolstatements,youfirstneedtolearnhowtorepresentthoseyesandnooptions,andyouneedtounderstandhowtowritethosebranchingpointsasPythoncode.Tothatend,let’sexploreBooleanvalues,comparison

operators,andBooleanoperators.

BooleanValuesWhiletheinteger,floating-point,andstringdatatypeshaveanunlimitednumberofpossiblevalues,theBooleandatatypehasonlytwovalues:TrueandFalse.(BooleaniscapitalizedbecausethedatatypeisnamedaftermathematicianGeorgeBoole.)WhentypedasPythoncode,theBooleanvaluesTrueandFalselackthequotesyouplacearoundstrings,andtheyalwaysstartwithacapitalTorF,withtherestofthewordinlowercase.Enterthefollowingintotheinteractiveshell.(Someoftheseinstructionsareintentionallyincorrect,andthey’llcauseerrormessagestoappear.)

➊>>>spam=True

>>>spam

True

➋>>>true

Traceback(mostrecentcalllast):

File"<pyshell#2>",line1,in<module>

true

NameError:name'true'isnotdefined

➌>>>True=2+2

SyntaxError:assignmenttokeyword

Likeanyothervalue,Booleanvaluesareusedinexpressionsandcanbestoredinvariables➊.Ifyoudon’tusethepropercase➋oryoutrytouseTrueandFalseforvariablenames➌,Pythonwillgiveyouanerrormessage.

ComparisonOperatorsComparisonoperatorscomparetwovaluesandevaluatedowntoasingleBooleanvalue.Table2-1liststhecomparisonoperators.

Table2-1.ComparisonOperators

Operator Meaning

== Equalto

!= Notequalto

< Lessthan

> Greaterthan

<= Lessthanorequalto

>= Greaterthanorequalto

TheseoperatorsevaluatetoTrueorFalsedependingonthevaluesyougivethem.Let’strysomeoperatorsnow,startingwith==and!=.

>>>42==42

True

>>>42==99

False

>>>2!=3

True

>>>2!=2

False

Asyoumightexpect,==(equalto)evaluatestoTruewhenthevaluesonbothsidesarethesame,and!=(notequalto)evaluatestoTruewhenthetwovaluesaredifferent.The==and!=operatorscanactuallyworkwithvaluesofanydatatype.

>>>'hello'=='hello'

True

>>>'hello'=='Hello'

False

>>>'dog'!='cat'

True

>>>True==True

True

>>>True!=False

True

>>>42==42.0

True

➊>>>42=='42'

False

Notethatanintegerorfloating-pointvaluewillalwaysbeunequaltoastringvalue.Theexpression42=='42'➊evaluatestoFalsebecausePythonconsiderstheinteger42tobedifferentfromthestring'42'.

The<,>,<=,and>=operators,ontheotherhand,workproperlyonlywithintegerandfloating-pointvalues.

>>>42<100

True

>>>42>100

False

>>>42<42

False

>>>eggCount=42

➊>>>eggCount<=42

True

>>>myAge=29

➋>>>myAge>=10

True

THEDIFFERENCEBETWEENTHE==AND=OPERATORS

Youmighthavenoticedthatthe==operator(equalto)hastwoequalsigns,whilethe=operator(assignment)hasjustoneequalsign.It’seasytoconfusethesetwooperatorswitheachother.Justrememberthesepoints:

The==operator(equalto)askswhethertwovaluesarethesameaseachother.The=operator(assignment)putsthevalueontherightintothevariableontheleft.

Tohelprememberwhichiswhich,noticethatthe==operator(equalto)consistsoftwocharacters,justlikethe!=operator(notequalto)consistsoftwocharacters.

You’lloftenusecomparisonoperatorstocompareavariable’svaluetosomeothervalue,likeintheeggCount<=42➊andmyAge>=10➋examples.(Afterall,insteadoftyping'dog'!='cat'inyourcode,youcouldhavejusttypedTrue.)You’llseemoreexamplesofthislaterwhenyoulearnaboutflowcontrolstatements.

BooleanOperatorsThethreeBooleanoperators(and,or,andnot)areusedtocompareBooleanvalues.Likecomparisonoperators,theyevaluatetheseexpressionsdowntoaBooleanvalue.Let’sexploretheseoperatorsindetail,startingwiththeandoperator.

BinaryBooleanOperatorsTheandandoroperatorsalwaystaketwoBooleanvalues(orexpressions),sothey’reconsideredbinaryoperators.TheandoperatorevaluatesanexpressiontoTrueifbothBooleanvaluesareTrue;otherwise,itevaluatestoFalse.Entersomeexpressionsusingandintotheinteractiveshelltoseeitinaction.

>>>TrueandTrue

True

>>>TrueandFalse

False

AtruthtableshowseverypossibleresultofaBooleanoperator.Table2-2isthetruthtablefortheandoperator.

Table2-2.TheandOperator’sTruthTable

Expression Evaluatesto…

TrueandTrue True

TrueandFalse False

FalseandTrue False

FalseandFalse False

Ontheotherhand,theoroperatorevaluatesanexpressiontoTrueifeitherofthetwoBooleanvaluesisTrue.IfbothareFalse,itevaluatestoFalse.

>>>FalseorTrue

True

>>>FalseorFalse

False

Youcanseeeverypossibleoutcomeoftheoroperatorinitstruthtable,showninTable2-3.

Table2-3.TheorOperator’sTruthTable

Expression Evaluatesto…

TrueorTrue True

TrueorFalse True

FalseorTrue True

FalseorFalse False

ThenotOperator

Unlikeandandor,thenotoperatoroperatesononlyoneBooleanvalue(orexpression).ThenotoperatorsimplyevaluatestotheoppositeBooleanvalue.

>>>notTrue

False

➊>>>notnotnotnotTrue

True

Muchlikeusingdoublenegativesinspeechandwriting,youcannestnotoperators➊,thoughthere’snevernotnoreasontodothisinrealprograms.Table2-4showsthetruthtablefornot.

Table2-4.ThenotOperator’sTruthTable

Expression Evaluatesto…

notTrue False

notFalse True

MixingBooleanandComparisonOperatorsSincethecomparisonoperatorsevaluatetoBooleanvalues,youcanusetheminexpressionswiththeBooleanoperators.

Recallthattheand,or,andnotoperatorsarecalledBooleanoperatorsbecausetheyalwaysoperateontheBooleanvaluesTrueandFalse.Whileexpressionslike4<5aren’tBooleanvalues,theyareexpressionsthatevaluatedowntoBooleanvalues.TryenteringsomeBooleanexpressionsthatusecomparisonoperatorsintotheinteractiveshell.

>>>(4<5)and(5<6)

True

>>>(4<5)and(9<6)

False

>>>(1==2)or(2==2)

True

Thecomputerwillevaluatetheleftexpressionfirst,andthenitwillevaluatetherightexpression.WhenitknowstheBooleanvalueforeach,itwillthenevaluatethewholeexpressiondowntooneBooleanvalue.Youcanthinkofthecomputer’sevaluationprocessfor(4<5)and(5<6)asshowninFigure2-2.

YoucanalsousemultipleBooleanoperatorsinanexpression,alongwiththecomparisonoperators.

>>>2+2==4andnot2+2==5and2*2==2+2

True

TheBooleanoperatorshaveanorderofoperationsjustlikethemathoperatorsdo.Afteranymathandcomparisonoperatorsevaluate,Pythonevaluatesthenotoperatorsfirst,thentheandoperators,andthentheoroperators.

Figure2-2.Theprocessofevaluating(4<5)and(5<6)toTrue.

ElementsofFlowControlFlowcontrolstatementsoftenstartwithapartcalledthecondition,andallarefollowedbyablockofcodecalledtheclause.BeforeyoulearnaboutPython’sspecificflowcontrolstatements,I’llcoverwhataconditionandablockare.

ConditionsTheBooleanexpressionsyou’veseensofarcouldallbeconsideredconditions,whicharethesamethingasexpressions;conditionisjustamorespecificnameinthecontextofflowcontrolstatements.ConditionsalwaysevaluatedowntoaBooleanvalue,TrueorFalse.AflowcontrolstatementdecideswhattodobasedonwhetheritsconditionisTrueorFalse,andalmosteveryflowcontrolstatementusesacondition.

BlocksofCodeLinesofPythoncodecanbegroupedtogetherinblocks.Youcantellwhenablockbeginsandendsfromtheindentationofthelinesofcode.Therearethreerulesforblocks.

1. Blocksbeginwhentheindentationincreases.2. Blockscancontainotherblocks.3. Blocksendwhentheindentationdecreasestozeroortoacontainingblock’s

indentation.

Blocksareeasiertounderstandbylookingatsomeindentedcode,solet’sfindtheblocksinpartofasmallgameprogram,shownhere:

ifname=='Mary':

➊print('HelloMary')

ifpassword=='swordfish':

➋print('Accessgranted.')

else:

➌print('Wrongpassword.')

Thefirstblockofcode➊startsatthelineprint('HelloMary')andcontainsallthelinesafterit.Insidethisblockisanotherblock➋,whichhasonlyasinglelineinit:print('AccessGranted.').Thethirdblock➌isalsoonelinelong:print('Wrongpassword.').

ProgramExecutionInthepreviouschapter’shello.pyprogram,Pythonstartedexecutinginstructionsatthetopoftheprogramgoingdown,oneafteranother.Theprogramexecution(orsimply,execution)isatermforthecurrentinstructionbeingexecuted.Ifyouprintthesourcecodeonpaperandputyourfingeroneachlineasitisexecuted,youcanthinkofyourfingerastheprogramexecution.

Notallprogramsexecutebysimplygoingstraightdown,however.Ifyouuseyourfingertotracethroughaprogramwithflowcontrolstatements,you’lllikelyfindyourselfjumpingaroundthesourcecodebasedonconditions,andyou’llprobablyskipentireclauses.

FlowControlStatementsNow,let’sexplorethemostimportantpieceofflowcontrol:thestatementsthemselves.ThestatementsrepresentthediamondsyousawintheflowchartinFigure2-1,andtheyaretheactualdecisionsyourprogramswillmake.

ifStatementsThemostcommontypeofflowcontrolstatementistheifstatement.Anifstatement’sclause(thatis,theblockfollowingtheifstatement)willexecuteifthestatement’sconditionisTrue.TheclauseisskippediftheconditionisFalse.

InplainEnglish,anifstatementcouldbereadas,“Ifthisconditionistrue,executethecodeintheclause.”InPython,anifstatementconsistsofthefollowing:

TheifkeywordAcondition(thatis,anexpressionthatevaluatestoTrueorFalse)AcolonStartingonthenextline,anindentedblockofcode(calledtheifclause)

Forexample,let’ssayyouhavesomecodethatcheckstoseewhethersomeone’snameisAlice.(Pretendnamewasassignedsomevalueearlier.)

ifname=='Alice':

print('Hi,Alice.')

Allflowcontrolstatementsendwithacolonandarefollowedbyanewblockofcode(theclause).Thisifstatement’sclauseistheblockwithprint('Hi,Alice.').Figure2-3showswhataflowchartofthiscodewouldlooklike.

Figure2-3.Theflowchartforanifstatement

elseStatementsAnifclausecanoptionallybefollowedbyanelsestatement.Theelseclauseisexecutedonlywhentheifstatement’sconditionisFalse.InplainEnglish,anelsestatementcouldbereadas,“Ifthisconditionistrue,executethiscode.Orelse,executethatcode.”Anelsestatementdoesn’thaveacondition,andincode,anelsestatementalwaysconsistsofthefollowing:

TheelsekeywordAcolonStartingonthenextline,anindentedblockofcode(calledtheelseclause)

ReturningtotheAliceexample,let’slookatsomecodethatusesanelsestatementtoofferadifferentgreetingiftheperson’snameisn’tAlice.

ifname=='Alice':

print('Hi,Alice.')

else:

print('Hello,stranger.')

Figure2-4showswhataflowchartofthiscodewouldlooklike.

Figure2-4.Theflowchartforanelsestatement

elifStatementsWhileonlyoneoftheiforelseclauseswillexecute,youmayhaveacasewhereyouwantoneofmanypossibleclausestoexecute.Theelifstatementisan“elseif”statementthatalwaysfollowsaniforanotherelifstatement.ItprovidesanotherconditionthatischeckedonlyifanyofthepreviousconditionswereFalse.Incode,anelifstatementalwaysconsistsofthefollowing:

TheelifkeywordAcondition(thatis,anexpressionthatevaluatestoTrueorFalse)AcolonStartingonthenextline,anindentedblockofcode(calledtheelifclause)

Let’saddaneliftothenamecheckertoseethisstatementinaction.ifname=='Alice':

print('Hi,Alice.')

elifage<12:

print('YouarenotAlice,kiddo.')

Thistime,youchecktheperson’sage,andtheprogramwilltellthemsomethingdifferentifthey’reyoungerthan12.YoucanseetheflowchartforthisinFigure2-5.

Figure2-5.Theflowchartforanelifstatement

Theelifclauseexecutesifage<12isTrueandname=='Alice'isFalse.However,ifbothoftheconditionsareFalse,thenbothoftheclausesareskipped.Itisnotguaranteedthatatleastoneoftheclauseswillbeexecuted.Whenthereisachainofelifstatements,onlyoneornoneoftheclauseswillbeexecuted.Onceoneofthestatements’conditionsisfoundtobeTrue,therestoftheelifclausesareautomaticallyskipped.Forexample,openanewfileeditorwindowandenterthefollowingcode,savingitasvampire.py:

ifname=='Alice':

print('Hi,Alice.')

elifage<12:

print('YouarenotAlice,kiddo.')

elifage>2000:

print('Unlikeyou,Aliceisnotanundead,immortalvampire.')

elifage>100:

print('YouarenotAlice,grannie.')

HereI’veaddedtwomoreelifstatementstomakethenamecheckergreetapersonwithdifferentanswersbasedonage.Figure2-6showstheflowchartforthis.

Figure2-6.Theflowchartformultipleelifstatementsinthevampire.pyprogram

Theorderoftheelifstatementsdoesmatter,however.Let’srearrangethemtointroduceabug.RememberthattherestoftheelifclausesareautomaticallyskippedonceaTrueconditionhasbeenfound,soifyouswaparoundsomeoftheclausesinvampire.py,yourunintoaproblem.Changethecodetolooklikethefollowing,andsaveitasvampire2.py:

ifname=='Alice':

print('Hi,Alice.')

elifage<12:

print('YouarenotAlice,kiddo.')

➊elifage>100:

print('YouarenotAlice,grannie.')

elifage>2000:

print('Unlikeyou,Aliceisnotanundead,immortalvampire.')

Saytheagevariablecontainsthevalue3000beforethiscodeisexecuted.Youmightexpectthecodetoprintthestring'Unlikeyou,Aliceisnotanundead,immortalvampire.'.However,becausetheage>100conditionisTrue(afterall,3000isgreaterthan100)➊,thestring'YouarenotAlice,grannie.'isprinted,andtherestoftheelifstatementsareautomaticallyskipped.Remember,atmostonlyoneoftheclauseswillbeexecuted,andforelifstatements,theordermatters!

Figure2-7showstheflowchartforthepreviouscode.Noticehowthediamondsforage>100andage>2000areswapped.

Optionally,youcanhaveanelsestatementafterthelastelifstatement.Inthatcase,itisguaranteedthatatleastone(andonlyone)oftheclauseswillbeexecuted.IftheconditionsineveryifandelifstatementareFalse,thentheelseclauseisexecuted.Forexample,let’sre-createtheAliceprogramtouseif,elif,andelseclauses.

ifname=='Alice':

print('Hi,Alice.')

elifage<12:

print('YouarenotAlice,kiddo.')

else:

print('YouareneitherAlicenoralittlekid.')

Figure2-8showstheflowchartforthisnewcode,whichwe’llsaveaslittleKid.py.

InplainEnglish,thistypeofflowcontrolstructurewouldbe,“Ifthefirstconditionistrue,dothis.Else,ifthesecondconditionistrue,dothat.Otherwise,dosomethingelse.”Whenyouuseallthreeofthesestatementstogether,remembertheserulesabouthowtoorderthemtoavoidbugsliketheoneinFigure2-7.First,thereisalwaysexactlyoneifstatement.Anyelifstatementsyouneedshouldfollowtheifstatement.Second,ifyouwanttobesurethatatleastoneclauseisexecuted,closethestructurewithanelsestatement.

Figure2-7.Theflowchartforthevampire2.pyprogram.Thecrossed-outpathwilllogicallyneverhappen,becauseifageweregreaterthan2000,itwouldhavealreadybeengreaterthan100.

Figure2-8.FlowchartforthepreviouslittleKid.pyprogram

whileLoopStatementsYoucanmakeablockofcodeexecuteoverandoveragainwithawhilestatement.Thecodeinawhileclausewillbeexecutedaslongasthewhilestatement’sconditionisTrue.Incode,awhilestatementalwaysconsistsofthefollowing:

ThewhilekeywordAcondition(thatis,anexpressionthatevaluatestoTrueorFalse)AcolonStartingonthenextline,anindentedblockofcode(calledthewhileclause)

Youcanseethatawhilestatementlookssimilartoanifstatement.Thedifferenceisinhowtheybehave.Attheendofanifclause,theprogramexecutioncontinuesaftertheifstatement.Butattheendofawhileclause,theprogramexecutionjumpsbacktothestartofthewhilestatement.Thewhileclauseisoftencalledthewhilelooporjusttheloop.

Let’slookatanifstatementandawhileloopthatusethesameconditionandtakethesameactionsbasedonthatcondition.Hereisthecodewithanifstatement:

spam=0

ifspam<5:

print('Hello,world.')

spam=spam+1

Hereisthecodewithawhilestatement:spam=0

whilespam<5:

print('Hello,world.')

spam=spam+1

Thesestatementsaresimilar—bothifandwhilecheckthevalueofspam,andifit’slessthanfive,theyprintamessage.Butwhenyourunthesetwocodesnippets,somethingverydifferenthappensforeachone.Fortheifstatement,theoutputissimply"Hello,world.".Butforthewhilestatement,it’s"Hello,world."repeatedfivetimes!Takealookattheflowchartsforthesetwopiecesofcode,Figure2-9andFigure2-10,toseewhythishappens.

Figure2-9.Theflowchartfortheifstatementcode

Figure2-10.Theflowchartforthewhilestatementcode

Thecodewiththeifstatementchecksthecondition,anditprintsHello,world.onlyonceifthatconditionistrue.Thecodewiththewhileloop,ontheotherhand,willprintitfivetimes.Itstopsafterfiveprintsbecausetheintegerinspamisincrementedbyoneattheendofeachloopiteration,whichmeansthattheloopwillexecutefivetimesbeforespam<5isFalse.

Inthewhileloop,theconditionisalwayscheckedatthestartofeachiteration(thatis,eachtimetheloopisexecuted).IftheconditionisTrue,thentheclauseisexecuted,andafterward,theconditionischeckedagain.ThefirsttimetheconditionisfoundtobeFalse,thewhileclauseisskipped.

AnAnnoyingwhileLoop

Here’sasmallexampleprogramthatwillkeepaskingyoutotype,literally,yourname.SelectFile▸NewWindowtoopenanewfileeditorwindow,enterthefollowingcode,andsavethefileasyourName.py:

➊name=''

➋whilename!='yourname':

print('Pleasetypeyourname.')

➌name=input()

➍print('Thankyou!')

First,theprogramsetsthenamevariable➊toanemptystring.Thisissothatthename!='yourname'conditionwillevaluatetoTrueandtheprogramexecutionwillenterthewhileloop’sclause➋.

Thecodeinsidethisclauseaskstheusertotypetheirname,whichisassignedtothename

variable➌.Sincethisisthelastlineoftheblock,theexecutionmovesbacktothestartofthewhileloopandreevaluatesthecondition.Ifthevalueinnameisnotequaltothestring'yourname',thentheconditionisTrue,andtheexecutionentersthewhileclauseagain.

Butoncetheusertypesyourname,theconditionofthewhileloopwillbe'yourname'!='yourname',whichevaluatestoFalse.TheconditionisnowFalse,andinsteadoftheprogramexecutionreenteringthewhileloop’sclause,itskipspastitandcontinuesrunningtherestoftheprogram➍.Figure2-11showsaflowchartfortheyourName.pyprogram.

Figure2-11.AflowchartoftheyourName.pyprogram

Now,let’sseeyourName.pyinaction.PressF5torunit,andentersomethingotherthanyournameafewtimesbeforeyougivetheprogramwhatitwants.

Pleasetypeyourname.

Al

Pleasetypeyourname.

Albert

Pleasetypeyourname.

%#@#%*(^&!!!

Pleasetypeyourname.

yourname

Thankyou!

Ifyouneverenteryourname,thenthewhileloop’sconditionwillneverbeFalse,andtheprogramwilljustkeepaskingforever.Here,theinput()callletstheuserentertherightstringtomaketheprogrammoveon.Inotherprograms,theconditionmightneveractuallychange,andthatcanbeaproblem.Let’slookathowyoucanbreakoutofawhileloop.

breakStatementsThereisashortcuttogettingtheprogramexecutiontobreakoutofawhileloop’sclauseearly.Iftheexecutionreachesabreakstatement,itimmediatelyexitsthewhileloop’sclause.Incode,abreakstatementsimplycontainsthebreakkeyword.

Prettysimple,right?Here’saprogramthatdoesthesamethingasthepreviousprogram,butitusesabreakstatementtoescapetheloop.Enterthefollowingcode,andsavethefileasyourName2.py:

➊whileTrue:

print('Pleasetypeyourname.')

➋name=input()

➌ifname=='yourname':

➍break

➎print('Thankyou!')

Thefirstline➊createsaninfiniteloop;itisawhileloopwhoseconditionisalwaysTrue.(TheexpressionTrue,afterall,alwaysevaluatesdowntothevalueTrue.)Theprogramexecutionwillalwaysentertheloopandwillexititonlywhenabreakstatementisexecuted.(Aninfiniteloopthatneverexitsisacommonprogrammingbug.)

Justlikebefore,thisprogramaskstheusertotypeyourname➋.Now,however,whiletheexecutionisstillinsidethewhileloop,anifstatementgetsexecuted➌tocheckwhethernameisequaltoyourname.IfthisconditionisTrue,thebreakstatementisrun➍,andtheexecutionmovesoutofthelooptoprint('Thankyou!')➎.Otherwise,theifstatement’sclausewiththebreakstatementisskipped,whichputstheexecutionattheendofthewhileloop.Atthispoint,theprogramexecutionjumpsbacktothestartofthewhilestatement➊torecheckthecondition.SincethisconditionismerelytheTrueBooleanvalue,theexecutionentersthelooptoasktheusertotypeyournameagain.SeeFigure2-12fortheflowchartofthisprogram.

RunyourName2.py,andenterthesametextyouenteredforyourName.py.Therewrittenprogramshouldrespondinthesamewayastheoriginal.

Figure2-12.TheflowchartfortheyourName2.pyprogramwithaninfiniteloop.NotethattheXpathwilllogicallyneverhappenbecausetheloopconditionisalwaysTrue.

continueStatementsLikebreakstatements,continuestatementsareusedinsideloops.Whentheprogramexecutionreachesacontinuestatement,theprogramexecutionimmediatelyjumpsbacktothestartoftheloopandreevaluatestheloop’scondition.(Thisisalsowhathappenswhentheexecutionreachestheendoftheloop.)

TRAPPEDINANINFINITELOOP?

Ifyoueverrunaprogramthathasabugcausingittogetstuckinaninfiniteloop,pressCTRL-C.ThiswillsendaKeyboardInterrupterrortoyourprogramandcauseittostopimmediately.Totryit,createasimpleinfiniteloopinthefileeditor,andsaveitasinfiniteloop.py.

whileTrue:

print('Helloworld!')

Whenyourunthisprogram,itwillprintHelloworld!tothescreenforever,becausethewhilestatement’sconditionisalwaysTrue.InIDLE’sinteractiveshellwindow,thereareonlytwowaystostopthisprogram:pressCTRL-CorselectShell▸restartShellfromthemenu.CTRL-Cishandyifyoueverwanttoterminateyourprogramimmediately,evenifit’snotstuckinaninfiniteloop.

Let’susecontinuetowriteaprogramthatasksforanameandpassword.Enterthefollowingcodeintoanewfileeditorwindowandsavetheprogramasswordfish.py.

whileTrue:

print('Whoareyou?')

name=input()

➊ifname!='Joe':

➋continue

print('Hello,Joe.Whatisthepassword?(Itisafish.)')

➌password=input()

ifpassword=='swordfish':

➍break

➎print('Accessgranted.')

IftheuserentersanynamebesidesJoe➊,thecontinuestatement➋causestheprogramexecutiontojumpbacktothestartoftheloop.Whenitreevaluatesthecondition,theexecutionwillalwaysentertheloop,sincetheconditionissimplythevalueTrue.Oncetheymakeitpastthatifstatement,theuserisaskedforapassword➌.Ifthepasswordenteredisswordfish,thenthebreakstatement➍isrun,andtheexecutionjumpsoutofthewhilelooptoprintAccessgranted➎.Otherwise,theexecutioncontinuestotheendofthewhileloop,whereitthenjumpsbacktothestartoftheloop.SeeFigure2-13forthisprogram’sflowchart.

Figure2-13.Aflowchartforswordfish.py.TheXpathwilllogicallyneverhappenbecausetheloopconditionisalwaysTrue.

“TRUTHY”AND“FALSEY”VALUES

TherearesomevaluesinotherdatatypesthatconditionswillconsiderequivalenttoTrueandFalse.Whenusedinconditions,0,0.0,and''(theemptystring)areconsideredFalse,whileallothervaluesareconsideredTrue.Forexample,lookatthefollowingprogram:

name=''

whilenotname:➊print('Enteryourname:')

name=input()

print('Howmanyguestswillyouhave?')

numOfGuests=int(input())

ifnumOfGuests:➋print('Besuretohaveenoughroomforallyourguests.')➌print('Done')

Iftheuserentersablankstringforname,thenthewhilestatement’sconditionwillbeTrue➊,andtheprogramcontinuestoaskforaname.IfthevaluefornumOfGuestsisnot0➋,thentheconditionisconsideredtobeTrue,andtheprogramwillprintareminderfortheuser➌.

Youcouldhavetypednotname!=''insteadofnotname,andnumOfGuests!=0insteadofnumOfGuests,butusingthetruthyandfalseyvaluescanmakeyourcodeeasiertoread.

Runthisprogramandgiveitsomeinput.UntilyouclaimtobeJoe,itshouldn’taskforapassword,andonceyouenterthecorrectpassword,itshouldexit.

Whoareyou?

I'mfine,thanks.Whoareyou?

Whoareyou?

Joe

Hello,Joe.Whatisthepassword?(Itisafish.)

Mary

Whoareyou?

Joe

Hello,Joe.Whatisthepassword?(Itisafish.)

swordfish

Accessgranted.

forLoopsandtherange()FunctionThewhileloopkeepsloopingwhileitsconditionisTrue(whichisthereasonforitsname),butwhatifyouwanttoexecuteablockofcodeonlyacertainnumberoftimes?Youcandothiswithaforloopstatementandtherange()function.

Incode,aforstatementlookssomethinglikeforiinrange(5):andalwaysincludesthefollowing:

TheforkeywordAvariablenameTheinkeywordAcalltotherange()methodwithuptothreeintegerspassedtoitAcolonStartingonthenextline,anindentedblockofcode(calledtheforclause)

Let’screateanewprogramcalledfiveTimes.pytohelpyouseeaforloopinaction.print('Mynameis')

foriinrange(5):

print('JimmyFiveTimes('+str(i)+')')

Thecodeintheforloop’sclauseisrunfivetimes.Thefirsttimeitisrun,thevariableiissetto0.Theprint()callintheclausewillprintJimmyFiveTimes(0).AfterPythonfinishesaniterationthroughallthecodeinsidetheforloop’sclause,theexecutiongoesbacktothetopoftheloop,andtheforstatementincrementsibyone.Thisiswhy

range(5)resultsinfiveiterationsthroughtheclause,withibeingsetto0,then1,then2,then3,andthen4.Thevariableiwillgoupto,butwillnotinclude,theintegerpassedtorange().Figure2-14showsaflowchartforthefiveTimes.pyprogram.

Figure2-14.TheflowchartforfiveTimes.py

Whenyourunthisprogram,itshouldprintJimmyFiveTimesfollowedbythevalueofifivetimesbeforeleavingtheforloop.

Mynameis

JimmyFiveTimes(0)

JimmyFiveTimes(1)

JimmyFiveTimes(2)

JimmyFiveTimes(3)

JimmyFiveTimes(4)

NOTE

Youcanusebreakandcontinuestatementsinsideforloopsaswell.Thecontinuestatementwillcontinuetothenextvalueoftheforloop’scounter,asiftheprogramexecutionhadreachedtheendoftheloopandreturnedtothestart.Infact,youcanusecontinueandbreakstatementsonlyinsidewhileandforloops.Ifyoutrytousethesestatementselsewhere,Pythonwillgiveyouanerror.

Asanotherforloopexample,considerthisstoryaboutthemathematicianKarlFriedrichGauss.WhenGausswasaboy,ateacherwantedtogivetheclasssomebusywork.Theteachertoldthemtoaddupallthenumbersfrom0to100.YoungGausscameupwithaclevertricktofigureouttheanswerinafewseconds,butyoucanwriteaPythonprogramwithaforlooptodothiscalculationforyou.

➊total=0

➋fornuminrange(101):

➌total=total+num

➍print(total)

Theresultshouldbe5,050.Whentheprogramfirststarts,thetotalvariableissetto0➊.Theforloop➋thenexecutestotal=total+num➌100times.Bythetimetheloophasfinishedallofits100iterations,everyintegerfrom0to100willhavebeenaddedtototal.Atthispoint,totalisprintedtothescreen➍.Evenontheslowestcomputers,thisprogramtakeslessthanasecondtocomplete.

(YoungGaussfiguredoutthattherewere50pairsofnumbersthataddedupto100:1+99,2+98,3+97,andsoon,until49+51.Since50×100is5,000,whenyouaddthatmiddle50,thesumofallthenumbersfrom0to100is5,050.Cleverkid!)

AnEquivalentwhileLoop

Youcanactuallyuseawhilelooptodothesamethingasaforloop;forloopsarejustmoreconcise.Let’srewritefiveTimes.pytouseawhileloopequivalentofaforloop.

print('Mynameis')

i=0

whilei<5:

print('JimmyFiveTimes('+str(i)+')')

i=i+1

Ifyourunthisprogram,theoutputshouldlookthesameasthefiveTimes.pyprogram,whichusesaforloop.

TheStarting,Stopping,andSteppingArgumentstorange()

Somefunctionscanbecalledwithmultipleargumentsseparatedbyacomma,andrange()isoneofthem.Thisletsyouchangetheintegerpassedtorange()tofollowanysequenceofintegers,includingstartingatanumberotherthanzero.

foriinrange(12,16):

print(i)

Thefirstargumentwillbewheretheforloop’svariablestarts,andthesecondargumentwillbeupto,butnotincluding,thenumbertostopat.

12

13

14

15

Therange()functioncanalsobecalledwiththreearguments.Thefirsttwoargumentswillbethestartandstopvalues,andthethirdwillbethestepargument.Thestepistheamountthatthevariableisincreasedbyaftereachiteration.

foriinrange(0,10,2):

print(i)

Socallingrange(0,10,2)willcountfromzerotoeightbyintervalsoftwo.0

2

4

6

8

Therange()functionisflexibleinthesequenceofnumbersitproducesforforloops.Forexample(Ineverapologizeformypuns),youcanevenuseanegativenumberforthestepargumenttomaketheforloopcountdowninsteadofup.

foriinrange(5,-1,-1):

print(i)

Runningaforlooptoprintiwithrange(5,-1,-1)shouldprintfromfivedowntozero.

5

4

3

2

1

0

ImportingModulesAllPythonprogramscancallabasicsetoffunctionscalledbuilt-infunctions,includingtheprint(),input(),andlen()functionsyou’veseenbefore.Pythonalsocomeswithasetofmodulescalledthestandardlibrary.EachmoduleisaPythonprogramthatcontainsarelatedgroupoffunctionsthatcanbeembeddedinyourprograms.Forexample,themathmodulehasmathematics-relatedfunctions,therandommodulehasrandomnumber–relatedfunctions,andsoon.

Beforeyoucanusethefunctionsinamodule,youmustimportthemodulewithanimportstatement.Incode,animportstatementconsistsofthefollowing:

TheimportkeywordThenameofthemoduleOptionally,moremodulenames,aslongastheyareseparatedbycommas

Onceyouimportamodule,youcanuseallthecoolfunctionsofthatmodule.Let’sgiveitatrywiththerandommodule,whichwillgiveusaccesstotherandom.ranint()function.

Enterthiscodeintothefileeditor,andsaveitasprintRandom.py:importrandom

foriinrange(5):

print(random.randint(1,10))

Whenyourunthisprogram,theoutputwilllooksomethinglikethis:4

1

8

4

1

Therandom.randint()functioncallevaluatestoarandomintegervaluebetweenthetwointegersthatyoupassit.Sincerandint()isintherandommodule,youmustfirsttyperandom.infrontofthefunctionnametotellPythontolookforthisfunctioninsidetherandommodule.

Here’sanexampleofanimportstatementthatimportsfourdifferentmodules:importrandom,sys,os,math

Nowwecanuseanyofthefunctionsinthesefourmodules.We’lllearnmoreaboutthemlaterinthebook.

fromimportStatementsAnalternativeformoftheimportstatementiscomposedofthefromkeyword,followedbythemodulename,theimportkeyword,andastar;forexample,fromrandomimport*.

Withthisformofimportstatement,callstofunctionsinrandomwillnotneedtherandom.prefix.However,usingthefullnamemakesformorereadablecode,soitisbettertousethenormalformoftheimportstatement.

EndingaProgramEarlywithsys.exit()Thelastflowcontrolconcepttocoverishowtoterminatetheprogram.Thisalwayshappensiftheprogramexecutionreachesthebottomoftheinstructions.However,youcancausetheprogramtoterminate,orexit,bycallingthesys.exit()function.Sincethisfunctionisinthesysmodule,youhavetoimportsysbeforeyourprogramcanuseit.

Openanewfileeditorwindowandenterthefollowingcode,savingitasexitExample.py:importsys

whileTrue:

print('Typeexittoexit.')

response=input()

ifresponse=='exit':

sys.exit()

print('Youtyped'+response+'.')

RunthisprograminIDLE.Thisprogramhasaninfiniteloopwithnobreakstatementinside.Theonlywaythisprogramwillendisiftheuserentersexit,causingsys.exit()tobecalled.Whenresponseisequaltoexit,theprogramends.Sincetheresponsevariableissetbytheinput()function,theusermustenterexitinordertostoptheprogram.

SummaryByusingexpressionsthatevaluatetoTrueorFalse(alsocalledconditions),youcanwriteprogramsthatmakedecisionsonwhatcodetoexecuteandwhatcodetoskip.YoucanalsoexecutecodeoverandoveragaininaloopwhileacertainconditionevaluatestoTrue.Thebreakandcontinuestatementsareusefulifyouneedtoexitalooporjumpbacktothestart.

Theseflowcontrolstatementswillletyouwritemuchmoreintelligentprograms.There’sanothertypeofflowcontrolthatyoucanachievebywritingyourownfunctions,whichisthetopicofthenextchapter.

PracticeQuestionsQ: 1.WhatarethetwovaluesoftheBooleandatatype?Howdoyouwritethem?

Q: 2.WhatarethethreeBooleanoperators?

Q: 3.WriteoutthetruthtablesofeachBooleanoperator(thatis,everypossiblecombinationofBooleanvaluesfortheoperatorandwhattheyevaluateto).

Q: 4.Whatdothefollowingexpressionsevaluateto?(5>4)and(3==5)

not(5>4)

(5>4)or(3==5)

not((5>4)or(3==5))

(TrueandTrue)and(True==False)

(notFalse)or(notTrue)

Q: 5.Whatarethesixcomparisonoperators?

Q: 6.Whatisthedifferencebetweentheequaltooperatorandtheassignmentoperator?

Q: 7.Explainwhataconditionisandwhereyouwoulduseone.

Q: 8.Identifythethreeblocksinthiscode:spam=0

ifspam==10:

print('eggs')

ifspam>5:

print('bacon')

else:

print('ham')

print('spam')

print('spam')

Q: 9.WritecodethatprintsHelloif1isstoredinspam,printsHowdyif2isstoredinspam,andprintsGreetings!ifanythingelseisstoredinspam.

Q: 10.Whatcanyoupressifyourprogramisstuckinaninfiniteloop?

Q: 11.Whatisthedifferencebetweenbreakandcontinue?

Q: 12.Whatisthedifferencebetweenrange(10),range(0,10),andrange(0,10,1)inaforloop?

Q: 13.Writeashortprogramthatprintsthenumbers1to10usingaforloop.Thenwriteanequivalentprogramthatprintsthenumbers1to10usingawhileloop.

Q: 14.Ifyouhadafunctionnamedbacon()insideamodulenamedspam,howwouldyoucallitafterimportingspam?

Extracredit:Lookuptheround()andabs()functionsontheInternet,andfindoutwhattheydo.Experimentwiththemintheinteractiveshell.

Chapter3.FunctionsYou’realreadyfamiliarwiththeprint(),input(),andlen()functionsfromthepreviouschapters.Pythonprovidesseveralbuiltinfunctionslikethese,butyoucanalsowriteyourownfunctions.Afunctionislikeamini-programwithinaprogram.

Tobetterunderstandhowfunctionswork,let’screateone.TypethisprogramintothefileeditorandsaveitashelloFunc.py:

➊defhello():

➋print('Howdy!')

print('Howdy!!!')

print('Hellothere.')

➌hello()

hello()

hello()

Thefirstlineisadefstatement➊,whichdefinesafunctionnamedhello().Thecodeintheblockthatfollowsthedefstatement➋isthebodyofthefunction.Thiscodeisexecutedwhenthefunctioniscalled,notwhenthefunctionisfirstdefined.

Thehello()linesafterthefunction➌arefunctioncalls.Incode,afunctioncallisjustthefunction’snamefollowedbyparentheses,possiblywithsomenumberofargumentsinbetweentheparentheses.Whentheprogramexecutionreachesthesecalls,itwilljumptothetoplineinthefunctionandbeginexecutingthecodethere.Whenitreachestheendofthefunction,theexecutionreturnstothelinethatcalledthefunctionandcontinuesmovingthroughthecodeasbefore.

Sincethisprogramcallshello()threetimes,thecodeinthehello()functionisexecutedthreetimes.Whenyourunthisprogram,theoutputlookslikethis:

Howdy!

Howdy!!!

Hellothere.

Howdy!

Howdy!!!

Hellothere.

Howdy!

Howdy!!!

Hellothere.

Amajorpurposeoffunctionsistogroupcodethatgetsexecutedmultipletimes.Withoutafunctiondefined,youwouldhavetocopyandpastethiscodeeachtime,andtheprogramwouldlooklikethis:

print('Howdy!')

print('Howdy!!!')

print('Hellothere.')

print('Howdy!')

print('Howdy!!!')

print('Hellothere.')

print('Howdy!')

print('Howdy!!!')

print('Hellothere.')

Ingeneral,youalwayswanttoavoidduplicatingcode,becauseifyoueverdecidetoupdatethecode—if,forexample,youfindabugyouneedtofix—you’llhavetoremembertochangethecodeeverywhereyoucopiedit.

Asyougetmoreprogrammingexperience,you’lloftenfindyourselfdeduplicatingcode,whichmeansgettingridofduplicatedorcopy-and-pastedcode.Deduplicationmakesyour

programsshorter,easiertoread,andeasiertoupdate.

defStatementswithParametersWhenyoucalltheprint()orlen()function,youpassinvalues,calledargumentsinthiscontext,bytypingthembetweentheparentheses.Youcanalsodefineyourownfunctionsthatacceptarguments.TypethisexampleintothefileeditorandsaveitashelloFunc2.py:

➊defhello(name):

➋print('Hello'+name)

➌hello('Alice')

hello('Bob')

Whenyourunthisprogram,theoutputlookslikethis:HelloAlice

HelloBob

Thedefinitionofthehello()functioninthisprogramhasaparametercalledname➊.Aparameterisavariablethatanargumentisstoredinwhenafunctioniscalled.Thefirsttimethehello()functioniscalled,it’swiththeargument'Alice'➌.Theprogramexecutionentersthefunction,andthevariablenameisautomaticallysetto'Alice',whichiswhatgetsprintedbytheprint()statement➋.

Onespecialthingtonoteaboutparametersisthatthevaluestoredinaparameterisforgottenwhenthefunctionreturns.Forexample,ifyouaddedprint(name)afterhello('Bob')inthepreviousprogram,theprogramwouldgiveyouaNameErrorbecausethereisnovariablenamedname.Thisvariablewasdestroyedafterthefunctioncallhello('Bob')hadreturned,soprint(name)wouldrefertoanamevariablethatdoesnotexist.

Thisissimilartohowaprogram’svariablesareforgottenwhentheprogramterminates.I’lltalkmoreaboutwhythathappenslaterinthechapter,whenIdiscusswhatafunction’slocalscopeis.

ReturnValuesandreturnStatementsWhenyoucallthelen()functionandpassitanargumentsuchas'Hello',thefunctioncallevaluatestotheintegervalue5,whichisthelengthofthestringyoupassedit.Ingeneral,thevaluethatafunctioncallevaluatestoiscalledthereturnvalueofthefunction.

Whencreatingafunctionusingthedefstatement,youcanspecifywhatthereturnvalueshouldbewithareturnstatement.Areturnstatementconsistsofthefollowing:

ThereturnkeywordThevalueorexpressionthatthefunctionshouldreturn

Whenanexpressionisusedwithareturnstatement,thereturnvalueiswhatthisexpressionevaluatesto.Forexample,thefollowingprogramdefinesafunctionthatreturnsadifferentstringdependingonwhatnumberitispassedasanargument.Typethiscodeintothefileeditorandsaveitasmagic8Ball.py:

➊importrandom

➋defgetAnswer(answerNumber):

➌ifanswerNumber==1:

return'Itiscertain'

elifanswerNumber==2:

return'Itisdecidedlyso'

elifanswerNumber==3:

return'Yes'

elifanswerNumber==4:

return'Replyhazytryagain'

elifanswerNumber==5:

return'Askagainlater'

elifanswerNumber==6:

return'Concentrateandaskagain'

elifanswerNumber==7:

return'Myreplyisno'

elifanswerNumber==8:

return'Outlooknotsogood'

elifanswerNumber==9:

return'Verydoubtful'

➍r=random.randint(1,9)

➎fortune=getAnswer(r)

➏print(fortune)

Whenthisprogramstarts,Pythonfirstimportstherandommodule➊.ThenthegetAnswer()functionisdefined➋.Becausethefunctionisbeingdefined(andnotcalled),theexecutionskipsoverthecodeinit.Next,therandom.randint()functioniscalledwithtwoarguments,1and9➍.Itevaluatestoarandomintegerbetween1and9(including1and9themselves),andthisvalueisstoredinavariablenamedr.

ThegetAnswer()functioniscalledwithrastheargument➎.TheprogramexecutionmovestothetopofthegetAnswer()function➌,andthevaluerisstoredinaparameternamedanswerNumber.Then,dependingonthisvalueinanswerNumber,thefunctionreturnsoneofmanypossiblestringvalues.TheprogramexecutionreturnstothelineatthebottomoftheprogramthatoriginallycalledgetAnswer()➎.Thereturnedstringisassignedtoavariablenamedfortune,whichthengetspassedtoaprint()call➏andisprintedtothescreen.

Notethatsinceyoucanpassreturnvaluesasanargumenttoanotherfunctioncall,youcouldshortenthesethreelines:

r=random.randint(1,9)

fortune=getAnswer(r)

print(fortune)

tothissingleequivalentline:print(getAnswer(random.randint(1,9)))

Remember,expressionsarecomposedofvaluesandoperators.Afunctioncallcanbeusedinanexpressionbecauseitevaluatestoitsreturnvalue.

TheNoneValueInPythonthereisavaluecalledNone,whichrepresentstheabsenceofavalue.NoneistheonlyvalueoftheNoneTypedatatype.(Otherprogramminglanguagesmightcallthisvaluenull,nil,orundefined.)JustliketheBooleanTrueandFalsevalues,NonemustbetypedwithacapitalN.

Thisvalue-without-a-valuecanbehelpfulwhenyouneedtostoresomethingthatwon’tbeconfusedforarealvalueinavariable.OneplacewhereNoneisusedisasthereturnvalueofprint().Theprint()functiondisplaystextonthescreen,butitdoesn’tneedtoreturnanythinginthesamewaylen()orinput()does.Butsinceallfunctioncallsneedtoevaluatetoareturnvalue,print()returnsNone.Toseethisinaction,enterthefollowingintotheinteractiveshell:

>>>spam=print('Hello!')

Hello!

>>>None==spam

True

Behindthescenes,PythonaddsreturnNonetotheendofanyfunctiondefinitionwithnoreturnstatement.Thisissimilartohowawhileorforloopimplicitlyendswithacontinuestatement.Also,ifyouuseareturnstatementwithoutavalue(thatis,justthereturnkeywordbyitself),thenNoneisreturned.

KeywordArgumentsandprint()Mostargumentsareidentifiedbytheirpositioninthefunctioncall.Forexample,random.randint(1,10)isdifferentfromrandom.randint(10,1).Thefunctioncallrandom.randint(1,10)willreturnarandomintegerbetween1and10,becausethefirstargumentisthelowendoftherangeandthesecondargumentisthehighend(whilerandom.randint(10,1)causesanerror).

However,keywordargumentsareidentifiedbythekeywordputbeforetheminthefunctioncall.Keywordargumentsareoftenusedforoptionalparameters.Forexample,theprint()functionhastheoptionalparametersendandseptospecifywhatshouldbeprintedattheendofitsargumentsandbetweenitsarguments(separatingthem),respectively.

Ifyouranthefollowingprogram:print('Hello')

print('World')

theoutputwouldlooklikethis:Hello

World

Thetwostringsappearonseparatelinesbecausetheprint()functionautomaticallyaddsanewlinecharactertotheendofthestringitispassed.However,youcansettheendkeywordargumenttochangethistoadifferentstring.Forexample,iftheprogramwerethis:

print('Hello',end='')

print('World')

theoutputwouldlooklikethis:HelloWorld

Theoutputisprintedonasinglelinebecausethereisnolongeranew-lineprintedafter'Hello'.Instead,theblankstringisprinted.Thisisusefulifyouneedtodisablethenewlinethatgetsaddedtotheendofeveryprint()functioncall.

Similarly,whenyoupassmultiplestringvaluestoprint(),thefunctionwillautomaticallyseparatethemwithasinglespace.Enterthefollowingintotheinteractiveshell:

>>>print('cats','dogs','mice')

catsdogsmice

Butyoucouldreplacethedefaultseparatingstringbypassingthesepkeywordargument.Enterthefollowingintotheinteractiveshell:

>>>print('cats','dogs','mice',sep=',')

cats,dogs,mice

Youcanaddkeywordargumentstothefunctionsyouwriteaswell,butfirstyou’llhavetolearnaboutthelistanddictionarydatatypesinthenexttwochapters.Fornow,justknowthatsomefunctionshaveoptionalkeywordargumentsthatcanbespecifiedwhenthefunctioniscalled.

LocalandGlobalScopeParametersandvariablesthatareassignedinacalledfunctionaresaidtoexistinthatfunction’slocalscope.Variablesthatareassignedoutsideallfunctionsaresaidtoexistintheglobalscope.Avariablethatexistsinalocalscopeiscalledalocalvariable,whileavariablethatexistsintheglobalscopeiscalledaglobalvariable.Avariablemustbeoneortheother;itcannotbebothlocalandglobal.

Thinkofascopeasacontainerforvariables.Whenascopeisdestroyed,allthevaluesstoredinthescope’svariablesareforgotten.Thereisonlyoneglobalscope,anditiscreatedwhenyourprogrambegins.Whenyourprogramterminates,theglobalscopeisdestroyed,andallitsvariablesareforgotten.Otherwise,thenexttimeyouranyourprogram,thevariableswouldremembertheirvaluesfromthelasttimeyouranit.

Alocalscopeiscreatedwheneverafunctioniscalled.Anyvariablesassignedinthisfunctionexistwithinthelocalscope.Whenthefunctionreturns,thelocalscopeisdestroyed,andthesevariablesareforgotten.Thenexttimeyoucallthisfunction,thelocalvariableswillnotrememberthevaluesstoredinthemfromthelasttimethefunctionwascalled.

Scopesmatterforseveralreasons:

Codeintheglobalscopecannotuseanylocalvariables.However,alocalscopecanaccessglobalvariables.Codeinafunction’slocalscopecannotusevariablesinanyotherlocalscope.Youcanusethesamenamefordifferentvariablesiftheyareindifferentscopes.Thatis,therecanbealocalvariablenamedspamandaglobalvariablealsonamedspam.

ThereasonPythonhasdifferentscopesinsteadofjustmakingeverythingaglobalvariableissothatwhenvariablesaremodifiedbythecodeinaparticularcalltoafunction,thefunctioninteractswiththerestoftheprogramonlythroughitsparametersandthereturnvalue.Thisnarrowsdownthelistcodelinesthatmaybecausingabug.Ifyourprogramcontainednothingbutglobalvariablesandhadabugbecauseofavariablebeingsettoabadvalue,thenitwouldbehardtotrackdownwherethisbadvaluewasset.Itcouldhavebeensetfromanywhereintheprogram—andyourprogramcouldbehundredsorthousandsoflineslong!Butifthebugisbecauseofalocalvariablewithabadvalue,youknowthatonlythecodeinthatonefunctioncouldhavesetitincorrectly.

Whileusingglobalvariablesinsmallprogramsisfine,itisabadhabittorelyonglobalvariablesasyourprogramsgetlargerandlarger.

LocalVariablesCannotBeUsedintheGlobalScopeConsiderthisprogram,whichwillcauseanerrorwhenyourunit:

defspam():

eggs=31337

spam()

print(eggs)

Ifyourunthisprogram,theoutputwilllooklikethis:Traceback(mostrecentcalllast):

File"C:/test3784.py",line4,in<module>

print(eggs)

NameError:name'eggs'isnotdefined

Theerrorhappensbecausetheeggsvariableexistsonlyinthelocalscopecreatedwhenspam()iscalled.Oncetheprogramexecutionreturnsfromspam,thatlocalscopeisdestroyed,andthereisnolongeravariablenamedeggs.Sowhenyourprogramtriestorunprint(eggs),Pythongivesyouanerrorsayingthateggsisnotdefined.Thismakessenseifyouthinkaboutit;whentheprogramexecutionisintheglobalscope,nolocalscopesexist,sotherecan’tbeanylocalvariables.Thisiswhyonlyglobalvariablescanbeusedintheglobalscope.

LocalScopesCannotUseVariablesinOtherLocalScopesAnewlocalscopeiscreatedwheneverafunctioniscalled,includingwhenafunctioniscalledfromanotherfunction.Considerthisprogram:

defspam():

➊eggs=99

➋bacon()

➌print(eggs)

defbacon():

ham=101

➍eggs=0

➎spam()

Whentheprogramstarts,thespam()functioniscalled➎,andalocalscopeiscreated.Thelocalvariableeggs➊issetto99.Thenthebacon()functioniscalled➋,andasecondlocalscopeiscreated.Multiplelocalscopescanexistatthesametime.Inthisnewlocalscope,thelocalvariablehamissetto101,andalocalvariableeggs—whichisdifferentfromtheoneinspam()’slocalscope—isalsocreated➍andsetto0.

Whenbacon()returns,thelocalscopeforthatcallisdestroyed.Theprogramexecutioncontinuesinthespam()functiontoprintthevalueofeggs➌,andsincethelocalscopeforthecalltospam()stillexistshere,theeggsvariableissetto99.Thisiswhattheprogramprints.

Theupshotisthatlocalvariablesinonefunctionarecompletelyseparatefromthelocalvariablesinanotherfunction.

GlobalVariablesCanBeReadfromaLocalScopeConsiderthefollowingprogram:

defspam():

print(eggs)

eggs=42

spam()

print(eggs)

Sincethereisnoparameternamedeggsoranycodethatassignseggsavalueinthespam()function,wheneggsisusedinspam(),Pythonconsidersitareferencetotheglobalvariableeggs.Thisiswhy42isprintedwhenthepreviousprogramisrun.

LocalandGlobalVariableswiththeSameNameTosimplifyyourlife,avoidusinglocalvariablesthathavethesamenameasaglobalvariableoranotherlocalvariable.Buttechnically,it’sperfectlylegaltodosoinPython.Toseewhathappens,typethefollowingcodeintothefileeditorandsaveitas

sameName.py:defspam():

➊eggs='spamlocal'

print(eggs)#prints'spamlocal'

defbacon():

➋eggs='baconlocal'

print(eggs)#prints'baconlocal'

spam()

print(eggs)#prints'baconlocal'

➌eggs='global'

bacon()

print(eggs)#prints'global'

Whenyourunthisprogram,itoutputsthefollowing:baconlocal

spamlocal

baconlocal

global

Thereareactuallythreedifferentvariablesinthisprogram,butconfusinglytheyareallnamedeggs.Thevariablesareasfollows:

➊Avariablenamedeggsthatexistsinalocalscopewhenspam()iscalled.

➋Avariablenamedeggsthatexistsinalocalscopewhenbacon()iscalled.

➌Avariablenamedeggsthatexistsintheglobalscope.

Sincethesethreeseparatevariablesallhavethesamename,itcanbeconfusingtokeeptrackofwhichoneisbeingusedatanygiventime.Thisiswhyyoushouldavoidusingthesamevariablenameindifferentscopes.

TheglobalStatementIfyouneedtomodifyaglobalvariablefromwithinafunction,usetheglobalstatement.Ifyouhavealinesuchasglobaleggsatthetopofafunction,ittellsPython,“Inthisfunction,eggsreferstotheglobalvariable,sodon’tcreatealocalvariablewiththisname.”Forexample,typethefollowingcodeintothefileeditorandsaveitassameName2.py:

defspam():

➊globaleggs

➋eggs='spam'

eggs='global'

spam()

print(eggs)

Whenyourunthisprogram,thefinalprint()callwilloutputthis:spam

Becauseeggsisdeclaredglobalatthetopofspam()➊,wheneggsissetto'spam'➋,thisassignmentisdonetothegloballyscopedspam.Nolocalspamvariableiscreated.

Therearefourrulestotellwhetheravariableisinalocalscopeorglobalscope:

1. Ifavariableisbeingusedintheglobalscope(thatis,outsideofallfunctions),thenitisalwaysaglobalvariable.

2. Ifthereisaglobalstatementforthatvariableinafunction,itisaglobalvariable.3. Otherwise,ifthevariableisusedinanassignmentstatementinthefunction,itisa

localvariable.4. Butifthevariableisnotusedinanassignmentstatement,itisaglobalvariable.

Togetabetterfeelfortheserules,here’sanexampleprogram.TypethefollowingcodeintothefileeditorandsaveitassameName3.py:

defspam():

➊globaleggs

eggs='spam'#thisistheglobal

defbacon():

➋eggs='bacon'#thisisalocal

defham():

➌print(eggs)#thisistheglobal

eggs=42#thisistheglobal

spam()

print(eggs)

Inthespam()function,eggsistheglobaleggsvariable,becausethere’saglobalstatementforeggsatthebeginningofthefunction➊.Inbacon(),eggsisalocalvariable,becausethere’sanassignmentstatementforitinthatfunction➋.Inham()➌,eggsistheglobalvariable,becausethereisnoassignmentstatementorglobalstatementforitinthatfunction.IfyourunsameName3.py,theoutputwilllooklikethis:

spam

Inafunction,avariablewilleitheralwaysbeglobaloralwaysbelocal.There’snowaythatthecodeinafunctioncanusealocalvariablenamedeggsandthenlaterinthatsamefunctionusetheglobaleggsvariable.

NOTE

Ifyoueverwanttomodifythevaluestoredinaglobalvariablefrominafunction,youmustuseaglobalstatementonthatvariable.

Ifyoutrytousealocalvariableinafunctionbeforeyouassignavaluetoit,asinthefollowingprogram,Pythonwillgiveyouanerror.Toseethis,typethefollowingintothefileeditorandsaveitassameName4.py:

defspam():

print(eggs)#ERROR!

➊eggs='spamlocal'

➋eggs='global'

spam()

Ifyourunthepreviousprogram,itproducesanerrormessage.Traceback(mostrecentcalllast):

File"C:/test3784.py",line6,in<module>

spam()

File"C:/test3784.py",line2,inspam

print(eggs)#ERROR!

UnboundLocalError:localvariable'eggs'referencedbeforeassignment

ThiserrorhappensbecausePythonseesthatthereisanassignmentstatementforeggsinthespam()function➊andthereforeconsiderseggstobelocal.Butbecauseprint(eggs)isexecutedbeforeeggsisassignedanything,thelocalvariableeggsdoesn’texist.Pythonwillnotfallbacktousingtheglobaleggsvariable➋.

FUNCTIONSAS“BLACKBOXES”

Often,allyouneedtoknowaboutafunctionareitsinputs(theparameters)andoutputvalue;youdon’talwayshavetoburdenyourselfwithhowthefunction’scodeactuallyworks.Whenyouthinkaboutfunctionsinthishigh-levelway,it’scommontosaythatyou’retreatingthefunctionasa“blackbox.”

Thisideaisfundamentaltomodernprogramming.Laterchaptersinthisbookwillshowyouseveralmoduleswithfunctionsthatwerewrittenbyotherpeople.Whileyoucantakeapeekatthesourcecodeifyou’recurious,youdon’tneedtoknowhowthesefunctionsworkinordertousethem.Andbecausewritingfunctionswithoutglobalvariablesisencouraged,youusuallydon’thavetoworryaboutthefunction’scodeinteractingwiththerestofyourprogram.

ExceptionHandlingRightnow,gettinganerror,orexception,inyourPythonprogrammeanstheentireprogramwillcrash.Youdon’twantthistohappeninreal-worldprograms.Instead,youwanttheprogramtodetecterrors,handlethem,andthencontinuetorun.

Forexample,considerthefollowingprogram,whichhasa“divide-by-zero”error.Openanewfileeditorwindowandenterthefollowingcode,savingitaszeroDivide.py:

defspam(divideBy):

return42/divideBy

print(spam(2))

print(spam(12))

print(spam(0))

print(spam(1))

We’vedefinedafunctioncalledspam,givenitaparameter,andthenprintedthevalueofthatfunctionwithvariousparameterstoseewhathappens.Thisistheoutputyougetwhenyourunthepreviouscode:

21.0

3.5

Traceback(mostrecentcalllast):

File"C:/zeroDivide.py",line6,in<module>

print(spam(0))

File"C:/zeroDivide.py",line2,inspam

return42/divideBy

ZeroDivisionError:divisionbyzero

AZeroDivisionErrorhappenswheneveryoutrytodivideanumberbyzero.Fromthelinenumbergivenintheerrormessage,youknowthatthereturnstatementinspam()iscausinganerror.

Errorscanbehandledwithtryandexceptstatements.Thecodethatcouldpotentiallyhaveanerrorisputinatryclause.Theprogramexecutionmovestothestartofafollowingexceptclauseifanerrorhappens.

Youcanputthepreviousdivide-by-zerocodeinatryclauseandhaveanexceptclausecontaincodetohandlewhathappenswhenthiserroroccurs.

defspam(divideBy):

try:

return42/divideBy

exceptZeroDivisionError:

print('Error:Invalidargument.')

print(spam(2))

print(spam(12))

print(spam(0))

print(spam(1))

Whencodeinatryclausecausesanerror,theprogramexecutionimmediatelymovestothecodeintheexceptclause.Afterrunningthatcode,theexecutioncontinuesasnormal.Theoutputofthepreviousprogramisasfollows:

21.0

3.5

Error:Invalidargument.

None

42.0

Notethatanyerrorsthatoccurinfunctioncallsinatryblockwillalsobecaught.Considerthefollowingprogram,whichinsteadhasthespam()callsinthetryblock:

defspam(divideBy):

return42/divideBy

try:

print(spam(2))

print(spam(12))

print(spam(0))

print(spam(1))

exceptZeroDivisionError:

print('Error:Invalidargument.')

Whenthisprogramisrun,theoutputlookslikethis:21.0

3.5

Error:Invalidargument.

Thereasonprint(spam(1))isneverexecutedisbecauseoncetheexecutionjumpstothecodeintheexceptclause,itdoesnotreturntothetryclause.Instead,itjustcontinuesmovingdownasnormal.

AShortProgram:GuesstheNumberThetoyexamplesI’veshowyousofarareusefulforintroducingbasicconcepts,butnowlet’sseehoweverythingyou’velearnedcomestogetherinamorecompleteprogram.Inthissection,I’llshowyouasimple“guessthenumber”game.Whenyourunthisprogram,theoutputwilllooksomethinglikethis:

Iamthinkingofanumberbetween1and20.

Takeaguess.

10

Yourguessistoolow.

Takeaguess.

15

Yourguessistoolow.

Takeaguess.

17

Yourguessistoohigh.

Takeaguess.

16

Goodjob!Youguessedmynumberin4guesses!

Typethefollowingsourcecodeintothefileeditor,andsavethefileasguessTheNumber.py:

#Thisisaguessthenumbergame.

importrandom

secretNumber=random.randint(1,20)

print('Iamthinkingofanumberbetween1and20.')

#Asktheplayertoguess6times.

forguessesTakeninrange(1,7):

print('Takeaguess.')

guess=int(input())

ifguess<secretNumber:

print('Yourguessistoolow.')

elifguess>secretNumber:

print('Yourguessistoohigh.')

else:

break#Thisconditionisthecorrectguess!

ifguess==secretNumber:

print('Goodjob!Youguessedmynumberin'+str(guessesTaken)+'guesses!')

else:

print('Nope.ThenumberIwasthinkingofwas'+str(secretNumber))

Let’slookatthiscodelinebyline,startingatthetop.#Thisisaguessthenumbergame.

importrandom

secretNumber=random.randint(1,20)

First,acommentatthetopofthecodeexplainswhattheprogramdoes.Then,theprogramimportstherandommodulesothatitcanusetherandom.randint()functiontogenerateanumberfortheusertoguess.Thereturnvalue,arandomintegerbetween1and20,isstoredinthevariablesecretNumber.

print('Iamthinkingofanumberbetween1and20.')

#Asktheplayertoguess6times.

forguessesTakeninrange(1,7):

print('Takeaguess.')

guess=int(input())

Theprogramtellstheplayerthatithascomeupwithasecretnumberandwillgivetheplayersixchancestoguessit.Thecodethatletstheplayerenteraguessandchecksthatguessisinaforloopthatwillloopatmostsixtimes.Thefirstthingthathappensintheloopisthattheplayertypesinaguess.Sinceinput()returnsastring,itsreturnvalueis

passedstraightintoint(),whichtranslatesthestringintoanintegervalue.Thisgetsstoredinavariablenamedguess.

ifguess<secretNumber:

print('Yourguessistoolow.')

elifguess>secretNumber:

print('Yourguessistoohigh.')

Thesefewlinesofcodechecktoseewhethertheguessislessthanorgreaterthanthesecretnumber.Ineithercase,ahintisprintedtothescreen.

else:

break#Thisconditionisthecorrectguess!

Iftheguessisneitherhighernorlowerthanthesecretnumber,thenitmustbeequaltothesecretnumber,inwhichcaseyouwanttheprogramexecutiontobreakoutoftheforloop.

ifguess==secretNumber:

print('Goodjob!Youguessedmynumberin'+str(guessesTaken)+'guesses!')

else:

print('Nope.ThenumberIwasthinkingofwas'+str(secretNumber))

Aftertheforloop,thepreviousif…elsestatementcheckswhethertheplayerhascorrectlyguessedthenumberandprintsanappropriatemessagetothescreen.Inbothcases,theprogramdisplaysavariablethatcontainsanintegervalue(guessesTakenandsecretNumber).Sinceitmustconcatenatetheseintegervaluestostrings,itpassesthesevariablestothestr()function,whichreturnsthestringvalueformoftheseintegers.Nowthesestringscanbeconcatenatedwiththe+operatorsbeforefinallybeingpassedtotheprint()functioncall.

SummaryFunctionsaretheprimarywaytocompartmentalizeyourcodeintologicalgroups.Sincethevariablesinfunctionsexistintheirownlocalscopes,thecodeinonefunctioncannotdirectlyaffectthevaluesofvariablesinotherfunctions.Thislimitswhatcodecouldbechangingthevaluesofyourvariables,whichcanbehelpfulwhenitcomestodebuggingyourcode.

Functionsareagreattooltohelpyouorganizeyourcode.Youcanthinkofthemasblackboxes:Theyhaveinputsintheformofparametersandoutputsintheformofreturnvalues,andthecodeinthemdoesn’taffectvariablesinotherfunctions.

Inpreviouschapters,asingleerrorcouldcauseyourprogramstocrash.Inthischapter,youlearnedabouttryandexceptstatements,whichcanruncodewhenanerrorhasbeendetected.Thiscanmakeyourprogramsmoreresilienttocommonerrorcases.

PracticeQuestionsQ: 1.Whyarefunctionsadvantageoustohaveinyourprograms?

Q: 2.Whendoesthecodeinafunctionexecute:whenthefunctionisdefinedorwhenthefunctioniscalled?

Q: 3.Whatstatementcreatesafunction?

Q: 4.Whatisthedifferencebetweenafunctionandafunctioncall?

Q: 5.HowmanyglobalscopesarethereinaPythonprogram?Howmanylocalscopes?

Q: 6.Whathappenstovariablesinalocalscopewhenthefunctioncallreturns?

Q: 7.Whatisareturnvalue?Canareturnvaluebepartofanexpression?

Q: 8.Ifafunctiondoesnothaveareturnstatement,whatisthereturnvalueofacalltothatfunction?

Q: 9.Howcanyouforceavariableinafunctiontorefertotheglobalvariable?

Q: 10.WhatisthedatatypeofNone?

Q: 11.Whatdoestheimportareallyourpetsnamedericstatementdo?

Q: 12.Ifyouhadafunctionnamedbacon()inamodulenamedspam,howwouldyoucallitafterimportingspam?

Q: 13.Howcanyoupreventaprogramfromcrashingwhenitgetsanerror?

Q: 14.Whatgoesinthetryclause?Whatgoesintheexceptclause?

PracticeProjectsForpractice,writeprogramstodothefollowingtasks.

TheCollatzSequenceWriteafunctionnamedcollatz()thathasoneparameternamednumber.Ifnumberiseven,thencollatz()shouldprintnumber//2andreturnthisvalue.Ifnumberisodd,thencollatz()shouldprintandreturn3*number+1.

Thenwriteaprogramthatletstheusertypeinanintegerandthatkeepscallingcollatz()onthatnumberuntilthefunctionreturnsthevalue1.(Amazinglyenough,thissequenceactuallyworksforanyinteger—soonerorlater,usingthissequence,you’llarriveat1!Evenmathematiciansaren’tsurewhy.Yourprogramisexploringwhat’scalledtheCollatzsequence,sometimescalled“thesimplestimpossiblemathproblem.”)

Remembertoconvertthereturnvaluefrominput()toanintegerwiththeint()function;otherwise,itwillbeastringvalue.

Hint:Anintegernumberisevenifnumber%2==0,andit’soddifnumber%2==1.

Theoutputofthisprogramcouldlooksomethinglikethis:Enternumber:

3

10

5

16

8

4

2

1

InputValidationAddtryandexceptstatementstothepreviousprojecttodetectwhethertheusertypesinanonintegerstring.Normally,theint()functionwillraiseaValueErrorerrorifitispassedanonintegerstring,asinint('puppy').Intheexceptclause,printamessagetotheusersayingtheymustenteraninteger.

Chapter4.ListsOnemoretopicyou’llneedtounderstandbeforeyoucanbeginwritingprogramsinearnestisthelistdatatypeanditscousin,thetuple.Listsandtuplescancontainmultiplevalues,whichmakesiteasiertowriteprogramsthathandlelargeamountsofdata.Andsinceliststhemselvescancontainotherlists,youcanusethemtoarrangedataintohierarchicalstructures.

Inthischapter,I’lldiscussthebasicsoflists.I’llalsoteachyouaboutmethods,whicharefunctionsthataretiedtovaluesofacertaindatatype.ThenI’llbrieflycoverthelist-liketupleandstringdatatypesandhowtheycomparetolistvalues.Inthenextchapter,I’llintroduceyoutothedictionarydatatype.

TheListDataTypeAlistisavaluethatcontainsmultiplevaluesinanorderedsequence.Thetermlistvaluereferstothelistitself(whichisavaluethatcanbestoredinavariableorpassedtoafunctionlikeanyothervalue),notthevaluesinsidethelistvalue.Alistvaluelookslikethis:['cat','bat','rat','elephant'].Justasstringvaluesaretypedwithquotecharacterstomarkwherethestringbeginsandends,alistbeginswithanopeningsquarebracketandendswithaclosingsquarebracket,[].Valuesinsidethelistarealsocalleditems.Itemsareseparatedwithcommas(thatis,theyarecomma-delimited).Forexample,enterthefollowingintotheinteractiveshell:

>>>[1,2,3]

[1,2,3]

>>>['cat','bat','rat','elephant']

['cat','bat','rat','elephant']

>>>['hello',3.1415,True,None,42]

['hello',3.1415,True,None,42]

➊>>>spam=['cat','bat','rat','elephant']

>>>spam

['cat','bat','rat','elephant']

Thespamvariable➊isstillassignedonlyonevalue:thelistvalue.Butthelistvalueitselfcontainsothervalues.Thevalue[]isanemptylistthatcontainsnovalues,similarto'',theemptystring.

GettingIndividualValuesinaListwithIndexesSayyouhavethelist['cat','bat','rat','elephant']storedinavariablenamedspam.ThePythoncodespam[0]wouldevaluateto'cat',andspam[1]wouldevaluateto'bat',andsoon.Theintegerinsidethesquarebracketsthatfollowsthelistiscalledanindex.Thefirstvalueinthelistisatindex0,thesecondvalueisatindex1,thethirdvalueisatindex2,andsoon.Figure4-1showsalistvalueassignedtospam,alongwithwhattheindexexpressionswouldevaluateto.

Figure4-1.Alistvaluestoredinthevariablespam,showingwhichvalueeachindexrefersto

Forexample,typethefollowingexpressionsintotheinteractiveshell.Startbyassigningalisttothevariablespam.

>>>spam=['cat','bat','rat','elephant']

>>>spam[0]

'cat'

>>>spam[1]

'bat'

>>>spam[2]

'rat'

>>>spam[3]

'elephant'

>>>['cat','bat','rat','elephant'][3]

'elephant'

➊>>>'Hello'+spam[0]

➋'Hellocat'

>>>'The'+spam[1]+'atethe'+spam[0]+'.'

'Thebatatethecat.'

Noticethattheexpression'Hello'+spam[0]➊evaluatesto'Hello'+'cat'becausespam[0]evaluatestothestring'cat'.Thisexpressioninturnevaluatestothe

stringvalue'Hellocat'➋.

PythonwillgiveyouanIndexErrorerrormessageifyouuseanindexthatexceedsthenumberofvaluesinyourlistvalue.

>>>spam=['cat','bat','rat','elephant']

>>>spam[10000]

Traceback(mostrecentcalllast):

File"<pyshell#9>",line1,in<module>

spam[10000]

IndexError:listindexoutofrange

Indexescanbeonlyintegervalues,notfloats.ThefollowingexamplewillcauseaTypeErrorerror:

>>>spam=['cat','bat','rat','elephant']

>>>spam[1]

'bat'

>>>spam[1.0]

Traceback(mostrecentcalllast):

File"<pyshell#13>",line1,in<module>

spam[1.0]

TypeError:listindicesmustbeintegers,notfloat

>>>spam[int(1.0)]

'bat'

Listscanalsocontainotherlistvalues.Thevaluesintheselistsoflistscanbeaccessedusingmultipleindexes,likeso:

>>>spam=[['cat','bat'],[10,20,30,40,50]]

>>>spam[0]

['cat','bat']

>>>spam[0][1]

'bat'

>>>spam[1][4]

50

Thefirstindexdictateswhichlistvaluetouse,andthesecondindicatesthevaluewithinthelistvalue.Forexample,spam[0][1]prints'bat',thesecondvalueinthefirstlist.Ifyouonlyuseoneindex,theprogramwillprintthefulllistvalueatthatindex.

NegativeIndexesWhileindexesstartat0andgoup,youcanalsousenegativeintegersfortheindex.Theintegervalue-1referstothelastindexinalist,thevalue-2referstothesecond-to-lastindexinalist,andsoon.Enterthefollowingintotheinteractiveshell:

>>>spam=['cat','bat','rat','elephant']

>>>spam[-1]

'elephant'

>>>spam[-3]

'bat'

>>>'The'+spam[-1]+'isafraidofthe'+spam[-3]+'.'

'Theelephantisafraidofthebat.'

GettingSublistswithSlicesJustasanindexcangetasinglevaluefromalist,aslicecangetseveralvaluesfromalist,intheformofanewlist.Asliceistypedbetweensquarebrackets,likeanindex,butithastwointegersseparatedbyacolon.Noticethedifferencebetweenindexesandslices.

spam[2]isalistwithanindex(oneinteger).spam[1:4]isalistwithaslice(twointegers).

Inaslice,thefirstintegeristheindexwheretheslicestarts.Thesecondintegeristhe

indexwherethesliceends.Aslicegoesupto,butwillnotinclude,thevalueatthesecondindex.Asliceevaluatestoanewlistvalue.Enterthefollowingintotheinteractiveshell:

>>>spam=['cat','bat','rat','elephant']

>>>spam[0:4]

['cat','bat','rat','elephant']

>>>spam[1:3]

['bat','rat']

>>>spam[0:-1]

['cat','bat','rat']

Asashortcut,youcanleaveoutoneorbothoftheindexesoneithersideofthecolonintheslice.Leavingoutthefirstindexisthesameasusing0,orthebeginningofthelist.Leavingoutthesecondindexisthesameasusingthelengthofthelist,whichwillslicetotheendofthelist.Enterthefollowingintotheinteractiveshell:

>>>spam=['cat','bat','rat','elephant']

>>>spam[:2]

['cat','bat']

>>>spam[1:]

['bat','rat','elephant']

>>>spam[:]

['cat','bat','rat','elephant']

GettingaList’sLengthwithlen()Thelen()functionwillreturnthenumberofvaluesthatareinalistvaluepassedtoit,justlikeitcancountthenumberofcharactersinastringvalue.Enterthefollowingintotheinteractiveshell:

>>>spam=['cat','dog','moose']

>>>len(spam)

3

ChangingValuesinaListwithIndexesNormallyavariablenamegoesontheleftsideofanassignmentstatement,likespam=42.However,youcanalsouseanindexofalisttochangethevalueatthatindex.Forexample,spam[1]='aardvark'means“Assignthevalueatindex1inthelistspamtothestring'aardvark'.”Enterthefollowingintotheinteractiveshell:

>>>spam=['cat','bat','rat','elephant']

>>>spam[1]='aardvark'

>>>spam

['cat','aardvark','rat','elephant']

>>>spam[2]=spam[1]

>>>spam

['cat','aardvark','aardvark','elephant']

>>>spam[-1]=12345

>>>spam

['cat','aardvark','aardvark',12345]

ListConcatenationandListReplicationThe+operatorcancombinetwoliststocreateanewlistvalueinthesamewayitcombinestwostringsintoanewstringvalue.The*operatorcanalsobeusedwithalistandanintegervaluetoreplicatethelist.Enterthefollowingintotheinteractiveshell:

>>>[1,2,3]+['A','B','C']

[1,2,3,'A','B','C']

>>>['X','Y','Z']*3

['X','Y','Z','X','Y','Z','X','Y','Z']

>>>spam=[1,2,3]

>>>spam=spam+['A','B','C']

>>>spam

[1,2,3,'A','B','C']

RemovingValuesfromListswithdelStatementsThedelstatementwilldeletevaluesatanindexinalist.Allofthevaluesinthelistafterthedeletedvaluewillbemoveduponeindex.Forexample,enterthefollowingintotheinteractiveshell:

>>>spam=['cat','bat','rat','elephant']

>>>delspam[2]

>>>spam

['cat','bat','elephant']

>>>delspam[2]

>>>spam

['cat','bat']

Thedelstatementcanalsobeusedonasimplevariabletodeleteit,asifitwerean“unassignment”statement.Ifyoutrytousethevariableafterdeletingit,youwillgetaNameErrorerrorbecausethevariablenolongerexists.

Inpractice,youalmostneverneedtodeletesimplevariables.Thedelstatementismostlyusedtodeletevaluesfromlists.

WorkingwithListsWhenyoufirstbeginwritingprograms,it’stemptingtocreatemanyindividualvariablestostoreagroupofsimilarvalues.Forexample,ifIwantedtostorethenamesofmycats,Imightbetemptedtowritecodelikethis:

catName1='Zophie'

catName2='Pooka'

catName3='Simon'

catName4='LadyMacbeth'

catName5='Fat-tail'

catName6='MissCleo'

(Idon’tactuallyownthismanycats,Iswear.)Itturnsoutthatthisisabadwaytowritecode.Foronething,ifthenumberofcatschanges,yourprogramwillneverbeabletostoremorecatsthanyouhavevariables.Thesetypesofprogramsalsohavealotofduplicateornearlyidenticalcodeinthem.Considerhowmuchduplicatecodeisinthefollowingprogram,whichyoushouldenterintothefileeditorandsaveasallMyCats1.py:

print('Enterthenameofcat1:')

catName1=input()

print('Enterthenameofcat2:')

catName2=input()

print('Enterthenameofcat3:')

catName3=input()

print('Enterthenameofcat4:')

catName4=input()

print('Enterthenameofcat5:')

catName5=input()

print('Enterthenameofcat6:')

catName6=input()

print('Thecatnamesare:')

print(catName1+''+catName2+''+catName3+''+catName4+''+

catName5+''+catName6)

Insteadofusingmultiple,repetitivevariables,youcanuseasinglevariablethatcontainsalistvalue.Forexample,here’sanewandimprovedversionoftheallMyCats1.pyprogram.Thisnewversionusesasinglelistandcanstoreanynumberofcatsthattheusertypesin.Inanewfileeditorwindow,typethefollowingsourcecodeandsaveitasallMyCats2.py:

catNames=[]

whileTrue:

print('Enterthenameofcat'+str(len(catNames)+1)+

'(Orenternothingtostop.):')

name=input()

ifname=='':

break

catNames=catNames+[name]#listconcatenation

print('Thecatnamesare:')

fornameincatNames:

print(''+name)

Whenyourunthisprogram,theoutputwilllooksomethinglikethis:Enterthenameofcat1(Orenternothingtostop.):

Zophie

Enterthenameofcat2(Orenternothingtostop.):

Pooka

Enterthenameofcat3(Orenternothingtostop.):

Simon

Enterthenameofcat4(Orenternothingtostop.):

LadyMacbeth

Enterthenameofcat5(Orenternothingtostop.):

Fat-tail

Enterthenameofcat6(Orenternothingtostop.):

MissCleo

Enterthenameofcat7(Orenternothingtostop.):

Thecatnamesare:

Zophie

Pooka

Simon

LadyMacbeth

Fat-tail

MissCleo

Thebenefitofusingalististhatyourdataisnowinastructure,soyourprogramismuchmoreflexibleinprocessingthedatathanitwouldbewithseveralrepetitivevariables.

UsingforLoopswithListsInChapter2,youlearnedaboutusingforloopstoexecuteablockofcodeacertainnumberoftimes.Technically,aforlooprepeatsthecodeblockonceforeachvalueinalistorlist-likevalue.Forexample,ifyouranthiscode:

foriinrange(4):

print(i)

theoutputofthisprogramwouldbeasfollows:0

1

2

3

Thisisbecausethereturnvaluefromrange(4)isalist-likevaluethatPythonconsiderssimilarto[0,1,2,3].Thefollowingprogramhasthesameoutputasthepreviousone:

foriin[0,1,2,3]:

print(i)

Whatthepreviousforloopactuallydoesisloopthroughitsclausewiththevariableisettoasuccessivevalueinthe[0,1,2,3]listineachiteration.

NOTE

Inthisbook,Iusethetermlist-liketorefertodatatypesthataretechnicallynamedsequences.Youdon’tneedtoknowthetechnicaldefinitionsofthisterm,though.

AcommonPythontechniqueistouserange(len(someList))withaforlooptoiterateovertheindexesofalist.Forexample,enterthefollowingintotheinteractiveshell:

>>>supplies=['pens','staplers','flame-throwers','binders']

>>>foriinrange(len(supplies)):

print('Index'+str(i)+'insuppliesis:'+supplies[i])

Index0insuppliesis:pens

Index1insuppliesis:staplers

Index2insuppliesis:flame-throwers

Index3insuppliesis:binders

Usingrange(len(supplies))inthepreviouslyshownforloopishandybecausethecodeintheloopcanaccesstheindex(asthevariablei)andthevalueatthatindex(assupplies[i]).Bestofall,range(len(supplies))williteratethroughalltheindexesofsupplies,nomatterhowmanyitemsitcontains.

TheinandnotinOperatorsYoucandeterminewhetheravalueisorisn’tinalistwiththeinandnotinoperators.Likeotheroperators,inandnotinareusedinexpressionsandconnecttwovalues:avaluetolookforinalistandthelistwhereitmaybefound.TheseexpressionswillevaluatetoaBooleanvalue.Enterthefollowingintotheinteractiveshell:

>>>'howdy'in['hello','hi','howdy','heyas']

True

>>>spam=['hello','hi','howdy','heyas']

>>>'cat'inspam

False

>>>'howdy'notinspam

False

>>>'cat'notinspam

True

Forexample,thefollowingprogramletstheusertypeinapetnameandthencheckstoseewhetherthenameisinalistofpets.Openanewfileeditorwindow,enterthefollowingcode,andsaveitasmyPets.py:

myPets=['Zophie','Pooka','Fat-tail']

print('Enterapetname:')

name=input()

ifnamenotinmyPets:

print('Idonothaveapetnamed'+name)

else:

print(name+'ismypet.')

Theoutputmaylooksomethinglikethis:Enterapetname:

Footfoot

IdonothaveapetnamedFootfoot

TheMultipleAssignmentTrickThemultipleassignmenttrickisashortcutthatletsyouassignmultiplevariableswiththevaluesinalistinonelineofcode.Soinsteadofdoingthis:

>>>cat=['fat','black','loud']

>>>size=cat[0]

>>>color=cat[1]

>>>disposition=cat[2]

youcouldtypethislineofcode:>>>cat=['fat','black','loud']

>>>size,color,disposition=cat

Thenumberofvariablesandthelengthofthelistmustbeexactlyequal,orPythonwillgiveyouaValueError:

>>>cat=['fat','black','loud']

>>>size,color,disposition,name=cat

Traceback(mostrecentcalllast):

File"<pyshell#84>",line1,in<module>

size,color,disposition,name=cat

ValueError:needmorethan3valuestounpack

AugmentedAssignmentOperatorsWhenassigningavaluetoavariable,youwillfrequentlyusethevariableitself.Forexample,afterassigning42tothevariablespam,youwouldincreasethevalueinspamby1withthefollowingcode:

>>>spam=42

>>>spam=spam+1

>>>spam

43

Asashortcut,youcanusetheaugmentedassignmentoperator+=todothesamething:>>>spam=42

>>>spam+=1

>>>spam

43

Thereareaugmentedassignmentoperatorsforthe+,-,*,/,and%operators,describedinTable4-1.

Table4-1.TheAugmentedAssignmentOperators

Augmentedassignmentstatement Equivalentassignmentstatement

spam=spam+1 spam+=1

spam=spam-1 spam-=1

spam=spam*1 spam*=1

spam=spam/1 spam/=1

spam=spam%1 spam%=1

The+=operatorcanalsodostringandlistconcatenation,andthe*=operatorcandostringandlistreplication.Enterthefollowingintotheinteractiveshell:

>>>spam='Hello'

>>>spam+='world!'

>>>spam

'Helloworld!'

>>>bacon=['Zophie']

>>>bacon*=3

>>>bacon

['Zophie','Zophie','Zophie']

MethodsAmethodisthesamethingasafunction,exceptitis“calledon”avalue.Forexample,ifalistvaluewerestoredinspam,youwouldcalltheindex()listmethod(whichI’llexplainnext)onthatlistlikeso:spam.index('hello').Themethodpartcomesafterthevalue,separatedbyaperiod.

Eachdatatypehasitsownsetofmethods.Thelistdatatype,forexample,hasseveralusefulmethodsforfinding,adding,removing,andotherwisemanipulatingvaluesinalist.

FindingaValueinaListwiththeindex()MethodListvalueshaveanindex()methodthatcanbepassedavalue,andifthatvalueexistsinthelist,theindexofthevalueisreturned.Ifthevalueisn’tinthelist,thenPythonproducesaValueErrorerror.Enterthefollowingintotheinteractiveshell:

>>>spam=['hello','hi','howdy','heyas']

>>>spam.index('hello')

0

>>>spam.index('heyas')

3

>>>spam.index('howdyhowdyhowdy')

Traceback(mostrecentcalllast):

File"<pyshell#31>",line1,in<module>

spam.index('howdyhowdyhowdy')

ValueError:'howdyhowdyhowdy'isnotinlist

Whenthereareduplicatesofthevalueinthelist,theindexofitsfirstappearanceisreturned.Enterthefollowingintotheinteractiveshell,andnoticethatindex()returns1,not3:

>>>spam=['Zophie','Pooka','Fat-tail','Pooka']

>>>spam.index('Pooka')

1

AddingValuestoListswiththeappend()andinsert()MethodsToaddnewvaluestoalist,usetheappend()andinsert()methods.Enterthefollowingintotheinteractiveshelltocalltheappend()methodonalistvaluestoredinthevariablespam:

>>>spam=['cat','dog','bat']

>>>spam.append('moose')

>>>spam

['cat','dog','bat','moose']

Thepreviousappend()methodcalladdstheargumenttotheendofthelist.Theinsert()methodcaninsertavalueatanyindexinthelist.Thefirstargumenttoinsert()istheindexforthenewvalue,andthesecondargumentisthenewvaluetobeinserted.Enterthefollowingintotheinteractiveshell:

>>>spam=['cat','dog','bat']

>>>spam.insert(1,'chicken')

>>>spam

['cat','chicken','dog','bat']

Noticethatthecodeisspam.append('moose')andspam.insert(1,'chicken'),notspam=spam.append('moose')andspam=spam.insert(1,'chicken').Neitherappend()norinsert()givesthenewvalueofspamasitsreturnvalue.(Infact,thereturnvalueofappend()andinsert()isNone,soyoudefinitelywouldn’twanttostorethisasthenewvariablevalue.)Rather,thelistismodifiedinplace.Modifyingalistinplaceis

coveredinmoredetaillaterinMutableandImmutableDataTypes.

Methodsbelongtoasingledatatype.Theappend()andinsert()methodsarelistmethodsandcanbecalledonlyonlistvalues,notonothervaluessuchasstringsorintegers.Enterthefollowingintotheinteractiveshell,andnotetheAttributeErrorerrormessagesthatshowup:

>>>eggs='hello'

>>>eggs.append('world')

Traceback(mostrecentcalllast):

File"<pyshell#19>",line1,in<module>

eggs.append('world')

AttributeError:'str'objecthasnoattribute'append'

>>>bacon=42

>>>bacon.insert(1,'world')

Traceback(mostrecentcalllast):

File"<pyshell#22>",line1,in<module>

bacon.insert(1,'world')

AttributeError:'int'objecthasnoattribute'insert'

RemovingValuesfromListswithremove()Theremove()methodispassedthevaluetoberemovedfromthelistitiscalledon.Enterthefollowingintotheinteractiveshell:

>>>spam=['cat','bat','rat','elephant']

>>>spam.remove('bat')

>>>spam

['cat','rat','elephant']

AttemptingtodeleteavaluethatdoesnotexistinthelistwillresultinaValueErrorerror.Forexample,enterthefollowingintotheinteractiveshellandnoticetheerrorthatisdisplayed:

>>>spam=['cat','bat','rat','elephant']

>>>spam.remove('chicken')

Traceback(mostrecentcalllast):

File"<pyshell#11>",line1,in<module>

spam.remove('chicken')

ValueError:list.remove(x):xnotinlist

Ifthevalueappearsmultipletimesinthelist,onlythefirstinstanceofthevaluewillberemoved.Enterthefollowingintotheinteractiveshell:

>>>spam=['cat','bat','rat','cat','hat','cat']

>>>spam.remove('cat')

>>>spam

['bat','rat','cat','hat','cat']

Thedelstatementisgoodtousewhenyouknowtheindexofthevalueyouwanttoremovefromthelist.Theremove()methodisgoodwhenyouknowthevalueyouwanttoremovefromthelist.

SortingtheValuesinaListwiththesort()MethodListsofnumbervaluesorlistsofstringscanbesortedwiththesort()method.Forexample,enterthefollowingintotheinteractiveshell:

>>>spam=[2,5,3.14,1,-7]

>>>spam.sort()

>>>spam

[-7,1,2,3.14,5]

>>>spam=['ants','cats','dogs','badgers','elephants']

>>>spam.sort()

>>>spam

['ants','badgers','cats','dogs','elephants']

YoucanalsopassTrueforthereversekeywordargumenttohavesort()sortthevaluesinreverseorder.Enterthefollowingintotheinteractiveshell:

>>>spam.sort(reverse=True)

>>>spam

['elephants','dogs','cats','badgers','ants']

Therearethreethingsyoushouldnoteaboutthesort()method.First,thesort()methodsortsthelistinplace;don’ttrytocapturethereturnvaluebywritingcodelikespam=spam.sort().

Second,youcannotsortliststhathavebothnumbervaluesandstringvaluesinthem,sincePythondoesn’tknowhowtocomparethesevalues.TypethefollowingintotheinteractiveshellandnoticetheTypeErrorerror:

>>>spam=[1,3,2,4,'Alice','Bob']

>>>spam.sort()

Traceback(mostrecentcalllast):

File"<pyshell#70>",line1,in<module>

spam.sort()

TypeError:unorderabletypes:str()<int()

Third,sort()uses“ASCIIbeticalorder”ratherthanactualalphabeticalorderforsortingstrings.Thismeansuppercaseletterscomebeforelowercaseletters.Therefore,thelowercaseaissortedsothatitcomesaftertheuppercaseZ.Foranexample,enterthefollowingintotheinteractiveshell:

>>>spam=['Alice','ants','Bob','badgers','Carol','cats']

>>>spam.sort()

>>>spam

['Alice','Bob','Carol','ants','badgers','cats']

Ifyouneedtosortthevaluesinregularalphabeticalorder,passstr.lowerforthekeykeywordargumentinthesort()methodcall.

>>>spam=['a','z','A','Z']

>>>spam.sort(key=str.lower)

>>>spam

['a','A','z','Z']

Thiscausesthesort()functiontotreatalltheitemsinthelistasiftheywerelowercasewithoutactuallychangingthevaluesinthelist.

ExampleProgram:Magic8BallwithaListUsinglists,youcanwriteamuchmoreelegantversionofthepreviouschapter’sMagic8Ballprogram.Insteadofseverallinesofnearlyidenticalelifstatements,youcancreateasinglelistthatthecodeworkswith.Openanewfileeditorwindowandenterthefollowingcode.Saveitasmagic8Ball2.py.

importrandom

messages=['Itiscertain',

'Itisdecidedlyso',

'Yesdefinitely',

'Replyhazytryagain',

'Askagainlater',

'Concentrateandaskagain',

'Myreplyisno',

'Outlooknotsogood',

'Verydoubtful']

print(messages[random.randint(0,len(messages)-1)])

EXCEPTIONSTOINDENTATIONRULESINPYTHON

Inmostcases,theamountofindentationforalineofcodetellsPythonwhatblockitisin.Therearesomeexceptionstothisrule,however.Forexample,listscanactuallyspanseverallinesinthesourcecodefile.Theindentationoftheselinesdonotmatter;Pythonknowsthatuntilitseestheendingsquarebracket,thelistisnotfinished.Forexample,youcanhavecodethatlookslikethis:

spam=['apples',

'oranges',

'bananas',

'cats']

print(spam)

Ofcourse,practicallyspeaking,mostpeopleusePython’sbehaviortomaketheirlistslookprettyandreadable,likethemessageslistintheMagic8Ballprogram.

Youcanalsosplitupasingleinstructionacrossmultiplelinesusingthe\linecontinuationcharacterattheend.Thinkof\assaying,“Thisinstructioncontinuesonthenextline.”Theindentationonthelineaftera\linecontinuationisnotsignificant.Forexample,thefollowingisvalidPythoncode:

print('Fourscoreandseven'+\

'yearsago…')

ThesetricksareusefulwhenyouwanttorearrangelonglinesofPythoncodetobeabitmorereadable.

Whenyourunthisprogram,you’llseethatitworksthesameasthepreviousmagic8Ball.pyprogram.

Noticetheexpressionyouuseastheindexintomessages:random.randint(0,len(messages)-1).Thisproducesarandomnumbertousefortheindex,regardlessofthesizeofmessages.Thatis,you’llgetarandomnumberbetween0andthevalueoflen(messages)-1.Thebenefitofthisapproachisthatyoucaneasilyaddandremovestringstothemessageslistwithoutchangingotherlinesofcode.Ifyoulaterupdateyourcode,therewillbefewerlinesyouhavetochangeandfewerchancesforyoutointroducebugs.

List-likeTypes:StringsandTuplesListsaren’ttheonlydatatypesthatrepresentorderedsequencesofvalues.Forexample,stringsandlistsareactuallysimilar,ifyouconsiderastringtobea“list”ofsingletextcharacters.Manyofthethingsyoucandowithlistscanalsobedonewithstrings:indexing;slicing;andusingthemwithforloops,withlen(),andwiththeinandnotinoperators.Toseethis,enterthefollowingintotheinteractiveshell:

>>>name='Zophie'

>>>name[0]

'Z'

>>>name[-2]

'i'

>>>name[0:4]

'Zoph'

>>>'Zo'inname

True

>>>'z'inname

False

>>>'p'notinname

False

>>>foriinname:

print('***'+i+'***')

***Z***

***o***

***p***

***h***

***i***

***e***

MutableandImmutableDataTypesButlistsandstringsaredifferentinanimportantway.Alistvalueisamutabledatatype:Itcanhavevaluesadded,removed,orchanged.However,astringisimmutable:Itcannotbechanged.TryingtoreassignasinglecharacterinastringresultsinaTypeErrorerror,asyoucanseebyenteringthefollowingintotheinteractiveshell:

>>>name='Zophieacat'

>>>name[7]='the'

Traceback(mostrecentcalllast):

File"<pyshell#50>",line1,in<module>

name[7]='the'

TypeError:'str'objectdoesnotsupportitemassignment

Theproperwayto“mutate”astringistouseslicingandconcatenationtobuildanewstringbycopyingfrompartsoftheoldstring.Enterthefollowingintotheinteractiveshell:

>>>name='Zophieacat'

>>>newName=name[0:7]+'the'+name[8:12]

>>>name

'Zophieacat'

>>>newName

'Zophiethecat'

Weused[0:7]and[8:12]torefertothecharactersthatwedon’twishtoreplace.Noticethattheoriginal'Zophieacat'stringisnotmodifiedbecausestringsareimmutable.

Althoughalistvalueismutable,thesecondlineinthefollowingcodedoesnotmodifythelisteggs:

>>>eggs=[1,2,3]

>>>eggs=[4,5,6]

>>>eggs

[4,5,6]

Thelistvalueineggsisn’tbeingchangedhere;rather,anentirelynewanddifferentlistvalue([4,5,6])isoverwritingtheoldlistvalue([1,2,3]).ThisisdepictedinFigure4-2.

Ifyouwantedtoactuallymodifytheoriginallistineggstocontain[4,5,6],youwouldhavetodosomethinglikethis:

>>>eggs=[1,2,3]

>>>deleggs[2]

>>>deleggs[1]

>>>deleggs[0]

>>>eggs.append(4)

>>>eggs.append(5)

>>>eggs.append(6)

>>>eggs

[4,5,6]

Figure4-2.Wheneggs=[4,5,6]isexecuted,thecontentsofeggsarereplacedwithanewlistvalue.

Inthefirstexample,thelistvaluethateggsendsupwithisthesamelistvalueitstartedwith.It’sjustthatthislisthasbeenchanged,ratherthanoverwritten.Figure4-3depictsthesevenchangesmadebythefirstsevenlinesinthepreviousinteractiveshellexample.

Figure4-3.Thedelstatementandtheappend()methodmodifythesamelistvalueinplace.

Changingavalueofamutabledatatype(likewhatthedelstatementandappend()methoddointhepreviousexample)changesthevalueinplace,sincethevariable’svalueisnotreplacedwithanewlistvalue.

Mutableversusimmutabletypesmayseemlikeameaninglessdistinction,butPassingReferenceswillexplainthedifferentbehaviorwhencallingfunctionswithmutableargumentsversusimmutablearguments.Butfirst,let’sfindoutaboutthetupledatatype,

whichisanimmutableformofthelistdatatype.

TheTupleDataTypeThetupledatatypeisalmostidenticaltothelistdatatype,exceptintwoways.First,tuplesaretypedwithparentheses,(and),insteadofsquarebrackets,[and].Forexample,enterthefollowingintotheinteractiveshell:

>>>eggs=('hello',42,0.5)

>>>eggs[0]

'hello'

>>>eggs[1:3]

(42,0.5)

>>>len(eggs)

3

Butthemainwaythattuplesaredifferentfromlistsisthattuples,likestrings,areimmutable.Tuplescannothavetheirvaluesmodified,appended,orremoved.Enterthefollowingintotheinteractiveshell,andlookattheTypeErrorerrormessage:

>>>eggs=('hello',42,0.5)

>>>eggs[1]=99

Traceback(mostrecentcalllast):

File"<pyshell#5>",line1,in<module>

eggs[1]=99

TypeError:'tuple'objectdoesnotsupportitemassignment

Ifyouhaveonlyonevalueinyourtuple,youcanindicatethisbyplacingatrailingcommaafterthevalueinsidetheparentheses.Otherwise,Pythonwillthinkyou’vejusttypedavalueinsideregularparentheses.ThecommaiswhatletsPythonknowthisisatuplevalue.(Unlikesomeotherprogramminglanguages,inPythonit’sfinetohaveatrailingcommaafterthelastiteminalistortuple.)Enterthefollowingtype()functioncallsintotheinteractiveshelltoseethedistinction:

>>>type(('hello',))

<class'tuple'>

>>>type(('hello'))

<class'str'>

Youcanusetuplestoconveytoanyonereadingyourcodethatyoudon’tintendforthatsequenceofvaluestochange.Ifyouneedanorderedsequenceofvaluesthatneverchanges,useatuple.Asecondbenefitofusingtuplesinsteadoflistsisthat,becausetheyareimmutableandtheircontentsdon’tchange,Pythoncanimplementsomeoptimizationsthatmakecodeusingtuplesslightlyfasterthancodeusinglists.

ConvertingTypeswiththelist()andtuple()FunctionsJustlikehowstr(42)willreturn'42',thestringrepresentationoftheinteger42,thefunctionslist()andtuple()willreturnlistandtupleversionsofthevaluespassedtothem.Enterthefollowingintotheinteractiveshell,andnoticethatthereturnvalueisofadifferentdatatypethanthevaluepassed:

>>>tuple(['cat','dog',5])

('cat','dog',5)

>>>list(('cat','dog',5))

['cat','dog',5]

>>>list('hello')

['h','e','l','l','o']

Convertingatupletoalistishandyifyouneedamutableversionofatuplevalue.

ReferencesAsyou’veseen,variablesstorestringsandintegervalues.Enterthefollowingintotheinteractiveshell:

>>>spam=42

>>>cheese=spam

>>>spam=100

>>>spam

100

>>>cheese

42

Youassign42tothespamvariable,andthenyoucopythevalueinspamandassignittothevariablecheese.Whenyoulaterchangethevalueinspamto100,thisdoesn’taffectthevalueincheese.Thisisbecausespamandcheesearedifferentvariablesthatstoredifferentvalues.

Butlistsdon’tworkthisway.Whenyouassignalisttoavariable,youareactuallyassigningalistreferencetothevariable.Areferenceisavaluethatpointstosomebitofdata,andalistreferenceisavaluethatpointstoalist.Hereissomecodethatwillmakethisdistinctioneasiertounderstand.Enterthisintotheinteractiveshell:

➊>>>spam=[0,1,2,3,4,5]

➋>>>cheese=spam

➌>>>cheese[1]='Hello!'

>>>spam

[0,'Hello!',2,3,4,5]

>>>cheese

[0,'Hello!',2,3,4,5]

Thismightlookoddtoyou.Thecodechangedonlythecheeselist,butitseemsthatboththecheeseandspamlistshavechanged.

Whenyoucreatethelist➊,youassignareferencetoitinthespamvariable.Butthenextline➋copiesonlythelistreferenceinspamtocheese,notthelistvalueitself.Thismeansthevaluesstoredinspamandcheesenowbothrefertothesamelist.Thereisonlyoneunderlyinglistbecausethelistitselfwasneveractuallycopied.Sowhenyoumodifythefirstelementofcheese➌,youaremodifyingthesamelistthatspamrefersto.

Rememberthatvariablesarelikeboxesthatcontainvalues.Thepreviousfiguresinthischaptershowthatlistsinboxesaren’texactlyaccuratebecauselistvariablesdon’tactuallycontainlists—theycontainreferencestolists.(ThesereferenceswillhaveIDnumbersthatPythonusesinternally,butyoucanignorethem.)Usingboxesasametaphorforvariables,Figure4-4showswhathappenswhenalistisassignedtothespamvariable.

Figure4-4.spam=[0,1,2,3,4,5]storesareferencetoalist,nottheactuallist.

Then,inFigure4-5,thereferenceinspamiscopiedtocheese.Onlyanewreferencewascreatedandstoredincheese,notanewlist.Notehowbothreferencesrefertothesamelist.

Figure4-5.spam=cheesecopiesthereference,notthelist.

Whenyoualterthelistthatcheeserefersto,thelistthatspamreferstoisalsochanged,becausebothcheeseandspamrefertothesamelist.YoucanseethisinFigure4-6.

Figure4-6.cheese[1]='Hello!'modifiesthelistthatbothvariablesreferto.

Variableswillcontainreferencestolistvaluesratherthanlistvaluesthemselves.Butforstringsandintegervalues,variablessimplycontainthestringorintegervalue.Pythonusesreferenceswhenevervariablesmuststorevaluesofmutabledatatypes,suchaslistsor

dictionaries.Forvaluesofimmutabledatatypessuchasstrings,integers,ortuples,Pythonvariableswillstorethevalueitself.

AlthoughPythonvariablestechnicallycontainreferencestolistordictionaryvalues,peopleoftencasuallysaythatthevariablecontainsthelistordictionary.

PassingReferencesReferencesareparticularlyimportantforunderstandinghowargumentsgetpassedtofunctions.Whenafunctioniscalled,thevaluesoftheargumentsarecopiedtotheparametervariables.Forlists(anddictionaries,whichI’lldescribeinthenextchapter),thismeansacopyofthereferenceisusedfortheparameter.Toseetheconsequencesofthis,openanewfileeditorwindow,enterthefollowingcode,andsaveitaspassingReference.py:

defeggs(someParameter):

someParameter.append('Hello')

spam=[1,2,3]

eggs(spam)

print(spam)

Noticethatwheneggs()iscalled,areturnvalueisnotusedtoassignanewvaluetospam.Instead,itmodifiesthelistinplace,directly.Whenrun,thisprogramproducesthefollowingoutput:

[1,2,3,'Hello']

EventhoughspamandsomeParametercontainseparatereferences,theybothrefertothesamelist.Thisiswhytheappend('Hello')methodcallinsidethefunctionaffectsthelistevenafterthefunctioncallhasreturned.

Keepthisbehaviorinmind:ForgettingthatPythonhandleslistanddictionaryvariablesthiswaycanleadtoconfusingbugs.

ThecopyModule’scopy()anddeepcopy()FunctionsAlthoughpassingaroundreferencesisoftenthehandiestwaytodealwithlistsanddictionaries,ifthefunctionmodifiesthelistordictionarythatispassed,youmaynotwantthesechangesintheoriginallistordictionaryvalue.Forthis,Pythonprovidesamodulenamedcopythatprovidesboththecopy()anddeepcopy()functions.Thefirstofthese,copy.copy(),canbeusedtomakeaduplicatecopyofamutablevaluelikealistordictionary,notjustacopyofareference.Enterthefollowingintotheinteractiveshell:

>>>importcopy

>>>spam=['A','B','C','D']

>>>cheese=copy.copy(spam)

>>>cheese[1]=42

>>>spam

['A','B','C','D']

>>>cheese

['A',42,'C','D']

Nowthespamandcheesevariablesrefertoseparatelists,whichiswhyonlythelistincheeseismodifiedwhenyouassign42atindex7.AsyoucanseeinFigure4-7,thereferenceIDnumbersarenolongerthesameforbothvariablesbecausethevariablesrefertoindependentlists.

Figure4-7.cheese=copy.copy(spam)createsasecondlistthatcanbemodifiedindependentlyofthefirst.

Ifthelistyouneedtocopycontainslists,thenusethecopy.deepcopy()functioninsteadofcopy.copy().Thedeepcopy()functionwillcopytheseinnerlistsaswell.

SummaryListsareusefuldatatypessincetheyallowyoutowritecodethatworksonamodifiablenumberofvaluesinasinglevariable.Laterinthisbook,youwillseeprogramsusingliststodothingsthatwouldbedifficultorimpossibletodowithoutthem.

Listsaremutable,meaningthattheircontentscanchange.Tuplesandstrings,althoughlist-likeinsomerespects,areimmutableandcannotbechanged.Avariablethatcontainsatupleorstringvaluecanbeoverwrittenwithanewtupleorstringvalue,butthisisnotthesamethingasmodifyingtheexistingvalueinplace—like,say,theappend()orremove()methodsdoonlists.

Variablesdonotstorelistvaluesdirectly;theystorereferencestolists.Thisisanimportantdistinctionwhencopyingvariablesorpassinglistsasargumentsinfunctioncalls.Becausethevaluethatisbeingcopiedisthelistreference,beawarethatanychangesyoumaketothelistmightimpactanothervariableinyourprogram.Youcanusecopy()ordeepcopy()ifyouwanttomakechangestoalistinonevariablewithoutmodifyingtheoriginallist.

PracticeQuestionsQ: 1.Whatis[]?

Q: 2.Howwouldyouassignthevalue'hello'asthethirdvalueinaliststoredinavariablenamedspam?(Assumespamcontains[2,4,6,8,10].)

Forthefollowingthreequestions,let’ssayspamcontainsthelist['a','b','c','d'].

Q: 3.Whatdoesspam[int('3'*2)/11]evaluateto?

Q: 4.Whatdoesspam[-1]evaluateto?

Q: 5.Whatdoesspam[:2]evaluateto?

Forthefollowingthreequestions,let’ssaybaconcontainsthelist[3.14,'cat',11,'cat',True].

Q: 6.Whatdoesbacon.index('cat')evaluateto?

Q: 7.Whatdoesbacon.append(99)makethelistvalueinbaconlooklike?

Q: 8.Whatdoesbacon.remove('cat')makethelistvalueinbaconlooklike?

Q: 9.Whataretheoperatorsforlistconcatenationandlistreplication?

Q: 10.Whatisthedifferencebetweentheappend()andinsert()listmethods?

Q: 11.Whataretwowaystoremovevaluesfromalist?

Q: 12.Nameafewwaysthatlistvaluesaresimilartostringvalues.

Q: 13.Whatisthedifferencebetweenlistsandtuples?

Q: 14.Howdoyoutypethetuplevaluethathasjusttheintegervalue42init?

Q: 15.Howcanyougetthetupleformofalistvalue?Howcanyougetthelistformofatuplevalue?

Q: 16.Variablesthat“contain”listvaluesdon’tactuallycontainlistsdirectly.Whatdotheycontaininstead?

Q: 17.Whatisthedifferencebetweencopy.copy()andcopy.deepcopy()?

PracticeProjectsForpractice,writeprogramstodothefollowingtasks.

CommaCodeSayyouhavealistvaluelikethis:

spam=['apples','bananas','tofu','cats']

Writeafunctionthattakesalistvalueasanargumentandreturnsastringwithalltheitemsseparatedbyacommaandaspace,withandinsertedbeforethelastitem.Forexample,passingthepreviousspamlisttothefunctionwouldreturn'apples,bananas,tofu,andcats'.Butyourfunctionshouldbeabletoworkwithanylistvaluepassedtoit.

CharacterPictureGridSayyouhavealistoflistswhereeachvalueintheinnerlistsisaone-characterstring,likethis:

grid=[['.','.','.','.','.','.'],

['.','O','O','.','.','.'],

['O','O','O','O','.','.'],

['O','O','O','O','O','.'],

['.','O','O','O','O','O'],

['O','O','O','O','O','.'],

['O','O','O','O','.','.'],

['.','O','O','.','.','.'],

['.','.','.','.','.','.']]

Youcanthinkofgrid[x][y]asbeingthecharacteratthex-andy-coordinatesofa“picture”drawnwithtextcharacters.The(0,0)originwillbeintheupper-leftcorner,thex-coordinatesincreasegoingright,andwthey-coordinatesincreasegoingdown.

Copythepreviousgridvalue,andwritecodethatusesittoprinttheimage...OO.OO…OOOOOOO.

.OOOOOOO…OOOOO…..OOO…

....O….

Hint:Youwillneedtousealoopinaloopinordertoprintgrid[0][0],thengrid[1][0],thengrid[2][0],andsoon,uptogrid[8][0].Thiswillfinishthefirstrow,sothenprintanewline.Thenyourprogramshouldprintgrid[0][1],thengrid[1][1],thengrid[2][1],andsoon.Thelastthingyourprogramwillprintisgrid[8][5].

Also,remembertopasstheendkeywordargumenttoprint()ifyoudon’twantanewlineprintedautomaticallyaftereachprint()call.

Chapter5.DictionariesandStructuringDataInthischapter,Iwillcoverthedictionarydatatype,whichprovidesaflexiblewaytoaccessandorganizedata.Then,combiningdictionarieswithyourknowledgeoflistsfromthepreviouschapter,you’lllearnhowtocreateadatastructuretomodelatic-tac-toeboard.

TheDictionaryDataTypeLikealist,adictionaryisacollectionofmanyvalues.Butunlikeindexesforlists,indexesfordictionariescanusemanydifferentdatatypes,notjustintegers.Indexesfordictionariesarecalledkeys,andakeywithitsassociatedvalueiscalledakey-valuepair.

Incode,adictionaryistypedwithbraces,{}.Enterthefollowingintotheinteractiveshell:>>>myCat={'size':'fat','color':'gray','disposition':'loud'}

ThisassignsadictionarytothemyCatvariable.Thisdictionary’skeysare'size','color',and'disposition'.Thevaluesforthesekeysare'fat','gray',and'loud',respectively.Youcanaccessthesevaluesthroughtheirkeys:

>>>myCat['size']

'fat'

>>>'Mycathas'+myCat['color']+'fur.'

'Mycathasgrayfur.'

Dictionariescanstilluseintegervaluesaskeys,justlikelistsuseintegersforindexes,buttheydonothavetostartat0andcanbeanynumber.

>>>spam={12345:'LuggageCombination',42:'TheAnswer'}

Dictionariesvs.ListsUnlikelists,itemsindictionariesareunordered.Thefirstiteminalistnamedspamwouldbespam[0].Butthereisno“first”iteminadictionary.Whiletheorderofitemsmattersfordeterminingwhethertwolistsarethesame,itdoesnotmatterinwhatorderthekey-valuepairsaretypedinadictionary.Enterthefollowingintotheinteractiveshell:

>>>spam=['cats','dogs','moose']

>>>bacon=['dogs','moose','cats']

>>>spam==bacon

False

>>>eggs={'name':'Zophie','species':'cat','age':'8'}

>>>ham={'species':'cat','age':'8','name':'Zophie'}

>>>eggs==ham

True

Becausedictionariesarenotordered,theycan’tbeslicedlikelists.

TryingtoaccessakeythatdoesnotexistinadictionarywillresultinaKeyErrorerrormessage,muchlikealist’s“out-of-range”IndexErrorerrormessage.Enterthefollowingintotheinteractiveshell,andnoticetheerrormessagethatshowsupbecausethereisno'color'key:

>>>spam={'name':'Zophie','age':7}

>>>spam['color']

Traceback(mostrecentcalllast):

File"<pyshell#1>",line1,in<module>

spam['color']

KeyError:'color'

Thoughdictionariesarenotordered,thefactthatyoucanhavearbitraryvaluesforthekeysallowsyoutoorganizeyourdatainpowerfulways.Sayyouwantedyourprogramtostoredataaboutyourfriends’birthdays.Youcanuseadictionarywiththenamesaskeysandthebirthdaysasvalues.Openanewfileeditorwindowandenterthefollowingcode.Saveitasbirthdays.py.

➊birthdays={'Alice':'Apr1','Bob':'Dec12','Carol':'Mar4'}

whileTrue:

print('Enteraname:(blanktoquit)')

name=input()

ifname=='':

break

➋ifnameinbirthdays:

➌print(birthdays[name]+'isthebirthdayof'+name)

else:

print('Idonothavebirthdayinformationfor'+name)

print('Whatistheirbirthday?')

bday=input()

➍birthdays[name]=bday

print('Birthdaydatabaseupdated.')

Youcreateaninitialdictionaryandstoreitinbirthdays➊.Youcanseeiftheenterednameexistsasakeyinthedictionarywiththeinkeyword➋,justasyoudidforlists.Ifthenameisinthedictionary,youaccesstheassociatedvalueusingsquarebrackets➌;ifnot,youcanadditusingthesamesquarebracketsyntaxcombinedwiththeassignmentoperator➍.

Whenyourunthisprogram,itwilllooklikethis:Enteraname:(blanktoquit)

Alice

Apr1isthebirthdayofAlice

Enteraname:(blanktoquit)

Eve

IdonothavebirthdayinformationforEve

Whatistheirbirthday?

Dec5

Birthdaydatabaseupdated.

Enteraname:(blanktoquit)

Eve

Dec5isthebirthdayofEve

Enteraname:(blanktoquit)

Ofcourse,allthedatayouenterinthisprogramisforgottenwhentheprogramterminates.You’lllearnhowtosavedatatofilesontheharddriveinChapter8.

Thekeys(),values(),anditems()MethodsTherearethreedictionarymethodsthatwillreturnlist-likevaluesofthedictionary’skeys,values,orbothkeysandvalues:keys(),values(),anditems().Thevaluesreturnedbythesemethodsarenottruelists:Theycannotbemodifiedanddonothaveanappend()method.Butthesedatatypes(dict_keys,dict_values,anddict_items,respectively)canbeusedinforloops.Toseehowthesemethodswork,enterthefollowingintotheinteractiveshell:

>>>spam={'color':'red','age':42}

>>>forvinspam.values():

print(v)

red

42

Here,aforloopiteratesovereachofthevaluesinthespamdictionary.Aforloopcanalsoiterateoverthekeysorbothkeysandvalues:

>>>forkinspam.keys():

print(spam[k])

color

age

>>>foriinspam.items():

print(i)

('color','red')

('age',42)

Usingthekeys(),values(),anditems()methods,aforloopcaniterateoverthekeys,values,orkey-valuepairsinadictionary,respectively.Noticethatthevaluesinthedict_itemsvaluereturnedbytheitems()methodaretuplesofthekeyandvalue.

Ifyouwantatruelistfromoneofthesemethods,passitslist-likereturnvaluetothelist()function.Enterthefollowingintotheinteractiveshell:

>>>spam={'color':'red','age':42}

>>>spam.keys()

dict_keys(['color','age'])

>>>list(spam.keys())

['color','age']

Thelist(spam.keys())linetakesthedict_keysvaluereturnedfromkeys()andpassesittolist(),whichthenreturnsalistvalueof['color','age'].

Youcanalsousethemultipleassignmenttrickinaforlooptoassignthekeyandvaluetoseparatevariables.Enterthefollowingintotheinteractiveshell:

>>>spam={'color':'red','age':42}

>>>fork,vinspam.items():

print('Key:'+k+'Value:'+str(v))

Key:ageValue:42

Key:colorValue:red

CheckingWhetheraKeyorValueExistsinaDictionaryRecallfromthepreviouschapterthattheinandnotinoperatorscancheckwhetheravalueexistsinalist.Youcanalsousetheseoperatorstoseewhetheracertainkeyorvalueexistsinadictionary.Enterthefollowingintotheinteractiveshell:

>>>spam={'name':'Zophie','age':7}

>>>'name'inspam.keys()

True

>>>'Zophie'inspam.values()

True

>>>'color'inspam.keys()

False

>>>'color'notinspam.keys()

True

>>>'color'inspam

False

Inthepreviousexample,noticethat'color'inspamisessentiallyashorterversionofwriting'color'inspam.keys().Thisisalwaysthecase:Ifyoueverwanttocheckwhetheravalueis(orisn’t)akeyinthedictionary,youcansimplyusethein(ornotin)keywordwiththedictionaryvalueitself.

Theget()MethodIt’stedioustocheckwhetherakeyexistsinadictionarybeforeaccessingthatkey’svalue.Fortunately,dictionarieshaveaget()methodthattakestwoarguments:thekeyofthevaluetoretrieveandafallbackvaluetoreturnifthatkeydoesnotexist.

Enterthefollowingintotheinteractiveshell:>>>picnicItems={'apples':5,'cups':2}

>>>'Iambringing'+str(picnicItems.get('cups',0))+'cups.'

'Iambringing2cups.'

>>>'Iambringing'+str(picnicItems.get('eggs',0))+'eggs.'

'Iambringing0eggs.'

Becausethereisno'eggs'keyinthepicnicItemsdictionary,thedefaultvalue0isreturnedbytheget()method.Withoutusingget(),thecodewouldhavecausedanerrormessage,suchasinthefollowingexample:

>>>picnicItems={'apples':5,'cups':2}

>>>'Iambringing'+str(picnicItems['eggs'])+'eggs.'

Traceback(mostrecentcalllast):

File"<pyshell#34>",line1,in<module>

'Iambringing'+str(picnicItems['eggs'])+'eggs.'

KeyError:'eggs'

Thesetdefault()MethodYou’lloftenhavetosetavalueinadictionaryforacertainkeyonlyifthatkeydoesnotalreadyhaveavalue.Thecodelookssomethinglikethis:

spam={'name':'Pooka','age':5}

if'color'notinspam:

spam['color']='black'

Thesetdefault()methodoffersawaytodothisinonelineofcode.Thefirstargumentpassedtothemethodisthekeytocheckfor,andthesecondargumentisthevaluetosetatthatkeyifthekeydoesnotexist.Ifthekeydoesexist,thesetdefault()methodreturnsthekey’svalue.Enterthefollowingintotheinteractiveshell:

>>>spam={'name':'Pooka','age':5}

>>>spam.setdefault('color','black')

'black'

>>>spam

{'color':'black','age':5,'name':'Pooka'}

>>>spam.setdefault('color','white')

'black'

>>>spam

{'color':'black','age':5,'name':'Pooka'}

Thefirsttimesetdefault()iscalled,thedictionaryinspamchangesto{'color':'black','age':5,'name':'Pooka'}.Themethodreturnsthevalue'black'becausethisisnowthevaluesetforthekey'color'.Whenspam.setdefault('color','white')iscallednext,thevalueforthatkeyisnotchangedto'white'becausespamalreadyhasakeynamed'color'.

Thesetdefault()methodisaniceshortcuttoensurethatakeyexists.Hereisashortprogramthatcountsthenumberofoccurrencesofeachletterinastring.Openthefileeditorwindowandenterthefollowingcode,savingitascharacterCount.py:

message='ItwasabrightcolddayinApril,andtheclockswerestrikingthirteen.'

count={}

forcharacterinmessage:

count.setdefault(character,0)

count[character]=count[character]+1

print(count)

Theprogramloopsovereachcharacterinthemessagevariable’sstring,countinghowofteneachcharacterappears.Thesetdefault()methodcallensuresthatthekeyisinthecountdictionary(withadefaultvalueof0)sotheprogramdoesn’tthrowaKeyErrorerrorwhencount[character]=count[character]+1isexecuted.Whenyourunthisprogram,theoutputwilllooklikethis:

{'':13,',':1,'.':1,'A':1,'I':1,'a':4,'c':3,'b':1,'e':5,'d':3,'g':2,'i':

6,'h':3,'k':2,'l':3,'o':2,'n':4,'p':1,'s':3,'r':5,'t':6,'w':2,'y':1}

Fromtheoutput,youcanseethatthelowercaselettercappears3times,thespace

characterappears13times,andtheuppercaseletterAappears1time.Thisprogramwillworknomatterwhatstringisinsidethemessagevariable,evenifthestringismillionsofcharacterslong!

PrettyPrintingIfyouimportthepprintmoduleintoyourprograms,you’llhaveaccesstothepprint()andpformat()functionsthatwill“prettyprint”adictionary’svalues.Thisishelpfulwhenyouwantacleanerdisplayoftheitemsinadictionarythanwhatprint()provides.ModifythepreviouscharacterCount.pyprogramandsaveitasprettyCharacterCount.py.

importpprint

message='ItwasabrightcolddayinApril,andtheclockswerestriking

thirteen.'

count={}

forcharacterinmessage:

count.setdefault(character,0)

count[character]=count[character]+1

pprint.pprint(count)

Thistime,whentheprogramisrun,theoutputlooksmuchcleaner,withthekeyssorted.{'':13,

',':1,

'.':1,

'A':1,

'I':1,

'a':4,

'b':1,

'c':3,

'd':3,

'e':5,

'g':2,

'h':3,

'i':6,

'k':2,

'l':3,

'n':4,

'o':2,

'p':1,

'r':5,

's':3,

't':6,

'w':2,

'y':1}

Thepprint.pprint()functionisespeciallyhelpfulwhenthedictionaryitselfcontainsnestedlistsordictionaries.

Ifyouwanttoobtaintheprettifiedtextasastringvalueinsteadofdisplayingitonthescreen,callpprint.pformat()instead.Thesetwolinesareequivalenttoeachother:

pprint.pprint(someDictionaryValue)

print(pprint.pformat(someDictionaryValue))

UsingDataStructurestoModelReal-WorldThingsEvenbeforetheInternet,itwaspossibletoplayagameofchesswithsomeoneontheothersideoftheworld.Eachplayerwouldsetupachessboardattheirhomeandthentaketurnsmailingapostcardtoeachotherdescribingeachmove.Todothis,theplayersneededawaytounambiguouslydescribethestateoftheboardandtheirmoves.

Inalgebraicchessnotation,thespacesonthechessboardareidentifiedbyanumberandlettercoordinate,asinFigure5-1.

Figure5-1.Thecoordinatesofachessboardinalgebraicchessnotation

Thechesspiecesareidentifiedbyletters:Kforking,Qforqueen,Rforrook,Bforbishop,andNforknight.Describingamoveusestheletterofthepieceandthecoordinatesofitsdestination.Apairofthesemovesdescribeswhathappensinasingleturn(withwhitegoingfirst);forinstance,thenotation2.Nf3Nc6indicatesthatwhitemovedaknighttof3andblackmovedaknighttoc6onthesecondturnofthegame.

There’sabitmoretoalgebraicnotationthanthis,butthepointisthatyoucanuseittounambiguouslydescribeagameofchesswithoutneedingtobeinfrontofachessboard.Youropponentcanevenbeontheothersideoftheworld!Infact,youdon’tevenneedaphysicalchesssetifyouhaveagoodmemory:Youcanjustreadthemailedchessmovesandupdateboardsyouhaveinyourimagination.

Computershavegoodmemories.Aprogramonamoderncomputercaneasilystorebillionsofstringslike'2.Nf3Nc6'.Thisishowcomputerscanplaychesswithouthavingaphysicalchessboard.Theymodeldatatorepresentachessboard,andyoucanwritecodetoworkwiththismodel.

Thisiswherelistsanddictionariescancomein.Youcanusethemtomodelreal-worldthings,likechessboards.Forthefirstexample,you’lluseagamethat’salittlesimplerthanchess:tic-tac-toe.

ATic-Tac-ToeBoardAtic-tac-toeboardlookslikealargehashsymbol(#)withnineslotsthatcaneachcontainanX,anO,orablank.Torepresenttheboardwithadictionary,youcanassigneachslotastring-valuekey,asshowninFigure5-2.

Youcanusestringvaluestorepresentwhat’sineachslotontheboard:'X','O',or''(aspacecharacter).Thus,you’llneedtostoreninestrings.Youcanuseadictionaryofvaluesforthis.Thestringvaluewiththekey'top-R'canrepresentthetop-rightcorner,thestringvaluewiththekey'low-L'canrepresentthebottom-leftcorner,thestringvaluewiththekey'mid-M'canrepresentthemiddle,andsoon.

Figure5-2.Theslotsofatic-tactoeboardwiththeircorrespondingkeys

Thisdictionaryisadatastructurethatrepresentsatic-tac-toeboard.Storethisboard-as-a-dictionaryinavariablenamedtheBoard.Openanewfileeditorwindow,andenterthefollowingsourcecode,savingitasticTacToe.py:

theBoard={'top-L':'','top-M':'','top-R':'',

'mid-L':'','mid-M':'','mid-R':'',

'low-L':'','low-M':'','low-R':''}

ThedatastructurestoredinthetheBoardvariablerepresentsthetic-tactoeboardinFigure5-3.

Figure5-3.Anemptytic-tac-toeboard

SincethevalueforeverykeyintheBoardisasingle-spacestring,thisdictionaryrepresentsacompletelyclearboard.IfplayerXwentfirstandchosethemiddlespace,youcouldrepresentthatboardwiththisdictionary:

theBoard={'top-L':'','top-M':'','top-R':'',

'mid-L':'','mid-M':'X','mid-R':'',

'low-L':'','low-M':'','low-R':''}

ThedatastructureintheBoardnowrepresentsthetic-tac-toeboardinFigure5-4.

Figure5-4.Thefirstmove

AboardwhereplayerOhaswonbyplacingOsacrossthetopmightlooklikethis:theBoard={'top-L':'O','top-M':'O','top-R':'O',

'mid-L':'X','mid-M':'X','mid-R':'',

'low-L':'','low-M':'','low-R':'X'}

ThedatastructureintheBoardnowrepresentsthetic-tac-toeboardinFigure5-5.

Figure5-5.PlayerOwins.

Ofcourse,theplayerseesonlywhatisprintedtothescreen,notthecontentsofvariables.Let’screateafunctiontoprinttheboarddictionaryontothescreen.MakethefollowingadditiontoticTacToe.py(newcodeisinbold):

theBoard={'top-L':'','top-M':'','top-R':'',

'mid-L':'','mid-M':'','mid-R':'',

'low-L':'','low-M':'','low-R':''}

defprintBoard(board):

print(board['top-L']+'|'+board['top-M']+'|'+board['top-R'])

print('-+-+-')

print(board['mid-L']+'|'+board['mid-M']+'|'+board['mid-R'])

print('-+-+-')

print(board['low-L']+'|'+board['low-M']+'|'+board['low-R'])

printBoard(theBoard)

Whenyourunthisprogram,printBoard()willprintoutablanktic-tactoeboard.||

-+-+-

||

-+-+-

||

TheprintBoard()functioncanhandleanytic-tac-toedatastructureyoupassit.Trychangingthecodetothefollowing:

theBoard={'top-L':'O','top-M':'O','top-R':'O','mid-L':'X','mid-M':

'X','mid-R':'','low-L':'','low-M':'','low-R':'X'}

defprintBoard(board):

print(board['top-L']+'|'+board['top-M']+'|'+board['top-R'])

print('-+-+-')

print(board['mid-L']+'|'+board['mid-M']+'|'+board['mid-R'])

print('-+-+-')

print(board['low-L']+'|'+board['low-M']+'|'+board['low-R'])

printBoard(theBoard)

Nowwhenyourunthisprogram,thenewboardwillbeprintedtothescreen.O|O|O

-+-+-

X|X|

-+-+-

||X

Becauseyoucreatedadatastructuretorepresentatic-tac-toeboardandwrotecodeinprintBoard()tointerpretthatdatastructure,younowhaveaprogramthat“models”thetic-tac-toeboard.Youcouldhaveorganizedyourdatastructuredifferently(forexample,usingkeyslike'TOP-LEFT'insteadof'top-L'),butaslongasthecodeworkswithyourdatastructures,youwillhaveacorrectlyworkingprogram.

Forexample,theprintBoard()functionexpectsthetic-tac-toedatastructuretobeadictionarywithkeysforallnineslots.Ifthedictionaryyoupassedwasmissing,say,the'mid-L'key,yourprogramwouldnolongerwork.

O|O|O

-+-+-

Traceback(mostrecentcalllast):

File"ticTacToe.py",line10,in<module>

printBoard(theBoard)

File"ticTacToe.py",line6,inprintBoard

print(board['mid-L']+'|'+board['mid-M']+'|'+board['mid-R'])

KeyError:'mid-L'

Nowlet’saddcodethatallowstheplayerstoentertheirmoves.ModifytheticTacToe.pyprogramtolooklikethis:

theBoard={'top-L':'','top-M':'','top-R':'','mid-L':'','mid-M':'

','mid-R':'','low-L':'','low-M':'','low-R':''}

defprintBoard(board):

print(board['top-L']+'|'+board['top-M']+'|'+board['top-R'])

print('-+-+-')

print(board['mid-L']+'|'+board['mid-M']+'|'+board['mid-R'])

print('-+-+-')

print(board['low-L']+'|'+board['low-M']+'|'+board['low-R'])

turn='X'

foriinrange(9):

➊printBoard(theBoard)

print('Turnfor'+turn+'.Moveonwhichspace?')

➋move=input()

➌theBoard[move]=turn

➍ifturn=='X':

turn='O'

else:

turn='X'

printBoard(theBoard)

Thenewcodeprintsouttheboardatthestartofeachnewturn➊,getstheactiveplayer’s

move➋,updatesthegameboardaccordingly➌,andthenswapstheactiveplayer➍beforemovingontothenextturn.

Whenyourunthisprogram,itwilllooksomethinglikethis:||

-+-+-

||

-+-+-

||

TurnforX.Moveonwhichspace?

mid-M

||

-+-+-

|X|

-+-+-

||

TurnforO.Moveonwhichspace?

low-L

||

-+-+-

|X|

-+-+-

O||

--snip--

O|O|X

-+-+-

X|X|O

-+-+-

O||X

TurnforX.Moveonwhichspace?

low-M

O|O|X

-+-+-

X|X|O

-+-+-

O|X|X

Thisisn’tacompletetic-tac-toegame—forinstance,itdoesn’tevercheckwhetheraplayerhaswon—butit’senoughtoseehowdatastructurescanbeusedinprograms.

NOTE

Ifyouarecurious,thesourcecodeforacompletetic-tac-toeprogramisdescribedintheresourcesavailablefromhttp://nostarch.com/automatestuff/.

NestedDictionariesandListsModelingatic-tac-toeboardwasfairlysimple:Theboardneededonlyasingledictionaryvaluewithninekey-valuepairs.Asyoumodelmorecomplicatedthings,youmayfindyouneeddictionariesandliststhatcontainotherdictionariesandlists.Listsareusefultocontainanorderedseriesofvalues,anddictionariesareusefulforassociatingkeyswithvalues.Forexample,here’saprogramthatusesadictionarythatcontainsotherdictionariesinordertoseewhoisbringingwhattoapicnic.ThetotalBrought()functioncanreadthisdatastructureandcalculatethetotalnumberofanitembeingbroughtbyalltheguests.

allGuests={'Alice':{'apples':5,'pretzels':12},

'Bob':{'hamsandwiches':3,'apples':2},

'Carol':{'cups':3,'applepies':1}}

deftotalBrought(guests,item):

numBrought=0

➊fork,vinguests.items():

➋numBrought=numBrought+v.get(item,0)

returnnumBrought

print('Numberofthingsbeingbrought:')

print('-Apples'+str(totalBrought(allGuests,'apples')))

print('-Cups'+str(totalBrought(allGuests,'cups')))

print('-Cakes'+str(totalBrought(allGuests,'cakes')))

print('-HamSandwiches'+str(totalBrought(allGuests,'hamsandwiches')))

print('-ApplePies'+str(totalBrought(allGuests,'applepies')))

InsidethetotalBrought()function,theforloopiteratesoverthekey-valuepairsinguests➊.Insidetheloop,thestringoftheguest’snameisassignedtok,andthedictionaryofpicnicitemsthey’rebringingisassignedtov.Iftheitemparameterexistsasakeyinthisdictionary,it’svalue(thequantity)isaddedtonumBrought➋.Ifitdoesnotexistasakey,theget()methodreturns0tobeaddedtonumBrought.

Theoutputofthisprogramlookslikethis:Numberofthingsbeingbrought:

-Apples7

-Cups3

-Cakes0

-HamSandwiches3

-ApplePies1

Thismayseemlikesuchasimplethingtomodelthatyouwouldn’tneedtobotherwithwritingaprogramtodoit.ButrealizethatthissametotalBrought()functioncouldeasilyhandleadictionarythatcontainsthousandsofguests,eachbringingthousandsofdifferentpicnicitems.ThenhavingthisinformationinadatastructurealongwiththetotalBrought()functionwouldsaveyoualotoftime!

Youcanmodelthingswithdatastructuresinwhateverwayyoulike,aslongastherestofthecodeinyourprogramcanworkwiththedatamodelcorrectly.Whenyoufirstbeginprogramming,don’tworrysomuchaboutthe“right”waytomodeldata.Asyougainmoreexperience,youmaycomeupwithmoreefficientmodels,buttheimportantthingisthatthedatamodelworksforyourprogram’sneeds.

SummaryYoulearnedallaboutdictionariesinthischapter.Listsanddictionariesarevaluesthatcancontainmultiplevalues,includingotherlistsanddictionaries.Dictionariesareusefulbecauseyoucanmaponeitem(thekey)toanother(thevalue),asopposedtolists,whichsimplycontainaseriesofvaluesinorder.Valuesinsideadictionaryareaccessedusingsquarebracketsjustaswithlists.Insteadofanintegerindex,dictionariescanhavekeysofavarietyofdatatypes:integers,floats,strings,ortuples.Byorganizingaprogram’svaluesintodatastructures,youcancreaterepresentationsofreal-worldobjects.Yousawanexampleofthiswithatic-tac-toeboard.

ThatjustaboutcoversallthebasicconceptsofPythonprogramming!You’llcontinuetolearnnewconceptsthroughouttherestofthisbook,butyounowknowenoughtostartwritingsomeusefulprogramsthatcanautomatetasks.YoumightnotthinkyouhaveenoughPythonknowledgetodothingssuchasdownloadwebpages,updatespreadsheets,orsendtextmessages,butthat’swherePythonmodulescomein!Thesemodules,writtenbyotherprogrammers,providefunctionsthatmakeiteasyforyoutodoallthesethings.Solet’slearnhowtowriterealprogramstodousefulautomatedtasks.

PracticeQuestionsQ: 1.Whatdoesthecodeforanemptydictionarylooklike?

Q: 2.Whatdoesadictionaryvaluewithakey'foo'andavalue42looklike?

Q: 3.Whatisthemaindifferencebetweenadictionaryandalist?

Q: 4.Whathappensifyoutrytoaccessspam['foo']ifspamis{'bar':100}?

Q: 5.Ifadictionaryisstoredinspam,whatisthedifferencebetweentheexpressions'cat'inspamand'cat'inspam.keys()?

Q: 6.Ifadictionaryisstoredinspam,whatisthedifferencebetweentheexpressions'cat'inspamand'cat'inspam.values()?

Q: 7.Whatisashortcutforthefollowingcode?if'color'notinspam:

spam['color']='black'

Q: 8.Whatmoduleandfunctioncanbeusedto“prettyprint”dictionaryvalues?

PracticeProjectsForpractice,writeprogramstodothefollowingtasks.

FantasyGameInventoryYouarecreatingafantasyvideogame.Thedatastructuretomodeltheplayer’sinventorywillbeadictionarywherethekeysarestringvaluesdescribingtheitemintheinventoryandthevalueisanintegervaluedetailinghowmanyofthatitemtheplayerhas.Forexample,thedictionaryvalue{'rope':1,'torch':6,'goldcoin':42,'dagger':1,'arrow':12}meanstheplayerhas1rope,6torches,42goldcoins,andsoon.

WriteafunctionnameddisplayInventory()thatwouldtakeanypossible“inventory”anddisplayitlikethefollowing:

Inventory:

12arrow

42goldcoin

1rope

6torch

1dagger

Totalnumberofitems:63

Hint:Youcanuseaforlooptoloopthroughallthekeysinadictionary.#inventory.py

stuff={'rope':1,'torch':6,'goldcoin':42,'dagger':1,'arrow':12}

defdisplay_inventory(inventory):

print("Inventory:")

item_total=0

fork,vininventory.items():

print(str(v)+''+k)

item_total+=v

print("Totalnumberofitems:"+str(item_total))

display_inventory(stuff)

ListtoDictionaryFunctionforFantasyGameInventoryImaginethatavanquisheddragon’slootisrepresentedasalistofstringslikethis:

dragonLoot=['goldcoin','dagger','goldcoin','goldcoin','ruby']

WriteafunctionnamedaddToInventory(inventory,addedItems),wheretheinventoryparameterisadictionaryrepresentingtheplayer’sinventory(likeinthepreviousproject)andtheaddedItemsparameterisalistlikedragonLoot.TheaddToInventory()functionshouldreturnadictionarythatrepresentstheupdatedinventory.NotethattheaddedItemslistcancontainmultiplesofthesameitem.Yourcodecouldlooksomethinglikethis:

defaddToInventory(inventory,addedItems):

#yourcodegoeshere

inv={'goldcoin':42,'rope':1}

dragonLoot=['goldcoin','dagger','goldcoin','goldcoin','ruby']

inv=addToInventory(inv,dragonLoot)

displayInventory(inv)

Thepreviousprogram(withyourdisplayInventory()functionfromthepreviousproject)wouldoutputthefollowing:

Inventory:

45goldcoin

1rope

1ruby

1dagger

Totalnumberofitems:48

Chapter6.ManipulatingStringsTextisoneofthemostcommonformsofdatayourprogramswillhandle.Youalreadyknowhowtoconcatenatetwostringvaluestogetherwiththe+operator,butyoucandomuchmorethanthat.Youcanextractpartialstringsfromstringvalues,addorremovespacing,convertletterstolowercaseoruppercase,andcheckthatstringsareformattedcorrectly.YoucanevenwritePythoncodetoaccesstheclipboardforcopyingandpastingtext.

Inthischapter,you’lllearnallthisandmore.Thenyou’llworkthroughtwodifferentprogrammingprojects:asimplepasswordmanagerandaprogramtoautomatetheboringchoreofformattingpiecesoftext.

WorkingwithStringsLet’slookatsomeofthewaysPythonletsyouwrite,print,andaccessstringsinyourcode.

StringLiteralsTypingstringvaluesinPythoncodeisfairlystraightforward:Theybeginandendwithasinglequote.Butthenhowcanyouuseaquoteinsideastring?Typing'ThatisAlice'scat.'won’twork,becausePythonthinksthestringendsafterAlice,andtherest(scat.')isinvalidPythoncode.Fortunately,therearemultiplewaystotypestrings.

DoubleQuotes

Stringscanbeginandendwithdoublequotes,justastheydowithsinglequotes.Onebenefitofusingdoublequotesisthatthestringcanhaveasinglequotecharacterinit.Enterthefollowingintotheinteractiveshell:

>>>spam="ThatisAlice'scat."

Sincethestringbeginswithadoublequote,Pythonknowsthatthesinglequoteispartofthestringandnotmarkingtheendofthestring.However,ifyouneedtousebothsinglequotesanddoublequotesinthestring,you’llneedtouseescapecharacters.

EscapeCharacters

Anescapecharacterletsyouusecharactersthatareotherwiseimpossibletoputintoastring.Anescapecharacterconsistsofabackslash(\)followedbythecharacteryouwanttoaddtothestring.(Despiteconsistingoftwocharacters,itiscommonlyreferredtoasasingularescapecharacter.)Forexample,theescapecharacterforasinglequoteis\'.Youcanusethisinsideastringthatbeginsandendswithsinglequotes.Toseehowescapecharacterswork,enterthefollowingintotheinteractiveshell:

>>>spam='SayhitoBob\'smother.'

PythonknowsthatsincethesinglequoteinBob\'shasabackslash,itisnotasinglequotemeanttoendthestringvalue.Theescapecharacters\'and\"letyouputsinglequotesanddoublequotesinsideyourstrings,respectively.

Table6-1liststheescapecharactersyoucanuse.

Table6-1.EscapeCharacters

Escapecharacter Printsas

\' Singlequote

\" Doublequote

\t Tab

\n Newline(linebreak)

\\ Backslash

Enterthefollowingintotheinteractiveshell:>>>print("Hellothere!\nHowareyou?\nI\'mdoingfine.")

Hellothere!

Howareyou?

I'mdoingfine.

RawStrings

Youcanplaceanrbeforethebeginningquotationmarkofastringtomakeitarawstring.Arawstringcompletelyignoresallescapecharactersandprintsanybackslashthatappearsinthestring.Forexample,typethefollowingintotheinteractiveshell:

>>>print(r'ThatisCarol\'scat.')

ThatisCarol\'scat.

Becausethisisarawstring,Pythonconsidersthebackslashaspartofthestringandnotasthestartofanescapecharacter.Rawstringsarehelpfulifyouaretypingstringvaluesthatcontainmanybackslashes,suchasthestringsusedforregularexpressionsdescribedinthenextchapter.

MultilineStringswithTripleQuotes

Whileyoucanusethe\nescapecharactertoputanewlineintoastring,itisofteneasiertousemultilinestrings.AmultilinestringinPythonbeginsandendswitheitherthreesinglequotesorthreedoublequotes.Anyquotes,tabs,ornewlinesinbetweenthe“triplequotes”areconsideredpartofthestring.Python’sindentationrulesforblocksdonotapplytolinesinsideamultilinestring.

Openthefileeditorandwritethefollowing:print('''DearAlice,

Eve'scathasbeenarrestedforcatnapping,catburglary,andextortion.

Sincerely,

Bob''')

Savethisprogramascatnapping.pyandrunit.Theoutputwilllooklikethis:DearAlice,

Eve'scathasbeenarrestedforcatnapping,catburglary,andextortion.

Sincerely,

Bob

NoticethatthesinglequotecharacterinEve'sdoesnotneedtobeescaped.Escapingsingleanddoublequotesisoptionalinrawstrings.Thefollowingprint()callwouldprintidenticaltextbutdoesn’tuseamultilinestring:

print('DearAlice,\n\nEve\'scathasbeenarrestedforcatnapping,cat

burglary,andextortion.\n\nSincerely,\nBob')

MultilineComments

Whilethehashcharacter(#)marksthebeginningofacommentfortherestoftheline,amultilinestringisoftenusedforcommentsthatspanmultiplelines.ThefollowingisperfectlyvalidPythoncode:

"""ThisisatestPythonprogram.

[email protected]

ThisprogramwasdesignedforPython3,notPython2.

"""

defspam():

"""Thisisamultilinecommenttohelp

explainwhatthespam()functiondoes."""

print('Hello!')

IndexingandSlicingStringsStringsuseindexesandslicesthesamewaylistsdo.Youcanthinkofthestring'Helloworld!'asalistandeachcharacterinthestringasanitemwithacorrespondingindex.

'Helloworld!'

01234567891011

Thespaceandexclamationpointareincludedinthecharactercount,so'Helloworld!'is12characterslong,fromHatindex0to!atindex11.

Enterthefollowingintotheinteractiveshell:>>>spam='Helloworld!'

>>>spam[0]

'H'

>>>spam[4]

'o'

>>>spam[-1]

'!'

>>>spam[0:5]

'Hello'

>>>spam[:5]

'Hello'

>>>spam[6:]

'world!'

Ifyouspecifyanindex,you’llgetthecharacteratthatpositioninthestring.Ifyouspecifyarangefromoneindextoanother,thestartingindexisincludedandtheendingindexisnot.That’swhy,ifspamis'Helloworld!',spam[0:5]is'Hello'.Thesubstringyougetfromspam[0:5]willincludeeverythingfromspam[0]tospam[4],leavingoutthespaceatindex5.

Notethatslicingastringdoesnotmodifytheoriginalstring.Youcancaptureaslicefromonevariableinaseparatevariable.Trytypingthefollowingintotheinteractiveshell:

>>>spam='Helloworld!'

>>>fizz=spam[0:5]

>>>fizz

'Hello'

Byslicingandstoringtheresultingsubstringinanothervariable,youcanhaveboththewholestringandthesubstringhandyforquick,easyaccess.

TheinandnotinOperatorswithStringsTheinandnotinoperatorscanbeusedwithstringsjustlikewithlistvalues.AnexpressionwithtwostringsjoinedusinginornotinwillevaluatetoaBooleanTrueorFalse.Enterthefollowingintotheinteractiveshell:

>>>'Hello'in'HelloWorld'

True

>>>'Hello'in'Hello'

True

>>>'HELLO'in'HelloWorld'

False

>>>''in'spam'

True

>>>'cats'notin'catsanddogs'

False

Theseexpressionstestwhetherthefirststring(theexactstring,casesensitive)canbefoundwithinthesecondstring.

UsefulStringMethodsSeveralstringmethodsanalyzestringsorcreatetransformedstringvalues.Thissectiondescribesthemethodsyou’llbeusingmostoften.

Theupper(),lower(),isupper(),andislower()StringMethodsTheupper()andlower()stringmethodsreturnanewstringwhereallthelettersintheoriginalstringhavebeenconvertedtouppercaseorlower-case,respectively.Nonlettercharactersinthestringremainunchanged.Enterthefollowingintotheinteractiveshell:

>>>spam='Helloworld!'

>>>spam=spam.upper()

>>>spam

'HELLOWORLD!'

>>>spam=spam.lower()

>>>spam

'helloworld!'

Notethatthesemethodsdonotchangethestringitselfbutreturnnewstringvalues.Ifyouwanttochangetheoriginalstring,youhavetocallupper()orlower()onthestringandthenassignthenewstringtothevariablewheretheoriginalwasstored.Thisiswhyyoumustusespam=spam.upper()tochangethestringinspaminsteadofsimplyspam.upper().(Thisisjustlikeifavariableeggscontainsthevalue10.Writingeggs+3doesnotchangethevalueofeggs,buteggs=eggs+3does.)

Theupper()andlower()methodsarehelpfulifyouneedtomakeacase-insensitivecomparison.Thestrings'great'and'GREat'arenotequaltoeachother.Butinthefollowingsmallprogram,itdoesnotmatterwhethertheusertypesGreat,GREAT,orgrEAT,becausethestringisfirstconvertedtolowercase.

print('Howareyou?')

feeling=input()

iffeeling.lower()=='great':

print('Ifeelgreattoo.')

else:

print('Ihopetherestofyourdayisgood.')

Whenyourunthisprogram,thequestionisdisplayed,andenteringavariationongreat,suchasGREat,willstillgivetheoutputIfeelgreattoo.Addingcodetoyourprogramtohandlevariationsormistakesinuserinput,suchasinconsistentcapitalization,willmakeyourprogramseasiertouseandlesslikelytofail.

Howareyou?

GREat

Ifeelgreattoo.

Theisupper()andislower()methodswillreturnaBooleanTruevalueifthestringhasatleastoneletterandallthelettersareuppercaseorlowercase,respectively.Otherwise,themethodreturnsFalse.Enterthefollowingintotheinteractiveshell,andnoticewhateachmethodcallreturns:

>>>spam='Helloworld!'

>>>spam.islower()

False

>>>spam.isupper()

False

>>>'HELLO'.isupper()

True

>>>'abc12345'.islower()

True

>>>'12345'.islower()

False

>>>'12345'.isupper()

False

Sincetheupper()andlower()stringmethodsthemselvesreturnstrings,youcancallstringmethodsonthosereturnedstringvaluesaswell.Expressionsthatdothiswilllooklikeachainofmethodcalls.Enterthefollowingintotheinteractiveshell:

>>>'Hello'.upper()

'HELLO'

>>>'Hello'.upper().lower()

'hello'

>>>'Hello'.upper().lower().upper()

'HELLO'

>>>'HELLO'.lower()

'hello'

>>>'HELLO'.lower().islower()

True

TheisXStringMethodsAlongwithislower()andisupper(),thereareseveralstringmethodsthathavenamesbeginningwiththewordis.ThesemethodsreturnaBooleanvaluethatdescribesthenatureofthestring.HerearesomecommonisXstringmethods:

isalpha()returnsTrueifthestringconsistsonlyoflettersandisnotblank.isalnum()returnsTrueifthestringconsistsonlyoflettersandnumbersandisnotblank.isdecimal()returnsTrueifthestringconsistsonlyofnumericcharactersandisnotblank.isspace()returnsTrueifthestringconsistsonlyofspaces,tabs,andnew-linesandisnotblank.istitle()returnsTrueifthestringconsistsonlyofwordsthatbeginwithanuppercaseletterfollowedbyonlylowercaseletters.

Enterthefollowingintotheinteractiveshell:>>>'hello'.isalpha()

True

>>>'hello123'.isalpha()

False

>>>'hello123'.isalnum()

True

>>>'hello'.isalnum()

True

>>>'123'.isdecimal()

True

>>>''.isspace()

True

>>>'ThisIsTitleCase'.istitle()

True

>>>'ThisIsTitleCase123'.istitle()

True

>>>'ThisIsnotTitleCase'.istitle()

False

>>>'ThisIsNOTTitleCaseEither'.istitle()

False

TheisXstringmethodsarehelpfulwhenyouneedtovalidateuserinput.Forexample,thefollowingprogramrepeatedlyasksusersfortheirageandapassworduntiltheyprovidevalidinput.Openanewfileeditorwindowandenterthisprogram,savingitasvalidateInput.py:

whileTrue:

print('Enteryourage:')

age=input()

ifage.isdecimal():

break

print('Pleaseenteranumberforyourage.')

whileTrue:

print('Selectanewpassword(lettersandnumbersonly):')

password=input()

ifpassword.isalnum():

break

print('Passwordscanonlyhavelettersandnumbers.')

Inthefirstwhileloop,weasktheuserfortheirageandstoretheirinputinage.Ifageisavalid(decimal)value,webreakoutofthisfirstwhileloopandmoveontothesecond,whichasksforapassword.Otherwise,weinformtheuserthattheyneedtoenteranumberandagainaskthemtoentertheirage.Inthesecondwhileloop,weaskforapassword,storetheuser’sinputinpassword,andbreakoutoftheloopiftheinputwasalphanumeric.Ifitwasn’t,we’renotsatisfiedsowetelltheuserthepasswordneedstobealphanumericandagainaskthemtoenterapassword.

Whenrun,theprogram’soutputlookslikethis:Enteryourage:

fortytwo

Pleaseenteranumberforyourage.

Enteryourage:

42

Selectanewpassword(lettersandnumbersonly):

secr3t!

Passwordscanonlyhavelettersandnumbers.

Selectanewpassword(lettersandnumbersonly):

secr3t

Callingisdecimal()andisalnum()onvariables,we’reabletotestwhetherthevaluesstoredinthosevariablesaredecimalornot,alphanumericornot.Here,thesetestshelpusrejecttheinputfortytwoandaccept42,andrejectsecr3t!andacceptsecr3t.

Thestartswith()andendswith()StringMethodsThestartswith()andendswith()methodsreturnTrueifthestringvaluetheyarecalledonbeginsorends(respectively)withthestringpassedtothemethod;otherwise,theyreturnFalse.Enterthefollowingintotheinteractiveshell:

>>>'Helloworld!'.startswith('Hello')

True

>>>'Helloworld!'.endswith('world!')

True

>>>'abc123'.startswith('abcdef')

False

>>>'abc123'.endswith('12')

False

>>>'Helloworld!'.startswith('Helloworld!')

True

>>>'Helloworld!'.endswith('Helloworld!')

True

Thesemethodsareusefulalternativestothe==equalsoperatorifyouneedtocheckonlywhetherthefirstorlastpartofthestring,ratherthanthewholething,isequaltoanotherstring.

Thejoin()andsplit()StringMethodsThejoin()methodisusefulwhenyouhavealistofstringsthatneedtobejoinedtogetherintoasinglestringvalue.Thejoin()methodiscalledonastring,getspasseda

listofstrings,andreturnsastring.Thereturnedstringistheconcatenationofeachstringinthepassed-inlist.Forexample,enterthefollowingintotheinteractiveshell:

>>>','.join(['cats','rats','bats'])

'cats,rats,bats'

>>>''.join(['My','name','is','Simon'])

'MynameisSimon'

>>>'ABC'.join(['My','name','is','Simon'])

'MyABCnameABCisABCSimon'

Noticethatthestringjoin()callsonisinsertedbetweeneachstringofthelistargument.Forexample,whenjoin(['cats','rats','bats'])iscalledonthe','string,thereturnedstringis‘cats,rats,bats’.

Rememberthatjoin()iscalledonastringvalueandispassedalistvalue.(It’seasytoaccidentallycallittheotherwayaround.)Thesplit()methoddoestheopposite:It’scalledonastringvalueandreturnsalistofstrings.Enterthefollowingintotheinteractiveshell:

>>>'MynameisSimon'.split()

['My','name','is','Simon']

Bydefault,thestring'MynameisSimon'issplitwhereverwhitespacecharacterssuchasthespace,tab,ornewlinecharactersarefound.Thesewhitespacecharactersarenotincludedinthestringsinthereturnedlist.Youcanpassadelimiterstringtothesplit()methodtospecifyadifferentstringtosplitupon.Forexample,enterthefollowingintotheinteractiveshell:

>>>'MyABCnameABCisABCSimon'.split('ABC')

['My','name','is','Simon']

>>>'MynameisSimon'.split('m')

['Myna','eisSi','on']

Acommonuseofsplit()istosplitamultilinestringalongthenewlinecharacters.Enterthefollowingintotheinteractiveshell:

>>>spam='''DearAlice,

Howhaveyoubeen?Iamfine.

Thereisacontainerinthefridge

thatislabeled"MilkExperiment".

Pleasedonotdrinkit.

Sincerely,

Bob'''

>>>spam.split('\n')

['DearAlice,','Howhaveyoubeen?Iamfine.','Thereisacontainerinthe

fridge','thatislabeled"MilkExperiment".','','Pleasedonotdrinkit.',

'Sincerely,','Bob']

Passingsplit()theargument'\n'letsussplitthemultilinestringstoredinspamalongthenewlinesandreturnalistinwhicheachitemcorrespondstoonelineofthestring.

JustifyingTextwithrjust(),ljust(),andcenter()Therjust()andljust()stringmethodsreturnapaddedversionofthestringtheyarecalledon,withspacesinsertedtojustifythetext.Thefirstargumenttobothmethodsisanintegerlengthforthejustifiedstring.Enterthefollowingintotheinteractiveshell:

>>>'Hello'.rjust(10)

'Hello'

>>>'Hello'.rjust(20)

'Hello'

>>>'HelloWorld'.rjust(20)

'HelloWorld'

>>>'Hello'.ljust(10)

'Hello'

'Hello'.rjust(10)saysthatwewanttoright-justify'Hello'inastringoftotallength10.'Hello'isfivecharacters,sofivespaceswillbeaddedtoitsleft,givingusastringof10characterswith'Hello'justifiedright.

Anoptionalsecondargumenttorjust()andljust()willspecifyafillcharacterotherthanaspacecharacter.Enterthefollowingintotheinteractiveshell:

>>>'Hello'.rjust(20,'*')

'***************Hello'

>>>'Hello'.ljust(20,'-')

'Hello---------------'

Thecenter()stringmethodworkslikeljust()andrjust()butcentersthetextratherthanjustifyingittotheleftorright.Enterthefollowingintotheinteractiveshell:

>>>'Hello'.center(20)

'Hello'

>>>'Hello'.center(20,'=')

'=======Hello========'

Thesemethodsareespeciallyusefulwhenyouneedtoprinttabulardatathathasthecorrectspacing.Openanewfileeditorwindowandenterthefollowingcode,savingitaspicnicTable.py:

defprintPicnic(itemsDict,leftWidth,rightWidth):

print('PICNICITEMS'.center(leftWidth+rightWidth,'-'))

fork,vinitemsDict.items():

print(k.ljust(leftWidth,'.')+str(v).rjust(rightWidth))

picnicItems={'sandwiches':4,'apples':12,'cups':4,'cookies':8000}

printPicnic(picnicItems,12,5)

printPicnic(picnicItems,20,6)

Inthisprogram,wedefineaprintPicnic()methodthatwilltakeinadictionaryofinformationandusecenter(),ljust(),andrjust()todisplaythatinformationinaneatlyalignedtable-likeformat.

Thedictionarythatwe’llpasstoprintPicnic()ispicnicItems.InpicnicItems,wehave4sandwiches,12apples,4cups,and8000cookies.Wewanttoorganizethisinformationintotwocolumns,withthenameoftheitemontheleftandthequantityontheright.

Todothis,wedecidehowwidewewanttheleftandrightcolumnstobe.Alongwithourdictionary,we’llpassthesevaluestoprintPicnic().

printPicnic()takesinadictionary,aleftWidthfortheleftcolumnofatable,andarightWidthfortherightcolumn.Itprintsatitle,PICNICITEMS,centeredabovethetable.Then,itloopsthroughthedictionary,printingeachkey-valuepaironalinewiththekeyjustifiedleftandpaddedbyperiods,andthevaluejustifiedrightandpaddedbyspaces.

AfterdefiningprintPicnic(),wedefinethedictionarypicnicItemsandcallprintPicnic()twice,passingitdifferentwidthsfortheleftandrighttablecolumns.

Whenyourunthisprogram,thepicnicitemsaredisplayedtwice.Thefirsttimetheleftcolumnis12characterswide,andtherightcolumnis5characterswide.Thesecondtimetheyare20and6characterswide,respectively.

---PICNICITEMS--

sandwiches..4

apples…...12

cups….....4

cookies…..8000

-------PICNICITEMS-------

sandwiches….......4

apples…...........12

cups….............4

cookies…..........8000

Usingrjust(),ljust(),andcenter()letsyouensurethatstringsareneatlyaligned,evenifyouaren’tsurehowmanycharacterslongyourstringsare.

RemovingWhitespacewithstrip(),rstrip(),andlstrip()Sometimesyoumaywanttostripoffwhitespacecharacters(space,tab,andnewline)fromtheleftside,rightside,orbothsidesofastring.Thestrip()stringmethodwillreturnanewstringwithoutanywhitespacecharactersatthebeginningorend.Thelstrip()andrstrip()methodswillremovewhitespacecharactersfromtheleftandrightends,respectively.Enterthefollowingintotheinteractiveshell:

>>>spam='HelloWorld'

>>>spam.strip()

'HelloWorld'

>>>spam.lstrip()

'HelloWorld'

>>>spam.rstrip()

'HelloWorld'

Optionally,astringargumentwillspecifywhichcharactersontheendsshouldbestripped.Enterthefollowingintotheinteractiveshell:

>>>spam='SpamSpamBaconSpamEggsSpamSpam'

>>>spam.strip('ampS')

'BaconSpamEggs'

Passingstrip()theargument'ampS'willtellittostripoccurencesofa,m,p,andcapitalSfromtheendsofthestringstoredinspam.Theorderofthecharactersinthestringpassedtostrip()doesnotmatter:strip('ampS')willdothesamethingasstrip('mapS')orstrip('Spam').

CopyingandPastingStringswiththepyperclipModuleThepyperclipmodulehascopy()andpaste()functionsthatcansendtexttoandreceivetextfromyourcomputer’sclipboard.Sendingtheoutputofyourprogramtotheclipboardwillmakeiteasytopasteittoanemail,wordprocessor,orsomeothersoftware.

PyperclipdoesnotcomewithPython.Toinstallit,followthedirectionsforinstallingthird-partymodulesinAppendixA.Afterinstallingthepyperclipmodule,enterthefollowingintotheinteractiveshell:

>>>importpyperclip

>>>pyperclip.copy('Helloworld!')

>>>pyperclip.paste()

'Helloworld!'

Ofcourse,ifsomethingoutsideofyourprogramchangestheclipboardcontents,thepaste()functionwillreturnit.Forexample,ifIcopiedthissentencetotheclipboardandthencalledpaste(),itwouldlooklikethis:

>>>pyperclip.paste()

'Forexample,ifIcopiedthissentencetotheclipboardandthencalled

paste(),itwouldlooklikethis:'

RUNNINGPYTHONSCRIPTSOUTSIDEOFIDLE

Sofar,you’vebeenrunningyourPythonscriptsusingtheinteractiveshellandfileeditorinIDLE.However,youwon’twanttogothroughtheinconvenienceofopeningIDLEandthePythonscripteachtimeyouwanttorunascript.Fortunately,thereareshortcutsyoucansetuptomakerunningPythonscriptseasier.ThestepsareslightlydifferentforWindows,OSX,andLinux,buteachisdescribedinAppendixB.TurntoAppendixBtolearnhowtorunyourPythonscriptsconvenientlyandbeabletopasscommandlineargumentstothem.(YouwillnotbeabletopasscommandlineargumentstoyourprogramsusingIDLE.)

Project:PasswordLockerYouprobablyhaveaccountsonmanydifferentwebsites.It’sabadhabittousethesamepasswordforeachofthembecauseifanyofthosesiteshasasecuritybreach,thehackerswilllearnthepasswordtoallofyourotheraccounts.It’sbesttousepasswordmanagersoftwareonyourcomputerthatusesonemasterpasswordtounlockthepasswordmanager.Thenyoucancopyanyaccountpasswordtotheclipboardandpasteitintothewebsite’sPasswordfield.

Thepasswordmanagerprogramyou’llcreateinthisexampleisn’tsecure,butitoffersabasicdemonstrationofhowsuchprogramswork.

THECHAPTERPROJECTS

Thisisthefirst“chapterproject”ofthebook.Fromhereon,eachchapterwillhaveprojectsthatdemonstratetheconceptscoveredinthechapter.Theprojectsarewritteninastylethattakesyoufromablankfileeditorwindowtoafull,workingprogram.Justlikewiththeinteractiveshellexamples,don’tonlyreadtheprojectsections—followalongonyourcomputer!

Step1:ProgramDesignandDataStructuresYouwanttobeabletorunthisprogramwithacommandlineargumentthatistheaccount’sname—forinstance,emailorblog.Thataccount’spasswordwillbecopiedtotheclipboardsothattheusercanpasteitintoaPasswordfield.Thisway,theusercanhavelong,complicatedpasswordswithouthavingtomemorizethem.

Openanewfileeditorwindowandsavetheprogramaspw.py.Youneedtostarttheprogramwitha#!(shebang)line(seeAppendixB)andshouldalsowriteacommentthatbrieflydescribestheprogram.Sinceyouwanttoassociateeachaccount’snamewithitspassword,youcanstoretheseasstringsinadictionary.Thedictionarywillbethedatastructurethatorganizesyouraccountandpassworddata.Makeyourprogramlooklikethefollowing:

#!python3

#pw.py-Aninsecurepasswordlockerprogram.

PASSWORDS={'email':'F7minlBDDuvMJuxESSKHFhTxFtjVB6',

'blog':'VmALvQyKAxiVH5G8v01if1MLZF3sdt',

'luggage':'12345'}

Step2:HandleCommandLineArgumentsThecommandlineargumentswillbestoredinthevariablesys.argv.(SeeAppendixBformoreinformationonhowtousecommandlineargumentsinyourprograms.)Thefirstiteminthesys.argvlistshouldalwaysbeastringcontainingtheprogram’sfilename('pw.py'),andtheseconditemshouldbethefirstcommandlineargument.Forthisprogram,thisargumentisthenameoftheaccountwhosepasswordyouwant.Sincethecommandlineargumentismandatory,youdisplayausagemessagetotheuseriftheyforgettoaddit(thatis,ifthesys.argvlisthasfewerthantwovaluesinit).Makeyourprogramlooklikethefollowing:

#!python3

#pw.py-Aninsecurepasswordlockerprogram.

PASSWORDS={'email':'F7minlBDDuvMJuxESSKHFhTxFtjVB6',

'blog':'VmALvQyKAxiVH5G8v01if1MLZF3sdt',

'luggage':'12345'}

importsys

iflen(sys.argv)<2:

print('Usage:pythonpw.py[account]-copyaccountpassword')

sys.exit()

account=sys.argv[1]#firstcommandlineargistheaccountname

Step3:CopytheRightPasswordNowthattheaccountnameisstoredasastringinthevariableaccount,youneedtoseewhetheritexistsinthePASSWORDSdictionaryasakey.Ifso,youwanttocopythekey’svaluetotheclipboardusingpyperclip.copy().(Sinceyou’reusingthepyperclipmodule,youneedtoimportit.)Notethatyoudon’tactuallyneedtheaccountvariable;youcouldjustusesys.argv[1]everywhereaccountisusedinthisprogram.Butavariablenamedaccountismuchmorereadablethansomethingcrypticlikesys.argv[1].

Makeyourprogramlooklikethefollowing:#!python3

#pw.py-Aninsecurepasswordlockerprogram.

PASSWORDS={'email':'F7minlBDDuvMJuxESSKHFhTxFtjVB6',

'blog':'VmALvQyKAxiVH5G8v01if1MLZF3sdt',

'luggage':'12345'}

importsys,pyperclip

iflen(sys.argv)<2:

print('Usage:pypw.py[account]-copyaccountpassword')

sys.exit()

account=sys.argv[1]#firstcommandlineargistheaccountname

ifaccountinPASSWORDS:

pyperclip.copy(PASSWORDS[account])

print('Passwordfor'+account+'copiedtoclipboard.')

else:

print('Thereisnoaccountnamed'+account)

ThisnewcodelooksinthePASSWORDSdictionaryfortheaccountname.Iftheaccountnameisakeyinthedictionary,wegetthevaluecorrespondingtothatkey,copyittotheclipboard,andprintamessagesayingthatwecopiedthevalue.Otherwise,weprintamessagesayingthere’snoaccountwiththatname.

That’sthecompletescript.UsingtheinstructionsinAppendixBforlaunchingcommandlineprogramseasily,younowhaveafastwaytocopyyouraccountpasswordstotheclipboard.YouwillhavetomodifythePASSWORDSdictionaryvalueinthesourcewheneveryouwanttoupdatetheprogramwithanewpassword.

Ofcourse,youprobablydon’twanttokeepallyourpasswordsinoneplacewhereanyonecouldeasilycopythem.Butyoucanmodifythisprogramanduseittoquicklycopyregulartexttotheclipboard.Sayyouaresendingoutseveralemailsthathavemanyofthesamestockparagraphsincommon.YoucouldputeachparagraphasavalueinthePASSWORDSdictionary(you’dprobablywanttorenamethedictionaryatthispoint),andthenyouwouldhaveawaytoquicklyselectandcopyoneofmanystandardpiecesoftexttotheclipboard.

OnWindows,youcancreateabatchfiletorunthisprogramwiththeWIN-RRunwindow.(Formoreaboutbatchfiles,seeAppendixB.)Typethefollowingintothefileeditorandsavethefileaspw.batintheC:\Windowsfolder:

@py.exeC:\Python34\pw.py%*

@pause

Withthisbatchfilecreated,runningthepassword-safeprogramonWindowsisjustamatterofpressingWIN-Randtypingpw<accountname>.

Project:AddingBulletstoWikiMarkupWheneditingaWikipediaarticle,youcancreateabulletedlistbyputtingeachlistitemonitsownlineandplacingastarinfront.Butsayyouhaveareallylargelistthatyouwanttoaddbulletpointsto.Youcouldjusttypethosestarsatthebeginningofeachline,onebyone.OryoucouldautomatethistaskwithashortPythonscript.

ThebulletPointAdder.pyscriptwillgetthetextfromtheclipboard,addastarandspacetothebeginningofeachline,andthenpastethisnewtexttotheclipboard.Forexample,ifIcopiedthefollowingtext(fortheWikipediaarticle“ListofListsofLists”)totheclipboard:

Listsofanimals

Listsofaquariumlife

Listsofbiologistsbyauthorabbreviation

Listsofcultivars

andthenranthebulletPointAdder.pyprogram,theclipboardwouldthencontainthefollowing:

*Listsofanimals

*Listsofaquariumlife

*Listsofbiologistsbyauthorabbreviation

*Listsofcultivars

Thisstar-prefixedtextisreadytobepastedintoaWikipediaarticleasabulletedlist.

Step1:CopyandPastefromtheClipboardYouwantthebulletPointAdder.pyprogramtodothefollowing:

1. Pastetextfromtheclipboard2. Dosomethingtoit3. Copythenewtexttotheclipboard

Thatsecondstepisalittletricky,butsteps1and3areprettystraightforward:Theyjustinvolvethepyperclip.copy()andpyperclip.paste()functions.Fornow,let’sjustwritethepartoftheprogramthatcoverssteps1and3.Enterthefollowing,savingtheprogramasbulletPointAdder.py:

#!python3

#bulletPointAdder.py-AddsWikipediabulletpointstothestart

#ofeachlineoftextontheclipboard.

importpyperclip

text=pyperclip.paste()

#TODO:Separatelinesandaddstars.

pyperclip.copy(text)

TheTODOcommentisareminderthatyoushouldcompletethispartoftheprogrameventually.Thenextstepistoactuallyimplementthatpieceoftheprogram.

Step2:SeparatetheLinesofTextandAddtheStarThecalltopyperclip.paste()returnsallthetextontheclipboardasonebigstring.Ifweusedthe“ListofListsofLists”example,thestringstoredintextwouldlooklikethis:

'Listsofanimals\nListsofaquariumlife\nListsofbiologistsbyauthor

abbreviation\nListsofcultivars'

The\nnewlinecharactersinthisstringcauseittobedisplayedwithmultiplelineswhenitisprintedorpastedfromtheclipboard.Therearemany“lines”inthisonestringvalue.Youwanttoaddastartothestartofeachoftheselines.

Youcouldwritecodethatsearchesforeach\nnewlinecharacterinthestringandthenaddsthestarjustafterthat.Butitwouldbeeasiertousethesplit()methodtoreturnalistofstrings,oneforeachlineintheoriginalstring,andthenaddthestartothefrontofeachstringinthelist.

Makeyourprogramlooklikethefollowing:#!python3

#bulletPointAdder.py-AddsWikipediabulletpointstothestart

#ofeachlineoftextontheclipboard.

importpyperclip

text=pyperclip.paste()

#Separatelinesandaddstars.

lines=text.split('\n')

foriinrange(len(lines)):#loopthroughallindexesinthe"lines"list

lines[i]='*'+lines[i]#addstartoeachstringin"lines"list

pyperclip.copy(text)

Wesplitthetextalongitsnewlinestogetalistinwhicheachitemisonelineofthetext.Westorethelistinlinesandthenloopthroughtheitemsinlines.Foreachline,weaddastarandaspacetothestartoftheline.Noweachstringinlinesbeginswithastar.

Step3:JointheModifiedLinesThelineslistnowcontainsmodifiedlinesthatstartwithstars.Butpyperclip.copy()isexpectingasinglestringvalue,notalistofstringvalues.Tomakethissinglestringvalue,passlinesintothejoin()methodtogetasinglestringjoinedfromthelist’sstrings.Makeyourprogramlooklikethefollowing:

#!python3

#bulletPointAdder.py-AddsWikipediabulletpointstothestart

#ofeachlineoftextontheclipboard.

importpyperclip

text=pyperclip.paste()

#Separatelinesandaddstars.

lines=text.split('\n')

foriinrange(len(lines)):#loopthroughallindexesfor"lines"list

lines[i]='*'+lines[i]#addstartoeachstringin"lines"list

text='\n'.join(lines)

pyperclip.copy(text)

Whenthisprogramisrun,itreplacesthetextontheclipboardwithtextthathasstarsatthestartofeachline.Nowtheprogramiscomplete,andyoucantryrunningitwithtextcopiedtotheclipboard.

Evenifyoudon’tneedtoautomatethisspecifictask,youmightwanttoautomatesomeotherkindoftextmanipulation,suchasremovingtrailingspacesfromtheendoflinesorconvertingtexttouppercaseorlowercase.Whateveryourneeds,youcanusetheclipboardforinputandoutput.

SummaryTextisacommonformofdata,andPythoncomeswithmanyhelpfulstringmethodstoprocessthetextstoredinstringvalues.Youwillmakeuseofindexing,slicing,andstringmethodsinalmosteveryPythonprogramyouwrite.

Theprogramsyouarewritingnowdon’tseemtoosophisticated—theydon’thavegraphicaluserinterfaceswithimagesandcolorfultext.Sofar,you’redisplayingtextwithprint()andlettingtheuserentertextwithinput().However,theusercanquicklyenterlargeamountsoftextthroughtheclipboard.Thisabilityprovidesausefulavenueforwritingprogramsthatmanipulatemassiveamountsoftext.Thesetext-basedprogramsmightnothaveflashywindowsorgraphics,buttheycangetalotofusefulworkdonequickly.

Anotherwaytomanipulatelargeamountsoftextisreadingandwritingfilesdirectlyofftheharddrive.You’lllearnhowtodothiswithPythoninthenextchapter.

PracticeQuestionsQ: 1.Whatareescapecharacters?

Q: 2.Whatdothe\nand\tescapecharactersrepresent?

Q: 3.Howcanyouputa\backslashcharacterinastring?

Q: 4.Thestringvalue"Howl'sMovingCastle"isavalidstring.Whyisn’titaproblemthatthesinglequotecharacterinthewordHowl'sisn’tescaped?

Q: 5.Ifyoudon’twanttoput\ninyourstring,howcanyouwriteastringwithnewlinesinit?

Q: 6.Whatdothefollowingexpressionsevaluateto?

'Helloworld!'[1]

'Helloworld!'[0:5]

'Helloworld!'[:5]

'Helloworld!'[3:]

Q: 7.Whatdothefollowingexpressionsevaluateto?

'Hello'.upper()

'Hello'.upper().isupper()

'Hello'.upper().lower()

Q: 8.Whatdothefollowingexpressionsevaluateto?

'Remember,remember,thefifthofNovember.'.split()

'-'.join('Therecanbeonlyone.'.split())

Q: 9.Whatstringmethodscanyouusetoright-justify,left-justify,andcenterastring?

Q: 10.Howcanyoutrimwhitespacecharactersfromthebeginningorendofastring?

PracticeProjectForpractice,writeaprogramthatdoesthefollowing.

TablePrinterWriteafunctionnamedprintTable()thattakesalistoflistsofstringsanddisplaysitinawell-organizedtablewitheachcolumnright-justified.Assumethatalltheinnerlistswillcontainthesamenumberofstrings.Forexample,thevaluecouldlooklikethis:

tableData=[['apples','oranges','cherries','banana'],

['Alice','Bob','Carol','David'],

['dogs','cats','moose','goose']]

YourprintTable()functionwouldprintthefollowing:applesAlicedogs

orangesBobcats

cherriesCarolmoose

bananaDavidgoose

Hint:Yourcodewillfirsthavetofindthelongeststringineachoftheinnerlistssothatthewholecolumncanbewideenoughtofitallthestrings.Youcanstorethemaximumwidthofeachcolumnasalistofintegers.TheprintTable()functioncanbeginwithcolWidths=[0]*len(tableData),whichwillcreatealistcontainingthesamenumberof0valuesasthenumberofinnerlistsintableData.Thatway,colWidths[0]canstorethewidthofthelongeststringintableData[0],colWidths[1]canstorethewidthofthelongeststringintableData[1],andsoon.YoucanthenfindthelargestvalueinthecolWidthslisttofindoutwhatintegerwidthtopasstotherjust()stringmethod.

PartII.AutomatingTasks

Chapter7.PatternMatchingwithRegularExpressionsYoumaybefamiliarwithsearchingfortextbypressingCTRL-Fandtypinginthewordsyou’relookingfor.Regularexpressionsgoonestepfurther:Theyallowyoutospecifyapatternoftexttosearchfor.Youmaynotknowabusiness’sexactphonenumber,butifyouliveintheUnitedStatesorCanada,youknowitwillbethreedigits,followedbyahyphen,andthenfourmoredigits(andoptionally,athree-digitareacodeatthestart).Thisishowyou,asahuman,knowaphonenumberwhenyouseeit:415-555-1234isaphonenumber,but4,155,551,234isnot.

Regularexpressionsarehelpful,butnotmanynon-programmersknowaboutthemeventhoughmostmoderntexteditorsandwordprocessors,suchasMicrosoftWordorOpenOffice,havefindandfind-and-replacefeaturesthatcansearchbasedonregularexpressions.Regularexpressionsarehugetime-savers,notjustforsoftwareusersbutalsoforprogrammers.Infact,techwriterCoryDoctorowarguesthatevenbeforeteachingprogramming,weshouldbeteachingregularexpressions:

“Knowing[regularexpressions]canmeanthedifferencebetweensolvingaproblemin3stepsandsolvingitin3,000steps.Whenyou’reanerd,youforgetthattheproblemsyousolvewithacouplekeystrokescantakeotherpeopledaysoftedious,error-proneworktoslogthrough.”[1]

Inthischapter,you’llstartbywritingaprogramtofindtextpatternswithoutusingregularexpressionsandthenseehowtouseregularexpressionstomakethecodemuchlessbloated.I’llshowyoubasicmatchingwithregularexpressionsandthenmoveontosomemorepowerfulfeatures,suchasstringsubstitutionandcreatingyourowncharacterclasses.Finally,attheendofthechapter,you’llwriteaprogramthatcanautomaticallyextractphonenumbersandemailaddressesfromablockoftext.

FindingPatternsofTextWithoutRegularExpressionsSayyouwanttofindaphonenumberinastring.Youknowthepattern:threenumbers,ahyphen,threenumbers,ahyphen,andfournumbers.Here’sanexample:415-555-4242.

Let’suseafunctionnamedisPhoneNumber()tocheckwhetherastringmatchesthispattern,returningeitherTrueorFalse.Openanewfileeditorwindowandenterthefollowingcode;thensavethefileasisPhoneNumber.py:

defisPhoneNumber(text):

➊iflen(text)!=12:

returnFalse

foriinrange(0,3):

➋ifnottext[i].isdecimal():

returnFalse

➌iftext[3]!='-':

returnFalse

foriinrange(4,7):

➍ifnottext[i].isdecimal():

returnFalse

➎iftext[7]!='-':

returnFalse

foriinrange(8,12):

➏ifnottext[i].isdecimal():

returnFalse

➐returnTrue

print('415-555-4242isaphonenumber:')

print(isPhoneNumber('415-555-4242'))

print('Moshimoshiisaphonenumber:')

print(isPhoneNumber('Moshimoshi'))

Whenthisprogramisrun,theoutputlookslikethis:415-555-4242isaphonenumber:

True

Moshimoshiisaphonenumber:

False

TheisPhoneNumber()functionhascodethatdoesseveralcheckstoseewhetherthestringintextisavalidphonenumber.Ifanyofthesechecksfail,thefunctionreturnsFalse.Firstthecodechecksthatthestringisexactly12characters➊.Thenitchecksthattheareacode(thatis,thefirstthreecharactersintext)consistsofonlynumericcharacters➋.Therestofthefunctionchecksthatthestringfollowsthepatternofaphonenumber:Thenumbermusthavethefirsthyphenaftertheareacode➌,threemorenumericcharacters➍,thenanotherhyphen➎,andfinallyfourmorenumbers➏.Iftheprogramexecutionmanagestogetpastallthechecks,itreturnsTrue➐.

CallingisPhoneNumber()withtheargument'415-555-4242'willreturnTrue.CallingisPhoneNumber()with'Moshimoshi'willreturnFalse;thefirsttestfailsbecause'Moshimoshi'isnot12characterslong.

Youwouldhavetoaddevenmorecodetofindthispatternoftextinalargerstring.Replacethelastfourprint()functioncallsinisPhoneNumber.pywiththefollowing:

message='Callmeat415-555-1011tomorrow.415-555-9999ismyoffice.'

foriinrange(len(message)):

➊chunk=message[i:i+12]

➋ifisPhoneNumber(chunk):

print('Phonenumberfound:'+chunk)

print('Done')

Whenthisprogramisrun,theoutputwilllooklikethis:Phonenumberfound:415-555-1011

Phonenumberfound:415-555-9999

Done

Oneachiterationoftheforloop,anewchunkof12charactersfrommessageisassignedtothevariablechunk➊.Forexample,onthefirstiteration,iis0,andchunkisassignedmessage[0:12](thatis,thestring'Callmeat4').Onthenextiteration,iis1,andchunkisassignedmessage[1:13](thestring'allmeat41').

YoupasschunktoisPhoneNumber()toseewhetheritmatchesthephonenumberpattern➋,andifso,youprintthechunk.

Continuetoloopthroughmessage,andeventuallythe12charactersinchunkwillbeaphonenumber.Theloopgoesthroughtheentirestring,testingeach12-characterpieceandprintinganychunkitfindsthatsatisfiesisPhoneNumber().Oncewe’redonegoingthroughmessage,weprintDone.

Whilethestringinmessageisshortinthisexample,itcouldbemillionsofcharacterslongandtheprogramwouldstillruninlessthanasecond.Asimilarprogramthatfindsphonenumbersusingregularexpressionswouldalsoruninlessthanasecond,butregularexpressionsmakeitquickertowritetheseprograms.

FindingPatternsofTextwithRegularExpressionsThepreviousphonenumber–findingprogramworks,butitusesalotofcodetodosomethinglimited:TheisPhoneNumber()functionis17linesbutcanfindonlyonepatternofphonenumbers.Whataboutaphonenumberformattedlike415.555.4242or(415)555-4242?Whatifthephonenumberhadanextension,like415-555-4242x99?TheisPhoneNumber()functionwouldfailtovalidatethem.Youcouldaddyetmorecodefortheseadditionalpatterns,butthereisaneasierway.

Regularexpressions,calledregexesforshort,aredescriptionsforapatternoftext.Forexample,a\dinaregexstandsforadigitcharacter—thatis,anysinglenumeral0to9.Theregex\d\d\d-\d\d\d-\d\d\d\disusedbyPythontomatchthesametextthepreviousisPhoneNumber()functiondid:astringofthreenumbers,ahyphen,threemorenumbers,anotherhyphen,andfournumbers.Anyotherstringwouldnotmatchthe\d\d\d-\d\d\d-\d\d\d\dregex.

Butregularexpressionscanbemuchmoresophisticated.Forexample,addinga3incurlybrackets({3})afterapatternislikesaying,“Matchthispatternthreetimes.”Sotheslightlyshorterregex\d{3}-\d{3}-\d{4}alsomatchesthecorrectphonenumberformat.

CreatingRegexObjectsAlltheregexfunctionsinPythonareintheremodule.Enterthefollowingintotheinteractiveshelltoimportthismodule:

>>>importre

NOTE

Mostoftheexamplesthatfollowinthischapterwillrequiretheremodule,soremembertoimportitatthebeginningofanyscriptyouwriteoranytimeyourestartIDLE.Otherwise,you’llgetaNameError:name're'isnotdefinederrormessage.

Passingastringvaluerepresentingyourregularexpressiontore.compile()returnsaRegexpatternobject(orsimply,aRegexobject).

TocreateaRegexobjectthatmatchesthephonenumberpattern,enterthefollowingintotheinteractiveshell.(Rememberthat\dmeans“adigitcharacter”and\d\d\d-\d\d\d-\d\d\d\distheregularexpressionforthecorrectphonenumberpattern.)

>>>phoneNumRegex=re.compile(r'\d\d\d-\d\d\d-\d\d\d\d')

NowthephoneNumRegexvariablecontainsaRegexobject.

PASSINGRAWSTRINGSTORE.COMPILE()

RememberthatescapecharactersinPythonusethebackslash(\).Thestringvalue'\n'representsasinglenewlinecharacter,notabackslashfollowedbyalowercasen.Youneedtoentertheescapecharacter\\toprintasinglebackslash.So'\\n'isthestringthatrepresentsabackslashfollowedbyalowercasen.However,byputtinganrbeforethefirstquoteofthestringvalue,youcanmarkthestringasarawstring,whichdoesnotescapecharacters.

Sinceregularexpressionsfrequentlyusebackslashesinthem,itisconvenienttopassrawstringstothere.compile()functioninsteadoftypingextrabackslashes.Typingr'\d\d\d-\d\d\d-\d\d\d\d'ismucheasierthantyping'\\d\\d\\d-\\d\\d\\d-\\d\\d\\d\\d'.

MatchingRegexObjects

ARegexobject’ssearch()methodsearchesthestringitispassedforanymatchestotheregex.Thesearch()methodwillreturnNoneiftheregexpatternisnotfoundinthestring.Ifthepatternisfound,thesearch()methodreturnsaMatchobject.Matchobjectshaveagroup()methodthatwillreturntheactualmatchedtextfromthesearchedstring.(I’llexplaingroupsshortly.)Forexample,enterthefollowingintotheinteractiveshell:

>>>phoneNumRegex=re.compile(r'\d\d\d-\d\d\d-\d\d\d\d')

>>>mo=phoneNumRegex.search('Mynumberis415-555-4242.')

>>>print('Phonenumberfound:'+mo.group())

Phonenumberfound:415-555-4242

ThemovariablenameisjustagenericnametouseforMatchobjects.Thisexamplemightseemcomplicatedatfirst,butitismuchshorterthantheearlierisPhoneNumber.pyprogramanddoesthesamething.

Here,wepassourdesiredpatterntore.compile()andstoretheresultingRegexobjectinphoneNumRegex.Thenwecallsearch()onphoneNumRegexandpasssearch()thestringwewanttosearchforamatch.Theresultofthesearchgetsstoredinthevariablemo.Inthisexample,weknowthatourpatternwillbefoundinthestring,soweknowthataMatchobjectwillbereturned.KnowingthatmocontainsaMatchobjectandnotthenullvalueNone,wecancallgroup()onmotoreturnthematch.Writingmo.group()insideourprintstatementdisplaysthewholematch,415-555-4242.

ReviewofRegularExpressionMatchingWhilethereareseveralstepstousingregularexpressionsinPython,eachstepisfairlysimple.

1. Importtheregexmodulewithimportre.2. CreateaRegexobjectwiththere.compile()function.(Remembertousearaw

string.)3. PassthestringyouwanttosearchintotheRegexobject’ssearch()method.This

returnsaMatchobject.4. CalltheMatchobject’sgroup()methodtoreturnastringoftheactualmatchedtext.

NOTE

WhileIencourageyoutoentertheexamplecodeintotheinteractiveshell,youshouldalsomakeuseofweb-basedregularexpressiontesters,whichcanshowyouexactlyhowaregexmatchesapieceoftextthatyouenter.Irecommendthetesterathttp://regexpal.com/.

MorePatternMatchingwithRegularExpressionsNowthatyouknowthebasicstepsforcreatingandfindingregularexpressionobjectswithPython,you’rereadytotrysomeoftheirmorepowerfulpattern-matchingcapabilities.

GroupingwithParenthesesSayyouwanttoseparatetheareacodefromtherestofthephonenumber.Addingparentheseswillcreategroupsintheregex:(\d\d\d)-(\d\d\d-\d\d\d\d).Thenyoucanusethegroup()matchobjectmethodtograbthematchingtextfromjustonegroup.

Thefirstsetofparenthesesinaregexstringwillbegroup1.Thesecondsetwillbegroup2.Bypassingtheinteger1or2tothegroup()matchobjectmethod,youcangrabdifferentpartsofthematchedtext.Passing0ornothingtothegroup()methodwillreturntheentirematchedtext.Enterthefollowingintotheinteractiveshell:

>>>phoneNumRegex=re.compile(r'(\d\d\d)-(\d\d\d-\d\d\d\d)')

>>>mo=phoneNumRegex.search('Mynumberis415-555-4242.')

>>>mo.group(1)

'415'

>>>mo.group(2)

'555-4242'

>>>mo.group(0)

'415-555-4242'

>>>mo.group()

'415-555-4242'

Ifyouwouldliketoretrieveallthegroupsatonce,usethegroups()method—notethepluralformforthename.

>>>mo.groups()

('415','555-4242')

>>>areaCode,mainNumber=mo.groups()

>>>print(areaCode)

415

>>>print(mainNumber)

555-4242

Sincemo.groups()returnsatupleofmultiplevalues,youcanusethemultiple-assignmenttricktoassigneachvaluetoaseparatevariable,asinthepreviousareaCode,mainNumber=mo.groups()line.

Parentheseshaveaspecialmeaninginregularexpressions,butwhatdoyoudoifyouneedtomatchaparenthesisinyourtext?Forinstance,maybethephonenumbersyouaretryingtomatchhavetheareacodesetinparentheses.Inthiscase,youneedtoescapethe(and)characterswithabackslash.Enterthefollowingintotheinteractiveshell:

>>>phoneNumRegex=re.compile(r'(\(\d\d\d\))(\d\d\d-\d\d\d\d)')

>>>mo=phoneNumRegex.search('Myphonenumberis(415)555-4242.')

>>>mo.group(1)

'(415)'

>>>mo.group(2)

'555-4242'

The\(and\)escapecharactersintherawstringpassedtore.compile()willmatchactualparenthesischaracters.

MatchingMultipleGroupswiththePipeThe|characteriscalledapipe.Youcanuseitanywhereyouwanttomatchoneofmanyexpressions.Forexample,theregularexpressionr'Batman|TinaFey'willmatcheither

'Batman'or'TinaFey'.

WhenbothBatmanandTinaFeyoccurinthesearchedstring,thefirstoccurrenceofmatchingtextwillbereturnedastheMatchobject.Enterthefollowingintotheinteractiveshell:

>>>heroRegex=re.compile(r'Batman|TinaFey')

>>>mo1=heroRegex.search('BatmanandTinaFey.')

>>>mo1.group()

'Batman'

>>>mo2=heroRegex.search('TinaFeyandBatman.')

>>>mo2.group()

'TinaFey'

NOTE

Youcanfindallmatchingoccurrenceswiththefindall()methodthat’sdiscussedinThefindall()Method.

Youcanalsousethepipetomatchoneofseveralpatternsaspartofyourregex.Forexample,sayyouwantedtomatchanyofthestrings'Batman','Batmobile','Batcopter',and'Batbat'.SinceallthesestringsstartwithBat,itwouldbeniceifyoucouldspecifythatprefixonlyonce.Thiscanbedonewithparentheses.Enterthefollowingintotheinteractiveshell:

>>>batRegex=re.compile(r'Bat(man|mobile|copter|bat)')

>>>mo=batRegex.search('Batmobilelostawheel')

>>>mo.group()

'Batmobile'

>>>mo.group(1)

'mobile'

Themethodcallmo.group()returnsthefullmatchedtext'Batmobile',whilemo.group(1)returnsjustthepartofthematchedtextinsidethefirstparenthesesgroup,'mobile'.Byusingthepipecharacterandgroupingparentheses,youcanspecifyseveralalternativepatternsyouwouldlikeyourregextomatch.

Ifyouneedtomatchanactualpipecharacter,escapeitwithabackslash,like\|.

OptionalMatchingwiththeQuestionMarkSometimesthereisapatternthatyouwanttomatchonlyoptionally.Thatis,theregexshouldfindamatchwhetherornotthatbitoftextisthere.The?characterflagsthegroupthatprecedesitasanoptionalpartofthepattern.Forexample,enterthefollowingintotheinteractiveshell:

>>>batRegex=re.compile(r'Bat(wo)?man')

>>>mo1=batRegex.search('TheAdventuresofBatman')

>>>mo1.group()

'Batman'

>>>mo2=batRegex.search('TheAdventuresofBatwoman')

>>>mo2.group()

'Batwoman'

The(wo)?partoftheregularexpressionmeansthatthepatternwoisanoptionalgroup.Theregexwillmatchtextthathaszeroinstancesoroneinstanceofwoinit.Thisiswhytheregexmatchesboth'Batwoman'and'Batman'.

Usingtheearlierphonenumberexample,youcanmaketheregexlookforphonenumbersthatdoordonothaveanareacode.Enterthefollowingintotheinteractiveshell:

>>>phoneRegex=re.compile(r'(\d\d\d-)?\d\d\d-\d\d\d\d')

>>>mo1=phoneRegex.search('Mynumberis415-555-4242')

>>>mo1.group()

'415-555-4242'

>>>mo2=phoneRegex.search('Mynumberis555-4242')

>>>mo2.group()

'555-4242'

Youcanthinkofthe?assaying,“Matchzerooroneofthegroupprecedingthisquestionmark.”

Ifyouneedtomatchanactualquestionmarkcharacter,escapeitwith\?.

MatchingZeroorMorewiththeStarThe*(calledthestarorasterisk)means“matchzeroormore”—thegroupthatprecedesthestarcanoccuranynumberoftimesinthetext.Itcanbecompletelyabsentorrepeatedoverandoveragain.Let’slookattheBatmanexampleagain.

>>>batRegex=re.compile(r'Bat(wo)*man')

>>>mo1=batRegex.search('TheAdventuresofBatman')

>>>mo1.group()

'Batman'

>>>mo2=batRegex.search('TheAdventuresofBatwoman')

>>>mo2.group()

'Batwoman'

>>>mo3=batRegex.search('TheAdventuresofBatwowowowoman')

>>>mo3.group()

'Batwowowowoman'

For'Batman',the(wo)*partoftheregexmatcheszeroinstancesofwointhestring;for'Batwoman',the(wo)*matchesoneinstanceofwo;andfor'Batwowowowoman',(wo)*matchesfourinstancesofwo.

Ifyouneedtomatchanactualstarcharacter,prefixthestarintheregularexpressionwithabackslash,\*.

MatchingOneorMorewiththePlusWhile*means“matchzeroormore,”the+(orplus)means“matchoneormore.”Unlikethestar,whichdoesnotrequireitsgrouptoappearinthematchedstring,thegroupprecedingaplusmustappearatleastonce.Itisnotoptional.Enterthefollowingintotheinteractiveshell,andcompareitwiththestarregexesintheprevioussection:

>>>batRegex=re.compile(r'Bat(wo)+man')

>>>mo1=batRegex.search('TheAdventuresofBatwoman')

>>>mo1.group()

'Batwoman'

>>>mo2=batRegex.search('TheAdventuresofBatwowowowoman')

>>>mo2.group()

'Batwowowowoman'

>>>mo3=batRegex.search('TheAdventuresofBatman')

>>>mo3==None

True

TheregexBat(wo)+manwillnotmatchthestring'TheAdventuresofBatman'becauseatleastonewoisrequiredbytheplussign.

Ifyouneedtomatchanactualplussigncharacter,prefixtheplussignwithabackslashtoescapeit:\+.

MatchingSpecificRepetitionswithCurlyBracketsIfyouhaveagroupthatyouwanttorepeataspecificnumberoftimes,followthegroupinyourregexwithanumberincurlybrackets.Forexample,theregex(Ha){3}willmatchthestring'HaHaHa',butitwillnotmatch'HaHa',sincethelatterhasonlytworepeatsofthe(Ha)group.

Insteadofonenumber,youcanspecifyarangebywritingaminimum,acomma,andamaximuminbetweenthecurlybrackets.Forexample,theregex(Ha){3,5}willmatch'HaHaHa','HaHaHaHa',and'HaHaHaHaHa'.

Youcanalsoleaveoutthefirstorsecondnumberinthecurlybracketstoleavetheminimumormaximumunbounded.Forexample,(Ha){3,}willmatchthreeormoreinstancesofthe(Ha)group,while(Ha){,5}willmatchzerotofiveinstances.Curlybracketscanhelpmakeyourregularexpressionsshorter.Thesetworegularexpressionsmatchidenticalpatterns:

(Ha){3}

(Ha)(Ha)(Ha)

Andthesetworegularexpressionsalsomatchidenticalpatterns:(Ha){3,5}

((Ha)(Ha)(Ha))|((Ha)(Ha)(Ha)(Ha))|((Ha)(Ha)(Ha)(Ha)(Ha))

Enterthefollowingintotheinteractiveshell:>>>haRegex=re.compile(r'(Ha){3}')

>>>mo1=haRegex.search('HaHaHa')

>>>mo1.group()

'HaHaHa'

>>>mo2=haRegex.search('Ha')

>>>mo2==None

True

Here,(Ha){3}matches'HaHaHa'butnot'Ha'.Sinceitdoesn’tmatch'Ha',search()returnsNone.

GreedyandNongreedyMatchingSince(Ha){3,5}canmatchthree,four,orfiveinstancesofHainthestring'HaHaHaHaHa',youmaywonderwhytheMatchobject’scalltogroup()inthepreviouscurlybracketexamplereturns'HaHaHaHaHa'insteadoftheshorterpossibilities.Afterall,'HaHaHa'and'HaHaHaHa'arealsovalidmatchesoftheregularexpression(Ha){3,5}.

Python’sregularexpressionsaregreedybydefault,whichmeansthatinambiguoussituationstheywillmatchthelongeststringpossible.Thenon-greedyversionofthecurlybrackets,whichmatchestheshorteststringpossible,hastheclosingcurlybracketfollowedbyaquestionmark.

Enterthefollowingintotheinteractiveshell,andnoticethedifferencebetweenthegreedyandnongreedyformsofthecurlybracketssearchingthesamestring:

>>>greedyHaRegex=re.compile(r'(Ha){3,5}')

>>>mo1=greedyHaRegex.search('HaHaHaHaHa')

>>>mo1.group()

'HaHaHaHaHa'

>>>nongreedyHaRegex=re.compile(r'(Ha){3,5}?')

>>>mo2=nongreedyHaRegex.search('HaHaHaHaHa')

>>>mo2.group()

'HaHaHa'

Notethatthequestionmarkcanhavetwomeaningsinregularexpressions:declaringanongreedymatchorflagginganoptionalgroup.Thesemeaningsareentirelyunrelated.

Thefindall()MethodInadditiontothesearch()method,Regexobjectsalsohaveafindall()method.Whilesearch()willreturnaMatchobjectofthefirstmatchedtextinthesearchedstring,thefindall()methodwillreturnthestringsofeverymatchinthesearchedstring.Toseehowsearch()returnsaMatchobjectonlyonthefirstinstanceofmatchingtext,enterthefollowingintotheinteractiveshell:

>>>phoneNumRegex=re.compile(r'\d\d\d-\d\d\d-\d\d\d\d')

>>>mo=phoneNumRegex.search('Cell:415-555-9999Work:212-555-0000')

>>>mo.group()

'415-555-9999'

Ontheotherhand,findall()willnotreturnaMatchobjectbutalistofstrings—aslongastherearenogroupsintheregularexpression.Eachstringinthelistisapieceofthesearchedtextthatmatchedtheregularexpression.Enterthefollowingintotheinteractiveshell:

>>>phoneNumRegex=re.compile(r'\d\d\d-\d\d\d-\d\d\d\d')#hasnogroups

>>>phoneNumRegex.findall('Cell:415-555-9999Work:212-555-0000')

['415-555-9999','212-555-0000']

Iftherearegroupsintheregularexpression,thenfindall()willreturnalistoftuples.Eachtuplerepresentsafoundmatch,anditsitemsarethematchedstringsforeachgroupintheregex.Toseefindall()inaction,enterthefollowingintotheinteractiveshell(noticethattheregularexpressionbeingcompilednowhasgroupsinparentheses):

>>>phoneNumRegex=re.compile(r'(\d\d\d)-(\d\d\d)-(\d\d\d\d)')#hasgroups

>>>phoneNumRegex.findall('Cell:415-555-9999Work:212-555-0000')

[('415','555','1122'),('415','555','8899')]

Tosummarizewhatthefindall()methodreturns,rememberthefollowing:

1. Whencalledonaregexwithnogroups,suchas\d\d\d-\d\d\d-\d\d\d\d,themethodfindall()returnsalistofstringmatches,suchas['415-555-9999','212-555-0000'].

2. Whencalledonaregexthathasgroups,suchas(\d\d\d)-(\d\d\d)-(\d\d\d\d),themethodfindall()returnsalistoftuplesofstrings(onestringforeachgroup),suchas[('415','555','1122'),('415','555','8899')].

CharacterClassesIntheearlierphonenumberregexexample,youlearnedthat\dcouldstandforanynumericdigit.Thatis,\disshorthandfortheregularexpression(0|1|2|3|4|5|6|7|8|9).Therearemanysuchshorthandcharacterclasses,asshowninTable7-1.

Table7-1.ShorthandCodesforCommonCharacterClasses

Shorthandcharacterclass

Represents

\d Anynumericdigitfrom0to9.

\D Anycharacterthatisnotanumericdigitfrom0to9.

\w Anyletter,numericdigit,ortheunderscorecharacter.(Thinkofthisasmatching“word”characters.)

\W Anycharacterthatisnotaletter,numericdigit,ortheunderscorecharacter.

\s Anyspace,tab,ornewlinecharacter.(Thinkofthisasmatching“space”characters.)

\S Anycharacterthatisnotaspace,tab,ornewline.

Characterclassesareniceforshorteningregularexpressions.Thecharacterclass[0-5]willmatchonlythenumbers0to5;thisismuchshorterthantyping(0|1|2|3|4|5).

Forexample,enterthefollowingintotheinteractiveshell:>>>xmasRegex=re.compile(r'\d+\s\w+')

>>>xmasRegex.findall('12drummers,11pipers,10lords,9ladies,8maids,7

swans,6geese,5rings,4birds,3hens,2doves,1partridge')

['12drummers','11pipers','10lords','9ladies','8maids','7swans','6

geese','5rings','4birds','3hens','2doves','1partridge']

Theregularexpression\d+\s\w+willmatchtextthathasoneormorenumericdigits(\d+),followedbyawhitespacecharacter(\s),followedbyoneormoreletter/digit/underscorecharacters(\w+).Thefindall()methodreturnsallmatchingstringsoftheregexpatterninalist.

MakingYourOwnCharacterClassesTherearetimeswhenyouwanttomatchasetofcharactersbuttheshorthandcharacterclasses(\d,\w,\s,andsoon)aretoobroad.Youcandefineyourowncharacterclassusingsquarebrackets.Forexample,thecharacterclass[aeiouAEIOU]willmatchanyvowel,bothlowercaseanduppercase.Enterthefollowingintotheinteractiveshell:

>>>vowelRegex=re.compile(r'[aeiouAEIOU]')

>>>vowelRegex.findall('Robocopeatsbabyfood.BABYFOOD.')

['o','o','o','e','a','a','o','o','A','O','O']

Youcanalsoincluderangesoflettersornumbersbyusingahyphen.Forexample,thecharacterclass[a-zA-Z0-9]willmatchalllowercaseletters,uppercaseletters,andnumbers.

Notethatinsidethesquarebrackets,thenormalregularexpressionsymbolsarenotinterpretedassuch.Thismeansyoudonotneedtoescapethe.,*,?,or()characterswithaprecedingbackslash.Forexample,thecharacterclass[0-5.]willmatchdigits0to5andaperiod.Youdonotneedtowriteitas[0-5\.].

Byplacingacaretcharacter(^)justafterthecharacterclass’sopeningbracket,youcanmakeanegativecharacterclass.Anegativecharacterclasswillmatchallthecharactersthatarenotinthecharacterclass.Forexample,enterthefollowingintotheinteractiveshell:

>>>consonantRegex=re.compile(r'[^aeiouAEIOU]')

>>>consonantRegex.findall('Robocopeatsbabyfood.BABYFOOD.')

['R','b','c','p','','t','s','','b','b','y','','f','d','.','

','B','B','Y','','F','D','.']

Now,insteadofmatchingeveryvowel,we’rematchingeverycharacterthatisn’tavowel.

TheCaretandDollarSignCharactersYoucanalsousethecaretsymbol(^)atthestartofaregextoindicatethatamatchmustoccuratthebeginningofthesearchedtext.Likewise,youcanputadollarsign($)attheendoftheregextoindicatethestringmustendwiththisregexpattern.Andyoucanusethe^and$togethertoindicatethattheentirestringmustmatchtheregex—thatis,it’snotenoughforamatchtobemadeonsomesubsetofthestring.

Forexample,ther'^Hello'regularexpressionstringmatchesstringsthatbeginwith'Hello'.Enterthefollowingintotheinteractiveshell:

>>>beginsWithHello=re.compile(r'^Hello')

>>>beginsWithHello.search('Helloworld!')

<_sre.SRE_Matchobject;span=(0,5),match='Hello'>

>>>beginsWithHello.search('Hesaidhello.')==None

True

Ther'\d$'regularexpressionstringmatchesstringsthatendwithanumericcharacterfrom0to9.Enterthefollowingintotheinteractiveshell:

>>>endsWithNumber=re.compile(r'\d$')

>>>endsWithNumber.search('Yournumberis42')

<_sre.SRE_Matchobject;span=(16,17),match='2'>

>>>endsWithNumber.search('Yournumberisfortytwo.')==None

True

Ther'^\d+$'regularexpressionstringmatchesstringsthatbothbeginandendwithoneormorenumericcharacters.Enterthefollowingintotheinteractiveshell:

>>>wholeStringIsNum=re.compile(r'^\d+$')

>>>wholeStringIsNum.search('1234567890')

<_sre.SRE_Matchobject;span=(0,10),match='1234567890'>

>>>wholeStringIsNum.search('12345xyz67890')==None

True

>>>wholeStringIsNum.search('1234567890')==None

True

Thelasttwosearch()callsinthepreviousinteractiveshellexampledemonstratehowtheentirestringmustmatchtheregexif^and$areused.

Ialwaysconfusethemeaningsofthesetwosymbols,soIusethemnemonic“Carrotscostdollars”toremindmyselfthatthecaretcomesfirstandthedollarsigncomeslast.

TheWildcardCharacterThe.(ordot)characterinaregularexpressioniscalledawildcardandwillmatchanycharacterexceptforanewline.Forexample,enterthefollowingintotheinteractiveshell:

>>>atRegex=re.compile(r'.at')

>>>atRegex.findall('Thecatinthehatsatontheflatmat.')

['cat','hat','sat','lat','mat']

Rememberthatthedotcharacterwillmatchjustonecharacter,whichiswhythematchforthetextflatinthepreviousexamplematchedonlylat.Tomatchanactualdot,escapethedotwithabackslash:\..

MatchingEverythingwithDot-StarSometimesyouwillwanttomatcheverythingandanything.Forexample,sayyouwanttomatchthestring'FirstName:',followedbyanyandalltext,followedby'LastName:',andthenfollowedbyanythingagain.Youcanusethedot-star(.*)tostandinforthat“anything.”Rememberthatthedotcharactermeans“anysinglecharacterexceptthenewline,”andthestarcharactermeans“zeroormoreoftheprecedingcharacter.”

Enterthefollowingintotheinteractiveshell:>>>nameRegex=re.compile(r'FirstName:(.*)LastName:(.*)')

>>>mo=nameRegex.search('FirstName:AlLastName:Sweigart')

>>>mo.group(1)

'Al'

>>>mo.group(2)

'Sweigart'

Thedot-starusesgreedymode:Itwillalwaystrytomatchasmuchtextaspossible.Tomatchanyandalltextinanongreedyfashion,usethedot,star,andquestionmark(.*?).Likewithcurlybrackets,thequestionmarktellsPythontomatchinanongreedyway.

Enterthefollowingintotheinteractiveshelltoseethedifferencebetweenthegreedyandnongreedyversions:

>>>nongreedyRegex=re.compile(r'<.*?>')

>>>mo=nongreedyRegex.search('<Toserveman>fordinner.>')

>>>mo.group()

'<Toserveman>'

>>>greedyRegex=re.compile(r'<.*>')

>>>mo=greedyRegex.search('<Toserveman>fordinner.>')

>>>mo.group()

'<Toserveman>fordinner.>'

Bothregexesroughlytranslateto“Matchanopeninganglebracket,followedbyanything,followedbyaclosinganglebracket.”Butthestring'<Toserveman>fordinner.>'hastwopossiblematchesfortheclosinganglebracket.Inthenongreedyversionoftheregex,Pythonmatchestheshortestpossiblestring:'<Toserveman>'.Inthegreedyversion,Pythonmatchesthelongestpossiblestring:'<Toserveman>fordinner.>'.

MatchingNewlineswiththeDotCharacterThedot-starwillmatcheverythingexceptanewline.Bypassingre.DOTALLasthesecondargumenttore.compile(),youcanmakethedotcharactermatchallcharacters,includingthenewlinecharacter.

Enterthefollowingintotheinteractiveshell:

>>>noNewlineRegex=re.compile('.*')

>>>noNewlineRegex.search('Servethepublictrust.\nProtecttheinnocent.

\nUpholdthelaw.').group()

'Servethepublictrust.'

>>>newlineRegex=re.compile('.*',re.DOTALL)

>>>newlineRegex.search('Servethepublictrust.\nProtecttheinnocent.

\nUpholdthelaw.').group()

'Servethepublictrust.\nProtecttheinnocent.\nUpholdthelaw.'

TheregexnoNewlineRegex,whichdidnothavere.DOTALLpassedtothere.compile()callthatcreatedit,willmatcheverythingonlyuptothefirstnewlinecharacter,whereasnewlineRegex,whichdidhavere.DOTALLpassedtore.compile(),matcheseverything.ThisiswhythenewlineRegex.search()callmatchesthefullstring,includingitsnewlinecharacters.

ReviewofRegexSymbolsThischaptercoveredalotofnotation,sohere’saquickreviewofwhatyoulearned:

The?matcheszerooroneoftheprecedinggroup.The*matcheszeroormoreoftheprecedinggroup.The+matchesoneormoreoftheprecedinggroup.The{n}matchesexactlynoftheprecedinggroup.The{n,}matchesnormoreoftheprecedinggroup.The{,m}matches0tomoftheprecedinggroup.The{n,m}matchesatleastnandatmostmoftheprecedinggroup.{n,m}?or*?or+?performsanongreedymatchoftheprecedinggroup.^spammeansthestringmustbeginwithspam.spam$meansthestringmustendwithspam.The.matchesanycharacter,exceptnewlinecharacters.\d,\w,and\smatchadigit,word,orspacecharacter,respectively.\D,\W,and\Smatchanythingexceptadigit,word,orspacecharacter,respectively.[abc]matchesanycharacterbetweenthebrackets(suchasa,b,orc).[^abc]matchesanycharacterthatisn’tbetweenthebrackets.

Case-InsensitiveMatchingNormally,regularexpressionsmatchtextwiththeexactcasingyouspecify.Forexample,thefollowingregexesmatchcompletelydifferentstrings:

>>>regex1=re.compile('Robocop')

>>>regex2=re.compile('ROBOCOP')

>>>regex3=re.compile('robOcop')

>>>regex4=re.compile('RobocOp')

Butsometimesyoucareonlyaboutmatchingtheletterswithoutworryingwhetherthey’reuppercaseorlowercase.Tomakeyourregexcase-insensitive,youcanpassre.IGNORECASEorre.Iasasecondargumenttore.compile().Enterthefollowingintotheinteractiveshell:

>>>robocop=re.compile(r'robocop',re.I)

>>>robocop.search('Robocopispartman,partmachine,allcop.').group()

'Robocop'

>>>robocop.search('ROBOCOPprotectstheinnocent.').group()

'ROBOCOP'

>>>robocop.search('Al,whydoesyourprogrammingbooktalkaboutrobocopsomuch?').group()

'robocop'

SubstitutingStringswiththesub()MethodRegularexpressionscannotonlyfindtextpatternsbutcanalsosubstitutenewtextinplaceofthosepatterns.Thesub()methodforRegexobjectsispassedtwoarguments.Thefirstargumentisastringtoreplaceanymatches.Thesecondisthestringfortheregularexpression.Thesub()methodreturnsastringwiththesubstitutionsapplied.

Forexample,enterthefollowingintotheinteractiveshell:>>>namesRegex=re.compile(r'Agent\w+')

>>>namesRegex.sub('CENSORED','AgentAlicegavethesecretdocumentstoAgentBob.')

'CENSOREDgavethesecretdocumentstoCENSORED.'

Sometimesyoumayneedtousethematchedtextitselfaspartofthesubstitution.Inthefirstargumenttosub(),youcantype\1,\2,\3,andsoon,tomean“Enterthetextofgroup1,2,3,andsoon,inthesubstitution.”

Forexample,sayyouwanttocensorthenamesofthesecretagentsbyshowingjustthefirstlettersoftheirnames.Todothis,youcouldusetheregexAgent(\w)\w*andpassr'\1****'asthefirstargumenttosub().The\1inthatstringwillbereplacedbywhatevertextwasmatchedbygroup1—thatis,the(\w)groupoftheregularexpression.

>>>agentNamesRegex=re.compile(r'Agent(\w)\w*')

>>>agentNamesRegex.sub(r'\1****','AgentAlicetoldAgentCarolthatAgent

EveknewAgentBobwasadoubleagent.')

A****toldC****thatE****knewB****wasadoubleagent.'

ManagingComplexRegexesRegularexpressionsarefineifthetextpatternyouneedtomatchissimple.Butmatchingcomplicatedtextpatternsmightrequirelong,convolutedregularexpressions.Youcanmitigatethisbytellingthere.compile()functiontoignorewhitespaceandcommentsinsidetheregularexpressionstring.This“verbosemode”canbeenabledbypassingthevariablere.VERBOSEasthesecondargumenttore.compile().

Nowinsteadofahard-to-readregularexpressionlikethis:phoneRegex=re.compile(r'((\d{3}|\(\d{3}\))?(\s|-|\.)?\d{3}(\s|-|\.)\d{4}

(\s*(ext|x|ext.)\s*\d{2,5})?)')

youcanspreadtheregularexpressionovermultiplelineswithcommentslikethis:phoneRegex=re.compile(r'''(

(\d{3}|\(\d{3}\))?#areacode

(\s|-|\.)?#separator

\d{3}#first3digits

(\s|-|\.)#separator

\d{4}#last4digits

(\s*(ext|x|ext.)\s*\d{2,5})?#extension

)''',re.VERBOSE)

Notehowthepreviousexampleusesthetriple-quotesyntax(''')tocreateamultilinestringsothatyoucanspreadtheregularexpressiondefinitionovermanylines,makingitmuchmorelegible.

ThecommentrulesinsidetheregularexpressionstringarethesameasregularPythoncode:The#symbolandeverythingafterittotheendofthelineareignored.Also,theextraspacesinsidethemultilinestringfortheregularexpressionarenotconsideredpartofthetextpatterntobematched.Thisletsyouorganizetheregularexpressionsoit’seasiertoread.

Combiningre.IGNORECASE,re.DOTALL,andre.VERBOSEWhatifyouwanttousere.VERBOSEtowritecommentsinyourregularexpressionbutalsowanttousere.IGNORECASEtoignorecapitalization?Unfortunately,there.compile()functiontakesonlyasinglevalueasitssecondargument.Youcangetaroundthislimitationbycombiningthere.IGNORECASE,re.DOTALL,andre.VERBOSEvariablesusingthepipecharacter(|),whichinthiscontextisknownasthebitwiseoroperator.

Soifyouwantaregularexpressionthat’scase-insensitiveandincludesnewlinestomatchthedotcharacter,youwouldformyourre.compile()calllikethis:

>>>someRegexValue=re.compile('foo',re.IGNORECASE|re.DOTALL)

Allthreeoptionsforthesecondargumentwilllooklikethis:>>>someRegexValue=re.compile('foo',re.IGNORECASE|re.DOTALL|re.VERBOSE)

Thissyntaxisalittleold-fashionedandoriginatesfromearlyversionsofPython.Thedetailsofthebitwiseoperatorsarebeyondthescopeofthisbook,butcheckouttheresourcesathttp://nostarch.com/automatestuff/formoreinformation.Youcanalsopassotheroptionsforthesecondargument;they’reuncommon,butyoucanreadmoreaboutthemintheresources,too.

Project:PhoneNumberandEmailAddressExtractorSayyouhavetheboringtaskoffindingeveryphonenumberandemailaddressinalongwebpageordocument.Ifyoumanuallyscrollthroughthepage,youmightendupsearchingforalongtime.Butifyouhadaprogramthatcouldsearchthetextinyourclipboardforphonenumbersandemailaddresses,youcouldsimplypressCTRL-Atoselectallthetext,pressCTRL-Ctocopyittotheclipboard,andthenrunyourprogram.Itcouldreplacethetextontheclipboardwithjustthephonenumbersandemailaddressesitfinds.

Wheneveryou’retacklinganewproject,itcanbetemptingtodiverightintowritingcode.Butmoreoftenthannot,it’sbesttotakeastepbackandconsiderthebiggerpicture.Irecommendfirstdrawingupahigh-levelplanforwhatyourprogramneedstodo.Don’tthinkabouttheactualcodeyet—youcanworryaboutthatlater.Rightnow,sticktobroadstrokes.

Forexample,yourphoneandemailaddressextractorwillneedtodothefollowing:

Getthetextofftheclipboard.Findallphonenumbersandemailaddressesinthetext.Pastethemontotheclipboard.

Nowyoucanstartthinkingabouthowthismightworkincode.Thecodewillneedtodothefollowing:

Usethepyperclipmoduletocopyandpastestrings.Createtworegexes,oneformatchingphonenumbersandtheotherformatchingemailaddresses.Findallmatches,notjustthefirstmatch,ofbothregexes.Neatlyformatthematchedstringsintoasinglestringtopaste.Displaysomekindofmessageifnomatcheswerefoundinthetext.

Thislistislikearoadmapfortheproject.Asyouwritethecode,youcanfocusoneachofthesestepsseparately.EachstepisfairlymanageableandexpressedintermsofthingsyoualreadyknowhowtodoinPython.

Step1:CreateaRegexforPhoneNumbersFirst,youhavetocreatearegularexpressiontosearchforphonenumbers.Createanewfile,enterthefollowing,andsaveitasphoneAndEmail.py:

#!python3

#phoneAndEmail.py-Findsphonenumbersandemailaddressesontheclipboard.

importpyperclip,re

phoneRegex=re.compile(r'''(

(\d{3}|\(\d{3}\))?#areacode

(\s|-|\.)?#separator

(\d{3})#first3digits

(\s|-|\.)#separator

(\d{4})#last4digits

(\s*(ext|x|ext.)\s*(\d{2,5}))?#extension

)''',re.VERBOSE)

#TODO:Createemailregex.

#TODO:Findmatchesinclipboardtext.

#TODO:Copyresultstotheclipboard.

TheTODOcommentsarejustaskeletonfortheprogram.They’llbereplacedasyouwritetheactualcode.

Thephonenumberbeginswithanoptionalareacode,sotheareacodegroupisfollowedwithaquestionmark.Sincetheareacodecanbejustthreedigits(thatis,\d{3})orthreedigitswithinparentheses(thatis,\(\d{3}\)),youshouldhaveapipejoiningthoseparts.Youcanaddtheregexcomment#Areacodetothispartofthemultilinestringtohelpyourememberwhat(\d{3}|\(\d{3}\))?issupposedtomatch.

Thephonenumberseparatorcharactercanbeaspace(\s),hyphen(-),orperiod(.),sothesepartsshouldalsobejoinedbypipes.Thenextfewpartsoftheregularexpressionarestraightforward:threedigits,followedbyanotherseparator,followedbyfourdigits.Thelastpartisanoptionalextensionmadeupofanynumberofspacesfollowedbyext,x,orext.,followedbytwotofivedigits.

Step2:CreateaRegexforEmailAddressesYouwillalsoneedaregularexpressionthatcanmatchemailaddresses.Makeyourprogramlooklikethefollowing:

#!python3

#phoneAndEmail.py-Findsphonenumbersandemailaddressesontheclipboard.

importpyperclip,re

phoneRegex=re.compile(r'''(

--snip--

#Createemailregex.

emailRegex=re.compile(r'''(

➊[a-zA-Z0-9._%+-]+#username

➋@#@symbol

➌[a-zA-Z0-9.-]+#domainname

(\.[a-zA-Z]{2,4})#dot-something

)''',re.VERBOSE)

#TODO:Findmatchesinclipboardtext.

#TODO:Copyresultstotheclipboard.

Theusernamepartoftheemailaddress➊isoneormorecharactersthatcanbeanyofthefollowing:lowercaseanduppercaseletters,numbers,adot,anunderscore,apercentsign,aplussign,orahyphen.Youcanputalloftheseintoacharacterclass:[a-zA-Z0-9._%+-].

Thedomainandusernameareseparatedbyan@symbol➋.Thedomainname➌hasaslightlylesspermissivecharacterclasswithonlyletters,numbers,periods,andhyphens:[a-zA-Z0-9.-].Andlastwillbethe“dot-com”part(technicallyknownasthetop-leveldomain),whichcanreallybedot-anything.Thisisbetweentwoandfourcharacters.

Theformatforemailaddresseshasalotofweirdrules.Thisregularexpressionwon’tmatcheverypossiblevalidemailaddress,butit’llmatchalmostanytypicalemailaddressyou’llencounter.

Step3:FindAllMatchesintheClipboardTextNowthatyouhavespecifiedtheregularexpressionsforphonenumbersandemailaddresses,youcanletPython’sremoduledothehardworkoffindingallthematchesontheclipboard.Thepyperclip.paste()functionwillgetastringvalueofthetextonthe

clipboard,andthefindall()regexmethodwillreturnalistoftuples.

Makeyourprogramlooklikethefollowing:#!python3

#phoneAndEmail.py-Findsphonenumbersandemailaddressesontheclipboard.

importpyperclip,re

phoneRegex=re.compile(r'''(

--snip--

#Findmatchesinclipboardtext.

text=str(pyperclip.paste())

➊matches=[]

➋forgroupsinphoneRegex.findall(text):

phoneNum='-'.join([groups[1],groups[3],groups[5]])

ifgroups[8]!='':

phoneNum+='x'+groups[8]

matches.append(phoneNum)

➌forgroupsinemailRegex.findall(text):

matches.append(groups[0])

#TODO:Copyresultstotheclipboard.

Thereisonetupleforeachmatch,andeachtuplecontainsstringsforeachgroupintheregularexpression.Rememberthatgroup0matchestheentireregularexpression,sothegroupatindex0ofthetupleistheoneyouareinterestedin.

Asyoucanseeat➊,you’llstorethematchesinalistvariablenamedmatches.Itstartsoffasanemptylist,andacoupleforloops.Fortheemailaddresses,youappendgroup0ofeachmatch➌.Forthematchedphonenumbers,youdon’twanttojustappendgroup0.Whiletheprogramdetectsphonenumbersinseveralformats,youwantthephonenumberappendedtobeinasingle,standardformat.ThephoneNumvariablecontainsastringbuiltfromgroups1,3,5,and8ofthematchedtext➋.(Thesegroupsaretheareacode,firstthreedigits,lastfourdigits,andextension.)

Step4:JointheMatchesintoaStringfortheClipboardNowthatyouhavetheemailaddressesandphonenumbersasalistofstringsinmatches,youwanttoputthemontheclipboard.Thepyperclip.copy()functiontakesonlyasinglestringvalue,notalistofstrings,soyoucallthejoin()methodonmatches.

Tomakeiteasiertoseethattheprogramisworking,let’sprintanymatchesyoufindtotheterminal.Andifnophonenumbersoremailaddresseswerefound,theprogramshouldtelltheuserthis.

Makeyourprogramlooklikethefollowing:#!python3

#phoneAndEmail.py-Findsphonenumbersandemailaddressesontheclipboard.

--snip--

forgroupsinemailRegex.findall(text):

matches.append(groups[0])

#Copyresultstotheclipboard.

iflen(matches)>0:

pyperclip.copy('\n'.join(matches))

print('Copiedtoclipboard:')

print('\n'.join(matches))

else:

print('Nophonenumbersoremailaddressesfound.')

RunningtheProgram

Foranexample,openyourwebbrowsertotheNoStarchPresscontactpageathttp://www.nostarch.com/contactus.htm,pressCTRL-Atoselectallthetextonthepage,andpressCTRL-Ctocopyittotheclipboard.Whenyourunthisprogram,theoutputwilllooksomethinglikethis:

Copiedtoclipboard:

800-420-7240

415-863-9900

415-863-9950

[email protected]

[email protected]

[email protected]

[email protected]

IdeasforSimilarProgramsIdentifyingpatternsoftext(andpossiblysubstitutingthemwiththesub()method)hasmanydifferentpotentialapplications.

FindwebsiteURLsthatbeginwithhttp://orhttps://.Cleanupdatesindifferentdateformats(suchas3/14/2015,03-14-2015,and2015/3/14)byreplacingthemwithdatesinasingle,standardformat.RemovesensitiveinformationsuchasSocialSecurityorcreditcardnumbers.Findcommontypossuchasmultiplespacesbetweenwords,accidentallyaccidentallyrepeatedwords,ormultipleexclamationmarksattheendofsentences.Thoseareannoying!!

SummaryWhileacomputercansearchfortextquickly,itmustbetoldpreciselywhattolookfor.Regularexpressionsallowyoutospecifytheprecisepatternsofcharactersyouarelookingfor.Infact,somewordprocessingandspreadsheetapplicationsprovidefind-and-replacefeaturesthatallowyoutosearchusingregularexpressions.

TheremodulethatcomeswithPythonletsyoucompileRegexobjects.Thesevalueshaveseveralmethods:search()tofindasinglematch,findall()tofindallmatchinginstances,andsub()todoafind-and-replacesubstitutionoftext.

There’sabitmoretoregularexpressionsyntaxthanisdescribedinthischapter.YoucanfindoutmoreintheofficialPythondocumentationathttp://docs.python.org/3/library/re.html.Thetutorialwebsitehttp://www.regular-expressions.info/isalsoausefulresource.

Nowthatyouhaveexpertisemanipulatingandmatchingstrings,it’stimetodiveintohowtoreadfromandwritetofilesonyourcomputer’sharddrive.

PracticeQuestionsQ: 1.WhatisthefunctionthatcreatesRegexobjects?

Q: 2.WhyarerawstringsoftenusedwhencreatingRegexobjects?

Q: 3.Whatdoesthesearch()methodreturn?

Q: 4.HowdoyougettheactualstringsthatmatchthepatternfromaMatchobject?

Q: 5.Intheregexcreatedfromr'(\d\d\d)-(\d\d\d-\d\d\d\d)',whatdoesgroup0cover?Group1?Group2?

Q: 6.Parenthesesandperiodshavespecificmeaningsinregularexpressionsyntax.Howwouldyouspecifythatyouwantaregextomatchactualparenthesesandperiodcharacters?

Q: 7.Thefindall()methodreturnsalistofstringsoralistoftuplesofstrings.Whatmakesitreturnoneortheother?

Q: 8.Whatdoesthe|charactersignifyinregularexpressions?

Q: 9.Whattwothingsdoesthe?charactersignifyinregularexpressions?

Q: 10.Whatisthedifferencebetweenthe+and*charactersinregularexpressions?

Q: 11.Whatisthedifferencebetween{3}and{3,5}inregularexpressions?

Q: 12.Whatdothe\d,\w,and\sshorthandcharacterclassessignifyinregularexpressions?

Q: 13.Whatdothe\D,\W,and\Sshorthandcharacterclassessignifyinregularexpressions?

Q: 14.Howdoyoumakearegularexpressioncase-insensitive?

Q: 15.Whatdoesthe.characternormallymatch?Whatdoesitmatchifre.DOTALLispassedasthesecondargumenttore.compile()?

Q: 16.Whatisthedifferencebetween.*and.*??

Q: 17.Whatisthecharacterclasssyntaxtomatchallnumbersandlowercaseletters?

Q: 18.IfnumRegex=re.compile(r'\d+'),whatwillnumRegex.sub('X','12drummers,11pipers,fiverings,3hens')return?

Q: 19.Whatdoespassingre.VERBOSEasthesecondargumenttore.compile()allowyoutodo?

Q: 20.Howwouldyouwritearegexthatmatchesanumberwithcommasforeverythreedigits?Itmustmatchthefollowing:

'42'

'1,234'

'6,368,745'

butnotthefollowing:

'12,34,567'(whichhasonlytwodigitsbetweenthecommas)'1234'(whichlackscommas)

Q: 21.HowwouldyouwritearegexthatmatchesthefullnameofsomeonewhoselastnameisNakamoto?Youcanassumethatthefirstnamethatcomesbeforeitwillalwaysbeonewordthatbeginswithacapitalletter.Theregexmustmatchthefollowing:

'SatoshiNakamoto'

'AliceNakamoto'

'RobocopNakamoto'

butnotthefollowing:

'satoshiNakamoto'(wherethefirstnameisnotcapitalized)'Mr.Nakamoto'(wheretheprecedingwordhasanonlettercharacter)'Nakamoto'(whichhasnofirstname)'Satoshinakamoto'(whereNakamotoisnotcapitalized)

Q: 22.HowwouldyouwritearegexthatmatchesasentencewherethefirstwordiseitherAlice,Bob,orCarol;thesecondwordiseithereats,pets,orthrows;thethirdwordisapples,cats,orbaseballs;andthesentenceendswithaperiod?Thisregexshouldbecase-insensitive.Itmustmatchthefollowing:

'Aliceeatsapples.'

'Bobpetscats.'

'Carolthrowsbaseballs.'

'AlicethrowsApples.'

'BOBEATSCATS.'

butnotthefollowing:

'Robocopeatsapples.'

'ALICETHROWSFOOTBALLS.'

'Caroleats7cats.'

PracticeProjectsForpractice,writeprogramstodothefollowingtasks.

StrongPasswordDetectionWriteafunctionthatusesregularexpressionstomakesurethepasswordstringitispassedisstrong.Astrongpasswordisdefinedasonethatisatleasteightcharacterslong,containsbothuppercaseandlowercasecharacters,andhasatleastonedigit.Youmayneedtotestthestringagainstmultipleregexpatternstovalidateitsstrength.

RegexVersionofstrip()Writeafunctionthattakesastringanddoesthesamethingasthestrip()stringmethod.Ifnootherargumentsarepassedotherthanthestringtostrip,thenwhitespacecharacterswillberemovedfromthebeginningandendofthestring.Otherwise,thecharactersspecifiedinthesecondargumenttothefunctionwillberemovedfromthestring.

[1]CoryDoctorow,“Here’swhatICTshouldreallyteachkids:howtodoregularexpressions,”Guardian,December4,2012,http://www.theguardian.com/technology/2012/dec/04/ict-teach-kids-regular-expressions/.

Chapter8.ReadingandWritingFilesVariablesareafinewaytostoredatawhileyourprogramisrunning,butifyouwantyourdatatopersistevenafteryourprogramhasfinished,youneedtosaveittoafile.Youcanthinkofafile’scontentsasasinglestringvalue,potentiallygigabytesinsize.Inthischapter,youwilllearnhowtousePythontocreate,read,andsavefilesontheharddrive.

FilesandFilePathsAfilehastwokeyproperties:afilename(usuallywrittenasoneword)andapath.Thepathspecifiesthelocationofafileonthecomputer.Forexample,thereisafileonmyWindows7laptopwiththefilenameprojects.docxinthepathC:\Users\asweigart\Documents.Thepartofthefilenameafterthelastperiodiscalledthefile’sextensionandtellsyouafile’stype.project.docxisaWorddocument,andUsers,asweigart,andDocumentsallrefertofolders(alsocalleddirectories).Folderscancontainfilesandotherfolders.Forexample,project.docxisintheDocumentsfolder,whichisinsidetheasweigartfolder,whichisinsidetheUsersfolder.Figure8-1showsthisfolderorganization.

Figure8-1.Afileinahierarchyoffolders

TheC:\partofthepathistherootfolder,whichcontainsallotherfolders.OnWindows,therootfolderisnamedC:\andisalsocalledtheC:drive.OnOSXandLinux,therootfolderis/.Inthisbook,I’llbeusingtheWindows-stylerootfolder,C:\.IfyouareenteringtheinteractiveshellexamplesonOSXorLinux,enter/instead.

Additionalvolumes,suchasaDVDdriveorUSBthumbdrive,willappeardifferentlyondifferentoperatingsystems.OnWindows,theyappearasnew,letteredrootdrives,suchasD:\orE:\.OnOSX,theyappearasnewfoldersunderthe/Volumesfolder.OnLinux,theyappearasnewfoldersunderthe/mnt(“mount”)folder.AlsonotethatwhilefoldernamesandfilenamesarenotcasesensitiveonWindowsandOSX,theyarecasesensitiveonLinux.

BackslashonWindowsandForwardSlashonOSXandLinuxOnWindows,pathsarewrittenusingbackslashes(\)astheseparatorbetweenfoldernames.OSXandLinux,however,usetheforwardslash(/)astheirpathseparator.Ifyouwantyourprogramstoworkonalloperatingsystems,youwillhavetowriteyourPythonscriptstohandlebothcases.

Fortunately,thisissimpletodowiththeos.path.join()function.Ifyoupassitthestringvaluesofindividualfileandfoldernamesinyourpath,os.path.join()willreturnastringwithafilepathusingthecorrectpathseparators.Enterthefollowingintotheinteractiveshell:

>>>importos

>>>os.path.join('usr','bin','spam')

'usr\\bin\\spam'

I’mrunningtheseinteractiveshellexamplesonWindows,soos.path.join('usr','bin','spam')returned'usr\\bin\\spam'.(Noticethatthebackslashesaredoubledbecauseeachbackslashneedstobeescapedbyanotherbackslashcharacter.)IfIhadcalledthisfunctiononOSXorLinux,thestringwouldhavebeen'usr/bin/spam'.

Theos.path.join()functionishelpfulifyouneedtocreatestringsforfilenames.Thesestringswillbepassedtoseveralofthefile-relatedfunctionsintroducedinthischapter.Forexample,thefollowingexamplejoinsnamesfromalistoffilenamestotheendofafolder’sname:

>>>myFiles=['accounts.txt','details.csv','invite.docx']

>>>forfilenameinmyFiles:

print(os.path.join('C:\\Users\\asweigart',filename))

C:\Users\asweigart\accounts.txt

C:\Users\asweigart\details.csv

C:\Users\asweigart\invite.docx

TheCurrentWorkingDirectoryEveryprogramthatrunsonyourcomputerhasacurrentworkingdirectory,orcwd.Anyfilenamesorpathsthatdonotbeginwiththerootfolderareassumedtobeunderthecurrentworkingdirectory.Youcangetthecurrentworkingdirectoryasastringvaluewiththeos.getcwd()functionandchangeitwithos.chdir().Enterthefollowingintotheinteractiveshell:

>>>importos

>>>os.getcwd()

'C:\\Python34'

>>>os.chdir('C:\\Windows\\System32')

>>>os.getcwd()

'C:\\Windows\\System32'

Here,thecurrentworkingdirectoryissettoC:\Python34,sothefilenameproject.docxreferstoC:\Python34\project.docx.WhenwechangethecurrentworkingdirectorytoC:\Windows,project.docxisinterpretedasC:\Windows\project.docx.

Pythonwilldisplayanerrorifyoutrytochangetoadirectorythatdoesnotexist.>>>os.chdir('C:\\ThisFolderDoesNotExist')

Traceback(mostrecentcalllast):

File"<pyshell#18>",line1,in<module>

os.chdir('C:\\ThisFolderDoesNotExist')

FileNotFoundError:[WinError2]Thesystemcannotfindthefilespecified:

'C:\\ThisFolderDoesNotExist'

NOTE

Whilefolderisthemoremodernnamefordirectory,notethatcurrentworkingdirectory(orjustworkingdirectory)isthestandardterm,notcurrentworkingfolder.

Absolutevs.RelativePathsTherearetwowaystospecifyafilepath.

Anabsolutepath,whichalwaysbeginswiththerootfolderArelativepath,whichisrelativetotheprogram’scurrentworkingdirectory

Therearealsothedot(.)anddot-dot(..)folders.Thesearenotrealfoldersbutspecialnamesthatcanbeusedinapath.Asingleperiod(“dot”)forafoldernameisshorthandfor“thisdirectory.”Twoperiods(“dot-dot”)means“theparentfolder.”

Figure8-2isanexampleofsomefoldersandfiles.WhenthecurrentworkingdirectoryissettoC:\bacon,therelativepathsfortheotherfoldersandfilesaresetastheyareinthefigure.

Figure8-2.TherelativepathsforfoldersandfilesintheworkingdirectoryC:\bacon

The.\atthestartofarelativepathisoptional.Forexample,.\spam.txtandspam.txtrefertothesamefile.

CreatingNewFolderswithos.makedirs()Yourprogramscancreatenewfolders(directories)withtheos.makedirs()function.Enterthefollowingintotheinteractiveshell:

>>>importos

>>>os.makedirs('C:\\delicious\\walnut\\waffles')

ThiswillcreatenotjusttheC:\deliciousfolderbutalsoawalnutfolderinsideC:\deliciousandawafflesfolderinsideC:\delicious\walnut.Thatis,os.makedirs()willcreateanynecessaryintermediatefoldersinordertoensurethatthefullpathexists.Figure8-3showsthishierarchyoffolders.

Figure8-3.Theresultofos.makedirs('C:\\delicious\\walnut\\waffles')

Theos.pathModuleTheos.pathmodulecontainsmanyhelpfulfunctionsrelatedtofilenamesandfilepaths.Forinstance,you’vealreadyusedos.path.join()tobuildpathsinawaythatwillworkonanyoperatingsystem.Sinceos.pathisamoduleinsidetheosmodule,youcanimportitbysimplyrunningimportos.Wheneveryourprogramsneedtoworkwithfiles,folders,orfilepaths,youcanrefertotheshortexamplesinthissection.Thefulldocumentationfortheos.pathmoduleisonthePythonwebsiteathttp://docs.python.org/3/library/os.path.html.

NOTE

Mostoftheexamplesthatfollowinthissectionwillrequiretheosmodule,soremembertoimportitatthebeginningofanyscriptyouwriteandanytimeyourestartIDLE.Otherwise,you’llgetaNameError:name'os'isnotdefinederrormessage.

HandlingAbsoluteandRelativePathsTheos.pathmoduleprovidesfunctionsforreturningtheabsolutepathofarelativepathandforcheckingwhetheragivenpathisanabsolutepath.

Callingos.path.abspath(path)willreturnastringoftheabsolutepathoftheargument.Thisisaneasywaytoconvertarelativepathintoanabsoluteone.Callingos.path.isabs(path)willreturnTrueiftheargumentisanabsolutepathandFalseifitisarelativepath.Callingos.path.relpath(path,start)willreturnastringofarelativepathfromthestartpathtopath.Ifstartisnotprovided,thecurrentworkingdirectoryisusedasthestartpath.

Trythesefunctionsintheinteractiveshell:>>>os.path.abspath('.')

'C:\\Python34'

>>>os.path.abspath('.\\Scripts')

'C:\\Python34\\Scripts'

>>>os.path.isabs('.')

False

>>>os.path.isabs(os.path.abspath('.'))

True

SinceC:\Python34wastheworkingdirectorywhenos.path.abspath()wascalled,the“single-dot”folderrepresentstheabsolutepath'C:\\Python34'.

NOTE

Sinceyoursystemprobablyhasdifferentfilesandfoldersonitthanmine,youwon’tbeabletofolloweveryexampleinthischapterexactly.Still,trytofollowalongusingfoldersthatexistonyourcomputer.

Enterthefollowingcallstoos.path.relpath()intotheinteractiveshell:>>>os.path.relpath('C:\\Windows','C:\\')

'Windows'

>>>os.path.relpath('C:\\Windows','C:\\spam\\eggs')

'..\\..\\Windows'

>>>os.getcwd()'C:\\Python34'

Callingos.path.dirname(path)willreturnastringofeverythingthatcomesbeforethelastslashinthepathargument.Callingos.path.basename(path)willreturnastringofeverythingthatcomesafterthelastslashinthepathargument.Thedirnameandbase

nameofapathareoutlinedinFigure8-4.

Figure8-4.Thebasenamefollowsthelastslashinapathandisthesameasthefilename.Thedirnameiseverythingbeforethelastslash.

Forexample,enterthefollowingintotheinteractiveshell:>>>path='C:\\Windows\\System32\\calc.exe'

>>>os.path.basename(path)

'calc.exe'

>>>os.path.dirname(path)

'C:\\Windows\\System32'

Ifyouneedapath’sdirnameandbasenametogether,youcanjustcallos.path.split()togetatuplevaluewiththesetwostrings,likeso:

>>>calcFilePath='C:\\Windows\\System32\\calc.exe'

>>>os.path.split(calcFilePath)

('C:\\Windows\\System32','calc.exe')

Noticethatyoucouldcreatethesametuplebycallingos.path.dirname()andos.path.basename()andplacingtheirreturnvaluesinatuple.

>>>(os.path.dirname(calcFilePath),os.path.basename(calcFilePath))

('C:\\Windows\\System32','calc.exe')

Butos.path.split()isaniceshortcutifyouneedbothvalues.

Also,notethatos.path.split()doesnottakeafilepathandreturnalistofstringsofeachfolder.Forthat,usethesplit()stringmethodandsplitonthestringinos.sep.Recallfromearlierthattheos.sepvariableissettothecorrectfolder-separatingslashforthecomputerrunningtheprogram.

Forexample,enterthefollowingintotheinteractiveshell:>>>calcFilePath.split(os.path.sep)

['C:','Windows','System32','calc.exe']

OnOSXandLinuxsystems,therewillbeablankstringatthestartofthereturnedlist:>>>'/usr/bin'.split(os.path.sep)

['','usr','bin']

Thesplit()stringmethodwillworktoreturnalistofeachpartofthepath.Itwillworkonanyoperatingsystemifyoupassitos.path.sep.

FindingFileSizesandFolderContentsOnceyouhavewaysofhandlingfilepaths,youcanthenstartgatheringinformationaboutspecificfilesandfolders.Theos.pathmoduleprovidesfunctionsforfindingthesizeofafileinbytesandthefilesandfoldersinsideagivenfolder.

Callingos.path.getsize(path)willreturnthesizeinbytesofthefileinthepathargument.Callingos.listdir(path)willreturnalistoffilenamestringsforeachfileinthepathargument.(Notethatthisfunctionisintheosmodule,notos.path.)

Here’swhatIgetwhenItrythesefunctionsintheinteractiveshell:

>>>os.path.getsize('C:\\Windows\\System32\\calc.exe')

776192

>>>os.listdir('C:\\Windows\\System32')

['0409','12520437.cpx','12520850.cpx','5U877.ax','aaclient.dll',

--snip--

'xwtpdui.dll','xwtpw32.dll','zh-CN','zh-HK','zh-TW','zipfldr.dll']

Asyoucansee,thecalc.exeprogramonmycomputeris776,192bytesinsize,andIhavealotoffilesinC:\Windows\system32.IfIwanttofindthetotalsizeofallthefilesinthisdirectory,Icanuseos.path.getsize()andos.listdir()together.

>>>totalSize=0

>>>forfilenameinos.listdir('C:\\Windows\\System32'):

totalSize=totalSize+os.path.getsize(os.path.join('C:\\Windows\\System32',filename))

>>>print(totalSize)

1117846456

AsIloopovereachfilenameintheC:\Windows\System32folder,thetotalSizevariableisincrementedbythesizeofeachfile.NoticehowwhenIcallos.path.getsize(),Iuseos.path.join()tojointhefoldernamewiththecurrentfilename.Theintegerthatos.path.getsize()returnsisaddedtothevalueoftotalSize.Afterloopingthroughallthefiles,IprinttotalSizetoseethetotalsizeoftheC:\Windows\System32folder.

CheckingPathValidityManyPythonfunctionswillcrashwithanerrorifyousupplythemwithapaththatdoesnotexist.Theos.pathmoduleprovidesfunctionstocheckwhetheragivenpathexistsandwhetheritisafileorfolder.

Callingos.path.exists(path)willreturnTrueifthefileorfolderreferredtointheargumentexistsandwillreturnFalseifitdoesnotexist.Callingos.path.isfile(path)willreturnTrueifthepathargumentexistsandisafileandwillreturnFalseotherwise.Callingos.path.isdir(path)willreturnTrueifthepathargumentexistsandisafolderandwillreturnFalseotherwise.

Here’swhatIgetwhenItrythesefunctionsintheinteractiveshell:>>>os.path.exists('C:\\Windows')

True

>>>os.path.exists('C:\\some_made_up_folder')

False

>>>os.path.isdir('C:\\Windows\\System32')

True

>>>os.path.isfile('C:\\Windows\\System32')

False

>>>os.path.isdir('C:\\Windows\\System32\\calc.exe')

False

>>>os.path.isfile('C:\\Windows\\System32\\calc.exe')

True

YoucandeterminewhetherthereisaDVDorflashdrivecurrentlyattachedtothecomputerbycheckingforitwiththeos.path.exists()function.Forinstance,ifIwantedtocheckforaflashdrivewiththevolumenamedD:\onmyWindowscomputer,Icoulddothatwiththefollowing:

>>>os.path.exists('D:\\')

False

Oops!ItlookslikeIforgottopluginmyflashdrive.

TheFileReading/WritingProcessOnceyouarecomfortableworkingwithfoldersandrelativepaths,you’llbeabletospecifythelocationoffilestoreadandwrite.Thefunctionscoveredinthenextfewsectionswillapplytoplaintextfiles.Plaintextfilescontainonlybasictextcharactersanddonotincludefont,size,orcolorinformation.Textfileswiththe.txtextensionorPythonscriptfileswiththe.pyextensionareexamplesofplaintextfiles.ThesecanbeopenedwithWindows’sNotepadorOSX’sTextEditapplication.Yourprogramscaneasilyreadthecontentsofplaintextfilesandtreatthemasanordinarystringvalue.

Binaryfilesareallotherfiletypes,suchaswordprocessingdocuments,PDFs,images,spreadsheets,andexecutableprograms.IfyouopenabinaryfileinNotepadorTextEdit,itwilllooklikescramblednonsense,likeinFigure8-5.

Figure8-5.TheWindowscalc.exeprogramopenedinNotepad

Sinceeverydifferenttypeofbinaryfilemustbehandledinitsownway,thisbookwillnotgointoreadingandwritingrawbinaryfilesdirectly.Fortunately,manymodulesmakeworkingwithbinaryfileseasier—youwillexploreoneofthem,theshelvemodule,laterinthischapter.

TherearethreestepstoreadingorwritingfilesinPython.

1. Calltheopen()functiontoreturnaFileobject.2. Calltheread()orwrite()methodontheFileobject.3. Closethefilebycallingtheclose()methodontheFileobject.

OpeningFileswiththeopen()FunctionToopenafilewiththeopen()function,youpassitastringpathindicatingthefileyouwanttoopen;itcanbeeitheranabsoluteorrelativepath.Theopen()functionreturnsaFileobject.

Tryitbycreatingatextfilenamedhello.txtusingNotepadorTextEdit.TypeHelloworld!asthecontentofthistextfileandsaveitinyouruserhomefolder.Then,ifyou’reusingWindows,enterthefollowingintotheinteractiveshell:

>>>helloFile=open('C:\\Users\\your_home_folder\\hello.txt')

Ifyou’reusingOSX,enterthefollowingintotheinteractiveshellinstead:>>>helloFile=open('/Users/your_home_folder/hello.txt')

Makesuretoreplaceyour_home_folderwithyourcomputerusername.Forexample,myusernameisasweigart,soI’denter'C:\\Users\\asweigart\\hello.txt'onWindows.

Boththesecommandswillopenthefilein“readingplaintext”mode,orreadmodeforshort.Whenafileisopenedinreadmode,Pythonletsyouonlyreaddatafromthefile;youcan’twriteormodifyitinanyway.ReadmodeisthedefaultmodeforfilesyouopeninPython.Butifyoudon’twanttorelyonPython’sdefaults,youcanexplicitlyspecifythemodebypassingthestringvalue'r'asasecondargumenttoopen().Soopen('/Users/asweigart/hello.txt','r')andopen('/Users/asweigart/hello.txt')dothesamething.

Thecalltoopen()returnsaFileobject.AFileobjectrepresentsafileonyourcomputer;itissimplyanothertypeofvalueinPython,muchlikethelistsanddictionariesyou’realreadyfamiliarwith.Inthepreviousexample,youstoredtheFileobjectinthevariablehelloFile.Now,wheneveryouwanttoreadfromorwritetothefile,youcandosobycallingmethodsontheFileobjectinhelloFile.

ReadingtheContentsofFilesNowthatyouhaveaFileobject,youcanstartreadingfromit.Ifyouwanttoreadtheentirecontentsofafileasastringvalue,usetheFileobject’sread()method.Let’scontinuewiththehello.txtFileobjectyoustoredinhelloFile.Enterthefollowingintotheinteractiveshell:

>>>helloContent=helloFile.read()

>>>helloContent

'Helloworld!'

Ifyouthinkofthecontentsofafileasasinglelargestringvalue,theread()methodreturnsthestringthatisstoredinthefile.

Alternatively,youcanusethereadlines()methodtogetalistofstringvaluesfromthefile,onestringforeachlineoftext.Forexample,createafilenamedsonnet29.txtinthesamedirectoryashello.txtandwritethefollowingtextinit:

When,indisgracewithfortuneandmen'seyes,

Iallalonebeweepmyoutcaststate,

Andtroubledeafheavenwithmybootlesscries,

Andlookuponmyselfandcursemyfate,

Makesuretoseparatethefourlineswithlinebreaks.Thenenterthefollowingintotheinteractiveshell:

>>>sonnetFile=open('sonnet29.txt')

>>>sonnetFile.readlines()

[When,indisgracewithfortuneandmen'seyes,\n','Iallalonebeweepmy

outcaststate,\n',Andtroubledeafheavenwithmybootlesscries,\n',And

lookuponmyselfandcursemyfate,']

Notethateachofthestringvaluesendswithanewlinecharacter,\n,exceptforthelastlineofthefile.Alistofstringsisofteneasiertoworkwiththanasinglelargestringvalue.

WritingtoFilesPythonallowsyoutowritecontenttoafileinawaysimilartohowtheprint()function

“writes”stringstothescreen.Youcan’twritetoafileyou’veopenedinreadmode,though.Instead,youneedtoopenitin“writeplaintext”modeor“appendplaintext”mode,orwritemodeandappendmodeforshort.

Writemodewilloverwritetheexistingfileandstartfromscratch,justlikewhenyouoverwriteavariable’svaluewithanewvalue.Pass'w'asthesecondargumenttoopen()toopenthefileinwritemode.Appendmode,ontheotherhand,willappendtexttotheendoftheexistingfile.Youcanthinkofthisasappendingtoalistinavariable,ratherthanoverwritingthevariablealtogether.Pass'a'asthesecondargumenttoopen()toopenthefileinappendmode.

Ifthefilenamepassedtoopen()doesnotexist,bothwriteandappendmodewillcreateanew,blankfile.Afterreadingorwritingafile,calltheclose()methodbeforeopeningthefileagain.

Let’sputtheseconceptstogether.Enterthefollowingintotheinteractiveshell:>>>baconFile=open('bacon.txt','w')

>>>baconFile.write('Helloworld!\n')

13

>>>baconFile.close()

>>>baconFile=open('bacon.txt','a')

>>>baconFile.write('Baconisnotavegetable.')

25

>>>baconFile.close()

>>>baconFile=open('bacon.txt')

>>>content=baconFile.read()

>>>baconFile.close()

>>>print(content)

Helloworld!

Baconisnotavegetable.

First,weopenbacon.txtinwritemode.Sincethereisn’tabacon.txtyet,Pythoncreatesone.Callingwrite()ontheopenedfileandpassingwrite()thestringargument'Helloworld!/n'writesthestringtothefileandreturnsthenumberofcharacterswritten,includingthenewline.Thenweclosethefile.

Toaddtexttotheexistingcontentsofthefileinsteadofreplacingthestringwejustwrote,weopenthefileinappendmode.Wewrite'Baconisnotavegetable.'tothefileandcloseit.Finally,toprintthefilecontentstothescreen,weopenthefileinitsdefaultreadmode,callread(),storetheresultingFileobjectincontent,closethefile,andprintcontent.

Notethatthewrite()methoddoesnotautomaticallyaddanewlinecharactertotheendofthestringliketheprint()functiondoes.Youwillhavetoaddthischaracteryourself.

SavingVariableswiththeshelveModuleYoucansavevariablesinyourPythonprogramstobinaryshelffilesusingtheshelvemodule.Thisway,yourprogramcanrestoredatatovariablesfromtheharddrive.TheshelvemodulewillletyouaddSaveandOpenfeaturestoyourprogram.Forexample,ifyouranaprogramandenteredsomeconfigurationsettings,youcouldsavethosesettingstoashelffileandthenhavetheprogramloadthemthenexttimeitisrun.

Enterthefollowingintotheinteractiveshell:>>>importshelve

>>>shelfFile=shelve.open('mydata')

>>>cats=['Zophie','Pooka','Simon']

>>>shelfFile['cats']=cats

>>>shelfFile.close()

Toreadandwritedatausingtheshelvemodule,youfirstimportshelve.Callshelve.open()andpassitafilename,andthenstorethereturnedshelfvalueinavariable.Youcanmakechangestotheshelfvalueasifitwereadictionary.Whenyou’redone,callclose()ontheshelfvalue.Here,ourshelfvalueisstoredinshelfFile.WecreatealistcatsandwriteshelfFile['cats']=catstostorethelistinshelfFileasavalueassociatedwiththekey'cats'(likeinadictionary).Thenwecallclose()onshelfFile.

AfterrunningthepreviouscodeonWindows,youwillseethreenewfilesinthecurrentworkingdirectory:mydata.bak,mydata.dat,andmydata.dir.OnOSX,onlyasinglemydata.dbfilewillbecreated.

Thesebinaryfilescontainthedatayoustoredinyourshelf.Theformatofthesebinaryfilesisnotimportant;youonlyneedtoknowwhattheshelvemoduledoes,nothowitdoesit.Themodulefreesyoufromworryingabouthowtostoreyourprogram’sdatatoafile.

Yourprogramscanusetheshelvemoduletolaterreopenandretrievethedatafromtheseshelffiles.Shelfvaluesdon’thavetobeopenedinreadorwritemode—theycandobothonceopened.Enterthefollowingintotheinteractiveshell:

>>>shelfFile=shelve.open('mydata')

>>>type(shelfFile)

<class'shelve.DbfilenameShelf'>

>>>shelfFile['cats']

['Zophie','Pooka','Simon']

>>>shelfFile.close()

Here,weopentheshelffilestocheckthatourdatawasstoredcorrectly.EnteringshelfFile['cats']returnsthesamelistthatwestoredearlier,soweknowthatthelistiscorrectlystored,andwecallclose().

Justlikedictionaries,shelfvalueshavekeys()andvalues()methodsthatwillreturnlist-likevaluesofthekeysandvaluesintheshelf.Sincethesemethodsreturnlist-likevaluesinsteadoftruelists,youshouldpassthemtothelist()functiontogettheminlistform.Enterthefollowingintotheinteractiveshell:

>>>shelfFile=shelve.open('mydata')

>>>list(shelfFile.keys())

['cats']

>>>list(shelfFile.values())

[['Zophie','Pooka','Simon']]

>>>shelfFile.close()

Plaintextisusefulforcreatingfilesthatyou’llreadinatexteditorsuchasNotepadorTextEdit,butifyouwanttosavedatafromyourPythonprograms,usetheshelvemodule.

SavingVariableswiththepprint.pformat()FunctionRecallfromPrettyPrintingthatthepprint.pprint()functionwill“prettyprint”thecontentsofalistordictionarytothescreen,whilethepprint.pformat()functionwillreturnthissametextasastringinsteadofprintingit.Notonlyisthisstringformattedtobeeasytoread,butitisalsosyntacticallycorrectPythoncode.Sayyouhaveadictionarystoredinavariableandyouwanttosavethisvariableanditscontentsforfutureuse.Usingpprint.pformat()willgiveyouastringthatyoucanwriteto.pyfile.Thisfilewillbeyourveryownmodulethatyoucanimportwheneveryouwanttousethevariablestoredinit.

Forexample,enterthefollowingintotheinteractiveshell:>>>importpprint

>>>cats=[{'name':'Zophie','desc':'chubby'},{'name':'Pooka','desc':'fluffy'}]

>>>pprint.pformat(cats)

"[{'desc':'chubby','name':'Zophie'},{'desc':'fluffy','name':'Pooka'}]"

>>>fileObj=open('myCats.py','w')

>>>fileObj.write('cats='+pprint.pformat(cats)+'\n')

83

>>>fileObj.close()

Here,weimportpprinttoletususepprint.pformat().Wehavealistofdictionaries,storedinavariablecats.Tokeepthelistincatsavailableevenafterweclosetheshell,weusepprint.pformat()toreturnitasastring.Oncewehavethedataincatsasastring,it’seasytowritethestringtoafile,whichwe’llcallmyCats.py.

ThemodulesthatanimportstatementimportsarethemselvesjustPythonscripts.Whenthestringfrompprint.pformat()issavedtoa.pyfile,thefileisamodulethatcanbeimportedjustlikeanyother.

AndsincePythonscriptsarethemselvesjusttextfileswiththe.pyfileextension,yourPythonprogramscanevengenerateotherPythonprograms.Youcanthenimportthesefilesintoscripts.

>>>importmyCats

>>>myCats.cats

[{'name':'Zophie','desc':'chubby'},{'name':'Pooka','desc':'fluffy'}]

>>>myCats.cats[0]

{'name':'Zophie','desc':'chubby'}

>>>myCats.cats[0]['name']

'Zophie'

Thebenefitofcreatinga.pyfile(asopposedtosavingvariableswiththeshelvemodule)isthatbecauseitisatextfile,thecontentsofthefilecanbereadandmodifiedbyanyonewithasimpletexteditor.Formostapplications,however,savingdatausingtheshelvemoduleisthepreferredwaytosavevariablestoafile.Onlybasicdatatypessuchasintegers,floats,strings,lists,anddictionariescanbewrittentoafileassimpletext.Fileobjects,forexample,cannotbeencodedastext.

Project:GeneratingRandomQuizFilesSayyou’reageographyteacherwith35studentsinyourclassandyouwanttogiveapopquizonUSstatecapitals.Alas,yourclasshasafewbadeggsinit,andyoucan’ttrustthestudentsnottocheat.You’dliketorandomizetheorderofquestionssothateachquizisunique,makingitimpossibleforanyonetocribanswersfromanyoneelse.Ofcourse,doingthisbyhandwouldbealengthyandboringaffair.Fortunately,youknowsomePython.

Hereiswhattheprogramdoes:

Creates35differentquizzes.Creates50multiple-choicequestionsforeachquiz,inrandomorder.Providesthecorrectanswerandthreerandomwronganswersforeachquestion,inrandomorder.Writesthequizzesto35textfiles.Writestheanswerkeysto35textfiles.

Thismeansthecodewillneedtodothefollowing:

Storethestatesandtheircapitalsinadictionary.Callopen(),write(),andclose()forthequizandanswerkeytextfiles.Userandom.shuffle()torandomizetheorderofthequestionsandmultiple-choiceoptions.

Step1:StoretheQuizDatainaDictionaryThefirststepistocreateaskeletonscriptandfillitwithyourquizdata.CreateafilenamedrandomQuizGenerator.py,andmakeitlooklikethefollowing:

#!python3

#randomQuizGenerator.py-Createsquizzeswithquestionsandanswersin

#randomorder,alongwiththeanswerkey.

➊importrandom

#Thequizdata.Keysarestatesandvaluesaretheircapitals.

➋capitals={'Alabama':'Montgomery','Alaska':'Juneau','Arizona':'Phoenix',

'Arkansas':'LittleRock','California':'Sacramento','Colorado':'Denver',

'Connecticut':'Hartford','Delaware':'Dover','Florida':'Tallahassee',

'Georgia':'Atlanta','Hawaii':'Honolulu','Idaho':'Boise','Illinois':

'Springfield','Indiana':'Indianapolis','Iowa':'DesMoines','Kansas':

'Topeka','Kentucky':'Frankfort','Louisiana':'BatonRouge','Maine':

'Augusta','Maryland':'Annapolis','Massachusetts':'Boston','Michigan':

'Lansing','Minnesota':'SaintPaul','Mississippi':'Jackson','Missouri':

'JeffersonCity','Montana':'Helena','Nebraska':'Lincoln','Nevada':

'CarsonCity','NewHampshire':'Concord','NewJersey':'Trenton','New

Mexico':'SantaFe','NewYork':'Albany','NorthCarolina':'Raleigh',

'NorthDakota':'Bismarck','Ohio':'Columbus','Oklahoma':'OklahomaCity',

'Oregon':'Salem','Pennsylvania':'Harrisburg','RhodeIsland':'Providence',

'SouthCarolina':'Columbia','SouthDakota':'Pierre','Tennessee':

'Nashville','Texas':'Austin','Utah':'SaltLakeCity','Vermont':

'Montpelier','Virginia':'Richmond','Washington':'Olympia','West

Virginia':'Charleston','Wisconsin':'Madison','Wyoming':'Cheyenne'}

#Generate35quizfiles.

➌forquizNuminrange(35):

#TODO:Createthequizandanswerkeyfiles.

#TODO:Writeouttheheaderforthequiz.

#TODO:Shuffletheorderofthestates.

#TODO:Loopthroughall50states,makingaquestionforeach.

Sincethisprogramwillberandomlyorderingthequestionsandanswers,you’llneedtoimporttherandommodule➊tomakeuseofitsfunctions.Thecapitalsvariable➋containsadictionarywithUSstatesaskeysandtheircapitalsasvalues.Andsinceyouwanttocreate35quizzes,thecodethatactuallygeneratesthequizandanswerkeyfiles(markedwithTODOcommentsfornow)willgoinsideaforloopthatloops35times➌.(Thisnumbercanbechangedtogenerateanynumberofquizfiles.)

Step2:CreatetheQuizFileandShuffletheQuestionOrderNowit’stimetostartfillinginthoseTODOs.

Thecodeintheloopwillberepeated35times—onceforeachquiz—soyouhavetoworryaboutonlyonequizatatimewithintheloop.Firstyou’llcreatetheactualquizfile.Itneedstohaveauniquefilenameandshouldalsohavesomekindofstandardheaderinit,withplacesforthestudenttofillinaname,date,andclassperiod.Thenyou’llneedtogetalistofstatesinrandomizedorder,whichcanbeusedlatertocreatethequestionsandanswersforthequiz.

AddthefollowinglinesofcodetorandomQuizGenerator.py:#!python3

#randomQuizGenerator.py-Createsquizzeswithquestionsandanswersin

#randomorder,alongwiththeanswerkey.

--snip--

#Generate35quizfiles.

forquizNuminrange(35):

#Createthequizandanswerkeyfiles.

➊quizFile=open('capitalsquiz%s.txt'%(quizNum+1),'w')

➋answerKeyFile=open('capitalsquiz_answers%s.txt'%(quizNum+1),'w')

#Writeouttheheaderforthequiz.

➌quizFile.write('Name:\n\nDate:\n\nPeriod:\n\n')

quizFile.write((''*20)+'StateCapitalsQuiz(Form%s)'%(quizNum+1))

quizFile.write('\n\n')

#Shuffletheorderofthestates.

states=list(capitals.keys())

➍random.shuffle(states)

#TODO:Loopthroughall50states,makingaquestionforeach.

Thefilenamesforthequizzeswillbecapitalsquiz<N>.txt,where<N>isauniquenumberforthequizthatcomesfromquizNum,theforloop’scounter.Theanswerkeyforcapitalsquiz<N>.txtwillbestoredinatextfilenamedcapitalsquiz_answers<N>.txt.Eachtimethroughtheloop,the%splaceholderin'capitalsquiz%s.txt'and'capitalsquiz_answers%s.txt'willbereplacedby(quizNum+1),sothefirstquizandanswerkeycreatedwillbecapitalsquiz1.txtandcapitalsquiz_answers1.txt.Thesefileswillbecreatedwithcallstotheopen()functionat➊and➋,with'w'asthesecondargumenttoopentheminwritemode.

Thewrite()statementsat➌createaquizheaderforthestudenttofillout.Finally,arandomizedlistofUSstatesiscreatedwiththehelpoftherandom.shuffle()function➍,whichrandomlyreordersthevaluesinanylistthatispassedtoit.

Step3:CreatetheAnswerOptions

Nowyouneedtogeneratetheansweroptionsforeachquestion,whichwillbemultiplechoicefromAtoD.You’llneedtocreateanotherforloop—thisonetogeneratethecontentforeachofthe50questionsonthequiz.Thentherewillbeathirdforloopnestedinsidetogeneratethemultiple-choiceoptionsforeachquestion.Makeyourcodelooklikethefollowing:

#!python3

#randomQuizGenerator.py-Createsquizzeswithquestionsandanswersin

#randomorder,alongwiththeanswerkey.

--snip--

#Loopthroughall50states,makingaquestionforeach.

forquestionNuminrange(50):

#Getrightandwronganswers.

➊correctAnswer=capitals[states[questionNum]]

➋wrongAnswers=list(capitals.values())

➌delwrongAnswers[wrongAnswers.index(correctAnswer)]

➍wrongAnswers=random.sample(wrongAnswers,3)

➎answerOptions=wrongAnswers+[correctAnswer]

➏random.shuffle(answerOptions)

#TODO:Writethequestionandansweroptionstothequizfile.

#TODO:Writetheanswerkeytoafile.

Thecorrectansweriseasytoget—it’sstoredasavalueinthecapitalsdictionary➊.Thisloopwillloopthroughthestatesintheshuffledstateslist,fromstates[0]tostates[49],findeachstateincapitals,andstorethatstate’scorrespondingcapitalincorrectAnswer.

Thelistofpossiblewronganswersistrickier.Youcangetitbyduplicatingallthevaluesinthecapitalsdictionary➋,deletingthecorrectanswer➌,andselectingthreerandomvaluesfromthislist➍.Therandom.sample()functionmakesiteasytodothisselection.Itsfirstargumentisthelistyouwanttoselectfrom;thesecondargumentisthenumberofvaluesyouwanttoselect.Thefulllistofansweroptionsisthecombinationofthesethreewronganswerswiththecorrectanswers➎.Finally,theanswersneedtoberandomized➏sothatthecorrectresponseisn’talwayschoiceD.

Step4:WriteContenttotheQuizandAnswerKeyFilesAllthatisleftistowritethequestiontothequizfileandtheanswertotheanswerkeyfile.Makeyourcodelooklikethefollowing:

#!python3

#randomQuizGenerator.py-Createsquizzeswithquestionsandanswersin

#randomorder,alongwiththeanswerkey.

--snip--

#Loopthroughall50states,makingaquestionforeach.

forquestionNuminrange(50):

--snip--

#Writethequestionandtheansweroptionstothequizfile.

quizFile.write('%s.Whatisthecapitalof%s?\n'%(questionNum+1,

states[questionNum]))

➊foriinrange(4):

➋quizFile.write('%s.%s\n'%('ABCD'[i],answerOptions[i]))

quizFile.write('\n')

#Writetheanswerkeytoafile.

➌answerKeyFile.write('%s.%s\n'%(questionNum+1,'ABCD'[

answerOptions.index(correctAnswer)]))

quizFile.close()

answerKeyFile.close()

Aforloopthatgoesthroughintegers0to3willwritetheansweroptionsintheanswerOptionslist➊.Theexpression'ABCD'[i]at➋treatsthestring'ABCD'asanarrayandwillevaluateto'A','B','C',andthen'D'oneachrespectiveiterationthroughtheloop.

Inthefinalline➌,theexpressionanswerOptions.index(correctAnswer)willfindtheintegerindexofthecorrectanswerintherandomlyorderedansweroptions,and'ABCD'[answerOptions.index(correctAnswer)]willevaluatetothecorrectanswer’slettertobewrittentotheanswerkeyfile.

Afteryouruntheprogram,thisishowyourcapitalsquiz1.txtfilewilllook,thoughofcourseyourquestionsandansweroptionsmaybedifferentfromthoseshownhere,dependingontheoutcomeofyourrandom.shuffle()calls:

Name:

Date:

Period:

StateCapitalsQuiz(Form1)

1.WhatisthecapitalofWestVirginia?

A.Hartford

B.SantaFe

C.Harrisburg

D.Charleston

2.WhatisthecapitalofColorado?

A.Raleigh

B.Harrisburg

C.Denver

D.Lincoln

--snip--

Thecorrespondingcapitalsquiz_answers1.txttextfilewilllooklikethis:1.D

2.C

3.A

4.C

--snip--

Project:MulticlipboardSayyouhavetheboringtaskoffillingoutmanyformsinawebpageorsoftwarewithseveraltextfields.Theclipboardsavesyoufromtypingthesametextoverandoveragain.Butonlyonethingcanbeontheclipboardatatime.Ifyouhaveseveraldifferentpiecesoftextthatyouneedtocopyandpaste,youhavetokeephighlightingandcopyingthesamefewthingsoverandoveragain.

YoucanwriteaPythonprogramtokeeptrackofmultiplepiecesoftext.This“multiclipboard”willbenamedmcb.pyw(since“mcb”isshortertotypethan“multiclipboard”).The.pywextensionmeansthatPythonwon’tshowaTerminalwindowwhenitrunsthisprogram.(SeeAppendixBformoredetails.)

Theprogramwillsaveeachpieceofclipboardtextunderakeyword.Forexample,whenyourunpymcb.pywsavespam,thecurrentcontentsoftheclipboardwillbesavedwiththekeywordspam.Thistextcanlaterbeloadedtotheclipboardagainbyrunningpymcb.pywspam.Andiftheuserforgetswhatkeywordstheyhave,theycanrunpymcb.pywlisttocopyalistofallkeywordstotheclipboard.

Here’swhattheprogramdoes:

Thecommandlineargumentforthekeywordischecked.Iftheargumentissave,thentheclipboardcontentsaresavedtothekeyword.Iftheargumentislist,thenallthekeywordsarecopiedtotheclipboard.Otherwise,thetextforthekeywordiscopiedtothekeyboard.

Thismeansthecodewillneedtodothefollowing:

Readthecommandlineargumentsfromsys.argv.Readandwritetotheclipboard.Saveandloadtoashelffile.

IfyouuseWindows,youcaneasilyrunthisscriptfromtheRun…windowbycreatingabatchfilenamedmcb.batwiththefollowingcontent:

@pyw.exeC:\Python34\mcb.pyw%*

Step1:CommentsandShelfSetupLet’sstartbymakingaskeletonscriptwithsomecommentsandbasicsetup.Makeyourcodelooklikethefollowing:

#!python3

#mcb.pyw-Savesandloadspiecesoftexttotheclipboard.

➊#Usage:py.exemcb.pywsave<keyword>-Savesclipboardtokeyword.

#py.exemcb.pyw<keyword>-Loadskeywordtoclipboard.

#py.exemcb.pywlist-Loadsallkeywordstoclipboard.

➋importshelve,pyperclip,sys

➌mcbShelf=shelve.open('mcb')

#TODO:Saveclipboardcontent.

#TODO:Listkeywordsandloadcontent.

mcbShelf.close()

It’scommonpracticetoputgeneralusageinformationincommentsatthetopofthefile

➊.Ifyoueverforgethowtorunyourscript,youcanalwayslookatthesecommentsforareminder.Thenyouimportyourmodules➋.Copyingandpastingwillrequirethepyperclipmodule,andreadingthecommandlineargumentswillrequirethesysmodule.Theshelvemodulewillalsocomeinhandy:Whenevertheuserwantstosaveanewpieceofclipboardtext,you’llsaveittoashelffile.Then,whentheuserwantstopastethetextbacktotheirclipboard,you’llopentheshelffileandloaditbackintoyourprogram.Theshelffilewillbenamedwiththeprefixmcb➌.

Step2:SaveClipboardContentwithaKeywordTheprogramdoesdifferentthingsdependingonwhethertheuserwantstosavetexttoakeyword,loadtextintotheclipboard,orlistalltheexistingkeywords.Let’sdealwiththatfirstcase.Makeyourcodelooklikethefollowing:

#!python3

#mcb.pyw-Savesandloadspiecesoftexttotheclipboard.

--snip--

#Saveclipboardcontent.

➊iflen(sys.argv)==3andsys.argv[1].lower()=='save':

➋mcbShelf[sys.argv[2]]=pyperclip.paste()

eliflen(sys.argv)==2:

➌#TODO:Listkeywordsandloadcontent.

mcbShelf.close()

Ifthefirstcommandlineargument(whichwillalwaysbeatindex1ofthesys.argvlist)is'save'➊,thesecondcommandlineargumentisthekeywordforthecurrentcontentoftheclipboard.ThekeywordwillbeusedasthekeyformcbShelf,andthevaluewillbethetextcurrentlyontheclipboard➋.

Ifthereisonlyonecommandlineargument,youwillassumeitiseither'list'orakeywordtoloadcontentontotheclipboard.Youwillimplementthatcodelater.Fornow,justputaTODOcommentthere➌.

Step3:ListKeywordsandLoadaKeyword’sContentFinally,let’simplementthetworemainingcases:Theuserwantstoloadclipboardtextinfromakeyword,ortheywantalistofallavailablekeywords.Makeyourcodelooklikethefollowing:

#!python3

#mcb.pyw-Savesandloadspiecesoftexttotheclipboard.

--snip--

#Saveclipboardcontent.

iflen(sys.argv)==3andsys.argv[1].lower()=='save':

mcbShelf[sys.argv[2]]=pyperclip.paste()

eliflen(sys.argv)==2:

#Listkeywordsandloadcontent.

➊ifsys.argv[1].lower()=='list':

➋pyperclip.copy(str(list(mcbShelf.keys())))

elifsys.argv[1]inmcbShelf:

➌pyperclip.copy(mcbShelf[sys.argv[1]])

mcbShelf.close()

Ifthereisonlyonecommandlineargument,firstlet’scheckwhetherit’s'list'➊.Ifso,astringrepresentationofthelistofshelfkeyswillbecopiedtotheclipboard➋.Theusercanpastethislistintoanopentexteditortoreadit.

Otherwise,youcanassumethecommandlineargumentisakeyword.Ifthiskeyword

existsinthemcbShelfshelfasakey,youcanloadthevalueontotheclipboard➌.

Andthat’sit!Launchingthisprogramhasdifferentstepsdependingonwhatoperatingsystemyourcomputeruses.SeeAppendixBfordetailsforyouroperatingsystem.

RecallthepasswordlockerprogramyoucreatedinChapter6thatstoredthepasswordsinadictionary.Updatingthepasswordsrequiredchangingthesourcecodeoftheprogram.Thisisn’tidealbecauseaverageusersdon’tfeelcomfortablechangingsourcecodetoupdatetheirsoftware.Also,everytimeyoumodifythesourcecodetoaprogram,youruntheriskofaccidentallyintroducingnewbugs.Bystoringthedataforaprograminadifferentplacethanthecode,youcanmakeyourprogramseasierforotherstouseandmoreresistanttobugs.

SummaryFilesareorganizedintofolders(alsocalleddirectories),andapathdescribesthelocationofafile.Everyprogramrunningonyourcomputerhasacurrentworkingdirectory,whichallowsyoutospecifyfilepathsrelativetothecurrentlocationinsteadofalwaystypingthefull(orabsolute)path.Theos.pathmodulehasmanyfunctionsformanipulatingfilepaths.

Yourprogramscanalsodirectlyinteractwiththecontentsoftextfiles.Theopen()functioncanopenthesefilestoreadintheircontentsasonelargestring(withtheread()method)orasalistofstrings(withthereadlines()method).Theopen()functioncanopenfilesinwriteorappendmodetocreatenewtextfilesoraddtoexistingtextfiles,respectively.

Inpreviouschapters,youusedtheclipboardasawayofgettinglargeamountsoftextintoaprogram,ratherthantypingitallin.Nowyoucanhaveyourprogramsreadfilesdirectlyfromtheharddrive,whichisabigimprovement,sincefilesaremuchlessvolatilethantheclipboard.

Inthenextchapter,youwilllearnhowtohandlethefilesthemselves,bycopyingthem,deletingthem,renamingthem,movingthem,andmore.

PracticeQuestionsQ: 1.Whatisarelativepathrelativeto?

Q: 2.Whatdoesanabsolutepathstartwith?

Q: 3.Whatdotheos.getcwd()andos.chdir()functionsdo?

Q: 4.Whatarethe.and..folders?

Q: 5.InC:\bacon\eggs\spam.txt,whichpartisthedirname,andwhichpartisthebasename?

Q: 6.Whatarethethree“mode”argumentsthatcanbepassedtotheopen()function?

Q: 7.Whathappensifanexistingfileisopenedinwritemode?

Q: 8.Whatisthedifferencebetweentheread()andreadlines()methods?

Q: 9.Whatdatastructuredoesashelfvalueresemble?

PracticeProjectsForpractice,designandwritethefollowingprograms.

ExtendingtheMulticlipboardExtendthemulticlipboardprograminthischaptersothatithasadelete<keyword>commandlineargumentthatwilldeleteakeywordfromtheshelf.Thenaddadeletecommandlineargumentthatwilldeleteallkeywords.

MadLibsCreateaMadLibsprogramthatreadsintextfilesandletstheuseraddtheirowntextanywherethewordADJECTIVE,NOUN,ADVERB,orVERBappearsinthetextfile.Forexample,atextfilemaylooklikethis:

TheADJECTIVEpandawalkedtotheNOUNandthenVERB.AnearbyNOUNwas

unaffectedbytheseevents.

Theprogramwouldfindtheseoccurrencesandprompttheusertoreplacethem.Enteranadjective:

silly

Enteranoun:

chandelier

Enteraverb:

screamed

Enteranoun:

pickuptruck

Thefollowingtextfilewouldthenbecreated:Thesillypandawalkedtothechandelierandthenscreamed.Anearbypickup

truckwasunaffectedbytheseevents.

Theresultsshouldbeprintedtothescreenandsavedtoanewtextfile.

RegexSearchWriteaprogramthatopensall.txtfilesinafolderandsearchesforanylinethatmatchesauser-suppliedregularexpression.Theresultsshouldbeprintedtothescreen.

Chapter9.OrganizingFilesInthepreviouschapter,youlearnedhowtocreateandwritetonewfilesinPython.Yourprogramscanalsoorganizepreexistingfilesontheharddrive.Maybeyou’vehadtheexperienceofgoingthroughafolderfullofdozens,hundreds,oreventhousandsoffilesandcopying,renaming,moving,orcompressingthemallbyhand.Orconsidertaskssuchasthese:

MakingcopiesofallPDFfiles(andonlythePDFfiles)ineverysub-folderofafolderRemovingtheleadingzerosinthefilenamesforeveryfileinafolderofhundredsoffilesnamedspam001.txt,spam002.txt,spam003.txt,andsoonCompressingthecontentsofseveralfoldersintooneZIPfile(whichcouldbeasimplebackupsystem)

AllthisboringstuffisjustbeggingtobeautomatedinPython.Byprogrammingyourcomputertodothesetasks,youcantransformitintoaquick-workingfileclerkwhonevermakesmistakes.

Asyoubeginworkingwithfiles,youmayfindithelpfultobeabletoquicklyseewhattheextension(.txt,.pdf,.jpg,andsoon)ofafileis.WithOSXandLinux,yourfilebrowsermostlikelyshowsextensionsautomatically.WithWindows,fileextensionsmaybehiddenbydefault.Toshowextensions,gotoStart▸ControlPanel▸AppearanceandPersonalization▸FolderOptions.OntheViewtab,underAdvancedSettings,unchecktheHideextensionsforknownfiletypescheckbox.

TheshutilModuleTheshutil(orshellutilities)modulehasfunctionstoletyoucopy,move,rename,anddeletefilesinyourPythonprograms.Tousetheshutilfunctions,youwillfirstneedtouseimportshutil.

CopyingFilesandFoldersTheshutilmoduleprovidesfunctionsforcopyingfiles,aswellasentirefolders.

Callingshutil.copy(source,destination)willcopythefileatthepathsourcetothefolderatthepathdestination.(Bothsourceanddestinationarestrings.)Ifdestinationisafilename,itwillbeusedasthenewnameofthecopiedfile.Thisfunctionreturnsastringofthepathofthecopiedfile.

Enterthefollowingintotheinteractiveshelltoseehowshutil.copy()works:>>>importshutil,os

>>>os.chdir('C:\\')

➊>>>shutil.copy('C:\\spam.txt','C:\\delicious')

'C:\\delicious\\spam.txt'

➋>>>shutil.copy('eggs.txt','C:\\delicious\\eggs2.txt')

'C:\\delicious\\eggs2.txt'

Thefirstshutil.copy()callcopiesthefileatC:\spam.txttothefolderC:\delicious.Thereturnvalueisthepathofthenewlycopiedfile.Notethatsinceafolderwasspecifiedasthedestination➊,theoriginalspam.txtfilenameisusedforthenew,copiedfile’sfilename.Thesecondshutil.copy()call➋alsocopiesthefileatC:\eggs.txttothefolderC:\deliciousbutgivesthecopiedfilethenameeggs2.txt.

Whileshutil.copy()willcopyasinglefile,shutil.copytree()willcopyanentirefolderandeveryfolderandfilecontainedinit.Callingshutil.copytree(source,destination)willcopythefolderatthepathsource,alongwithallofitsfilesandsubfolders,tothefolderatthepathdestination.Thesourceanddestinationparametersarebothstrings.Thefunctionreturnsastringofthepathofthecopiedfolder.

Enterthefollowingintotheinteractiveshell:>>>importshutil,os

>>>os.chdir('C:\\')

>>>shutil.copytree('C:\\bacon','C:\\bacon_backup')

'C:\\bacon_backup'

Theshutil.copytree()callcreatesanewfoldernamedbacon_backupwiththesamecontentastheoriginalbaconfolder.Youhavenowsafelybackedupyourprecious,preciousbacon.

MovingandRenamingFilesandFoldersCallingshutil.move(source,destination)willmovethefileorfolderatthepathsourcetothepathdestinationandwillreturnastringoftheabsolutepathofthenewlocation.

Ifdestinationpointstoafolder,thesourcefilegetsmovedintodestinationandkeepsitscurrentfilename.Forexample,enterthefollowingintotheinteractiveshell:

>>>importshutil

>>>shutil.move('C:\\bacon.txt','C:\\eggs')

'C:\\eggs\\bacon.txt'

AssumingafoldernamedeggsalreadyexistsintheC:\directory,thisshutil.move()callssays,“MoveC:\bacon.txtintothefolderC:\eggs.”

Iftherehadbeenabacon.txtfilealreadyinC:\eggs,itwouldhavebeenoverwritten.Sinceit’seasytoaccidentallyoverwritefilesinthisway,youshouldtakesomecarewhenusingmove().

Thedestinationpathcanalsospecifyafilename.Inthefollowingexample,thesourcefileismovedandrenamed.

>>>shutil.move('C:\\bacon.txt','C:\\eggs\\new_bacon.txt')

'C:\\eggs\\new_bacon.txt'

Thislinesays,“MoveC:\bacon.txtintothefolderC:\eggs,andwhileyou’reatit,renamethatbacon.txtfiletonew_bacon.txt.”

BothofthepreviousexamplesworkedundertheassumptionthattherewasafoldereggsintheC:\directory.Butifthereisnoeggsfolder,thenmove()willrenamebacon.txttoafilenamedeggs.

>>>shutil.move('C:\\bacon.txt','C:\\eggs')

'C:\\eggs'

Here,move()can’tfindafoldernamedeggsintheC:\directoryandsoassumesthatdestinationmustbespecifyingafilename,notafolder.Sothebacon.txttextfileisrenamedtoeggs(atextfilewithoutthe.txtfileextension)—probablynotwhatyouwanted!Thiscanbeatough-to-spotbuginyourprogramssincethemove()callcanhappilydosomethingthatmightbequitedifferentfromwhatyouwereexpecting.Thisisyetanotherreasontobecarefulwhenusingmove().

Finally,thefoldersthatmakeupthedestinationmustalreadyexist,orelsePythonwillthrowanexception.Enterthefollowingintotheinteractiveshell:

>>>shutil.move('spam.txt','c:\\does_not_exist\\eggs\\ham')

Traceback(mostrecentcalllast):

File"C:\Python34\lib\shutil.py",line521,inmove

os.rename(src,real_dst)

FileNotFoundError:[WinError3]Thesystemcannotfindthepathspecified:

'spam.txt'->'c:\\does_not_exist\\eggs\\ham'

Duringhandlingoftheaboveexception,anotherexceptionoccurred:

Traceback(mostrecentcalllast):

File"<pyshell#29>",line1,in<module>

shutil.move('spam.txt','c:\\does_not_exist\\eggs\\ham')

File"C:\Python34\lib\shutil.py",line533,inmove

copy2(src,real_dst)

File"C:\Python34\lib\shutil.py",line244,incopy2

copyfile(src,dst,follow_symlinks=follow_symlinks)

File"C:\Python34\lib\shutil.py",line108,incopyfile

withopen(dst,'wb')asfdst:

FileNotFoundError:[Errno2]Nosuchfileordirectory:'c:\\does_not_exist\\

eggs\\ham'

Pythonlooksforeggsandhaminsidethedirectorydoes_not_exist.Itdoesn’tfindthenonexistentdirectory,soitcan’tmovespam.txttothepathyouspecified.

PermanentlyDeletingFilesandFoldersYoucandeleteasinglefileorasingleemptyfolderwithfunctionsintheosmodule,whereastodeleteafolderandallofitscontents,youusetheshutilmodule.

Callingos.unlink(path)willdeletethefileatpath.

Callingos.rmdir(path)willdeletethefolderatpath.Thisfoldermustbeemptyofanyfilesorfolders.Callingshutil.rmtree(path)willremovethefolderatpath,andallfilesandfoldersitcontainswillalsobedeleted.

Becarefulwhenusingthesefunctionsinyourprograms!It’softenagoodideatofirstrunyourprogramwiththesecallscommentedoutandwithprint()callsaddedtoshowthefilesthatwouldbedeleted.HereisaPythonprogramthatwasintendedtodeletefilesthathavethe.txtfileextensionbuthasatypo(highlightedinbold)thatcausesittodelete.rxtfilesinstead:

importos

forfilenameinos.listdir():

iffilename.endswith('.rxt'):

os.unlink(filename)

Ifyouhadanyimportantfilesendingwith.rxt,theywouldhavebeenaccidentally,permanentlydeleted.Instead,youshouldhavefirstruntheprogramlikethis:

importos

forfilenameinos.listdir():

iffilename.endswith('.rxt'):

#os.unlink(filename)

print(filename)

Nowtheos.unlink()calliscommented,soPythonignoresit.Instead,youwillprintthefilenameofthefilethatwouldhavebeendeleted.Runningthisversionoftheprogramfirstwillshowyouthatyou’veaccidentallytoldtheprogramtodelete.rxtfilesinsteadof.txtfiles.

Onceyouarecertaintheprogramworksasintended,deletetheprint(filename)lineanduncommenttheos.unlink(filename)line.Thenruntheprogramagaintoactuallydeletethefiles.

SafeDeleteswiththesend2trashModuleSincePython’sbuilt-inshutil.rmtree()functionirreversiblydeletesfilesandfolders,itcanbedangeroustouse.Amuchbetterwaytodeletefilesandfoldersiswiththethird-partysend2trashmodule.Youcaninstallthismodulebyrunningpipinstallsend2trashfromaTerminalwindow.(SeeAppendixAforamorein-depthexplanationofhowtoinstallthird-partymodules.)

Usingsend2trashismuchsaferthanPython’sregulardeletefunctions,becauseitwillsendfoldersandfilestoyourcomputer’strashorrecyclebininsteadofpermanentlydeletingthem.Ifabuginyourprogramdeletessomethingwithsend2trashyoudidn’tintendtodelete,youcanlaterrestoreitfromtherecyclebin.

Afteryouhaveinstalledsend2trash,enterthefollowingintotheinteractiveshell:>>>importsend2trash

>>>baconFile=open('bacon.txt','a')#createsthefile

>>>baconFile.write('Baconisnotavegetable.')

25

>>>baconFile.close()

>>>send2trash.send2trash('bacon.txt')

Ingeneral,youshouldalwaysusethesend2trash.send2trash()functiontodeletefilesandfolders.Butwhilesendingfilestotherecyclebinletsyourecoverthemlater,itwillnotfreeupdiskspacelikepermanentlydeletingthemdoes.Ifyouwantyourprogramto

freeupdiskspace,usetheosandshutilfunctionsfordeletingfilesandfolders.Notethatthesend2trash()functioncanonlysendfilestotherecyclebin;itcannotpullfilesoutofit.

WalkingaDirectoryTreeSayyouwanttorenameeveryfileinsomefolderandalsoeveryfileineverysubfolderofthatfolder.Thatis,youwanttowalkthroughthedirectorytree,touchingeachfileasyougo.Writingaprogramtodothiscouldgettricky;fortunately,Pythonprovidesafunctiontohandlethisprocessforyou.

Let’slookattheC:\deliciousfolderwithitscontents,showninFigure9-1.

Figure9-1.Anexamplefolderthatcontainsthreefoldersandfourfiles

Hereisanexampleprogramthatusestheos.walk()functiononthedirectorytreefromFigure9-1:

importos

forfolderName,subfolders,filenamesinos.walk('C:\\delicious'):

print('Thecurrentfolderis'+folderName)

forsubfolderinsubfolders:

print('SUBFOLDEROF'+folderName+':'+subfolder)

forfilenameinfilenames:

print('FILEINSIDE'+folderName+':'+filename)

print('')

Theos.walk()functionispassedasinglestringvalue:thepathofafolder.Youcanuseos.walk()inaforloopstatementtowalkadirectorytree,muchlikehowyoucanusetherange()functiontowalkoverarangeofnumbers.Unlikerange(),theos.walk()functionwillreturnthreevaluesoneachiterationthroughtheloop:

1. Astringofthecurrentfolder’sname2. Alistofstringsofthefoldersinthecurrentfolder3. Alistofstringsofthefilesinthecurrentfolder

(Bycurrentfolder,Imeanthefolderforthecurrentiterationoftheforloop.Thecurrentworkingdirectoryoftheprogramisnotchangedbyos.walk().)

Justlikeyoucanchoosethevariablenameiinthecodeforiinrange(10):,youcanalsochoosethevariablenamesforthethreevalueslistedearlier.Iusuallyusethenamesfoldername,subfolders,andfilenames.

Whenyourunthisprogram,itwilloutputthefollowing:ThecurrentfolderisC:\delicious

SUBFOLDEROFC:\delicious:cats

SUBFOLDEROFC:\delicious:walnut

FILEINSIDEC:\delicious:spam.txt

ThecurrentfolderisC:\delicious\cats

FILEINSIDEC:\delicious\cats:catnames.txt

FILEINSIDEC:\delicious\cats:zophie.jpg

ThecurrentfolderisC:\delicious\walnut

SUBFOLDEROFC:\delicious\walnut:waffles

ThecurrentfolderisC:\delicious\walnut\waffles

FILEINSIDEC:\delicious\walnut\waffles:butter.txt.

Sinceos.walk()returnslistsofstringsforthesubfolderandfilenamevariables,youcanusetheselistsintheirownforloops.Replacetheprint()functioncallswithyourowncustomcode.(Orifyoudon’tneedoneorbothofthem,removetheforloops.)

CompressingFileswiththezipfileModuleYoumaybefamiliarwithZIPfiles(withthe.zipfileextension),whichcanholdthecompressedcontentsofmanyotherfiles.Compressingafilereducesitssize,whichisusefulwhentransferringitovertheInternet.AndsinceaZIPfilecanalsocontainmultiplefilesandsubfolders,it’sahandywaytopackageseveralfilesintoone.Thissinglefile,calledanarchivefile,canthenbe,say,attachedtoanemail.

YourPythonprogramscanbothcreateandopen(orextract)ZIPfilesusingfunctionsinthezipfilemodule.SayyouhaveaZIPfilenamedexample.zipthathasthecontentsshowninFigure9-2.

YoucandownloadthisZIPfilefromhttp://nostarch.com/automatestuff/orjustfollowalongusingaZIPfilealreadyonyourcomputer.

Figure9-2.Thecontentsofexample.zip

ReadingZIPFilesToreadthecontentsofaZIPfile,firstyoumustcreateaZipFileobject(notethecapitallettersZandF).ZipFileobjectsareconceptuallysimilartotheFileobjectsyousawreturnedbytheopen()functioninthepreviouschapter:Theyarevaluesthroughwhichtheprograminteractswiththefile.TocreateaZipFileobject,callthezipfile.ZipFile()function,passingitastringofthe.zipfile’sfilename.NotethatzipfileisthenameofthePythonmodule,andZipFile()isthenameofthefunction.

Forexample,enterthefollowingintotheinteractiveshell:>>>importzipfile,os

>>>os.chdir('C:\\')#movetothefolderwithexample.zip

>>>exampleZip=zipfile.ZipFile('example.zip')

>>>exampleZip.namelist()

['spam.txt','cats/','cats/catnames.txt','cats/zophie.jpg']

>>>spamInfo=exampleZip.getinfo('spam.txt')

>>>spamInfo.file_size

13908

>>>spamInfo.compress_size

3828

➊>>>'Compressedfileis%sxsmaller!'%(round(spamInfo.file_size/spamInfo

.compress_size,2))

'Compressedfileis3.63xsmaller!'

>>>exampleZip.close()

AZipFileobjecthasanamelist()methodthatreturnsalistofstringsforallthefilesandfolderscontainedintheZIPfile.Thesestringscanbepassedtothegetinfo()ZipFilemethodtoreturnaZipInfoobjectaboutthatparticularfile.ZipInfoobjectshavetheirownattributes,suchasfile_sizeandcompress_sizeinbytes,whichholdintegersoftheoriginalfilesizeandcompressedfilesize,respectively.WhileaZipFileobjectrepresents

anentirearchivefile,aZipInfoobjectholdsusefulinformationaboutasinglefileinthearchive.

Thecommandat➊calculateshowefficientlyexample.zipiscompressedbydividingtheoriginalfilesizebythecompressedfilesizeandprintsthisinformationusingastringformattedwith%s.

ExtractingfromZIPFilesTheextractall()methodforZipFileobjectsextractsallthefilesandfoldersfromaZIPfileintothecurrentworkingdirectory.

>>>importzipfile,os

>>>os.chdir('C:\\')#movetothefolderwithexample.zip

>>>exampleZip=zipfile.ZipFile('example.zip')

➊>>>exampleZip.extractall()

>>>exampleZip.close()

Afterrunningthiscode,thecontentsofexample.zipwillbeextractedtoC:\.Optionally,youcanpassafoldernametoextractall()tohaveitextractthefilesintoafolderotherthanthecurrentworkingdirectory.Ifthefolderpassedtotheextractall()methoddoesnotexist,itwillbecreated.Forinstance,ifyoureplacedthecallat➊withexampleZip.extractall('C:\\delicious'),thecodewouldextractthefilesfromexample.zipintoanewlycreatedC:\deliciousfolder.

Theextract()methodforZipFileobjectswillextractasinglefilefromtheZIPfile.Continuetheinteractiveshellexample:

>>>exampleZip.extract('spam.txt')

'C:\\spam.txt'

>>>exampleZip.extract('spam.txt','C:\\some\\new\\folders')

'C:\\some\\new\\folders\\spam.txt'

>>>exampleZip.close()

Thestringyoupasstoextract()mustmatchoneofthestringsinthelistreturnedbynamelist().Optionally,youcanpassasecondargumenttoextract()toextractthefileintoafolderotherthanthecurrentworkingdirectory.Ifthissecondargumentisafolderthatdoesn’tyetexist,Pythonwillcreatethefolder.Thevaluethatextract()returnsistheabsolutepathtowhichthefilewasextracted.

CreatingandAddingtoZIPFilesTocreateyourowncompressedZIPfiles,youmustopentheZipFileobjectinwritemodebypassing'w'asthesecondargument.(Thisissimilartoopeningatextfileinwritemodebypassing'w'totheopen()function.)

Whenyoupassapathtothewrite()methodofaZipFileobject,PythonwillcompressthefileatthatpathandadditintotheZIPfile.Thewrite()method’sfirstargumentisastringofthefilenametoadd.Thesecondargumentisthecompressiontypeparameter,whichtellsthecomputerwhatalgorithmitshouldusetocompressthefiles;youcanalwaysjustsetthisvaluetozipfile.ZIP_DEFLATED.(Thisspecifiesthedeflatecompressionalgorithm,whichworkswellonalltypesofdata.)Enterthefollowingintotheinteractiveshell:

>>>importzipfile

>>>newZip=zipfile.ZipFile('new.zip','w')

>>>newZip.write('spam.txt',compress_type=zipfile.ZIP_DEFLATED)

>>>newZip.close()

ThiscodewillcreateanewZIPfilenamednew.zipthathasthecompressedcontentsofspam.txt.

Keepinmindthat,justaswithwritingtofiles,writemodewilleraseallexistingcontentsofaZIPfile.IfyouwanttosimplyaddfilestoanexistingZIPfile,pass'a'asthesecondargumenttozipfile.ZipFile()toopentheZIPfileinappendmode.

Project:RenamingFileswithAmerican-StyleDatestoEuropean-StyleDatesSayyourbossemailsyouthousandsoffileswithAmerican-styledates(MM-DD-YYYY)intheirnamesandneedsthemrenamedtoEuropean-styledates(DD-MM-YYYY).Thisboringtaskcouldtakealldaytodobyhand!Let’swriteaprogramtodoitinstead.

Here’swhattheprogramdoes:

ItsearchesallthefilenamesinthecurrentworkingdirectoryforAmerican-styledates.Whenoneisfound,itrenamesthefilewiththemonthanddayswappedtomakeitEuropean-style.

Thismeansthecodewillneedtodothefollowing:

CreatearegexthatcanidentifythetextpatternofAmerican-styledates.Callos.listdir()tofindallthefilesintheworkingdirectory.Loopovereachfilename,usingtheregextocheckwhetherithasadate.Ifithasadate,renamethefilewithshutil.move().

Forthisproject,openanewfileeditorwindowandsaveyourcodeasrenameDates.py.

Step1:CreateaRegexforAmerican-StyleDatesThefirstpartoftheprogramwillneedtoimportthenecessarymodulesandcreatearegexthatcanidentifyMM-DD-YYYYdates.Theto-docommentswillremindyouwhat’slefttowriteinthisprogram.TypingthemasTODOmakesthemeasytofindusingIDLE’sCTRL-Ffindfeature.Makeyourcodelooklikethefollowing:

#!python3

#renameDates.py-RenamesfilenameswithAmericanMM-DD-YYYYdateformat

#toEuropeanDD-MM-YYYY.

➊importshutil,os,re

#CreatearegexthatmatchesfileswiththeAmericandateformat.

➋datePattern=re.compile(r"""^(.*?)#alltextbeforethedate

((0|1)?\d)-#oneortwodigitsforthemonth

((0|1|2|3)?\d)-#oneortwodigitsfortheday

((19|20)\d\d)#fourdigitsfortheyear

(.*?)$#alltextafterthedate

➌""",re.VERBOSE)

#TODO:Loopoverthefilesintheworkingdirectory.

#TODO:Skipfileswithoutadate.

#TODO:Getthedifferentpartsofthefilename.

#TODO:FormtheEuropean-stylefilename.

#TODO:Getthefull,absolutefilepaths.

#TODO:Renamethefiles.

Fromthischapter,youknowtheshutil.move()functioncanbeusedtorenamefiles:Itsargumentsarethenameofthefiletorenameandthenewfilename.Becausethisfunctionexistsintheshutilmodule,youmustimportthatmodule➊.

Butbeforerenamingthefiles,youneedtoidentifywhichfilesyouwanttorename.Filenameswithdatessuchasspam4-4-1984.txtand01-03-2014eggs.zipshouldbe

renamed,whilefilenameswithoutdatessuchaslittlebrother.epubcanbeignored.

Youcanusearegularexpressiontoidentifythispattern.Afterimportingtheremoduleatthetop,callre.compile()tocreateaRegexobject➋.Passingre.VERBOSEforthesecondargument➌willallowwhitespaceandcommentsintheregexstringtomakeitmorereadable.

Theregularexpressionstringbeginswith^(.*?)tomatchanytextatthebeginningofthefilenamethatmightcomebeforethedate.The((0|1)?\d)groupmatchesthemonth.Thefirstdigitcanbeeither0or1,sotheregexmatches12forDecemberbutalso02forFebruary.Thisdigitisalsooptionalsothatthemonthcanbe04or4forApril.Thegroupforthedayis((0|1|2|3)?\d)andfollowssimilarlogic;3,03,and31areallvalidnumbersfordays.(Yes,thisregexwillacceptsomeinvaliddatessuchas4-31-2014,2-29-2013,and0-15-2014.Dateshavealotofthornyspecialcasesthatcanbeeasytomiss.Butforsimplicity,theregexinthisprogramworkswellenough.)

While1885isavalidyear,youcanjustlookforyearsinthe20thor21stcentury.Thiswillkeepyourprogramfromaccidentallymatchingnondatefilenameswithadate-likeformat,suchas10-10-1000.txt.

The(.*?)$partoftheregexwillmatchanytextthatcomesafterthedate.

Step2:IdentifytheDatePartsfromtheFilenamesNext,theprogramwillhavetoloopoverthelistoffilenamestringsreturnedfromos.listdir()andmatchthemagainsttheregex.Anyfilesthatdonothaveadateinthemshouldbeskipped.Forfilenamesthathaveadate,thematchedtextwillbestoredinseveralvariables.FillinthefirstthreeTODOsinyourprogramwiththefollowingcode:

#!python3

#renameDates.py-RenamesfilenameswithAmericanMM-DD-YYYYdateformat

#toEuropeanDD-MM-YYYY.

--snip--

#Loopoverthefilesintheworkingdirectory.

foramerFilenameinos.listdir('.'):

mo=datePattern.search(amerFilename)

#Skipfileswithoutadate.

➊ifmo==None:

➋continue

➌#Getthedifferentpartsofthefilename.

beforePart=mo.group(1)

monthPart=mo.group(2)

dayPart=mo.group(4)

yearPart=mo.group(6)

afterPart=mo.group(8)

--snip--

IftheMatchobjectreturnedfromthesearch()methodisNone➊,thenthefilenameinamerFilenamedoesnotmatchtheregularexpression.Thecontinuestatement➋willskiptherestoftheloopandmoveontothenextfilename.

Otherwise,thevariousstringsmatchedintheregularexpressiongroupsarestoredinvariablesnamedbeforePart,monthPart,dayPart,yearPart,andafterPart➌.ThestringsinthesevariableswillbeusedtoformtheEuropean-stylefilenameinthenextstep.

Tokeepthegroupnumbersstraight,tryreadingtheregexfromthebeginningandcountupeachtimeyouencounteranopeningparenthesis.Withoutthinkingaboutthecode,justwriteanoutlineoftheregularexpression.Thiscanhelpyouvisualizethegroups.Forexample:

datePattern=re.compile(r"""^(1)#alltextbeforethedate

(2(3))-#oneortwodigitsforthemonth

(4(5))-#oneortwodigitsfortheday

(6(7))#fourdigitsfortheyear

(8)$#alltextafterthedate

""",re.VERBOSE)

Here,thenumbers1through8representthegroupsintheregularexpressionyouwrote.Makinganoutlineoftheregularexpression,withjusttheparenthesesandgroupnumbers,cangiveyouaclearerunderstandingofyourregexbeforeyoumoveonwiththerestoftheprogram.

Step3:FormtheNewFilenameandRenametheFilesAsthefinalstep,concatenatethestringsinthevariablesmadeinthepreviousstepwiththeEuropean-styledate:Thedatecomesbeforethemonth.FillinthethreeremainingTODOsinyourprogramwiththefollowingcode:

#!python3

#renameDates.py-RenamesfilenameswithAmericanMM-DD-YYYYdateformat

#toEuropeanDD-MM-YYYY.

--snip--

#FormtheEuropean-stylefilename.

➊euroFilename=beforePart+dayPart+'-'+monthPart+'-'+yearPart+

afterPart

#Getthefull,absolutefilepaths.

absWorkingDir=os.path.abspath('.')

amerFilename=os.path.join(absWorkingDir,amerFilename)

euroFilename=os.path.join(absWorkingDir,euroFilename)

#Renamethefiles.

➋print('Renaming"%s"to"%s"...'%(amerFilename,euroFilename))

➌#shutil.move(amerFilename,euroFilename)#uncommentaftertesting

StoretheconcatenatedstringinavariablenamedeuroFilename➊.Then,passtheoriginalfilenameinamerFilenameandtheneweuroFilenamevariabletotheshutil.move()functiontorenamethefile➌.

Thisprogramhastheshutil.move()callcommentedoutandinsteadprintsthefilenamesthatwillberenamed➋.Runningtheprogramlikethisfirstcanletyoudouble-checkthatthefilesarerenamedcorrectly.Thenyoucanuncommenttheshutil.move()callandruntheprogramagaintoactuallyrenamethefiles.

IdeasforSimilarProgramsTherearemanyotherreasonswhyyoumightwanttorenamealargenumberoffiles.

Toaddaprefixtothestartofthefilename,suchasaddingspam_torenameeggs.txttospam_eggs.txtTochangefilenameswithEuropean-styledatestoAmerican-styledatesToremovethezerosfromfilessuchasspam0042.txt

Project:BackingUpaFolderintoaZIPFileSayyou’reworkingonaprojectwhosefilesyoukeepinafoldernamedC:\AlsPythonBook.You’reworriedaboutlosingyourwork,soyou’dliketocreateZIPfile“snapshots”oftheentirefolder.You’dliketokeepdifferentversions,soyouwanttheZIPfile’sfilenametoincrementeachtimeitismade;forexample,AlsPythonBook_1.zip,AlsPythonBook_2.zip,AlsPythonBook_3.zip,andsoon.Youcoulddothisbyhand,butitisratherannoying,andyoumightaccidentallymisnumbertheZIPfiles’names.Itwouldbemuchsimplertorunaprogramthatdoesthisboringtaskforyou.

Forthisproject,openanewfileeditorwindowandsaveitasbackupToZip.py.

Step1:FigureOuttheZIPFile’sNameThecodeforthisprogramwillbeplacedintoafunctionnamedbackupToZip().ThiswillmakeiteasytocopyandpastethefunctionintootherPythonprogramsthatneedthisfunctionality.Attheendoftheprogram,thefunctionwillbecalledtoperformthebackup.Makeyourprogramlooklikethis:

#!python3

#backupToZip.py-Copiesanentirefolderanditscontentsinto

#aZIPfilewhosefilenameincrements.

➊importzipfile,os

defbackupToZip(folder):

#Backuptheentirecontentsof"folder"intoaZIPfile.

folder=os.path.abspath(folder)#makesurefolderisabsolute

#Figureoutthefilenamethiscodeshouldusebasedon

#whatfilesalreadyexist.

➋number=1

➌whileTrue:

zipFilename=os.path.basename(folder)+'_'+str(number)+'.zip'

ifnotos.path.exists(zipFilename):

break

number=number+1

➍#TODO:CreatetheZIPfile.

#TODO:Walktheentirefoldertreeandcompressthefilesineachfolder.

print('Done.')

backupToZip('C:\\delicious')

Dothebasicsfirst:Addtheshebang(#!)line,describewhattheprogramdoes,andimportthezipfileandosmodules➊.

DefineabackupToZip()functionthattakesjustoneparameter,folder.Thisparameterisastringpathtothefolderwhosecontentsshouldbebackedup.ThefunctionwilldeterminewhatfilenametousefortheZIPfileitwillcreate;thenthefunctionwillcreatethefile,walkthefolderfolder,andaddeachofthesubfoldersandfilestotheZIPfile.WriteTODOcommentsforthesestepsinthesourcecodetoremindyourselftodothemlater➍.

Thefirstpart,namingtheZIPfile,usesthebasenameoftheabsolutepathoffolder.IfthefolderbeingbackedupisC:\delicious,theZIPfile’snameshouldbedelicious_N.zip,whereN=1isthefirsttimeyouruntheprogram,N=2isthesecondtime,andsoon.

YoucandeterminewhatNshouldbebycheckingwhetherdelicious_1.zipalreadyexists,

thencheckingwhetherdelicious_2.zipalreadyexists,andsoon.UseavariablenamednumberforN➋,andkeepincrementingitinsidetheloopthatcallsos.path.exists()tocheckwhetherthefileexists➌.Thefirstnonexistentfilenamefoundwillcausethelooptobreak,sinceitwillhavefoundthefilenameofthenewzip.

Step2:CreatetheNewZIPFileNextlet’screatetheZIPfile.Makeyourprogramlooklikethefollowing:

#!python3

#backupToZip.py-Copiesanentirefolderanditscontentsinto

#aZIPfilewhosefilenameincrements.

--snip--

whileTrue:

zipFilename=os.path.basename(folder)+'_'+str(number)+'.zip'

ifnotos.path.exists(zipFilename):

break

number=number+1

#CreatetheZIPfile.

print('Creating%s…'%(zipFilename))

➊backupZip=zipfile.ZipFile(zipFilename,'w')

#TODO:Walktheentirefoldertreeandcompressthefilesineachfolder.

print('Done.')

backupToZip('C:\\delicious')

NowthatthenewZIPfile’snameisstoredinthezipFilenamevariable,youcancallzipfile.ZipFile()toactuallycreatetheZIPfile➊.Besuretopass'w'asthesecondargumentsothattheZIPfileisopenedinwritemode.

Step3:WalktheDirectoryTreeandAddtotheZIPFileNowyouneedtousetheos.walk()functiontodotheworkoflistingeveryfileinthefolderanditssubfolders.Makeyourprogramlooklikethefollowing:

#!python3

#backupToZip.py-Copiesanentirefolderanditscontentsinto

#aZIPfilewhosefilenameincrements.

--snip--

#Walktheentirefoldertreeandcompressthefilesineachfolder.

➊forfoldername,subfolders,filenamesinos.walk(folder):

print('Addingfilesin%s…'%(foldername))

#AddthecurrentfoldertotheZIPfile.

➋backupZip.write(foldername)

#AddallthefilesinthisfoldertotheZIPfile.

➌forfilenameinfilenames:

newBase/os.path.basename(folder)+'_'

iffilename.startswith(newBase)andfilename.endswith('.zip')

continue#don'tbackupthebackupZIPfiles

backupZip.write(os.path.join(foldername,filename))

backupZip.close()

print('Done.')

backupToZip('C:\\delicious')

Youcanuseos.walk()inaforloop➊,andoneachiterationitwillreturntheiteration’scurrentfoldername,thesubfoldersinthatfolder,andthefilenamesinthatfolder.

Intheforloop,thefolderisaddedtotheZIPfile➋.Thenestedforloopcangothrougheachfilenameinthefilenameslist➌.EachoftheseisaddedtotheZIPfile,exceptforpreviouslymadebackupZIPs.

Whenyourunthisprogram,itwillproduceoutputthatwilllooksomethinglikethis:Creatingdelicious_1.zip…

AddingfilesinC:\delicious…

AddingfilesinC:\delicious\cats…

AddingfilesinC:\delicious\waffles…

AddingfilesinC:\delicious\walnut…

AddingfilesinC:\delicious\walnut\waffles…

Done.

Thesecondtimeyourunit,itwillputallthefilesinC:\deliciousintoaZIPfilenameddelicious_2.zip,andsoon.

IdeasforSimilarProgramsYoucanwalkadirectorytreeandaddfilestocompressedZIParchivesinseveralotherprograms.Forexample,youcanwriteprogramsthatdothefollowing:

Walkadirectorytreeandarchivejustfileswithcertainextensions,suchas.txtor.py,andnothingelseWalkadirectorytreeandarchiveeveryfileexceptthe.txtand.pyonesFindthefolderinadirectorytreethathasthegreatestnumberoffilesorthefolderthatusesthemostdiskspace

SummaryEvenifyouareanexperiencedcomputeruser,youprobablyhandlefilesmanuallywiththemouseandkeyboard.Modernfileexplorersmakeiteasytoworkwithafewfiles.Butsometimesyou’llneedtoperformataskthatwouldtakehoursusingyourcomputer’sfileexplorer.

Theosandshutilmodulesofferfunctionsforcopying,moving,renaming,anddeletingfiles.Whendeletingfiles,youmightwanttousethesend2trashmoduletomovefilestotherecyclebinortrashratherthanpermanentlydeletingthem.Andwhenwritingprogramsthathandlefiles,it’sagoodideatocommentoutthecodethatdoestheactualcopy/move/rename/deleteandaddaprint()callinsteadsoyoucanruntheprogramandverifyexactlywhatitwilldo.

Oftenyouwillneedtoperformtheseoperationsnotonlyonfilesinonefolderbutalsooneveryfolderinthatfolder,everyfolderinthosefolders,andsoon.Theos.walk()functionhandlesthistrekacrossthefoldersforyousothatyoucanconcentrateonwhatyourprogramneedstodowiththefilesinthem.

Thezipfilemodulegivesyouawayofcompressingandextractingfilesin.ziparchivesthroughPython.Combinedwiththefile-handlingfunctionsofosandshutil,zipfilemakesiteasytopackageupseveralfilesfromanywhereonyourharddrive.These.zipfilesaremucheasiertouploadtowebsitesorsendasemailattachmentsthanmanyseparatefiles.

Previouschaptersofthisbookhaveprovidedsourcecodeforyoutocopy.Butwhenyouwriteyourownprograms,theyprobablywon’tcomeoutperfectlythefirsttime.ThenextchapterfocusesonsomePythonmodulesthatwillhelpyouanalyzeanddebugyourprogramssothatyoucanquicklygetthemworkingcorrectly.

PracticeQuestionsQ: 1.Whatisthedifferencebetweenshutil.copy()andshutil.copytree()?

Q: 2.Whatfunctionisusedtorenamefiles?

Q: 3.Whatisthedifferencebetweenthedeletefunctionsinthesend2trashandshutilmodules?

Q: 4.ZipFileobjectshaveaclose()methodjustlikeFileobjects’close()method.WhatZipFilemethodisequivalenttoFileobjects’open()method?

PracticeProjectsForpractice,writeprogramstodothefollowingtasks.

SelectiveCopyWriteaprogramthatwalksthroughafoldertreeandsearchesforfileswithacertainfileextension(suchas.pdfor.jpg).Copythesefilesfromwhateverlocationtheyareintoanewfolder.

DeletingUnneededFilesIt’snotuncommonforafewunneededbuthumongousfilesorfolderstotakeupthebulkofthespaceonyourharddrive.Ifyou’retryingtofreeuproomonyourcomputer,you’llgetthemostbangforyourbuckbydeletingthemostmassiveoftheunwantedfiles.Butfirstyouhavetofindthem.

Writeaprogramthatwalksthroughafoldertreeandsearchesforexceptionallylargefilesorfolders—say,onesthathaveafilesizeofmorethan100MB.(Remember,togetafile’ssize,youcanuseos.path.getsize()fromtheosmodule.)Printthesefileswiththeirabsolutepathtothescreen.

FillingintheGapsWriteaprogramthatfindsallfileswithagivenprefix,suchasspam001.txt,spam002.txt,andsoon,inasinglefolderandlocatesanygapsinthenumbering(suchasifthereisaspam001.txtandspam003.txtbutnospam002.txt).Havetheprogramrenameallthelaterfilestoclosethisgap.

Asanaddedchallenge,writeanotherprogramthatcaninsertgapsintonumberedfilessothatanewfilecanbeadded.

Chapter10.DebuggingNowthatyouknowenoughtowritemorecomplicatedprograms,youmaystartfindingnot-so-simplebugsinthem.Thischaptercoverssometoolsandtechniquesforfindingtherootcauseofbugsinyourprogramtohelpyoufixbugsfasterandwithlesseffort.

Toparaphraseanoldjokeamongprogrammers,“Writingcodeaccountsfor90percentofprogramming.Debuggingcodeaccountsfortheother90percent.”

Yourcomputerwilldoonlywhatyoutellittodo;itwon’treadyourmindanddowhatyouintendedittodo.Evenprofessionalprogrammerscreatebugsallthetime,sodon’tfeeldiscouragedifyourprogramhasaproblem.

Fortunately,thereareafewtoolsandtechniquestoidentifywhatexactlyyourcodeisdoingandwhereit’sgoingwrong.First,youwilllookatloggingandassertions,twofeaturesthatcanhelpyoudetectbugsearly.Ingeneral,theearlieryoucatchbugs,theeasiertheywillbetofix.

Second,youwilllookathowtousethedebugger.ThedebuggerisafeatureofIDLEthatexecutesaprogramoneinstructionatatime,givingyouachancetoinspectthevaluesinvariableswhileyourcoderuns,andtrackhowthevalueschangeoverthecourseofyourprogram.Thisismuchslowerthanrunningtheprogramatfullspeed,butitishelpfultoseetheactualvaluesinaprogramwhileitruns,ratherthandeducingwhatthevaluesmightbefromthesourcecode.

RaisingExceptionsPythonraisesanexceptionwheneverittriestoexecuteinvalidcode.InChapter3,youreadabouthowtohandlePython’sexceptionswithtryandexceptstatementssothatyourprogramcanrecoverfromexceptionsthatyouanticipated.Butyoucanalsoraiseyourownexceptionsinyourcode.Raisinganexceptionisawayofsaying,“Stoprunningthecodeinthisfunctionandmovetheprogramexecutiontotheexceptstatement.”

Exceptionsareraisedwitharaisestatement.Incode,araisestatementconsistsofthefollowing:

TheraisekeywordAcalltotheException()functionAstringwithahelpfulerrormessagepassedtotheException()function

Forexample,enterthefollowingintotheinteractiveshell:>>>raiseException('Thisistheerrormessage.')

Traceback(mostrecentcalllast):

File"<pyshell#191>",line1,in<module>

raiseException('Thisistheerrormessage.')

Exception:Thisistheerrormessage.

Iftherearenotryandexceptstatementscoveringtheraisestatementthatraisedtheexception,theprogramsimplycrashesanddisplaystheexception’serrormessage.

Oftenit’sthecodethatcallsthefunction,notthefuctionitself,thatknowshowtohandleanexpection.Soyouwillcommonlyseearaisestatementinsideafunctionandthetryandexceptstatementsinthecodecallingthefunction.Forexample,openanewfileeditorwindow,enterthefollowingcode,andsavetheprogramasboxPrint.py:

defboxPrint(symbol,width,height):

iflen(symbol)!=1:

➊raiseException('Symbolmustbeasinglecharacterstring.')

ifwidth<=2:

➋raiseException('Widthmustbegreaterthan2.')

ifheight<=2:

➌raiseException('Heightmustbegreaterthan2.')

print(symbol*width)

foriinrange(height-2):

print(symbol+(''*(width-2))+symbol)

print(symbol*width)

forsym,w,hin(('*',4,4),('O',20,5),('x',1,3),('ZZ',3,3)):

try:

boxPrint(sym,w,h)

➍exceptExceptionaserr:

➎print('Anexceptionhappened:'+str(err))

Herewe’vedefinedaboxPrint()functionthattakesacharacter,awidth,andaheight,andusesthecharactertomakealittlepictureofaboxwiththatwidthandheight.Thisboxshapeisprintedtotheconsole.

Saywewantthecharactertobeasinglecharacter,andthewidthandheighttobegreaterthan2.Weaddifstatementstoraiseexceptionsiftheserequirementsaren’tsatisfied.Later,whenwecallboxPrint()withvariousarguments,ourtry/exceptwillhandleinvalidarguments.

ThisprogramusestheexceptExceptionaserrformoftheexceptstatement➍.IfanExceptionobjectisreturnedfromboxPrint()➊➋➌,thisexceptstatementwillstoreit

inavariablenamederr.TheExceptionobjectcanthenbeconvertedtoastringbypassingittostr()toproduceauser-friendlyerrormessage➎.WhenyourunthisboxPrint.py,theoutputwilllooklikethis:

****

**

**

****

OOOOOOOOOOOOOOOOOOOO

OO

OO

OO

OOOOOOOOOOOOOOOOOOOO

Anexceptionhappened:Widthmustbegreaterthan2.

Anexceptionhappened:Symbolmustbeasinglecharacterstring.

Usingthetryandexceptstatements,youcanhandleerrorsmoregracefullyinsteadoflettingtheentireprogramcrash.

GettingtheTracebackasaStringWhenPythonencountersanerror,itproducesatreasuretroveoferrorinformationcalledthetraceback.Thetracebackincludestheerrormessage,thelinenumberofthelinethatcausedtheerror,andthesequenceofthefunctioncallsthatledtotheerror.Thissequenceofcallsiscalledthecallstack.

OpenanewfileeditorwindowinIDLE,enterthefollowingprogram,andsaveitaserrorExample.py:

defspam():

bacon()

defbacon():

raiseException('Thisistheerrormessage.')

spam()

WhenyourunerrorExample.py,theoutputwilllooklikethis:Traceback(mostrecentcalllast):

File"errorExample.py",line7,in<module>

spam()

File"errorExample.py",line2,inspam

bacon()

File"errorExample.py",line5,inbacon

raiseException('Thisistheerrormessage.')

Exception:Thisistheerrormessage.

Fromthetraceback,youcanseethattheerrorhappenedonline5,inthebacon()function.Thisparticularcalltobacon()camefromline2,inthespam()function,whichinturnwascalledonline7.Inprogramswherefunctionscanbecalledfrommultipleplaces,thecallstackcanhelpyoudeterminewhichcallledtotheerror.

ThetracebackisdisplayedbyPythonwheneveraraisedexceptiongoesunhandled.Butyoucanalsoobtainitasastringbycallingtraceback.format_exc().Thisfunctionisusefulifyouwanttheinformationfromanexception’stracebackbutalsowantanexceptstatementtogracefullyhandletheexception.YouwillneedtoimportPython’stracebackmodulebeforecallingthisfunction.

Forexample,insteadofcrashingyourprogramrightwhenanexceptionoccurs,youcanwritethetracebackinformationtoalogfileandkeepyourprogramrunning.Youcanlookatthelogfilelater,whenyou’rereadytodebugyourprogram.Enterthefollowingintotheinteractiveshell:

>>>importtraceback

>>>try:

raiseException('Thisistheerrormessage.')

except:

errorFile=open('errorInfo.txt','w')

errorFile.write(traceback.format_exc())

errorFile.close()

print('ThetracebackinfowaswrittentoerrorInfo.txt.')

116

ThetracebackinfowaswrittentoerrorInfo.txt.

The116isthereturnvaluefromthewrite()method,since116characterswerewrittentothefile.ThetracebacktextwaswrittentoerrorInfo.txt.

Traceback(mostrecentcalllast):

File"<pyshell#28>",line2,in<module>

Exception:Thisistheerrormessage.

AssertionsAnassertionisasanitychecktomakesureyourcodeisn’tdoingsomethingobviouslywrong.Thesesanitychecksareperformedbyassertstatements.Ifthesanitycheckfails,thenanAssertionErrorexceptionisraised.Incode,anassertstatementconsistsofthefollowing:

TheassertkeywordAcondition(thatis,anexpressionthatevaluatestoTrueorFalse)AcommaAstringtodisplaywhentheconditionisFalse

Forexample,enterthefollowingintotheinteractiveshell:>>>podBayDoorStatus='open'

>>>assertpodBayDoorStatus=='open','Thepodbaydoorsneedtobe"open".'

>>>podBayDoorStatus='I\'msorry,Dave.I\'mafraidIcan'tdothat.''

>>>assertpodBayDoorStatus=='open','Thepodbaydoorsneedtobe"open".'

Traceback(mostrecentcalllast):

File"<pyshell#10>",line1,in<module>

assertpodBayDoorStatus=='open','Thepodbaydoorsneedtobe"open".'

AssertionError:Thepodbaydoorsneedtobe"open".

Herewe’vesetpodBayDoorStatusto'open',sofromnowon,wefullyexpectthevalueofthisvariabletobe'open'.Inaprogramthatusesthisvariable,wemighthavewrittenalotofcodeundertheassumptionthatthevalueis'open'—codethatdependsonitsbeing'open'inordertoworkasweexpect.Soweaddanassertiontomakesurewe’rerighttoassumepodBayDoorStatusis'open'.Here,weincludethemessage'Thepodbaydoorsneedtobe"open".'soit’llbeeasytoseewhat’swrongiftheassertionfails.

Later,saywemaketheobviousmistakeofassigningpodBayDoorStatusanothervalue,butdon’tnoticeitamongmanylinesofcode.Theassertioncatchesthismistakeandclearlytellsuswhat’swrong.

InplainEnglish,anassertstatementsays,“Iassertthatthisconditionholdstrue,andifnot,thereisabugsomewhereintheprogram.”Unlikeexceptions,yourcodeshouldnothandleassertstatementswithtryandexcept;ifanassertfails,yourprogramshouldcrash.Byfailingfastlikethis,youshortenthetimebetweentheoriginalcauseofthebugandwhenyoufirstnoticethebug.Thiswillreducetheamountofcodeyouwillhavetocheckbeforefindingthecodethat’scausingthebug.

Assertionsareforprogrammererrors,notusererrors.Forerrorsthatcanberecoveredfrom(suchasafilenotbeingfoundortheuserenteringinvaliddata),raiseanexceptioninsteadofdetectingitwithanassertstatement.

UsinganAssertioninaTrafficLightSimulationSayyou’rebuildingatrafficlightsimulationprogram.Thedatastructurerepresentingthestoplightsatanintersectionisadictionarywithkeys'ns'and'ew',forthestoplightsfacingnorth-southandeast-west,respectively.Thevaluesatthesekeyswillbeoneofthestrings'green','yellow',or'red'.Thecodewouldlooksomethinglikethis:

market_2nd={'ns':'green','ew':'red'}

mission_16th={'ns':'red','ew':'green'}

ThesetwovariableswillbefortheintersectionsofMarketStreetand2ndStreet,and

MissionStreetand16thStreet.Tostarttheproject,youwanttowriteaswitchLights()function,whichwilltakeanintersectiondictionaryasanargumentandswitchthelights.

Atfirst,youmightthinkthatswitchLights()shouldsimplyswitcheachlighttothenextcolorinthesequence:Any'green'valuesshouldchangeto'yellow','yellow'valuesshouldchangeto'red',and'red'valuesshouldchangeto'green'.Thecodetoimplementthisideamightlooklikethis:

defswitchLights(stoplight):

forkeyinstoplight.keys():

ifstoplight[key]=='green':

stoplight[key]='yellow'

elifstoplight[key]=='yellow':

stoplight[key]='red'

elifstoplight[key]=='red':

stoplight[key]='green'

switchLights(market_2nd)

Youmayalreadyseetheproblemwiththiscode,butlet’spretendyouwrotetherestofthesimulationcode,thousandsoflineslong,withoutnoticingit.Whenyoufinallydorunthesimulation,theprogramdoesn’tcrash—butyourvirtualcarsdo!

Sinceyou’vealreadywrittentherestoftheprogram,youhavenoideawherethebugcouldbe.Maybeit’sinthecodesimulatingthecarsorinthecodesimulatingthevirtualdrivers.ItcouldtakehourstotracethebugbacktotheswitchLights()function.

ButifwhilewritingswitchLights()youhadaddedanassertiontocheckthatatleastoneofthelightsisalwaysred,youmighthaveincludedthefollowingatthebottomofthefunction:

assert'red'instoplight.values(),'Neitherlightisred!'+str(stoplight)

Withthisassertioninplace,yourprogramwouldcrashwiththiserrormessage:Traceback(mostrecentcalllast):

File"carSim.py",line14,in<module>

switchLights(market_2nd)

File"carSim.py",line13,inswitchLights

assert'red'instoplight.values(),'Neitherlightisred!'+str(stoplight)

➊AssertionError:Neitherlightisred!{'ns':'yellow','ew':'green'}

TheimportantlinehereistheAssertionError➊.Whileyourprogramcrashingisnotideal,itimmediatelypointsoutthatasanitycheckfailed:Neitherdirectionoftraffichasaredlight,meaningthattrafficcouldbegoingbothways.Byfailingfastearlyintheprogram’sexecution,youcansaveyourselfalotoffuturedebuggingeffort.

DisablingAssertionsAssertionscanbedisabledbypassingthe-OoptionwhenrunningPython.Thisisgoodforwhenyouhavefinishedwritingandtestingyourprogramanddon’twantittobesloweddownbyperformingsanitychecks(althoughmostofthetimeassertstatementsdonotcauseanoticeablespeeddifference).Assertionsarefordevelopment,notthefinalproduct.Bythetimeyouhandoffyourprogramtosomeoneelsetorun,itshouldbefreeofbugsandnotrequirethesanitychecks.SeeAppendixBfordetailsabouthowtolaunchyourprobably-not-insaneprogramswiththe-Ooption.

LoggingIfyou’veeverputaprint()statementinyourcodetooutputsomevariable’svaluewhileyourprogramisrunning,you’veusedaformofloggingtodebugyourcode.Loggingisagreatwaytounderstandwhat’shappeninginyourprogramandinwhatorderitshappening.Python’sloggingmodulemakesiteasytocreatearecordofcustommessagesthatyouwrite.Theselogmessageswilldescribewhentheprogramexecutionhasreachedtheloggingfunctioncallandlistanyvariablesyouhavespecifiedatthatpointintime.Ontheotherhand,amissinglogmessageindicatesapartofthecodewasskippedandneverexecuted.

UsingtheloggingModuleToenabletheloggingmoduletodisplaylogmessagesonyourscreenasyourprogramruns,copythefollowingtothetopofyourprogram(butunderthe#!pythonshebangline):

importlogging

logging.basicConfig(level=logging.DEBUG,format='%(asctime)s-%(levelname)s

-%(message)s')

Youdon’tneedtoworrytoomuchabouthowthisworks,butbasically,whenPythonlogsanevent,itcreatesaLogRecordobjectthatholdsinformationaboutthatevent.Theloggingmodule’sbasicConfig()functionletsyouspecifywhatdetailsabouttheLogRecordobjectyouwanttoseeandhowyouwantthosedetailsdisplayed.

Sayyouwroteafunctiontocalculatethefactorialofanumber.Inmathematics,factorial4is1×2×3×4,or24.Factorial7is1×2×3×4×5×6×7,or5,040.Openanewfileeditorwindowandenterthefollowingcode.Ithasabuginit,butyouwillalsoenterseverallogmessagestohelpyourselffigureoutwhatisgoingwrong.SavetheprogramasfactorialLog.py.

importlogging

logging.basicConfig(level=logging.DEBUG,format='%(asctime)s-%(levelname)s

-%(message)s')

logging.debug('Startofprogram')

deffactorial(n):

logging.debug('Startoffactorial(%)'%(n))

total=1

foriinrange(n+1):

total*=i

logging.debug('iis'+str(i)+',totalis'+str(total))

logging.debug('Endoffactorial(%)'%(n))

returntotal

print(factorial(5))

logging.debug('Endofprogram')

Here,weusethelogging.debug()functionwhenwewanttoprintloginformation.Thisdebug()functionwillcallbasicConfig(),andalineofinformationwillbeprinted.ThisinformationwillbeintheformatwespecifiedinbasicConfig()andwillincludethemessageswepassedtodebug().Theprint(factorial(5))callispartoftheoriginalprogram,sotheresultisdisplayedevenifloggingmessagesaredisabled.

Theoutputofthisprogramlookslikethis:2015-05-2316:20:12,664-DEBUG-Startofprogram

2015-05-2316:20:12,664-DEBUG-Startoffactorial(5)

2015-05-2316:20:12,665-DEBUG-iis0,totalis0

2015-05-2316:20:12,668-DEBUG-iis1,totalis0

2015-05-2316:20:12,670-DEBUG-iis2,totalis0

2015-05-2316:20:12,673-DEBUG-iis3,totalis0

2015-05-2316:20:12,675-DEBUG-iis4,totalis0

2015-05-2316:20:12,678-DEBUG-iis5,totalis0

2015-05-2316:20:12,680-DEBUG-Endoffactorial(5)

0

2015-05-2316:20:12,684-DEBUG-Endofprogram

Thefactorial()functionisreturning0asthefactorialof5,whichisn’tright.Theforloopshouldbemultiplyingthevalueintotalbythenumbersfrom1to5.Butthelogmessagesdisplayedbylogging.debug()showthattheivariableisstartingat0insteadof1.Sincezerotimesanythingiszero,therestoftheiterationsalsohavethewrongvaluefortotal.Loggingmessagesprovideatrailofbreadcrumbsthatcanhelpyoufigureoutwhenthingsstartedtogowrong.

Changetheforiinrange(n+1):linetoforiinrange(1,n+1):,andruntheprogramagain.Theoutputwilllooklikethis:

2015-05-2317:13:40,650-DEBUG-Startofprogram

2015-05-2317:13:40,651-DEBUG-Startoffactorial(5)

2015-05-2317:13:40,651-DEBUG-iis1,totalis1

2015-05-2317:13:40,654-DEBUG-iis2,totalis2

2015-05-2317:13:40,656-DEBUG-iis3,totalis6

2015-05-2317:13:40,659-DEBUG-iis4,totalis24

2015-05-2317:13:40,661-DEBUG-iis5,totalis120

2015-05-2317:13:40,661-DEBUG-Endoffactorial(5)

120

2015-05-2317:13:40,666-DEBUG-Endofprogram

Thefactorial(5)callcorrectlyreturns120.Thelogmessagesshowedwhatwasgoingoninsidetheloop,whichledstraighttothebug.

Youcanseethatthelogging.debug()callsprintedoutnotjustthestringspassedtothembutalsoatimestampandthewordDEBUG.

Don’tDebugwithprint()Typingimportloggingandlogging.basicConfig(level=logging.DEBUG,format='%(asctime)s-%(levelname)s-%(message)s')issomewhatunwieldy.Youmaywanttouseprint()callsinstead,butdon’tgiveintothistemptation!Onceyou’redonedebugging,you’llendupspendingalotoftimeremovingprint()callsfromyourcodeforeachlogmessage.Youmightevenaccidentallyremovesomeprint()callsthatwerebeingusedfornonlogmessages.Thenicethingaboutlogmessagesisthatyou’refreetofillyourprogramwithasmanyasyoulike,andyoucanalwaysdisablethemlaterbyaddingasinglelogging.disable(logging.CRITICAL)call.Unlikeprint(),theloggingmodulemakesiteasytoswitchbetweenshowingandhidinglogmessages.

Logmessagesareintendedfortheprogrammer,nottheuser.Theuserwon’tcareaboutthecontentsofsomedictionaryvalueyouneedtoseetohelpwithdebugging;usealogmessageforsomethinglikethat.Formessagesthattheuserwillwanttosee,likeFilenotfoundorInvalidinput,pleaseenteranumber,youshoulduseaprint()call.Youdon’twanttodeprivetheuserofusefulinformationafteryou’vedisabledlogmessages.

LoggingLevelsLogginglevelsprovideawaytocategorizeyourlogmessagesbyimportance.Therearefivelogginglevels,describedinTable10-1fromleasttomostimportant.Messagescanbe

loggedateachlevelusingadifferentloggingfunction.

Table10-1.LoggingLevelsinPython

Level LoggingFunction Description

DEBUG logging.debug() Thelowestlevel.Usedforsmalldetails.Usuallyyoucareaboutthesemessagesonlywhendiagnosingproblems.

INFO logging.info() Usedtorecordinformationongeneraleventsinyourprogramorconfirmthatthingsareworkingattheirpointintheprogram.

WARNING logging.warning() Usedtoindicateapotentialproblemthatdoesn’tpreventtheprogramfromworkingbutmightdosointhefuture.

ERROR logging.error() Usedtorecordanerrorthatcausedtheprogramtofailtodosomething.

CRITICAL logging.critical() Thehighestlevel.Usedtoindicateafatalerrorthathascausedorisabouttocausetheprogramtostoprunningentirely.

Yourloggingmessageispassedasastringtothesefunctions.Thelogginglevelsaresuggestions.Ultimately,itisuptoyoutodecidewhichcategoryyourlogmessagefallsinto.Enterthefollowingintotheinteractiveshell:

>>>importlogging

>>>logging.basicConfig(level=logging.DEBUG,format='%(asctime)s-

%(levelname)s-%(message)s')

>>>logging.debug('Somedebuggingdetails.')

2015-05-1819:04:26,901-DEBUG-Somedebuggingdetails.

>>>logging.info('Theloggingmoduleisworking.')

2015-05-1819:04:35,569-INFO-Theloggingmoduleisworking.

>>>logging.warning('Anerrormessageisabouttobelogged.')

2015-05-1819:04:56,843-WARNING-Anerrormessageisabouttobelogged.

>>>logging.error('Anerrorhasoccurred.')

2015-05-1819:05:07,737-ERROR-Anerrorhasoccurred.

>>>logging.critical('Theprogramisunabletorecover!')

2015-05-1819:05:45,794-CRITICAL-Theprogramisunabletorecover!

Thebenefitoflogginglevelsisthatyoucanchangewhatpriorityofloggingmessageyouwanttosee.Passinglogging.DEBUGtothebasicConfig()function’slevelkeywordargumentwillshowmessagesfromallthelogginglevels(DEBUGbeingthelowestlevel).Butafterdevelopingyourprogramsomemore,youmaybeinterestedonlyinerrors.Inthatcase,youcansetbasicConfig()’slevelargumenttologging.ERROR.ThiswillshowonlyERRORandCRITICALmessagesandskiptheDEBUG,INFO,andWARNINGmessages.

DisablingLoggingAfteryou’vedebuggedyourprogram,youprobablydon’twantalltheselogmessagesclutteringthescreen.Thelogging.disable()functiondisablesthesesothatyoudon’thavetogointoyourprogramandremovealltheloggingcallsbyhand.Yousimplypasslogging.disable()alogginglevel,anditwillsuppressalllogmessagesatthatlevelorlower.Soifyouwanttodisableloggingentirely,justaddlogging.disable(logging.CRITICAL)toyourprogram.Forexample,enterthefollowingintotheinteractiveshell:

>>>importlogging

>>>logging.basicConfig(level=logging.INFO,format='%(asctime)s-

%(levelname)s-%(message)s')

>>>logging.critical('Criticalerror!Criticalerror!')

2015-05-2211:10:48,054-CRITICAL-Criticalerror!Criticalerror!

>>>logging.disable(logging.CRITICAL)

>>>logging.critical('Criticalerror!Criticalerror!')

>>>logging.error('Error!Error!')

Sincelogging.disable()willdisableallmessagesafterit,youwillprobablywanttoadditneartheimportlogginglineofcodeinyourprogram.Thisway,youcaneasilyfindittocommentoutoruncommentthatcalltoenableordisableloggingmessagesasneeded.

LoggingtoaFileInsteadofdisplayingthelogmessagestothescreen,youcanwritethemtoatextfile.Thelogging.basicConfig()functiontakesafilenamekeywordargument,likeso:

importlogging

logging.basicConfig(filename='myProgramLog.txt',level=logging.DEBUG,format='

%(asctime)s-%(levelname)s-%(message)s')

ThelogmessageswillbesavedtomyProgramLog.txt.Whileloggingmessagesarehelpful,theycanclutteryourscreenandmakeithardtoreadtheprogram’soutput.Writingtheloggingmessagestoafilewillkeepyourscreenclearandstorethemessagessoyoucanreadthemafterrunningtheprogram.Youcanopenthistextfileinanytexteditor,suchasNotepadorTextEdit.

IDLE’sDebuggerThedebuggerisafeatureofIDLEthatallowsyoutoexecuteyourprogramonelineatatime.Thedebuggerwillrunasinglelineofcodeandthenwaitforyoutotellittocontinue.Byrunningyourprogram“underthedebugger”likethis,youcantakeasmuchtimeasyouwanttoexaminethevaluesinthevariablesatanygivenpointduringtheprogram’slifetime.Thisisavaluabletoolfortrackingdownbugs.

ToenableIDLE’sdebugger,clickDebug▸Debuggerintheinteractiveshellwindow.ThiswillbringuptheDebugControlwindow,whichlookslikeFigure10-1.

WhentheDebugControlwindowappears,selectallfouroftheStack,Locals,Source,andGlobalscheckboxessothatthewindowshowsthefullsetofdebuginformation.WhiletheDebugControlwindowisdisplayed,anytimeyourunaprogramfromthefileeditor,thedebuggerwillpauseexecutionbeforethefirstinstructionanddisplaythefollowing:

ThelineofcodethatisabouttobeexecutedAlistofalllocalvariablesandtheirvaluesAlistofallglobalvariablesandtheirvalues

Figure10-1.TheDebugControlwindow

You’llnoticethatinthelistofglobalvariablesthereareseveralvariablesyouhaven’tdefined,suchas__builtins__,__doc__,__file__,andsoon.ThesearevariablesthatPythonautomaticallysetswheneveritrunsaprogram.Themeaningofthesevariablesisbeyondthescopeofthisbook,andyoucancomfortablyignorethem.

TheprogramwillstaypauseduntilyoupressoneofthefivebuttonsintheDebugControlwindow:Go,Step,Over,Out,orQuit.

GoClickingtheGobuttonwillcausetheprogramtoexecutenormallyuntilitterminatesorreachesabreakpoint.(Breakpointsaredescribedlaterinthischapter.)Ifyouaredonedebuggingandwanttheprogramtocontinuenormally,clicktheGobutton.

StepClickingtheStepbuttonwillcausethedebuggertoexecutethenextlineofcodeandthenpauseagain.TheDebugControlwindow’slistofglobalandlocalvariableswillbeupdatediftheirvalueschange.Ifthenextlineofcodeisafunctioncall,thedebuggerwill“stepinto”thatfunctionandjumptothefirstlineofcodeofthatfunction.

OverClickingtheOverbuttonwillexecutethenextlineofcode,similartotheStepbutton.However,ifthenextlineofcodeisafunctioncall,theOverbuttonwill“stepover”thecodeinthefunction.Thefunction’scodewillbeexecutedatfullspeed,andthedebuggerwillpauseassoonasthefunctioncallreturns.Forexample,ifthenextlineofcodeisaprint()call,youdon’treallycareaboutcodeinsidethebuilt-inprint()function;youjustwantthestringyoupassitprintedtothescreen.Forthisreason,usingtheOverbuttonismorecommonthantheStepbutton.

OutClickingtheOutbuttonwillcausethedebuggertoexecutelinesofcodeatfullspeeduntilitreturnsfromthecurrentfunction.IfyouhavesteppedintoafunctioncallwiththeStepbuttonandnowsimplywanttokeepexecutinginstructionsuntilyougetbackout,clicktheOutbuttonto“stepout”ofthecurrentfunctioncall.

QuitIfyouwanttostopdebuggingentirelyandnotbothertocontinueexecutingtherestoftheprogram,clicktheQuitbutton.TheQuitbuttonwillimmediatelyterminatetheprogram.Ifyouwanttorunyourprogramnormallyagain,selectDebug▸Debuggeragaintodisablethedebugger.

DebuggingaNumberAddingProgramOpenanewfileeditorwindowandenterthefollowingcode:

print('Enterthefirstnumbertoadd:')

first=input()

print('Enterthesecondnumbertoadd:')

second=input()

print('Enterthethirdnumbertoadd:')

third=input()

print('Thesumis'+first+second+third)

SaveitasbuggyAddingProgram.pyandrunitfirstwithoutthedebuggerenabled.Theprogramwilloutputsomethinglikethis:

Enterthefirstnumbertoadd:

5

Enterthesecondnumbertoadd:

3

Enterthethirdnumbertoadd:

42

Thesumis5342

Theprogramhasn’tcrashed,butthesumisobviouslywrong.Let’senabletheDebugControlwindowandrunitagain,thistimeunderthedebugger.

WhenyoupressF5orselectRun▸RunModule(withDebug▸DebuggerenabledandallfourcheckboxesontheDebugControlwindowchecked),theprogramstartsinapausedstateonline1.Thedebuggerwillalwayspauseonthelineofcodeitisabouttoexecute.TheDebugControlwindowwilllooklikeFigure10-2.

Figure10-2.TheDebugControlwindowwhentheprogramfirststartsunderthedebugger

ClicktheOverbuttononcetoexecutethefirstprint()call.YoushoulduseOverinsteadofStephere,sinceyoudon’twanttostepintothecodefortheprint()function.TheDebugControlwindowwillupdatetoline2,andline2inthefileeditorwindowwillbehighlighted,asshowninFigure10-3.Thisshowsyouwheretheprogramexecutioncurrentlyis.

Figure10-3.TheDebugControlwindowafterclickingOver

ClickOveragaintoexecutetheinput()functioncall,andthebuttonsintheDebugControlwindowwilldisablethemselveswhileIDLEwaitsforyoutotypesomethingfortheinput()callintotheinteractiveshellwindow.Enter5andpressReturn.TheDebugControlwindowbuttonswillbereenabled.

KeepclickingOver,entering3and42asthenexttwonumbers,untilthedebuggerisonline7,thefinalprint()callintheprogram.TheDebugControlwindowshouldlooklikeFigure10-4.YoucanseeintheGlobalssectionthatthefirst,second,andthirdvariablesaresettostringvalues'5','3',and'42'insteadofintegervalues5,3,and42.Whenthelastlineisexecuted,thesestringsareconcatenatedinsteadofaddedtogether,causingthebug.

Figure10-4.TheDebugControlwindowonthelastline.Thevariablesaresettostrings,causingthebug.

Steppingthroughtheprogramwiththedebuggerishelpfulbutcanalsobeslow.Oftenyou’llwanttheprogramtorunnormallyuntilitreachesacertainlineofcode.Youcanconfigurethedebuggertodothiswithbreakpoints.

BreakpointsAbreakpointcanbesetonaspecificlineofcodeandforcesthedebuggertopausewhenevertheprogramexecutionreachesthatline.Openanewfileeditorwindowandenterthefollowingprogram,whichsimulatesflippingacoin1,000times.SaveitascoinFlip.py.

importrandom

heads=0

foriinrange(1,1001):

➊ifrandom.randint(0,1)==1:

heads=heads+1

ifi==500:

➋print('Halfwaydone!')

print('Headscameup'+str(heads)+'times.')

Therandom.randint(0,1)call➊willreturn0halfofthetimeand1theotherhalfofthetime.Thiscanbeusedtosimulatea50/50coinflipwhere1representsheads.Whenyou

runthisprogramwithoutthedebugger,itquicklyoutputssomethinglikethefollowing:Halfwaydone!

Headscameup490times.

Ifyouranthisprogramunderthedebugger,youwouldhavetoclicktheOverbuttonthousandsoftimesbeforetheprogramterminated.Ifyouwereinterestedinthevalueofheadsatthehalfwaypointoftheprogram’sexecution,when500of1000coinflipshavebeencompleted,youcouldinsteadjustsetabreakpointonthelineprint('Halfwaydone!')➋.Tosetabreakpoint,right-clickthelineinthefileeditorandselectSetBreakpoint,asshowninFigure10-5.

Figure10-5.Settingabreakpoint

Youdon’twanttosetabreakpointontheifstatementline,sincetheifstatementisexecutedoneverysingleiterationthroughtheloop.Bysettingthebreakpointonthecodeintheifstatement,thedebuggerbreaksonlywhentheexecutionenterstheifclause.

Thelinewiththebreakpointwillbehighlightedinyellowinthefileeditor.Whenyouruntheprogramunderthedebugger,itwillstartinapausedstateatthefirstline,asusual.ButifyouclickGo,theprogramwillrunatfullspeeduntilitreachesthelinewiththebreakpointsetonit.YoucanthenclickGo,Over,Step,orOuttocontinueasnormal.

Ifyouwanttoremoveabreakpoint,right-clickthelineinthefileeditorandselectClearBreakpointfromthemenu.Theyellowhighlightingwillgoaway,andthedebuggerwillnotbreakonthatlineinthefuture.

SummaryAssertions,exceptions,logging,andthedebuggerareallvaluabletoolstofindandpreventbugsinyourprogram.AssertionswiththePythonassertstatementareagoodwaytoimplement“sanitychecks”thatgiveyouanearlywarningwhenanecessaryconditiondoesn’tholdtrue.Assertionsareonlyforerrorsthattheprogramshouldn’ttrytorecoverfromandshouldfailfast.Otherwise,youshouldraiseanexception.

Anexceptioncanbecaughtandhandledbythetryandexceptstatements.Theloggingmoduleisagoodwaytolookintoyourcodewhileit’srunningandismuchmoreconvenienttousethantheprint()functionbecauseofitsdifferentlogginglevelsandabilitytologtoatextfile.

Thedebuggerletsyoustepthroughyourprogramonelineatatime.Alternatively,youcanrunyourprogramatnormalspeedandhavethedebuggerpauseexecutionwheneveritreachesalinewithabreakpointset.Usingthedebugger,youcanseethestateofanyvariable’svalueatanypointduringtheprogram’slifetime.

Thesedebuggingtoolsandtechniqueswillhelpyouwriteprogramsthatwork.Accidentallyintroducingbugsintoyourcodeisafactoflife,nomatterhowmanyyearsofcodingexperienceyouhave.

PracticeQuestionsQ: 1.WriteanassertstatementthattriggersanAssertionErrorifthevariablespamisanintegerlessthan10.

Q: 2.WriteanassertstatementthattriggersanAssertionErrorifthevariableseggsandbaconcontainstringsthatarethesameaseachother,eveniftheircasesaredifferent(thatis,'hello'and'hello'areconsideredthesame,and'goodbye'and'GOODbye'arealsoconsideredthesame).

Q: 3.WriteanassertstatementthatalwaystriggersanAssertionError.

Q: 4.Whatarethetwolinesthatyourprogrammusthaveinordertobeabletocalllogging.debug()?

Q: 5.Whatarethetwolinesthatyourprogrammusthaveinordertohavelogging.debug()sendaloggingmessagetoafilenamedprogramLog.txt?

Q: 6.Whatarethefivelogginglevels?

Q: 7.Whatlineofcodecanyouaddtodisableallloggingmessagesinyourprogram?

Q: 8.Whyisusingloggingmessagesbetterthanusingprint()todisplaythesamemessage?

Q: 9.WhatarethedifferencesbetweentheStep,Over,andOutbuttonsintheDebugControlwindow?

Q: 10.AfteryouclickGointheDebugControlwindow,whenwillthedebuggerstop?

Q: 11.Whatisabreakpoint?

Q: 12.HowdoyousetabreakpointonalineofcodeinIDLE?

PracticeProjectForpractice,writeaprogramthatdoesthefollowing.

DebuggingCoinTossThefollowingprogramismeanttobeasimplecointossguessinggame.Theplayergetstwoguesses(it’saneasygame).However,theprogramhasseveralbugsinit.Runthroughtheprogramafewtimestofindthebugsthatkeeptheprogramfromworkingcorrectly.

importrandom

guess=''

whileguessnotin('heads','tails'):

print('Guessthecointoss!Enterheadsortails:')

guess=input()

toss=random.randint(0,1)#0istails,1isheads

iftoss==guess:

print('Yougotit!')

else:

print('Nope!Guessagain!')

guesss=input()

iftoss==guess:

print('Yougotit!')

else:

print('Nope.Youarereallybadatthisgame.')

Chapter11.WebScrapingInthoserare,terrifyingmomentswhenI’mwithoutWi-Fi,IrealizejusthowmuchofwhatIdoonthecomputerisreallywhatIdoontheInternet.OutofsheerhabitI’llfindmyselftryingtocheckemail,readfriends’Twitterfeeds,oranswerthequestion,“DidKurtwoodSmithhaveanymajorrolesbeforehewasintheoriginal1987Robocop?”[2]

SincesomuchworkonacomputerinvolvesgoingontheInternet,it’dbegreatifyourprogramscouldgetonline.WebscrapingisthetermforusingaprogramtodownloadandprocesscontentfromtheWeb.Forexample,Googlerunsmanywebscrapingprogramstoindexwebpagesforitssearchengine.Inthischapter,youwilllearnaboutseveralmodulesthatmakeiteasytoscrapewebpagesinPython.

webbrowser.ComeswithPythonandopensabrowsertoaspecificpage.Requests.DownloadsfilesandwebpagesfromtheInternet.BeautifulSoup.ParsesHTML,theformatthatwebpagesarewrittenin.Selenium.Launchesandcontrolsawebbrowser.Seleniumisabletofillinformsandsimulatemouseclicksinthisbrowser.

Project:mapit.pywiththewebbrowserModuleThewebbrowsermodule’sopen()functioncanlaunchanewbrowsertoaspecifiedURL.Enterthefollowingintotheinteractiveshell:

>>>importwebbrowser

>>>webbrowser.open('http://inventwithpython.com/')

AwebbrowsertabwillopentotheURLhttp://inventwithpython.com/.Thisisabouttheonlythingthewebbrowsermodulecando.Evenso,theopen()functiondoesmakesomeinterestingthingspossible.Forexample,it’stedioustocopyastreetaddresstotheclipboardandbringupamapofitonGoogleMaps.Youcouldtakeafewstepsoutofthistaskbywritingasimplescripttoautomaticallylaunchthemapinyourbrowserusingthecontentsofyourclipboard.Thisway,youonlyhavetocopytheaddresstoaclipboardandrunthescript,andthemapwillbeloadedforyou.

Thisiswhatyourprogramdoes:

Getsastreetaddressfromthecommandlineargumentsorclipboard.OpensthewebbrowsertotheGoogleMapspagefortheaddress.

Thismeansyourcodewillneedtodothefollowing:

Readthecommandlineargumentsfromsys.argv.Readtheclipboardcontents.Callthewebbrowser.open()functiontoopenthewebbrowser.

OpenanewfileeditorwindowandsaveitasmapIt.py.

Step1:FigureOuttheURLBasedontheinstructionsinAppendixB,setupmapIt.pysothatwhenyourunitfromthecommandline,likeso…

C:\>mapit870ValenciaSt,SanFrancisco,CA94110

…thescriptwillusethecommandlineargumentsinsteadoftheclipboard.Iftherearenocommandlinearguments,thentheprogramwillknowtousethecontentsoftheclipboard.

FirstyouneedtofigureoutwhatURLtouseforagivenstreetaddress.Whenyouloadhttp://maps.google.com/inthebrowserandsearchforanaddress,theURLintheaddressbarlookssomethinglikethis:https://www.google.com/maps/place/870+Valencia+St/@37.7590311,-122.4215096,17z/data=!3m1!4b1!4m2!3m1!1s0x808f7e3dadc07a37:0xc86b0b2bb93b73d8

TheaddressisintheURL,butthere’salotofadditionaltextthereaswell.WebsitesoftenaddextradatatoURLstohelptrackvisitorsorcustomizesites.Butifyoutryjustgoingtohttps://www.google.com/maps/place/870+Valencia+St+San+Francisco+CA/,you’llfindthatitstillbringsupthecorrectpage.Soyourprogramcanbesettoopenawebbrowserto'https://www.google.com/maps/place/your_address_string'(whereyour_address_stringistheaddressyouwanttomap).

Step2:HandletheCommandLineArgumentsMakeyourcodelooklikethis:

#!python3

#mapIt.py-Launchesamapinthebrowserusinganaddressfromthe

#commandlineorclipboard.

importwebbrowser,sys

iflen(sys.argv)>1:

#Getaddressfromcommandline.

address=''.join(sys.argv[1:])

#TODO:Getaddressfromclipboard.

Aftertheprogram’s#!shebangline,youneedtoimportthewebbrowsermoduleforlaunchingthebrowserandimportthesysmoduleforreadingthepotentialcommandlinearguments.Thesys.argvvariablestoresalistoftheprogram’sfilenameandcommandlinearguments.Ifthislisthasmorethanjustthefilenameinit,thenlen(sys.argv)evaluatestoanintegergreaterthan1,meaningthatcommandlineargumentshaveindeedbeenprovided.

Commandlineargumentsareusuallyseparatedbyspaces,butinthiscase,youwanttointerpretalloftheargumentsasasinglestring.Sincesys.argvisalistofstrings,youcanpassittothejoin()method,whichreturnsasinglestringvalue.Youdon’twanttheprogramnameinthisstring,soinsteadofsys.argv,youshouldpasssys.argv[1:]tochopoffthefirstelementofthearray.Thefinalstringthatthisexpressionevaluatestoisstoredintheaddressvariable.

Ifyouruntheprogrambyenteringthisintothecommandline…mapit870ValenciaSt,SanFrancisco,CA94110

…thesys.argvvariablewillcontainthislistvalue:['mapIt.py','870','Valencia','St,','San','Francisco,','CA','94110']

Theaddressvariablewillcontainthestring'870ValencieSt,SanFrancisco,CA94110'.

Step3:HandletheClipboardContentandLaunchtheBrowserMakeyourcodelooklikethefollowing:

#!python3

#mapIt.py-Launchesamapinthebrowserusinganaddressfromthe

#commandlineorclipboard.

importwebbrowser,sys,pyperclip

iflen(sys.argv)>1:

#Getaddressfromcommandline.

address=''.join(sys.argv[1:])

else:

#Getaddressfromclipboard.

address=pyperclip.paste()

webbrowser.open('https://www.google.com/maps/place/'+address)

Iftherearenocommandlinearguments,theprogramwillassumetheaddressisstoredontheclipboard.Youcangettheclipboardcontentwithpyperclip.paste()andstoreitinavariablenamedaddress.Finally,tolaunchawebbrowserwiththeGoogleMapsURL,callwebbrowser.open().

Whilesomeoftheprogramsyouwritewillperformhugetasksthatsaveyouhours,itcanbejustassatisfyingtouseaprogramthatconvenientlysavesyouafewsecondseachtimeyouperformacommontask,suchasgettingamapofanaddress.Table11-1comparesthestepsneededtodisplayamapwithandwithoutmapIt.py.

Table11-1.GettingaMapwithandWithoutmapIt.py

Manuallygettingamap UsingmapIt.py

Highlighttheaddress. Highlighttheaddress.

Copytheaddress. Copytheaddress.

Openthewebbrowser. RunmapIt.py.

Gotohttp://maps.google.com/.

Clicktheaddresstextfield.

Pastetheaddress.

PressENTER.

SeehowmapIt.pymakesthistasklesstedious?

IdeasforSimilarProgramsAslongasyouhaveaURL,thewebbrowsermoduleletsuserscutoutthestepofopeningthebrowseranddirectingthemselvestoawebsite.Otherprogramscouldusethisfunctionalitytodothefollowing:

Openalllinksonapageinseparatebrowsertabs.OpenthebrowsertotheURLforyourlocalweather.Openseveralsocialnetworksitesthatyouregularlycheck.

DownloadingFilesfromtheWebwiththerequestsModuleTherequestsmoduleletsyoueasilydownloadfilesfromtheWebwithouthavingtoworryaboutcomplicatedissuessuchasnetworkerrors,connectionproblems,anddatacompression.Therequestsmoduledoesn’tcomewithPython,soyou’llhavetoinstallitfirst.Fromthecommandline,runpipinstallrequests.(AppendixAhasadditionaldetailsonhowtoinstallthird-partymodules.)

TherequestsmodulewaswrittenbecausePython’surllib2moduleistoocomplicatedtouse.Infact,takeapermanentmarkerandblackoutthisentireparagraph.ForgetIevermentionedurllib2.IfyouneedtodownloadthingsfromtheWeb,justusetherequestsmodule.

Next,doasimpletesttomakesuretherequestsmoduleinstalleditselfcorrectly.Enterthefollowingintotheinteractiveshell:

>>>importrequests

Ifnoerrormessagesshowup,thentherequestsmodulehasbeensuccessfullyinstalled.

DownloadingaWebPagewiththerequests.get()FunctionTherequests.get()functiontakesastringofaURLtodownload.Bycallingtype()onrequests.get()’sreturnvalue,youcanseethatitreturnsaResponseobject,whichcontainstheresponsethatthewebservergaveforyourrequest.I’llexplaintheResponseobjectinmoredetaillater,butfornow,enterthefollowingintotheinteractiveshellwhileyourcomputerisconnectedtotheInternet:

>>>importrequests

➊>>>res=requests.get('http://www.gutenberg.org/cache/epub/1112/pg1112.txt')

>>>type(res)

<class'requests.models.Response'>

➋>>>res.status_code==requests.codes.ok

True

>>>len(res.text)

178981

>>>print(res.text[:250])

TheProjectGutenbergEBookofRomeoandJuliet,byWilliamShakespeare

ThiseBookisfortheuseofanyoneanywhereatnocostandwith

almostnorestrictionswhatsoever.Youmaycopyit,giveitawayor

re-useitunderthetermsoftheProje

TheURLgoestoatextwebpagefortheentireplayofRomeoandJuliet,providedbyProjectGutenberg➊.Youcantellthattherequestforthiswebpagesucceededbycheckingthestatus_codeattributeoftheResponseobject.Ifitisequaltothevalueofrequests.codes.ok,theneverythingwentfine➋.(Incidentally,thestatuscodefor“OK”intheHTTPprotocolis200.Youmayalreadybefamiliarwiththe404statuscodefor“NotFound.”)

Iftherequestsucceeded,thedownloadedwebpageisstoredasastringintheResponseobject’stextvariable.Thisvariableholdsalargestringoftheentireplay;thecalltolen(res.text)showsyouthatitismorethan178,000characterslong.Finally,callingprint(res.text[:250])displaysonlythefirst250characters.

CheckingforErrorsAsyou’veseen,theResponseobjecthasastatus_codeattributethatcanbechecked

againstrequests.codes.oktoseewhetherthedownloadsucceeded.Asimplerwaytocheckforsuccessistocalltheraise_for_status()methodontheResponseobject.Thiswillraiseanexceptioniftherewasanerrordownloadingthefileandwilldonothingifthedownloadsucceeded.Enterthefollowingintotheinteractiveshell:

>>>res=requests.get('http://inventwithpython.com/page_that_does_not_exist')

>>>res.raise_for_status()

Traceback(mostrecentcalllast):

File"<pyshell#138>",line1,in<module>

res.raise_for_status()

File"C:\Python34\lib\site-packages\requests\models.py",line773,inraise_for_status

raiseHTTPError(http_error_msg,response=self)

requests.exceptions.HTTPError:404ClientError:NotFound

Theraise_for_status()methodisagoodwaytoensurethataprogramhaltsifabaddownloadoccurs.Thisisagoodthing:Youwantyourprogramtostopassoonassomeunexpectederrorhappens.Ifafaileddownloadisn’tadealbreakerforyourprogram,youcanwraptheraise_for_status()linewithtryandexceptstatementstohandlethiserrorcasewithoutcrashing.

importrequests

res=requests.get('http://inventwithpython.com/page_that_does_not_exist')

try:

res.raise_for_status()

exceptExceptionasexc:

print('Therewasaproblem:%s'%(exc))

Thisraise_for_status()methodcallcausestheprogramtooutputthefollowing:Therewasaproblem:404ClientError:NotFound

Alwayscallraise_for_status()aftercallingrequests.get().Youwanttobesurethatthedownloadhasactuallyworkedbeforeyourprogramcontinues.

SavingDownloadedFilestotheHardDriveFromhere,youcansavethewebpagetoafileonyourharddrivewiththestandardopen()functionandwrite()method.Therearesomeslightdifferences,though.First,youmustopenthefileinwritebinarymodebypassingthestring'wb'asthesecondargumenttoopen().Evenifthepageisinplaintext(suchastheRomeoandJuliettextyoudownloadedearlier),youneedtowritebinarydatainsteadoftextdatainordertomaintaintheUnicodeencodingofthetext.

UNICODEENCODINGS

Unicodeencodingsarebeyondthescopeofthisbook,butyoucanlearnmoreaboutthemfromthesewebpages:

JoelonSoftware:TheAbsoluteMinimumEverySoftwareDeveloperAbsolutely,PositivelyMustKnowAboutUnicodeandCharacterSets(NoExcuses!):http://www.joelonsoftware.com/articles/Unicode.htmlPragmaticUnicode:http://nedbatchelder.com/text/unipain.html

Towritethewebpagetoafile,youcanuseaforloopwiththeResponseobject’siter_content()method.

>>>importrequests

>>>res=requests.get('http://www.gutenberg.org/cache/epub/1112/pg1112.txt')

>>>res.raise_for_status()

>>>playFile=open('RomeoAndJuliet.txt','wb')

>>>forchunkinres.iter_content(100000):

playFile.write(chunk)

100000

78981

>>>playFile.close()

Theiter_content()methodreturns“chunks”ofthecontentoneachiterationthroughtheloop.Eachchunkisofthebytesdatatype,andyougettospecifyhowmanybyteseachchunkwillcontain.Onehundredthousandbytesisgenerallyagoodsize,sopass100000astheargumenttoiter_content().

ThefileRomeoAndJuliet.txtwillnowexistinthecurrentworkingdirectory.Notethatwhilethefilenameonthewebsitewaspg1112.txt,thefileonyourharddrivehasadifferentfilename.Therequestsmodulesimplyhandlesdownloadingthecontentsofwebpages.Oncethepageisdownloaded,itissimplydatainyourprogram.EvenifyouweretoloseyourInternetconnectionafterdownloadingthewebpage,allthepagedatawouldstillbeonyourcomputer.

Thewrite()methodreturnsthenumberofbyteswrittentothefile.Inthepreviousexample,therewere100,000bytesinthefirstchunk,andtheremainingpartofthefileneededonly78,981bytes.

Toreview,here’sthecompleteprocessfordownloadingandsavingafile:

1. Callrequests.get()todownloadthefile.2. Callopen()with'wb'tocreateanewfileinwritebinarymode.3. LoopovertheResponseobject’siter_content()method.4. Callwrite()oneachiterationtowritethecontenttothefile.5. Callclose()toclosethefile.

That’sallthereistotherequestsmodule!Theforloopanditer_content()stuffmayseemcomplicatedcomparedtotheopen()/write()/close()workflowyou’vebeenusing

towritetextfiles,butit’stoensurethattherequestsmoduledoesn’teatuptoomuchmemoryevenifyoudownloadmassivefiles.Youcanlearnabouttherequestsmodule’sotherfeaturesfromhttp://requests.readthedocs.org/.

HTMLBeforeyoupickapartwebpages,you’lllearnsomeHTMLbasics.You’llalsoseehowtoaccessyourwebbrowser’spowerfuldevelopertools,whichwillmakescrapinginformationfromtheWebmucheasier.

ResourcesforLearningHTMLHypertextMarkupLanguage(HTML)istheformatthatwebpagesarewrittenin.ThischapterassumesyouhavesomebasicexperiencewithHTML,butifyouneedabeginnertutorial,Isuggestoneofthefollowingsites:

http://htmldog.com/guides/html/beginner/http://www.codecademy.com/tracks/web/https://developer.mozilla.org/en-US/learn/html/

AQuickRefresherIncaseit’sbeenawhilesinceyou’velookedatanyHTML,here’saquickoverviewofthebasics.AnHTMLfileisaplaintextfilewiththe.htmlfileextension.Thetextinthesefilesissurroundedbytags,whicharewordsenclosedinanglebrackets.Thetagstellthebrowserhowtoformatthewebpage.Astartingtagandclosingtagcanenclosesometexttoformanelement.Thetext(orinnerHTML)isthecontentbetweenthestartingandclosingtags.Forexample,thefollowingHTMLwilldisplayHelloworld!inthebrowser,withHelloinbold:

<strong>Hello</strong>world!

ThisHTMLwilllooklikeFigure11-1inabrowser.

Figure11-1.Helloworld!renderedinthebrowser

Theopening<strong>tagsaysthattheenclosedtextwillappearinbold.Theclosing</strong>tagstellsthebrowserwheretheendoftheboldtextis.

TherearemanydifferenttagsinHTML.Someofthesetagshaveextrapropertiesintheformofattributeswithintheanglebrackets.Forexample,the<a>tagenclosestextthatshouldbealink.TheURLthatthetextlinkstoisdeterminedbythehrefattribute.Here’sanexample:

Al'sfree<ahref="http://inventwithpython.com">Pythonbooks</a>.

ThisHTMLwilllooklikeFigure11-2inabrowser.

Figure11-2.Thelinkrenderedinthebrowser

Someelementshaveanidattributethatisusedtouniquelyidentifytheelementinthepage.Youwillofteninstructyourprogramstoseekoutanelementbyitsidattribute,sofiguringoutanelement’sidattributeusingthebrowser’sdevelopertoolsisacommontaskinwritingwebscrapingprograms.

ViewingtheSourceHTMLofaWebPageYou’llneedtolookattheHTMLsourceofthewebpagesthatyourprogramswillworkwith.Todothis,right-click(orCTRL-clickonOSX)anywebpageinyourwebbrowser,andselectViewSourceorViewpagesourcetoseetheHTMLtextofthepage(seeFigure11-3).Thisisthetextyourbrowseractuallyreceives.Thebrowserknowshowtodisplay,orrender,thewebpagefromthisHTML.

Figure11-3.Viewingthesourceofawebpage

IhighlyrecommendviewingthesourceHTMLofsomeofyourfavoritesites.It’sfineifyoudon’tfullyunderstandwhatyouareseeingwhenyoulookatthesource.Youwon’tneedHTMLmasterytowritesimplewebscrapingprograms—afterall,youwon’tbewritingyourownwebsites.Youjustneedenoughknowledgetopickoutdatafromanexistingsite.

OpeningYourBrowser’sDeveloperToolsInadditiontoviewingawebpage’ssource,youcanlookthroughapage’sHTMLusingyourbrowser’sdevelopertools.InChromeandInternetExplorerforWindows,thedevelopertoolsarealreadyinstalled,andyoucanpressF12tomakethemappear(seeFigure11-4).PressingF12againwillmakethedevelopertoolsdisappear.InChrome,youcanalsobringupthedevelopertoolsbyselectingView▸Developer▸DeveloperTools.InOSX,pressing -OPTION-IwillopenChrome’sDeveloperTools.

Figure11-4.TheDeveloperToolswindowintheChromebrowser

InFirefox,youcanbringuptheWebDeveloperToolsInspectorbypressingCTRL-SHIFT-ConWindowsandLinuxorbypressing⌘-OPTION-ConOSX.ThelayoutisalmostidenticaltoChrome’sdevelopertools.

InSafari,openthePreferenceswindow,andontheAdvancedpanechecktheShowDevelopmenuinthemenubaroption.Afterithasbeenenabled,youcanbringupthedevelopertoolsbypressing -OPTION-I.

Afterenablingorinstallingthedevelopertoolsinyourbrowser,youcanright-clickanypartofthewebpageandselectInspectElementfromthecontextmenutobringuptheHTMLresponsibleforthatpartofthepage.ThiswillbehelpfulwhenyoubegintoparseHTMLforyourwebscrapingprograms.

DON’TUSEREGULAREXPRESSIONSTOPARSEHTML

LocatingaspecificpieceofHTMLinastringseemslikeaperfectcaseforregularexpressions.However,Iadviseyouagainstit.TherearemanydifferentwaysthatHTMLcanbeformattedandstillbeconsideredvalidHTML,buttryingtocaptureallthesepossiblevariationsinaregularexpressioncanbetediousanderrorprone.AmoduledevelopedspecificallyforparsingHTML,suchasBeautifulSoup,willbelesslikelytoresultinbugs.

Youcanfindanextendedargumentforwhyyoushouldn’ttoparseHTMLwithregularexpressionsathttp://stackoverflow.com/a/1732454/1893164/.

UsingtheDeveloperToolstoFindHTMLElementsOnceyourprogramhasdownloadedawebpageusingtherequestsmodule,youwillhavethepage’sHTMLcontentasasinglestringvalue.NowyouneedtofigureoutwhichpartoftheHTMLcorrespondstotheinformationonthewebpageyou’reinterestedin.

Thisiswherethebrowser’sdevelopertoolscanhelp.Sayyouwanttowriteaprogramtopullweatherforecastdatafromhttp://weather.gov/.Beforewritinganycode,doalittleresearch.Ifyouvisitthesiteandsearchforthe94105ZIPcode,thesitewilltakeyoutoapageshowingtheforecastforthatarea.

Whatifyou’reinterestedinscrapingthetemperatureinformationforthatZIPcode?Right-clickwhereitisonthepage(orCONTROL-clickonOSX)andselectInspectElementfromthecontextmenuthatappears.ThiswillbringuptheDeveloperToolswindow,whichshowsyoutheHTMLthatproducesthisparticularpartofthewebpage.Figure11-5showsthedevelopertoolsopentotheHTMLofthetemperature.

Figure11-5.Inspectingtheelementthatholdsthetemperaturetextwiththedevelopertools

Fromthedevelopertools,youcanseethattheHTMLresponsibleforthetemperaturepartofthewebpageis<pclass="myforecast-current-lrg">57°F</p>.Thisisexactlywhatyouwerelookingfor!Itseemsthatthetemperatureinformationiscontainedinsidea<p>elementwiththemyforecast-current-lrgclass.Nowthatyouknowwhatyou’relookingfor,theBeautifulSoupmodulewillhelpyoufinditinthestring.

ParsingHTMLwiththeBeautifulSoupModuleBeautifulSoupisamoduleforextractinginformationfromanHTMLpage(andismuchbetterforthispurposethanregularexpressions).TheBeautifulSoupmodule’snameisbs4(forBeautifulSoup,version4).Toinstallit,youwillneedtorunpipinstallbeautifulsoup4fromthecommandline.(CheckoutAppendixAforinstructionsoninstallingthird-partymodules.)Whilebeautifulsoup4isthenameusedforinstallation,toimportBeautifulSoupyourunimportbs4.

Forthischapter,theBeautifulSoupexampleswillparse(thatis,analyzeandidentifythepartsof)anHTMLfileontheharddrive.OpenanewfileeditorwindowinIDLE,enterthefollowing,andsaveitasexample.html.Alternatively,downloaditfromhttp://nostarch.com/automatestuff/.

<!--Thisistheexample.htmlexamplefile.-->

<html><head><title>TheWebsiteTitle</title></head>

<body>

<p>Downloadmy<strong>Python</strong>bookfrom<ahref="http://

inventwithpython.com">mywebsite</a>.</p>

<pclass="slogan">LearnPythontheeasyway!</p>

<p>By<spanid="author">AlSweigart</span></p>

</body></html>

Asyoucansee,evenasimpleHTMLfileinvolvesmanydifferenttagsandattributes,andmattersquicklygetconfusingwithcomplexwebsites.Thankfully,BeautifulSoupmakesworkingwithHTMLmucheasier.

CreatingaBeautifulSoupObjectfromHTMLThebs4.BeautifulSoup()functionneedstobecalledwithastringcontainingtheHTMLitwillparse.Thebs4.BeautifulSoup()functionreturnsisaBeautifulSoupobject.EnterthefollowingintotheinteractiveshellwhileyourcomputerisconnectedtotheInternet:

>>>importrequests,bs4

>>>res=requests.get('http://nostarch.com')

>>>res.raise_for_status()

>>>noStarchSoup=bs4.BeautifulSoup(res.text)

>>>type(noStarchSoup)

<class'bs4.BeautifulSoup'>

Thiscodeusesrequests.get()todownloadthemainpagefromtheNoStarchPresswebsiteandthenpassesthetextattributeoftheresponsetobs4.BeautifulSoup().TheBeautifulSoupobjectthatitreturnsisstoredinavariablenamednoStarchSoup.

YoucanalsoloadanHTMLfilefromyourharddrivebypassingaFileobjecttobs4.BeautifulSoup().Enterthefollowingintotheinteractiveshell(makesuretheexample.htmlfileisintheworkingdirectory):

>>>exampleFile=open('example.html')

>>>exampleSoup=bs4.BeautifulSoup(exampleFile)

>>>type(exampleSoup)

<class'bs4.BeautifulSoup'>

OnceyouhaveaBeautifulSoupobject,youcanuseitsmethodstolocatespecificpartsofanHTMLdocument.

FindinganElementwiththeselect()MethodYoucanretrieveawebpageelementfromaBeautifulSoupobjectbycallingthe

select()methodandpassingastringofaCSSselectorfortheelementyouarelookingfor.Selectorsarelikeregularexpressions:Theyspecifyapatterntolookfor,inthiscase,inHTMLpagesinsteadofgeneraltextstrings.

AfulldiscussionofCSSselectorsyntaxisbeyondthescopeofthisbook(there’sagoodselectortutorialintheresourcesathttp://nostarch.com/automatestuff/),buthere’sashortintroductiontoselectors.Table11-2showsexamplesofthemostcommonCSSselectorpatterns.

Table11-2.ExamplesofCSSSelectors

Selectorpassedtotheselect()method Willmatch…

soup.select('div') Allelementsnamed<div>

soup.select('#author') Theelementwithanidattributeofauthor

soup.select('.notice') AllelementsthatuseaCSSclassattributenamednotice

soup.select('divspan') Allelementsnamed<span>thatarewithinanelementnamed<div>

soup.select('div>span') Allelementsnamed<span>thataredirectlywithinanelementnamed<div>,withnootherelementinbetween

soup.select('input[name]') Allelementsnamed<input>thathaveanameattributewithanyvalue

soup.select('input[type="button"]') Allelementsnamed<input>thathaveanattributenamedtypewithvaluebutton

Thevariousselectorpatternscanbecombinedtomakesophisticatedmatches.Forexample,soup.select('p#author')willmatchanyelementthathasanidattributeofauthor,aslongasitisalsoinsidea<p>element.

Theselect()methodwillreturnalistofTagobjects,whichishowBeautifulSouprepresentsanHTMLelement.ThelistwillcontainoneTagobjectforeverymatchintheBeautifulSoupobject’sHTML.Tagvaluescanbepassedtothestr()functiontoshowtheHTMLtagstheyrepresent.TagvaluesalsohaveanattrsattributethatshowsalltheHTMLattributesofthetagasadictionary.Usingtheexample.htmlfilefromearlier,enterthefollowingintotheinteractiveshell:

>>>importbs4

>>>exampleFile=open('example.html')

>>>exampleSoup=bs4.BeautifulSoup(exampleFile.read())

>>>elems=exampleSoup.select('#author')

>>>type(elems)

<class'list'>

>>>len(elems)

1

>>>type(elems[0])

<class'bs4.element.Tag'>

>>>elems[0].getText()

'AlSweigart'

>>>str(elems[0])

'<spanid="author">AlSweigart</span>'

>>>elems[0].attrs

{'id':'author'}

Thiscodewillpulltheelementwithid="author"outofourexampleHTML.Weuseselect('#author')toreturnalistofalltheelementswithid="author".WestorethislistofTagobjectsinthevariableelems,andlen(elems)tellsusthereisoneTagobjectinthelist;therewasonematch.CallinggetText()ontheelementreturnstheelement’stext,orinnerHTML.Thetextofanelementisthecontentbetweentheopeningandclosingtags:inthiscase,'AlSweigart'.

Passingtheelementtostr()returnsastringwiththestartingandclosingtagsandtheelement’stext.Finally,attrsgivesusadictionarywiththeelement’sattribute,'id',andthevalueoftheidattribute,'author'.

Youcanalsopullallthe<p>elementsfromtheBeautifulSoupobject.Enterthisintotheinteractiveshell:

>>>pElems=exampleSoup.select('p')

>>>str(pElems[0])

'<p>Downloadmy<strong>Python</strong>bookfrom<ahref="http://

inventwithpython.com">mywebsite</a>.</p>'

>>>pElems[0].getText()

'DownloadmyPythonbookfrommywebsite.'

>>>str(pElems[1])

'<pclass="slogan">LearnPythontheeasyway!</p>'

>>>pElems[1].getText()

'LearnPythontheeasyway!'

>>>str(pElems[2])

'<p>By<spanid="author">AlSweigart</span></p>'

>>>pElems[2].getText()

'ByAlSweigart'

Thistime,select()givesusalistofthreematches,whichwestoreinpElems.Usingstr()onpElems[0],pElems[1],andpElems[2]showsyoueachelementasastring,andusinggetText()oneachelementshowsyouitstext.

GettingDatafromanElement’sAttributesTheget()methodforTagobjectsmakesitsimpletoaccessattributevaluesfromanelement.Themethodispassedastringofanattributenameandreturnsthatattribute’svalue.Usingexample.html,enterthefollowingintotheinteractiveshell:

>>>importbs4

>>>soup=bs4.BeautifulSoup(open('example.html'))

>>>spanElem=soup.select('span')[0]

>>>str(spanElem)

'<spanid="author">AlSweigart</span>'

>>>spanElem.get('id')

'author'

>>>spanElem.get('some_nonexistent_addr')==None

True

>>>spanElem.attrs

{'id':'author'}

Hereweuseselect()tofindany<span>elementsandthenstorethefirstmatchedelementinspanElem.Passingtheattributename'id'toget()returnstheattribute’svalue,'author'.

Project:“I’mFeelingLucky”GoogleSearchWheneverIsearchatopiconGoogle,Idon’tlookatjustonesearchresultatatime.Bymiddle-clickingasearchresultlink(orclickingwhileholdingCTRL),Iopenthefirstseverallinksinabunchofnewtabstoreadlater.IsearchGoogleoftenenoughthatthisworkflow—openingmybrowser,searchingforatopic,andmiddle-clickingseverallinksonebyone—istedious.ItwouldbeniceifIcouldsimplytypeasearchtermonthecommandlineandhavemycomputerautomaticallyopenabrowserwithallthetopsearchresultsinnewtabs.Let’swriteascripttodothis.

Thisiswhatyourprogramdoes:

Getssearchkeywordsfromthecommandlinearguments.Retrievesthesearchresultspage.Opensabrowsertabforeachresult.

Thismeansyourcodewillneedtodothefollowing:

Readthecommandlineargumentsfromsys.argv.Fetchthesearchresultpagewiththerequestsmodule.Findthelinkstoeachsearchresult.Callthewebbrowser.open()functiontoopenthewebbrowser.

Openanewfileeditorwindowandsaveitaslucky.py.

Step1:GettheCommandLineArgumentsandRequesttheSearchPageBeforecodinganything,youfirstneedtoknowtheURLofthesearchresultpage.Bylookingatthebrowser’saddressbarafterdoingaGooglesearch,youcanseethattheresultpagehasaURLlikehttps://www.google.com/search?q=SEARCH_TERM_HERE.TherequestsmodulecandownloadthispageandthenyoucanuseBeautifulSouptofindthesearchresultlinksintheHTML.Finally,you’llusethewebbrowsermoduletoopenthoselinksinbrowsertabs.

Makeyourcodelooklikethefollowing:#!python3

#lucky.py-OpensseveralGooglesearchresults.

importrequests,sys,webbrowser,bs4

print('Googling…')#displaytextwhiledownloadingtheGooglepage

res=requests.get('http://google.com/search?q='+''.join(sys.argv[1:]))

res.raise_for_status()

#TODO:Retrievetopsearchresultlinks.

#TODO:Openabrowsertabforeachresult.

Theuserwillspecifythesearchtermsusingcommandlineargumentswhentheylaunchtheprogram.Theseargumentswillbestoredasstringsinalistinsys.argv.

Step2:FindAlltheResultsNowyouneedtouseBeautifulSouptoextractthetopsearchresultlinksfromyourdownloadedHTML.Buthowdoyoufigureouttherightselectorforthejob?Forexample,youcan’tjustsearchforall<a>tags,becausetherearelotsoflinksyoudon’t

careaboutintheHTML.Instead,youmustinspectthesearchresultpagewiththebrowser’sdevelopertoolstotrytofindaselectorthatwillpickoutonlythelinksyouwant.

AfterdoingaGooglesearchforBeautifulSoup,youcanopenthebrowser’sdevelopertoolsandinspectsomeofthelinkelementsonthepage.Theylookincrediblycomplicated,somethinglikethis:<ahref="/url?sa=t&amp;rct=j&amp;q=&amp;esrc=s&amp;source=web&amp;cd=1&amp;cad=rja&amp;uact=8&

amp;ved=0CCgQFjAA&amp;url=http%3A%2F%2Fwww.crummy.com%2Fsoftware%2FBeautifulSoup

%2F&amp;ei=LHBVU_XDD9KVyAShmYDwCw&amp;usg=AFQjCNHAxwplurFOBqg5cehWQEVKi-

TuLQ&amp;sig2=sdZu6WVlBlVSDrwhtworMA"onmousedown="return

rwt(this,'','','','1','AFQjCNHAxwplurFOBqg5cehWQEVKi-

TuLQ','sdZu6WVlBlVSDrwhtworMA','0CCgQFjAA','','',event)"data-

href="http://www.crummy.com/software/BeautifulSoup/"><em>Beautiful

Soup</em>:WecalledhimTortoisebecausehetaughtus.</a>.

Itdoesn’tmatterthattheelementlooksincrediblycomplicated.Youjustneedtofindthepatternthatallthesearchresultlinkshave.Butthis<a>elementdoesn’thaveanythingthateasilydistinguishesitfromthenonsearchresult<a>elementsonthepage.

Makeyourcodelooklikethefollowing:#!python3

#lucky.py-Opensseveralgooglesearchresults.

importrequests,sys,webbrowser,bs4

--snip--

#Retrievetopsearchresultlinks.

soup=bs4.BeautifulSoup(res.text)

#Openabrowsertabforeachresult.

linkElems=soup.select('.ra')

Ifyoulookupalittlefromthe<a>element,though,thereisanelementlikethis:<h3class="r">.LookingthroughtherestoftheHTMLsource,itlooksliketherclassisusedonlyforsearchresultlinks.Youdon’thavetoknowwhattheCSSclassrisorwhatitdoes.You’rejustgoingtouseitasamarkerforthe<a>elementyouarelookingfor.YoucancreateaBeautifulSoupobjectfromthedownloadedpage’sHTMLtextandthenusetheselector'.ra'tofindall<a>elementsthatarewithinanelementthathastherCSSclass.

Step3:OpenWebBrowsersforEachResultFinally,we’lltelltheprogramtoopenwebbrowsertabsforourresults.Addthefollowingtotheendofyourprogram:

#!python3

#lucky.py-Opensseveralgooglesearchresults.

importrequests,sys,webbrowser,bs4

--snip--

#Openabrowsertabforeachresult.

linkElems=soup.select('.ra')

numOpen=min(5,len(linkElems))

foriinrange(numOpen):

webbrowser.open('http://google.com'+linkElems[i].get('href'))

Bydefault,youopenthefirstfivesearchresultsinnewtabsusingthewebbrowser

module.However,theusermayhavesearchedforsomethingthatturnedupfewerthanfiveresults.Thesoup.select()callreturnsalistofalltheelementsthatmatchedyour'.ra'selector,sothenumberoftabsyouwanttoopeniseither5orthelengthofthislist(whicheverissmaller).

Thebuilt-inPythonfunctionmin()returnsthesmallestoftheintegerorfloatargumentsitispassed.(Thereisalsoabuilt-inmax()functionthatreturnsthelargestargumentitispassed.)Youcanusemin()tofindoutwhethertherearefewerthanfivelinksinthelistandstorethenumberoflinkstoopeninavariablenamednumOpen.Thenyoucanrunthroughaforloopbycallingrange(numOpen).

Oneachiterationoftheloop,youusewebbrowser.open()toopenanewtabinthewebbrowser.Notethatthehrefattribute’svalueinthereturned<a>elementsdonothavetheinitialhttp://google.compart,soyouhavetoconcatenatethattothehrefattribute’sstringvalue.

NowyoucaninstantlyopenthefirstfiveGoogleresultsfor,say,Pythonprogrammingtutorialsbyrunningluckypythonprogrammingtutorialsonthecommandline!(SeeAppendixBforhowtoeasilyrunprogramsonyouroperatingsystem.)

IdeasforSimilarProgramsThebenefitoftabbedbrowsingisthatyoucaneasilyopenlinksinnewtabstoperuselater.Aprogramthatautomaticallyopensseverallinksatoncecanbeaniceshortcuttodothefollowing:

OpenalltheproductpagesaftersearchingashoppingsitesuchasAmazonOpenallthelinkstoreviewsforasingleproductOpentheresultlinkstophotosafterperformingasearchonaphotositesuchasFlickrorImgur

Project:DownloadingAllXKCDComicsBlogsandotherregularlyupdatingwebsitesusuallyhaveafrontpagewiththemostrecentpostaswellasaPreviousbuttononthepagethattakesyoutothepreviouspost.ThenthatpostwillalsohaveaPreviousbutton,andsoon,creatingatrailfromthemostrecentpagetothefirstpostonthesite.Ifyouwantedacopyofthesite’scontenttoreadwhenyou’renotonline,youcouldmanuallynavigateovereverypageandsaveeachone.Butthisisprettyboringwork,solet’swriteaprogramtodoitinstead.

XKCDisapopulargeekwebcomicwithawebsitethatfitsthisstructure(seeFigure11-6).Thefrontpageathttp://xkcd.com/hasaPrevbuttonthatguidestheuserbackthroughpriorcomics.Downloadingeachcomicbyhandwouldtakeforever,butyoucanwriteascripttodothisinacoupleofminutes.

Here’swhatyourprogramdoes:

LoadstheXKCDhomepage.Savesthecomicimageonthatpage.FollowsthePreviousComiclink.Repeatsuntilitreachesthefirstcomic.

Figure11-6.XKCD,“awebcomicofromance,sarcasm,math,andlanguage”

Thismeansyourcodewillneedtodothefollowing:

Downloadpageswiththerequestsmodule.FindtheURLofthecomicimageforapageusingBeautifulSoup.Downloadandsavethecomicimagetotheharddrivewithiter_content().FindtheURLofthePreviousComiclink,andrepeat.

OpenanewfileeditorwindowandsaveitasdownloadXkcd.py.

Step1:DesigntheProgramIfyouopenthebrowser’sdevelopertoolsandinspecttheelementsonthepage,you’llfindthefollowing:

TheURLofthecomic’simagefileisgivenbythehrefattributeofan<img>element.The<img>elementisinsidea<divid="comic">element.ThePrevbuttonhasarelHTMLattributewiththevalueprev.Thefirstcomic’sPrevbuttonlinkstothehttp://xkcd.com/#URL,indicatingthattherearenomorepreviouspages.

Makeyourcodelooklikethefollowing:#!python3

#downloadXkcd.py-DownloadseverysingleXKCDcomic.

importrequests,os,bs4

url='http://xkcd.com'#startingurl

os.makedirs('xkcd',exist_ok=True)#storecomicsin./xkcd

whilenoturl.endswith('#'):

#TODO:Downloadthepage.

#TODO:FindtheURLofthecomicimage.

#TODO:Downloadtheimage.

#TODO:Savetheimageto./xkcd.

#TODO:GetthePrevbutton'surl.

print('Done.')

You’llhaveaurlvariablethatstartswiththevalue'http://xkcd.com'andrepeatedlyupdateit(inaforloop)withtheURLofthecurrentpage’sPrevlink.Ateverystepintheloop,you’lldownloadthecomicaturl.You’llknowtoendtheloopwhenurlendswith'#'.

Youwilldownloadtheimagefilestoafolderinthecurrentworkingdirectorynamedxkcd.Thecallos.makedirs()ensuresthatthisfolderexists,andtheexist_ok=Truekeywordargumentpreventsthefunctionfromthrowinganexceptionifthisfolderalreadyexists.Therestofthecodeisjustcommentsthatoutlinetherestofyourprogram.

Step2:DownloadtheWebPageLet’simplementthecodefordownloadingthepage.Makeyourcodelooklikethefollowing:

#!python3

#downloadXkcd.py-DownloadseverysingleXKCDcomic.

importrequests,os,bs4

url='http://xkcd.com'#startingurl

os.makedirs('xkcd',exist_ok=True)#storecomicsin./xkcd

whilenoturl.endswith('#'):

#Downloadthepage.

print('Downloadingpage%s…'%url)

res=requests.get(url)

res.raise_for_status()

soup=bs4.BeautifulSoup(res.text)

#TODO:FindtheURLofthecomicimage.

#TODO:Downloadtheimage.

#TODO:Savetheimageto./xkcd.

#TODO:GetthePrevbutton'surl.

print('Done.')

First,printurlsothattheuserknowswhichURLtheprogramisabouttodownload;thenusetherequestsmodule’srequest.get()functiontodownloadit.Asalways,youimmediatelycalltheResponseobject’sraise_for_status()methodtothrowanexceptionandendtheprogramifsomethingwentwrongwiththedownload.Otherwise,youcreateaBeautifulSoupobjectfromthetextofthedownloadedpage.

Step3:FindandDownloadtheComicImageMakeyourcodelooklikethefollowing:

#!python3

#downloadXkcd.py-DownloadseverysingleXKCDcomic.

importrequests,os,bs4

--snip--

#FindtheURLofthecomicimage.

comicElem=soup.select('#comicimg')

ifcomicElem==[]:

print('Couldnotfindcomicimage.')

else:

comicUrl=comicElem[0].get('src')

#Downloadtheimage.

print('Downloadingimage%s…'%(comicUrl))

res=requests.get(comicUrl)

res.raise_for_status()

#TODO:Savetheimageto./xkcd.

#TODO:GetthePrevbutton'surl.

print('Done.')

FrominspectingtheXKCDhomepagewithyourdevelopertools,youknowthatthe<img>elementforthecomicimageisinsidea<div>elementwiththeidattributesettocomic,sotheselector'#comicimg'willgetyouthecorrect<img>elementfromtheBeautifulSoupobject.

AfewXKCDpageshavespecialcontentthatisn’tasimpleimagefile.That’sfine;you’lljustskipthose.Ifyourselectordoesn’tfindanyelements,thensoup.select('#comicimg')willreturnablanklist.Whenthathappens,theprogramcanjustprintanerrormessageandmoveonwithoutdownloadingtheimage.

Otherwise,theselectorwillreturnalistcontainingone<img>element.Youcangetthesrcattributefromthis<img>elementandpassittorequests.get()todownloadthecomic’simagefile.

Step4:SavetheImageandFindthePreviousComicMakeyourcodelooklikethefollowing:

#!python3

#downloadXkcd.py-DownloadseverysingleXKCDcomic.

importrequests,os,bs4

--snip--

#Savetheimageto./xkcd.

imageFile=open(os.path.join('xkcd',os.path.basename(comicUrl)),'wb')

forchunkinres.iter_content(100000):

imageFile.write(chunk)

imageFile.close()

#GetthePrevbutton'surl.

prevLink=soup.select('a[rel="prev"]')[0]

url='http://xkcd.com'+prevLink.get('href')

print('Done.')

Atthispoint,theimagefileofthecomicisstoredintheresvariable.Youneedtowritethisimagedatatoafileontheharddrive.

You’llneedafilenameforthelocalimagefiletopasstoopen().ThecomicUrlwillhaveavaluelike'http://imgs.xkcd.com/comics/heartbleed_explanation.png'—whichyoumighthavenoticedlooksalotlikeafilepath.Andinfact,youcancallos.path.basename()withcomicUrl,anditwillreturnjustthelastpartoftheURL,'heartbleed_explanation.png'.Youcanusethisasthefilenamewhensavingtheimagetoyourharddrive.Youjointhisnamewiththenameofyourxkcdfolderusingos.path.join()sothatyourprogramusesbackslashes(\)onWindowsandforwardslashes(/)onOSXandLinux.Nowthatyoufinallyhavethefilename,youcancallopen()toopenanewfilein'wb'“writebinary”mode.

Rememberfromearlierinthischapterthattosavefilesyou’vedownloadedusingRequests,youneedtoloopoverthereturnvalueoftheiter_content()method.Thecodeintheforloopwritesoutchunksoftheimagedata(atmost100,000byteseach)tothefileandthenyouclosethefile.Theimageisnowsavedtoyourharddrive.

Afterward,theselector'a[rel="prev"]'identifiesthe<a>elementwiththerelattributesettoprev,andyoucanusethis<a>element’shrefattributetogetthepreviouscomic’sURL,whichgetsstoredinurl.Thenthewhileloopbeginstheentiredownloadprocessagainforthiscomic.

Theoutputofthisprogramwilllooklikethis:Downloadingpagehttp://xkcd.com…

Downloadingimagehttp://imgs.xkcd.com/comics/phone_alarm.png…

Downloadingpagehttp://xkcd.com/1358/...

Downloadingimagehttp://imgs.xkcd.com/comics/nro.png…

Downloadingpagehttp://xkcd.com/1357/...

Downloadingimagehttp://imgs.xkcd.com/comics/free_speech.png…

Downloadingpagehttp://xkcd.com/1356/...

Downloadingimagehttp://imgs.xkcd.com/comics/orbital_mechanics.png…

Downloadingpagehttp://xkcd.com/1355/...

Downloadingimagehttp://imgs.xkcd.com/comics/airplane_message.png…

Downloadingpagehttp://xkcd.com/1354/...

Downloadingimagehttp://imgs.xkcd.com/comics/heartbleed_explanation.png…

--snip--

ThisprojectisagoodexampleofaprogramthatcanautomaticallyfollowlinksinordertoscrapelargeamountsofdatafromtheWeb.YoucanlearnaboutBeautifulSoup’sotherfeaturesfromitsdocumentationathttp://www.crummy.com/software/BeautifulSoup/bs4/doc/.

IdeasforSimilarProgramsDownloadingpagesandfollowinglinksarethebasisofmanywebcrawlingprograms.

Similarprogramscouldalsodothefollowing:

Backupanentiresitebyfollowingallofitslinks.Copyallthemessagesoffawebforum.Duplicatethecatalogofitemsforsaleonanonlinestore.

TherequestsandBeautifulSoupmodulesaregreataslongasyoucanfigureouttheURLyouneedtopasstorequests.get().However,sometimesthisisn’tsoeasytofind.Orperhapsthewebsiteyouwantyourprogramtonavigaterequiresyoutologinfirst.Theseleniummodulewillgiveyourprogramsthepowertoperformsuchsophisticatedtasks.

ControllingtheBrowserwiththeseleniumModuleTheseleniummoduleletsPythondirectlycontrolthebrowserbyprogrammaticallyclickinglinksandfillinginlogininformation,almostasthoughthereisahumanuserinteractingwiththepage.SeleniumallowsyoutointeractwithwebpagesinamuchmoreadvancedwaythanRequestsandBeautifulSoup;butbecauseitlaunchesawebbrowser,itisabitslowerandhardtoruninthebackgroundif,say,youjustneedtodownloadsomefilesfromtheWeb.

AppendixAhasmoredetailedstepsoninstallingthird-partymodules.

StartingaSelenium-ControlledBrowserFortheseexamples,you’llneedtheFirefoxwebbrowser.Thiswillbethebrowserthatyoucontrol.Ifyoudon’talreadyhaveFirefox,youcandownloaditforfreefromhttp://getfirefox.com/.

ImportingthemodulesforSeleniumisslightlytricky.Insteadofimportselenium,youneedtorunfromseleniumimportwebdriver.(Theexactreasonwhytheseleniummoduleissetupthiswayisbeyondthescopeofthisbook.)Afterthat,youcanlaunchtheFirefoxbrowserwithSelenium.Enterthefollowingintotheinteractiveshell:

>>>fromseleniumimportwebdriver

>>>browser=webdriver.Firefox()

>>>type(browser)

<class'selenium.webdriver.firefox.webdriver.WebDriver'>

>>>browser.get('http://inventwithpython.com')

You’llnoticewhenwebdriver.Firefox()iscalled,theFirefoxwebbrowserstartsup.Callingtype()onthevaluewebdriver.Firefox()revealsit’softheWebDriverdatatype.Andcallingbrowser.get('http://inventwithpython.com')directsthebrowsertohttp://inventwithpython.com/.YourbrowsershouldlooksomethinglikeFigure11-7.

Figure11-7.Aftercallingwebdriver.Firefox()andget()inIDLE,theFirefoxbrowserappears.

FindingElementsonthePage

WebDriverobjectshavequiteafewmethodsforfindingelementsonapage.Theyaredividedintothefind_element_*andfind_elements_*methods.Thefind_element_*methodsreturnasingleWebElementobject,representingthefirstelementonthepagethatmatchesyourquery.Thefind_elements_*methodsreturnalistofWebElement_*objectsforeverymatchingelementonthepage.

Table11-3showsseveralexamplesoffind_element_*andfind_elements_*methodsbeingcalledonaWebDriverobjectthat’sstoredinthevariablebrowser.

Table11-3.Selenium’sWebDriverMethodsforFindingElements

Methodname WebElementobject/listreturned

browser.find_element_by_class_name(name)

browser.find_elements_by_class_name(name)ElementsthatusetheCSSclassname

browser.find_element_by_css_selector(selector)

browser.find_elements_by_css_selector(selector)ElementsthatmatchtheCSSselector

browser.find_element_by_id(id)

browser.find_elements_by_id(id)Elementswithamatchingidattributevalue

browser.find_element_by_link_text(text)

browser.find_elements_by_link_text(text)<a>elementsthatcompletelymatchthetextprovided

browser.find_element_by_partial_link_text(text)

browser.find_elements_by_partial_link_text(text)<a>elementsthatcontainthetextprovided

browser.find_element_by_name(name)

browser.find_elements_by_name(name)Elementswithamatchingnameattributevalue

browser.find_element_by_tag_name(name)

browser.find_elements_by_tag_name(name)Elementswithamatchingtagname(caseinsensitive;an<a>elementismatchedby'a'and'A')

Exceptforthe*_by_tag_name()methods,theargumentstoallthemethodsarecasesensitive.Ifnoelementsexistonthepagethatmatchwhatthemethodislookingfor,theseleniummoduleraisesaNoSuchElementexception.Ifyoudonotwantthisexceptiontocrashyourprogram,addtryandexceptstatementstoyourcode.

OnceyouhavetheWebElementobject,youcanfindoutmoreaboutitbyreadingtheattributesorcallingthemethodsinTable11-4.

Table11-4.WebElementAttributesandMethods

Attributeormethod Description

tag_name Thetagname,suchas'a'foran<a>element

get_attribute(name) Thevaluefortheelement’snameattribute

text Thetextwithintheelement,suchas'hello'in<span>hello</span>

clear() Fortextfieldortextareaelements,clearsthetexttypedintoit

is_displayed() ReturnsTrueiftheelementisvisible;otherwisereturnsFalse

is_enabled() Forinputelements,returnsTrueiftheelementisenabled;otherwisereturnsFalse

is_selected() Forcheckboxorradiobuttonelements,returnsTrueiftheelementisselected;otherwisereturnsFalse

location Adictionarywithkeys'x'and'y'forthepositionoftheelementinthepage

Forexample,openanewfileeditorandenterthefollowingprogram:fromseleniumimportwebdriver

browser=webdriver.Firefox()

browser.get('http://inventwithpython.com')

try:

elem=browser.find_element_by_class_name('bookcover')

print('Found<%s>elementwiththatclassname!'%(elem.tag_name))

except:

print('Wasnotabletofindanelementwiththatname.')

HereweopenFirefoxanddirectittoaURL.Onthispage,wetrytofindelementswiththeclassname'bookcover',andifsuchanelementisfound,weprintitstagnameusingthetag_nameattribute.Ifnosuchelementwasfound,weprintadifferentmessage.

Thisprogramwilloutputthefollowing:Found<img>elementwiththatclassname!

Wefoundanelementwiththeclassname'bookcover'andthetagname'img'.

ClickingthePageWebElementobjectsreturnedfromthefind_element_*andfind_elements_*methodshaveaclick()methodthatsimulatesamouseclickonthatelement.Thismethodcanbeusedtofollowalink,makeaselectiononaradiobutton,clickaSubmitbutton,ortriggerwhateverelsemighthappenwhentheelementisclickedbythemouse.Forexample,enterthefollowingintotheinteractiveshell:

>>>fromseleniumimportwebdriver

>>>browser=webdriver.Firefox()

>>>browser.get('http://inventwithpython.com')

>>>linkElem=browser.find_element_by_link_text('ReadItOnline')

>>>type(linkElem)

<class'selenium.webdriver.remote.webelement.WebElement'>

>>>linkElem.click()#followsthe"ReadItOnline"link

ThisopensFirefoxtohttp://inventwithpython.com/,getstheWebElementobjectforthe<a>

elementwiththetextReadItOnline,andthensimulatesclickingthat<a>element.It’sjustlikeifyouclickedthelinkyourself;thebrowserthenfollowsthatlink.

FillingOutandSubmittingFormsSendingkeystrokestotextfieldsonawebpageisamatteroffindingthe<input>or<textarea>elementforthattextfieldandthencallingthesend_keys()method.Forexample,enterthefollowingintotheinteractiveshell:

>>>fromseleniumimportwebdriver

>>>browser=webdriver.Firefox()

>>>browser.get('http://gmail.com')

>>>emailElem=browser.find_element_by_id('Email')

>>>emailElem.send_keys('[email protected]')

>>>passwordElem=browser.find_element_by_id('Passwd')

>>>passwordElem.send_keys('12345')

>>>passwordElem.submit()

AslongasGmailhasn’tchangedtheidoftheUsernameandPasswordtextfieldssincethisbookwaspublished,thepreviouscodewillfillinthosetextfieldswiththeprovidedtext.(Youcanalwaysusethebrowser’sinspectortoverifytheid.)Callingthesubmit()methodonanyelementwillhavethesameresultasclickingtheSubmitbuttonfortheformthatelementisin.(YoucouldhavejustaseasilycalledemailElem.submit(),andthecodewouldhavedonethesamething.)

SendingSpecialKeysSeleniumhasamoduleforkeyboardkeysthatareimpossibletotypeintoastringvalue,whichfunctionmuchlikeescapecharacters.Thesevaluesarestoredinattributesintheselenium.webdriver.common.keysmodule.Sincethatissuchalongmodulename,it’smucheasiertorunfromselenium.webdriver.common.keysimportKeysatthetopofyourprogram;ifyoudo,thenyoucansimplywriteKeysanywhereyou’dnormallyhavetowriteselenium.webdriver.common.keys.Table11-5liststhecommonlyusedKeysvariables.

Table11-5.CommonlyUsedVariablesintheselenium.webdriver.common.keysModule

Attributes Meanings

Keys.DOWN,Keys.UP,Keys.LEFT,Keys.RIGHT Thekeyboardarrowkeys

Keys.ENTER,Keys.RETURN TheENTERandRETURNkeys

Keys.HOME,Keys.END,Keys.PAGE_DOWN,Keys.PAGE_UP Thehome,end,pagedown,andpageupkeys

Keys.ESCAPE,Keys.BACK_SPACE,Keys.DELETE TheESC,BACKSPACE,andDELETEkeys

Keys.F1,Keys.F2,…,Keys.F12 TheF1toF12keysatthetopofthekeyboard

Keys.TAB TheTABkey

Forexample,ifthecursorisnotcurrentlyinatextfield,pressingtheHOMEandENDkeyswillscrollthebrowsertothetopandbottomofthepage,respectively.Enterthefollowingintotheinteractiveshell,andnoticehowthesend_keys()callsscrollthepage:

>>>fromseleniumimportwebdriver

>>>fromselenium.webdriver.common.keysimportKeys

>>>browser=webdriver.Firefox()

>>>browser.get('http://nostarch.com')

>>>htmlElem=browser.find_element_by_tag_name('html')

>>>htmlElem.send_keys(Keys.END)#scrollstobottom

>>>htmlElem.send_keys(Keys.HOME)#scrollstotop

The<html>tagisthebasetaginHTMLfiles:ThefullcontentoftheHTMLfileisenclosedwithinthe<html>and</html>tags.Callingbrowser.find_element_by_tag_name('html')isagoodplacetosendkeystothegeneralwebpage.Thiswouldbeusefulif,forexample,newcontentisloadedonceyou’vescrolledtothebottomofthepage.

ClickingBrowserButtonsSeleniumcansimulateclicksonvariousbrowserbuttonsaswellthroughthefollowingmethods:

browser.back().ClickstheBackbutton.browser.forward().ClickstheForwardbutton.browser.refresh().ClickstheRefresh/Reloadbutton.browser.quit().ClickstheCloseWindowbutton.

MoreInformationonSeleniumSeleniumcandomuchmorebeyondthefunctionsdescribedhere.Itcanmodifyyourbrowser’scookies,takescreenshotsofwebpages,andruncustomJavaScript.Tolearnmoreaboutthesefeatures,youcanvisittheSeleniumdocumentationathttp://selenium-python.readthedocs.org/.

SummaryMostboringtasksaren’tlimitedtothefilesonyourcomputer.BeingabletoprogrammaticallydownloadwebpageswillextendyourprogramstotheInternet.Therequestsmodulemakesdownloadingstraightforward,andwithsomebasicknowledgeofHTMLconceptsandselectors,youcanutilizetheBeautifulSoupmoduletoparsethepagesyoudownload.

Buttofullyautomateanyweb-basedtasks,youneeddirectcontrolofyourwebbrowserthroughtheseleniummodule.Theseleniummodulewillallowyoutologintowebsitesandfilloutformsautomatically.SinceawebbrowseristhemostcommonwaytosendandreceiveinformationovertheInternet,thisisagreatabilitytohaveinyourprogrammertoolkit.

PracticeQuestionsQ: 1.Brieflydescribethedifferencesbetweenthewebbrowser,requests,BeautifulSoup,andseleniummodules.

Q: 2.Whattypeofobjectisreturnedbyrequests.get()?Howcanyouaccessthedownloadedcontentasastringvalue?

Q: 3.WhatRequestsmethodchecksthatthedownloadworked?

Q: 4.HowcanyougettheHTTPstatuscodeofaRequestsresponse?

Q: 5.HowdoyousaveaRequestsresponsetoafile?

Q: 6.Whatisthekeyboardshortcutforopeningabrowser’sdevelopertools?

Q: 7.Howcanyouview(inthedevelopertools)theHTMLofaspecificelementonawebpage?

Q: 8.WhatistheCSSselectorstringthatwouldfindtheelementwithanidattributeofmain?

Q: 9.WhatistheCSSselectorstringthatwouldfindtheelementswithaCSSclassofhighlight?

Q: 10.WhatistheCSSselectorstringthatwouldfindallthe<div>elementsinsideanother<div>element?

Q: 11.WhatistheCSSselectorstringthatwouldfindthe<button>elementwithavalueattributesettofavorite?

Q: 12.SayyouhaveaBeautifulSoupTagobjectstoredinthevariablespamfortheelement<div>Helloworld!</div>.Howcouldyougetastring'Helloworld!'fromtheTagobject?

Q: 13.HowwouldyoustorealltheattributesofaBeautifulSoupTagobjectinavariablenamedlinkElem?

Q: 14.Runningimportseleniumdoesn’twork.Howdoyouproperlyimporttheseleniummodule?

Q: 15.What’sthedifferencebetweenthefind_element_*andfind_elements_*methods?

Q: 16.WhatmethodsdoSelenium’sWebElementobjectshaveforsimulatingmouseclicksandkeyboardkeys?

Q: 17.Youcouldcallsend_keys(Keys.ENTER)ontheSubmitbutton’sWebElementobject,butwhatisaneasierwaytosubmitaformwithSelenium?

Q: 18.Howcanyousimulateclickingabrowser’sForward,Back,andRefreshbuttonswithSelenium?

PracticeProjectsForpractice,writeprogramstodothefollowingtasks.

CommandLineEmailerWriteaprogramthattakesanemailaddressandstringoftextonthecommandlineandthen,usingSelenium,logsintoyouremailaccountandsendsanemailofthestringtotheprovidedaddress.(Youmightwanttosetupaseparateemailaccountforthisprogram.)

Thiswouldbeanicewaytoaddanotificationfeaturetoyourprograms.YoucouldalsowriteasimilarprogramtosendmessagesfromaFacebookorTwitteraccount.

ImageSiteDownloaderWriteaprogramthatgoestoaphoto-sharingsitelikeFlickrorImgur,searchesforacategoryofphotos,andthendownloadsalltheresultingimages.Youcouldwriteaprogramthatworkswithanyphotositethathasasearchfeature.

20482048isasimplegamewhereyoucombinetilesbyslidingthemup,down,left,orrightwiththearrowkeys.Youcanactuallygetafairlyhighscorebyrepeatedlyslidinginanup,right,down,andleftpatternoverandoveragain.Writeaprogramthatwillopenthegameathttps://gabrielecirulli.github.io/2048/andkeepsendingup,right,down,andleftkeystrokestoautomaticallyplaythegame.

LinkVerificationWriteaprogramthat,giventheURLofawebpage,willattempttodownloadeverylinkedpageonthepage.Theprogramshouldflaganypagesthathavea404“NotFound”statuscodeandprintthemoutasbrokenlinks.

[2]Theanswerisno.

Chapter12.WorkingwithExcelSpreadsheetsExcelisapopularandpowerfulspreadsheetapplicationforWindows.TheopenpyxlmoduleallowsyourPythonprogramstoreadandmodifyExcelspreadsheetfiles.Forexample,youmighthavetheboringtaskofcopyingcertaindatafromonespreadsheetandpastingitintoanotherone.Oryoumighthavetogothroughthousandsofrowsandpickoutjustahandfulofthemtomakesmalleditsbasedonsomecriteria.Oryoumighthavetolookthroughhundredsofspreadsheetsofdepartmentbudgets,searchingforanythatareinthered.Theseareexactlythesortofboring,mindlessspreadsheettasksthatPythoncandoforyou.

AlthoughExcelisproprietarysoftwarefromMicrosoft,therearefreealternativesthatrunonWindows,OSX,andLinux.BothLibreOfficeCalcandOpenOfficeCalcworkwithExcel’s.xlsxfileformatforspreadsheets,whichmeanstheopenpyxlmodulecanworkonspreadsheetsfromtheseapplicationsaswell.Youcandownloadthesoftwarefromhttps://www.libreoffice.org/andhttp://www.openoffice.org/,respectively.EvenifyoualreadyhaveExcelinstalledonyourcomputer,youmayfindtheseprogramseasiertouse.Thescreenshotsinthischapter,however,areallfromExcel2010onWindows7.

ExcelDocumentsFirst,let’sgooversomebasicdefinitions:AnExcelspreadsheetdocumentiscalledaworkbook.Asingleworkbookissavedinafilewiththe.xlsxextension.Eachworkbookcancontainmultiplesheets(alsocalledworksheets).Thesheettheuseriscurrentlyviewing(orlastviewedbeforeclosingExcel)iscalledtheactivesheet.

Eachsheethascolumns(addressedbylettersstartingatA)androws(addressedbynumbersstartingat1).Aboxataparticularcolumnandrowiscalledacell.Eachcellcancontainanumberortextvalue.Thegridofcellswithdatamakesupasheet.

InstallingtheopenpyxlModulePythondoesnotcomewithOpenPyXL,soyou’llhavetoinstallit.Followtheinstructionsforinstallingthird-partymodulesinAppendixA;thenameofthemoduleisopenpyxl.Totestwhetheritisinstalledcorrectly,enterthefollowingintotheinteractiveshell:

>>>importopenpyxl

Ifthemodulewascorrectlyinstalled,thisshouldproducenoerrormessages.Remembertoimporttheopenpyxlmodulebeforerunningtheinteractiveshellexamplesinthischapter,oryou’llgetaNameError:name'openpyxl'isnotdefinederror.

Thisbookcoversversion2.1.4ofOpenPyXL,butnewversionsareregularlyreleasedbytheOpenPyXLteam.Don’tworry,though:Newversionsshouldstaybackwardcompatiblewiththeinstructionsinthisbookforquitesometime.Ifyouhaveanewerversionandwanttoseewhatadditionalfeaturesmaybeavailabletoyou,youcancheckoutthefulldocumentationforOpenPyXLathttp://openpyxl.readthedocs.org/.

ReadingExcelDocumentsTheexamplesinthischapterwilluseaspreadsheetnamedexample.xlsxstoredintherootfolder.Youcaneithercreatethespreadsheetyourselfordownloaditfromhttp://nostarch.com/automatestuff/.Figure12-1showsthetabsforthethreedefaultsheetsnamedSheet1,Sheet2,andSheet3thatExcelautomaticallyprovidesfornewworkbooks.(Thenumberofdefaultsheetscreatedmayvarybetweenoperatingsystemsandspreadsheetprograms.)

Figure12-1.Thetabsforaworkbook’ssheetsareinthelower-leftcornerofExcel.

Sheet1intheexamplefileshouldlooklikeTable12-1.(Ifyoudidn’tdownloadexample.xlsxfromthewebsite,youshouldenterthisdataintothesheetyourself.)

Table12-1.Theexample.xlsxSpreadsheet

A B C

1 4/5/20151:34:02PM Apples 73

2 4/5/20153:41:23AM Cherries 85

3 4/6/201512:46:51PM Pears 14

4 4/8/20158:59:43AM Oranges 52

5 4/10/20152:07:00AM Apples 152

6 4/10/20156:10:37PM Bananas 23

7 4/10/20152:40:46AM Strawberries 98

Nowthatwehaveourexamplespreadsheet,let’sseehowwecanmanipulateitwiththeopenpyxlmodule.

OpeningExcelDocumentswithOpenPyXLOnceyou’veimportedtheopenpyxlmodule,you’llbeabletousetheopenpyxl.load_workbook()function.Enterthefollowingintotheinteractiveshell:

>>>importopenpyxl

>>>wb=openpyxl.load_workbook('example.xlsx')

>>>type(wb)

<class'openpyxl.workbook.workbook.Workbook'>

Theopenpyxl.load_workbook()functiontakesinthefilenameandreturnsavalueofthe

workbookdatatype.ThisWorkbookobjectrepresentstheExcelfile,abitlikehowaFileobjectrepresentsanopenedtextfile.

Rememberthatexample.xlsxneedstobeinthecurrentworkingdirectoryinorderforyoutoworkwithit.Youcanfindoutwhatthecurrentworkingdirectoryisbyimportingosandusingos.getcwd(),andyoucanchangethecurrentworkingdirectoryusingos.chdir().

GettingSheetsfromtheWorkbookYoucangetalistofallthesheetnamesintheworkbookbycallingtheget_sheet_names()method.Enterthefollowingintotheinteractiveshell:

>>>importopenpyxl

>>>wb=openpyxl.load_workbook('example.xlsx')

>>>wb.get_sheet_names()

['Sheet1','Sheet2','Sheet3']

>>>sheet=wb.get_sheet_by_name('Sheet3')

>>>sheet

<Worksheet"Sheet3">

>>>type(sheet)<class'openpyxl.worksheet.worksheet.Worksheet'>

>>>sheet.title

'Sheet3'

>>>anotherSheet=wb.get_active_sheet()

>>>anotherSheet

<Worksheet"Sheet1">

EachsheetisrepresentedbyaWorksheetobject,whichyoucanobtainbypassingthesheetnamestringtotheget_sheet_by_name()workbookmethod.Finally,youcancalltheget_active_sheet()methodofaWorkbookobjecttogettheworkbook’sactivesheet.Theactivesheetisthesheetthat’sontopwhentheworkbookisopenedinExcel.OnceyouhavetheWorksheetobject,youcangetitsnamefromthetitleattribute.

GettingCellsfromtheSheetsOnceyouhaveaWorksheetobject,youcanaccessaCellobjectbyitsname.Enterthefollowingintotheinteractiveshell:

>>>importopenpyxl

>>>wb=openpyxl.load_workbook('example.xlsx')

>>>sheet=wb.get_sheet_by_name('Sheet1')

>>>sheet['A1']

<CellSheet1.A1>

>>>sheet['A1'].value

datetime.datetime(2015,4,5,13,34,2)

>>>c=sheet['B1']

>>>c.value

'Apples'

>>>'Row'+str(c.row)+',Column'+c.column+'is'+c.value

'Row1,ColumnBisApples'

>>>'Cell'+c.coordinate+'is'+c.value

'CellB1isApples'

>>>sheet['C1'].value

73

TheCellobjecthasavalueattributethatcontains,unsurprisingly,thevaluestoredinthatcell.Cellobjectsalsohaverow,column,andcoordinateattributesthatprovidelocationinformationforthecell.

Here,accessingthevalueattributeofourCellobjectforcellB1givesusthestring'Apples'.Therowattributegivesustheinteger1,thecolumnattributegivesus'B',andthecoordinateattributegivesus'B1'.

OpenPyXLwillautomaticallyinterpretthedatesincolumnAandreturnthemasdatetimevaluesratherthanstrings.ThedatetimedatatypeisexplainedfurtherinChapter16.

Specifyingacolumnbylettercanbetrickytoprogram,especiallybecauseaftercolumnZ,thecolumnsstartbyusingtwoletters:AA,AB,AC,andsoon.Asanalternative,youcanalsogetacellusingthesheet’scell()methodandpassingintegersforitsrowandcolumnkeywordarguments.Thefirstroworcolumnintegeris1,not0.Continuetheinteractiveshellexamplebyenteringthefollowing:

>>>sheet.cell(row=1,column=2)

<CellSheet1.B1>

>>>sheet.cell(row=1,column=2).value

'Apples'

>>>foriinrange(1,8,2):

print(i,sheet.cell(row=i,column=2).value)

1Apples

3Pears

5Apples

7Strawberries

Asyoucansee,usingthesheet’scell()methodandpassingitrow=1andcolumn=2getsyouaCellobjectforcellB1,justlikespecifyingsheet['B1']did.Then,usingthecell()methodanditskeywordarguments,youcanwriteaforlooptoprintthevaluesofaseriesofcells.

SayyouwanttogodowncolumnBandprintthevalueineverycellwithanoddrownumber.Bypassing2fortherange()function’s“step”parameter,youcangetcellsfromeverysecondrow(inthiscase,alltheodd-numberedrows).Theforloop’sivariableispassedfortherowkeywordargumenttothecell()method,while2isalwayspassedforthecolumnkeywordargument.Notethattheinteger2,notthestring'B',ispassed.

YoucandeterminethesizeofthesheetwiththeWorksheetobject’sget_highest_row()andget_highest_column()methods.Enterthefollowingintotheinteractiveshell:

>>>importopenpyxl

>>>wb=openpyxl.load_workbook('example.xlsx')

>>>sheet=wb.get_sheet_by_name('Sheet1')

>>>sheet.get_highest_row()

7

>>>sheet.get_highest_column()

3

Notethattheget_highest_column()methodreturnsanintegerratherthantheletterthatappearsinExcel.

ConvertingBetweenColumnLettersandNumbersToconvertfromletterstonumbers,calltheopenpyxl.cell.column_index_from_string()function.Toconvertfromnumberstoletters,calltheopenpyxl.cell.get_column_letter()function.Enterthefollowingintotheinteractiveshell:

>>>importopenpyxl

>>>fromopenpyxl.cellimportget_column_letter,column_index_from_string

>>>get_column_letter(1)

'A'

>>>get_column_letter(2)

'B'

>>>get_column_letter(27)

'AA'

>>>get_column_letter(900)

'AHP'

>>>wb=openpyxl.load_workbook('example.xlsx')

>>>sheet=wb.get_sheet_by_name('Sheet1')

>>>get_column_letter(sheet.get_highest_column())

'C'

>>>column_index_from_string('A')

1

>>>column_index_from_string('AA')

27

Afteryouimportthesetwofunctionsfromtheopenpyxl.cellmodule,youcancallget_column_letter()andpassitanintegerlike27tofigureoutwhattheletternameofthe27thcolumnis.Thefunctioncolumn_index_string()doesthereverse:Youpassittheletternameofacolumn,andittellsyouwhatnumberthatcolumnis.Youdon’tneedtohaveaworkbookloadedtousethesefunctions.Ifyouwant,youcanloadaworkbook,getaWorksheetobject,andcallaWorksheetobjectmethodlikeget_highest_column()togetaninteger.Then,youcanpassthatintegertoget_column_letter().

GettingRowsandColumnsfromtheSheetsYoucansliceWorksheetobjectstogetalltheCellobjectsinarow,column,orrectangularareaofthespreadsheet.Thenyoucanloopoverallthecellsintheslice.Enterthefollowingintotheinteractiveshell:

>>>importopenpyxl

>>>wb=openpyxl.load_workbook('example.xlsx')

>>>sheet=wb.get_sheet_by_name('Sheet1')

>>>tuple(sheet['A1':'C3'])

((<CellSheet1.A1>,<CellSheet1.B1>,<CellSheet1.C1>),(<CellSheet1.A2>,

<CellSheet1.B2>,<CellSheet1.C2>),(<CellSheet1.A3>,<CellSheet1.B3>,

<CellSheet1.C3>))

➊>>>forrowOfCellObjectsinsheet['A1':'C3']:

➋forcellObjinrowOfCellObjects:

print(cellObj.coordinate,cellObj.value)

print('---ENDOFROW---')

A12015-04-0513:34:02

B1Apples

C173

---ENDOFROW---

A22015-04-0503:41:23

B2Cherries

C285

---ENDOFROW---

A32015-04-0612:46:51

B3Pears

C314

---ENDOFROW---

Here,wespecifythatwewanttheCellobjectsintherectangularareafromA1toC3,andwegetaGeneratorobjectcontainingtheCellobjectsinthatarea.TohelpusvisualizethisGeneratorobject,wecanusetuple()onittodisplayitsCellobjectsinatuple.

Thistuplecontainsthreetuples:oneforeachrow,fromthetopofthedesiredareatothebottom.EachofthesethreeinnertuplescontainstheCellobjectsinonerowofourdesiredarea,fromtheleftmostcelltotheright.Sooverall,oursliceofthesheetcontainsalltheCellobjectsintheareafromA1toC3,startingfromthetop-leftcellandendingwiththebottom-rightcell.

Toprintthevaluesofeachcellinthearea,weusetwoforloops.Theouterforloopgoesovereachrowintheslice➊.Then,foreachrow,thenestedforloopgoesthrougheachcellinthatrow➋.

Toaccessthevaluesofcellsinaparticularroworcolumn,youcanalsouseaWorksheetobject’srowsandcolumnsattribute.Enterthefollowingintotheinteractiveshell:

>>>importopenpyxl

>>>wb=openpyxl.load_workbook('example.xlsx')

>>>sheet=wb.get_active_sheet()

>>>sheet.columns[1]

(<CellSheet1.B1>,<CellSheet1.B2>,<CellSheet1.B3>,<CellSheet1.B4>,

<CellSheet1.B5>,<CellSheet1.B6>,<CellSheet1.B7>)

>>>forcellObjinsheet.columns[1]:

print(cellObj.value)

Apples

Cherries

Pears

Oranges

Apples

Bananas

Strawberries

UsingtherowsattributeonaWorksheetobjectwillgiveyouatupleoftuples.Eachoftheseinnertuplesrepresentsarow,andcontainstheCellobjectsinthatrow.Thecolumnsattributealsogivesyouatupleoftuples,witheachoftheinnertuplescontainingtheCellobjectsinaparticularcolumn.Forexample.xlsx,sincethereare7rowsand3columns,rowsgivesusatupleof7tuples(eachcontaining3Cellobjects),andcolumnsgivesusatupleof3tuples(eachcontaining7Cellobjects).

Toaccessoneparticulartuple,youcanrefertoitbyitsindexinthelargertuple.Forexample,togetthetuplethatrepresentscolumnB,youusesheet.columns[1].TogetthetuplecontainingtheCellobjectsincolumnA,you’dusesheet.columns[0].Onceyouhaveatuplerepresentingoneroworcolumn,youcanloopthroughitsCellobjectsandprinttheirvalues.

Workbooks,Sheets,CellsAsaquickreview,here’sarundownofallthefunctions,methods,anddatatypesinvolvedinreadingacelloutofaspreadsheetfile:

1. Importtheopenpyxlmodule.2. Calltheopenpyxl.load_workbook()function.3. GetaWorkbookobject.4. Calltheget_active_sheet()orget_sheet_by_name()workbookmethod.5. GetaWorksheetobject.6. Useindexingorthecell()sheetmethodwithrowandcolumnkeywordarguments.7. GetaCellobject.8. ReadtheCellobject’svalueattribute.

Project:ReadingDatafromaSpreadsheetSayyouhaveaspreadsheetofdatafromthe2010USCensusandyouhavetheboringtaskofgoingthroughitsthousandsofrowstocountboththetotalpopulationandthenumberofcensustractsforeachcounty.(Acensustractissimplyageographicareadefinedforthepurposesofthecensus.)Eachrowrepresentsasinglecensustract.We’llnamethespreadsheetfilecensuspopdata.xlsx,andyoucandownloaditfromhttp://nostarch.com/automatestuff/.ItscontentslooklikeFigure12-2.

Figure12-2.Thecensuspopdata.xlsxspreadsheet

EventhoughExcelcancalculatethesumofmultipleselectedcells,you’dstillhavetoselectthecellsforeachofthe3,000-pluscounties.Evenifittakesjustafewsecondstocalculateacounty’spopulationbyhand,thiswouldtakehourstodoforthewholespreadsheet.

Inthisproject,you’llwriteascriptthatcanreadfromthecensusspreadsheetfileandcalculatestatisticsforeachcountyinamatterofseconds.

Thisiswhatyourprogramdoes:

ReadsthedatafromtheExcelspreadsheet.Countsthenumberofcensustractsineachcounty.Countsthetotalpopulationofeachcounty.Printstheresults.

Thismeansyourcodewillneedtodothefollowing:

OpenandreadthecellsofanExceldocumentwiththeopenpyxlmodule.Calculateallthetractandpopulationdataandstoreitinadatastructure.Writethedatastructuretoatextfilewiththe.pyextensionusingthepprintmodule.

Step1:ReadtheSpreadsheetDataThereisjustonesheetinthecensuspopdata.xlsxspreadsheet,named'PopulationbyCensusTract',andeachrowholdsthedataforasinglecensustract.Thecolumnsarethetractnumber(A),thestateabbreviation(B),thecountyname(C),andthepopulationofthetract(D).

Openanewfileeditorwindowandenterthefollowingcode.Savethefileas

readCensusExcel.py.#!python3

#readCensusExcel.py-Tabulatespopulationandnumberofcensustractsfor

#eachcounty.

➊importopenpyxl,pprint

print('Openingworkbook…')

➋wb=openpyxl.load_workbook('censuspopdata.xlsx')

➌sheet=wb.get_sheet_by_name('PopulationbyCensusTract')

countyData={}

#TODO:FillincountyDatawitheachcounty'spopulationandtracts.

print('Readingrows…')

➍forrowinrange(2,sheet.get_highest_row()+1):

#Eachrowinthespreadsheethasdataforonecensustract.

state=sheet['B'+str(row)].value

county=sheet['C'+str(row)].value

pop=sheet['D'+str(row)].value

#TODO:OpenanewtextfileandwritethecontentsofcountyDatatoit.

Thiscodeimportstheopenpyxlmodule,aswellasthepprintmodulethatyou’llusetoprintthefinalcountydata➊.Thenitopensthecensuspopdata.xlsxfile➋,getsthesheetwiththecensusdata➌,andbeginsiteratingoveritsrows➍.

Notethatyou’vealsocreatedavariablenamedcountyData,whichwillcontainthepopulationsandnumberoftractsyoucalculateforeachcounty.Beforeyoucanstoreanythinginit,though,youshoulddetermineexactlyhowyou’llstructurethedatainsideit.

Step2:PopulatetheDataStructureThedatastructurestoredincountyDatawillbeadictionarywithstateabbreviationsasitskeys.Eachstateabbreviationwillmaptoanotherdictionary,whosekeysarestringsofthecountynamesinthatstate.Eachcountynamewillinturnmaptoadictionarywithjusttwokeys,'tracts'and'pop'.Thesekeysmaptothenumberofcensustractsandpopulationforthecounty.Forexample,thedictionarywilllooksimilartothis:

{'AK':{'AleutiansEast':{'pop':3141,'tracts':1},

'AleutiansWest':{'pop':5561,'tracts':2},

'Anchorage':{'pop':291826,'tracts':55},

'Bethel':{'pop':17013,'tracts':3},

'BristolBay':{'pop':997,'tracts':1},

--snip--

IfthepreviousdictionarywerestoredincountyData,thefollowingexpressionswouldevaluatelikethis:

>>>countyData['AK']['Anchorage']['pop']

291826

>>>countyData['AK']['Anchorage']['tracts']

55

Moregenerally,thecountyDatadictionary’skeyswilllooklikethis:countyData[stateabbrev][county]['tracts']

countyData[stateabbrev][county]['pop']

NowthatyouknowhowcountyDatawillbestructured,youcanwritethecodethatwillfillitwiththecountydata.Addthefollowingcodetothebottomofyourprogram:

#!python3

#readCensusExcel.py-Tabulatespopulationandnumberofcensustractsfor

#eachcounty.

--snip--

forrowinrange(2,sheet.get_highest_row()+1):

#Eachrowinthespreadsheethasdataforonecensustract.

state=sheet['B'+str(row)].value

county=sheet['C'+str(row)].value

pop=sheet['D'+str(row)].value

#Makesurethekeyforthisstateexists.

➊countyData.setdefault(state,{})

#Makesurethekeyforthiscountyinthisstateexists.

➋countyData[state].setdefault(county,{'tracts':0,'pop':0})

#Eachrowrepresentsonecensustract,soincrementbyone.

➌countyData[state][county]['tracts']+=1

#Increasethecountypopbythepopinthiscensustract.

➍countyData[state][county]['pop']+=int(pop)

#TODO:OpenanewtextfileandwritethecontentsofcountyDatatoit.

Thelasttwolinesofcodeperformtheactualcalculationwork,incrementingthevaluefortracts➌andincreasingthevalueforpop➍forthecurrentcountyoneachiterationoftheforloop.

TheothercodeistherebecauseyoucannotaddacountydictionaryasthevalueforastateabbreviationkeyuntilthekeyitselfexistsincountyData.(Thatis,countyData['AK']['Anchorage']['tracts']+=1willcauseanerrorifthe'AK'keydoesn’texistyet.)Tomakesurethestateabbreviationkeyexistsinyourdatastructure,youneedtocallthesetdefault()methodtosetavalueifonedoesnotalreadyexistforstate➊.

JustasthecountyDatadictionaryneedsadictionaryasthevalueforeachstateabbreviationkey,eachofthosedictionarieswillneeditsowndictionaryasthevalueforeachcountykey➋.Andeachofthosedictionariesinturnwillneedkeys'tracts'and'pop'thatstartwiththeintegervalue0.(Ifyoueverlosetrackofthedictionarystructure,lookbackattheexampledictionaryatthestartofthissection.)

Sincesetdefault()willdonothingifthekeyalreadyexists,youcancallitoneveryiterationoftheforloopwithoutaproblem.

Step3:WritetheResultstoaFileAftertheforloophasfinished,thecountyDatadictionarywillcontainallofthepopulationandtractinformationkeyedbycountyandstate.Atthispoint,youcouldprogrammorecodetowritethistoatextfileoranotherExcelspreadsheet.Fornow,let’sjustusethepprint.pformat()functiontowritethecountyDatadictionaryvalueasamassivestringtoafilenamedcensus2010.py.Addthefollowingcodetothebottomofyourprogram(makingsuretokeepitunindentedsothatitstaysoutsidetheforloop):

#!python3

#readCensusExcel.py-Tabulatespopulationandnumberofcensustractsfor

#eachcounty.

--snip--

forrowinrange(2,sheet.get_highest_row()+1):

--snip--

#OpenanewtextfileandwritethecontentsofcountyDatatoit.

print('Writingresults…')

resultFile=open('census2010.py','w')

resultFile.write('allData='+pprint.pformat(countyData))

resultFile.close()

print('Done.')

Thepprint.pformat()functionproducesastringthatitselfisformattedasvalidPythoncode.Byoutputtingittoatextfilenamedcensus2010.py,you’vegeneratedaPythonprogramfromyourPythonprogram!Thismayseemcomplicated,buttheadvantageis

thatyoucannowimportcensus2010.pyjustlikeanyotherPythonmodule.Intheinteractiveshell,changethecurrentworkingdirectorytothefolderwithyournewlycreatedcensus2010.pyfile(onmylaptop,thisisC:\Python34),andthenimportit:

>>>importos

>>>os.chdir('C:\\Python34')

>>>importcensus2010

>>>census2010.allData['AK']['Anchorage']

{'pop':291826,'tracts':55}

>>>anchoragePop=census2010.allData['AK']['Anchorage']['pop']

>>>print('The2010populationofAnchoragewas'+str(anchoragePop))

The2010populationofAnchoragewas291826

ThereadCensusExcel.pyprogramwasthrowawaycode:Onceyouhaveitsresultssavedtocensus2010.py,youwon’tneedtoruntheprogramagain.Wheneveryouneedthecountydata,youcanjustrunimportcensus2010.

Calculatingthisdatabyhandwouldhavetakenhours;thisprogramdiditinafewseconds.UsingOpenPyXL,youwillhavenotroubleextractinginformationthatissavedtoanExcelspreadsheetandperformingcalculationsonit.Youcandownloadthecompleteprogramfromhttp://nostarch.com/automatestuff/.

IdeasforSimilarProgramsManybusinessesandofficesuseExceltostorevarioustypesofdata,andit’snotuncommonforspreadsheetstobecomelargeandunwieldy.AnyprogramthatparsesanExcelspreadsheethasasimilarstructure:Itloadsthespreadsheetfile,prepssomevariablesordatastructures,andthenloopsthrougheachoftherowsinthespreadsheet.Suchaprogramcoulddothefollowing:

Comparedataacrossmultiplerowsinaspreadsheet.OpenmultipleExcelfilesandcomparedatabetweenspreadsheets.Checkwhetheraspreadsheethasblankrowsorinvaliddatainanycellsandalerttheuserifitdoes.ReaddatafromaspreadsheetanduseitastheinputforyourPythonprograms.

WritingExcelDocumentsOpenPyXLalsoprovideswaysofwritingdata,meaningthatyourprogramscancreateandeditspreadsheetfiles.WithPython,it’ssimpletocreatespreadsheetswiththousandsofrowsofdata.

CreatingandSavingExcelDocumentsCalltheopenpyxl.Workbook()functiontocreateanew,blankWorkbookobject.Enterthefollowingintotheinteractiveshell:

>>>importopenpyxl

>>>wb=openpyxl.Workbook()

>>>wb.get_sheet_names()

['Sheet']

>>>sheet=wb.get_active_sheet()

>>>sheet.title

'Sheet'

>>>sheet.title='SpamBaconEggsSheet'

>>>wb.get_sheet_names()

['SpamBaconEggsSheet']

TheworkbookwillstartoffwithasinglesheetnamedSheet.Youcanchangethenameofthesheetbystoringanewstringinitstitleattribute.

AnytimeyoumodifytheWorkbookobjectoritssheetsandcells,thespreadsheetfilewillnotbesaveduntilyoucallthesave()workbookmethod.Enterthefollowingintotheinteractiveshell(withexample.xlsxinthecurrentworkingdirectory):

>>>importopenpyxl

>>>wb=openpyxl.load_workbook('example.xlsx')

>>>sheet=wb.get_active_sheet()

>>>sheet.title='SpamSpamSpam'

>>>wb.save('example_copy.xlsx')

Here,wechangethenameofoursheet.Tosaveourchanges,wepassafilenameasastringtothesave()method.Passingadifferentfilenamethantheoriginal,suchas'example_copy.xlsx',savesthechangestoacopyofthespreadsheet.

Wheneveryoueditaspreadsheetyou’veloadedfromafile,youshouldalwayssavethenew,editedspreadsheettoadifferentfilenamethantheoriginal.Thatway,you’llstillhavetheoriginalspreadsheetfiletoworkwithincaseabuginyourcodecausedthenew,savedfiletohaveincorrectorcorruptdata.

CreatingandRemovingSheetsSheetscanbeaddedtoandremovedfromaworkbookwiththecreate_sheet()andremove_sheet()methods.Enterthefollowingintotheinteractiveshell:

>>>importopenpyxl

>>>wb=openpyxl.Workbook()

>>>wb.get_sheet_names()

['Sheet']

>>>wb.create_sheet()

<Worksheet"Sheet1">

>>>wb.get_sheet_names()

['Sheet','Sheet1']

>>>wb.create_sheet(index=0,title='FirstSheet')

<Worksheet"FirstSheet">

>>>wb.get_sheet_names()

['FirstSheet','Sheet','Sheet1']

>>>wb.create_sheet(index=2,title='MiddleSheet')

<Worksheet"MiddleSheet">

>>>wb.get_sheet_names()

['FirstSheet','Sheet','MiddleSheet','Sheet1']

Thecreate_sheet()methodreturnsanewWorksheetobjectnamedSheetX,whichbydefaultissettobethelastsheetintheworkbook.Optionally,theindexandnameofthenewsheetcanbespecifiedwiththeindexandtitlekeywordarguments.

Continuethepreviousexamplebyenteringthefollowing:>>>wb.get_sheet_names()

['FirstSheet','Sheet','MiddleSheet','Sheet1']

>>>wb.remove_sheet(wb.get_sheet_by_name('MiddleSheet'))

>>>wb.remove_sheet(wb.get_sheet_by_name('Sheet1'))

>>>wb.get_sheet_names()

['FirstSheet','Sheet']

Theremove_sheet()methodtakesaWorksheetobject,notastringofthesheetname,asitsargument.Ifyouknowonlythenameofasheetyouwanttoremove,callget_sheet_by_name()andpassitsreturnvalueintoremove_sheet().

Remembertocallthesave()methodtosavethechangesafteraddingsheetstoorremovingsheetsfromtheworkbook.

WritingValuestoCellsWritingvaluestocellsismuchlikewritingvaluestokeysinadictionary.Enterthisintotheinteractiveshell:

>>>importopenpyxl

>>>wb=openpyxl.Workbook()

>>>sheet=wb.get_sheet_by_name('Sheet')

>>>sheet['A1']='Helloworld!'

>>>sheet['A1'].value

'Helloworld!'

Ifyouhavethecell’scoordinateasastring,youcanuseitjustlikeadictionarykeyontheWorksheetobjecttospecifywhichcelltowriteto.

Project:UpdatingaSpreadsheetInthisproject,you’llwriteaprogramtoupdatecellsinaspreadsheetofproducesales.Yourprogramwilllookthroughthespreadsheet,findspecifickindsofproduce,andupdatetheirprices.Downloadthisspreadsheetfromhttp://nostarch.com/automatestuff/.Figure12-3showswhatthespreadsheetlookslike.

Figure12-3.Aspreadsheetofproducesales

Eachrowrepresentsanindividualsale.Thecolumnsarethetypeofproducesold(A),thecostperpoundofthatproduce(B),thenumberofpoundssold(C),andthetotalrevenuefromthesale(D).TheTOTALcolumnissettotheExcelformula=ROUND(B3*C3,2),whichmultipliesthecostperpoundbythenumberofpoundssoldandroundstheresulttothenearestcent.Withthisformula,thecellsintheTOTALcolumnwillautomaticallyupdatethemselvesifthereisachangeincolumnBorC.

Nowimaginethatthepricesofgarlic,celery,andlemonswereenteredincorrectly,leavingyouwiththeboringtaskofgoingthroughthousandsofrowsinthisspreadsheettoupdatethecostperpoundforanygarlic,celery,andlemonrows.Youcan’tdoasimplefind-and-replaceforthepricebecausetheremightbeotheritemswiththesamepricethatyoudon’twanttomistakenly“correct.”Forthousandsofrows,thiswouldtakehourstodobyhand.Butyoucanwriteaprogramthatcanaccomplishthisinseconds.

Yourprogramdoesthefollowing:

Loopsoveralltherows.Iftherowisforgarlic,celery,orlemons,changestheprice.

Thismeansyourcodewillneedtodothefollowing:

Openthespreadsheetfile.Foreachrow,checkwhetherthevalueincolumnAisCelery,Garlic,orLemon.Ifitis,updatethepriceincolumnB.Savethespreadsheettoanewfile(sothatyoudon’tlosetheoldspreadsheet,justincase).

Step1:SetUpaDataStructurewiththeUpdateInformationThepricesthatyouneedtoupdateareasfollows:

Celery 1.19

Garlic 3.07

Lemon 1.27

Youcouldwritecodelikethis:ifproduceName=='Celery':

cellObj=1.19

ifproduceName=='Garlic':

cellObj=3.07

ifproduceName=='Lemon':

cellObj=1.27

Havingtheproduceandupdatedpricedatahardcodedlikethisisabitinelegant.Ifyouneededtoupdatethespreadsheetagainwithdifferentpricesordifferentproduce,youwouldhavetochangealotofthecode.Everytimeyouchangecode,youriskintroducingbugs.

Amoreflexiblesolutionistostorethecorrectedpriceinformationinadictionaryandwriteyourcodetousethisdatastructure.Inanewfileeditorwindow,enterthefollowingcode:

#!python3

#updateProduce.py-Correctscostsinproducesalesspreadsheet.

importopenpyxl

wb=openpyxl.load_workbook('produceSales.xlsx')

sheet=wb.get_sheet_by_name('Sheet')

#Theproducetypesandtheirupdatedprices

PRICE_UPDATES={'Garlic':3.07,

'Celery':1.19,

'Lemon':1.27}

#TODO:Loopthroughtherowsandupdatetheprices.

SavethisasupdateProduce.py.Ifyouneedtoupdatethespreadsheetagain,you’llneedtoupdateonlythePRICE_UPDATESdictionary,notanyothercode.

Step2:CheckAllRowsandUpdateIncorrectPricesThenextpartoftheprogramwillloopthroughalltherowsinthespreadsheet.AddthefollowingcodetothebottomofupdateProduce.py:

#!python3

#updateProduce.py-Correctscostsinproducesalesspreadsheet.

--snip--

#Loopthroughtherowsandupdatetheprices.

➊forrowNuminrange(2,sheet.get_highest_row()):#skipthefirstrow

➋produceName=sheet.cell(row=rowNum,column=1).value

➌ifproduceNameinPRICE_UPDATES:

sheet.cell(row=rowNum,column=2).value=PRICE_UPDATES[produceName]

➍wb.save('updatedProduceSales.xlsx')

Weloopthroughtherowsstartingatrow2,sincerow1isjusttheheader➊.Thecellin

column1(thatis,columnA)willbestoredinthevariableproduceName➋.IfproduceNameexistsasakeyinthePRICE_UPDATESdictionary➌,thenyouknowthisisarowthatmusthaveitspricecorrected.ThecorrectpricewillbeinPRICE_UPDATES[produceName].

NoticehowcleanusingPRICE_UPDATESmakesthecode.Onlyoneifstatement,ratherthancodelikeifproduceName=='Garlic':,isnecessaryforeverytypeofproducetoupdate.AndsincethecodeusesthePRICE_UPDATESdictionaryinsteadofhardcodingtheproducenamesandupdatedcostsintotheforloop,youmodifyonlythePRICE_UPDATESdictionaryandnotthecodeiftheproducesalesspreadsheetneedsadditionalchanges.

Aftergoingthroughtheentirespreadsheetandmakingchanges,thecodesavestheWorkbookobjecttoupdatedProduceSales.xlsx➍.Itdoesn’toverwritetheoldspreadsheetjustincasethere’sabuginyourprogramandtheupdatedspreadsheetiswrong.Aftercheckingthattheupdatedspreadsheetlooksright,youcandeletetheoldspreadsheet.

Youcandownloadthecompletesourcecodeforthisprogramfromhttp://nostarch.com/automatestuff/.

IdeasforSimilarProgramsSincemanyofficeworkersuseExcelspreadsheetsallthetime,aprogramthatcanautomaticallyeditandwriteExcelfilescouldbereallyuseful.Suchaprogramcoulddothefollowing:

Readdatafromonespreadsheetandwriteittopartsofotherspreadsheets.Readdatafromwebsites,textfiles,ortheclipboardandwriteittoaspreadsheet.Automatically“cleanup”datainspreadsheets.Forexample,itcoulduseregularexpressionstoreadmultipleformatsofphonenumbersandeditthemtoasingle,standardformat.

SettingtheFontStyleofCellsStylingcertaincells,rows,orcolumnscanhelpyouemphasizeimportantareasinyourspreadsheet.Intheproducespreadsheet,forexample,yourprogramcouldapplyboldtexttothepotato,garlic,andparsniprows.Orperhapsyouwanttoitalicizeeveryrowwithacostperpoundgreaterthan$5.Stylingpartsofalargespreadsheetbyhandwouldbetedious,butyourprogramscandoitinstantly.

Tocustomizefontstylesincells,important,importtheFont()andStyle()functionsfromtheopenpyxl.stylesmodule.

fromopenpyxl.stylesimportFont,Style

ThisallowsyoutotypeFont()insteadofopenpyxl.styles.Font().(SeeImportingModulestoreviewthisstyleofimportstatement.)

Here’sanexamplethatcreatesanewworkbookandsetscellA1tohavea24-point,italicizedfont.Enterthefollowingintotheinteractiveshell:

>>>importopenpyxl

>>>fromopenpyxl.stylesimportFont,Style

>>>wb=openpyxl.Workbook()

>>>sheet=wb.get_sheet_by_name('Sheet')

➊>>>italic24Font=Font(size=24,italic=True)

➋>>>styleObj=Style(font=italic24Font)

➌>>>sheet['A'].style/styleObj

>>>sheet['A1']='Helloworld!'

>>>wb.save('styled.xlsx')

OpenPyXLrepresentsthecollectionofstylesettingsforacellwithaStyleobject,whichisstoredintheCellobject’sstyleattribute.Acell’sstylecanbesetbyassigningtheStyleobjecttothestyleattribute.

Inthisexample,Font(size=24,italic=True)returnsaFontobject,whichisstoredinitalic24Font➊.ThekeywordargumentstoFont(),sizeanditalic,configuretheFontobject’sstyleattributes.ThisFontobjectisthenpassedintotheStyle(font=italic24Font)call,whichreturnsthevalueyoustoredinstyleObj➋.AndwhenstyleObjisassignedtothecell’sstyleattribute➌,allthatfontstylinginformationgetsappliedtocellA1.

FontObjectsThestyleattributesinFontobjectsaffecthowthetextincellsisdisplayed.Tosetfontstyleattributes,youpasskeywordargumentstoFont().Table12-2showsthepossiblekeywordargumentsfortheFont()function.

Table12-2.KeywordArgumentsforFontstyleAttributes

Keywordargument Datatype Description

name String Thefontname,suchas'Calibri'or'TimesNewRoman'

size Integer Thepointsize

bold Boolean True,forboldfont

italic Boolean True,foritalicfont

YoucancallFont()tocreateaFontobjectandstorethatFontobjectinavariable.YouthenpassthattoStyle(),storetheresultingStyleobjectinavariable,andassignthatvariabletoaCellobject’sstyleattribute.Forexample,thiscodecreatesvariousfontstyles:

>>>importopenpyxl

>>>fromopenpyxl.stylesimportFont,Style

>>>wb=openpyxl.Workbook()

>>>sheet=wb.get_sheet_by_name('Sheet')

>>>fontObj1=Font(name='TimesNewRoman',bold=True)

>>>styleObj1=Style(font=fontObj1)

>>>sheet['A1'].style/styleObj

>>>sheet['A1']='BoldTimesNewRoman'

>>>fontObj2=Font(size=24,italic=True)

>>>styleObj2=Style(font=fontObj2)

>>>sheet['B3'].style/styleObj

>>>sheet['B3']='24ptItalic'

>>>wb.save('styles.xlsx')

Here,westoreaFontobjectinfontObj1anduseittocreateaStyleobject,whichwestoreinstyleObj1,andthensettheA1Cellobject’sstyleattributetostyleObj.WerepeattheprocesswithanotherFontobjectandStyleobjecttosetthestyleofasecondcell.Afteryourunthiscode,thestylesoftheA1andB3cellsinthespreadsheetwillbesettocustomfontstyles,asshowninFigure12-4.

Figure12-4.Aspreadsheetwithcustomfontstyles

ForcellA1,wesetthefontnameto'TimesNewRoman'andsetboldtotrue,soourtext

appearsinboldTimesNewRoman.Wedidn’tspecifyasize,sotheopenpyxldefault,11,isused.IncellB3,ourtextisitalic,withasizeof24;wedidn’tspecifyafontname,sotheopenpyxldefault,Calibri,isused.

FormulasFormulas,whichbeginwithanequalsign,canconfigurecellstocontainvaluescalculatedfromothercells.Inthissection,you’llusetheopenpyxlmoduletoprogrammaticallyaddformulastocells,justlikeanynormalvalue.Forexample:

>>>sheet['B9']='=SUM(B1:B8)'

Thiswillstore=SUM(B1:B8)asthevalueincellB9.ThissetstheB9celltoaformulathatcalculatesthesumofvaluesincellsB1toB8.YoucanseethisinactioninFigure12-5.

Figure12-5.CellB9containstheformula=SUM(B1:B8),whichaddsthecellsB1toB8.

Aformulaissetjustlikeanyothertextvalueinacell.Enterthefollowingintotheinteractiveshell:

>>>importopenpyxl

>>>wb=openpyxl.Workbook()

>>>sheet=wb.get_active_sheet()

>>>sheet['A1']=200

>>>sheet['A2']=300

>>>sheet['A3']='=SUM(A1:A2)'

>>>wb.save('writeFormula.xlsx')

ThecellsinA1andA2aresetto200and300,respectively.ThevalueincellA3issettoaformulathatsumsthevaluesinA1andA2.WhenthespreadsheetisopenedinExcel,A3willdisplayitsvalueas500.

Youcanalsoreadtheformulainacelljustasyouwouldanyvalue.However,ifyouwanttoseetheresultofthecalculationfortheformulainsteadoftheliteralformula,youmustpassTrueforthedata_onlykeywordargumenttoload_workbook().ThismeansaWorkbookobjectcanshoweithertheformulasortheresultoftheformulasbutnotboth.(ButyoucanhavemultipleWorkbookobjectsloadedforthesamespreadsheetfile.)Enterthefollowingintotheinteractiveshelltoseethedifferencebetweenloadingaworkbookwithandwithoutthedata_onlykeywordargument:

>>>importopenpyxl

>>>wbFormulas=openpyxl.load_workbook('writeFormula.xlsx')

>>>sheet=wbFormulas.get_active_sheet()

>>>sheet['A3'].value

'=SUM(A1:A2)'

>>>wbDataOnly=openpyxl.load_workbook('writeFormula.xlsx',data_only=True)

>>>sheet=wbDataOnly.get_active_sheet()

>>>sheet['A3'].value

500

Here,whenload_workbook()iscalledwithdata_only=True,theA3cellshows500,theresultofthe=SUM(A1:A2)formula,ratherthanthetextoftheformula.

Excelformulasofferalevelofprogrammabilityforspreadsheetsbutcanquicklybecomeunmanageableforcomplicatedtasks.Forexample,evenifyou’redeeplyfamiliarwithExcelformulas,it’saheadachetotrytodecipherwhat=IFERROR(TRIM(IF(LEN(VLOOKUP(F7,Sheet2!$A$1:$B$10000,2,FALSE))>0,SUBSTITUTE(VLOOKUP(F7,Sheet2!$A$1:$B$10000,2,FALSE),“”,“”),“”)),“”)actuallydoes.Pythoncodeismuchmorereadable.

AdjustingRowsandColumnsInExcel,adjustingthesizesofrowsandcolumnsisaseasyasclickinganddraggingtheedgesofaroworcolumnheader.Butifyouneedtosetaroworcolumn’ssizebasedonitscells’contentsorifyouwanttosetsizesinalargenumberofspreadsheetfiles,itwillbemuchquickertowriteaPythonprogramtodoit.

Rowsandcolumnscanalsobehiddenentirelyfromview.Ortheycanbe“frozen”inplacesothattheyarealwaysvisibleonthescreenandappearoneverypagewhenthespreadsheetisprinted(whichishandyforheaders).

SettingRowHeightandColumnWidthWorksheetobjectshaverow_dimensionsandcolumn_dimensionsattributesthatcontrolrowheightsandcolumnwidths.Enterthisintotheinteractiveshell:

>>>importopenpyxl

>>>wb=openpyxl.Workbook()

>>>sheet=wb.get_active_sheet()

>>>sheet['A1']='Tallrow'

>>>sheet['B2']='Widecolumn'

>>>sheet.row_dimensions[1].height=70

>>>sheet.column_dimensions['B'].width=20

>>>wb.save('dimensions.xlsx')

Asheet’srow_dimensionsandcolumn_dimensionsaredictionary-likevalues;row_dimensionscontainsRowDimensionobjectsandcolumn_dimensionscontainsColumnDimensionobjects.Inrow_dimensions,youcanaccessoneoftheobjectsusingthenumberoftherow(inthiscase,1or2).Incolumn_dimensions,youcanaccessoneoftheobjectsusingtheletterofthecolumn(inthiscase,AorB).

Thedimensions.xlsxspreadsheetlookslikeFigure12-6.

Figure12-6.Row1andcolumnBsettolargerheightsandwidths

OnceyouhavetheRowDimensionobject,youcansetitsheight.OnceyouhavetheColumnDimensionobject,youcansetitswidth.Therowheightcanbesettoanintegerorfloatvaluebetween0and409.Thisvaluerepresentstheheightmeasuredinpoints,whereonepointequals1/72ofaninch.Thedefaultrowheightis12.75.Thecolumnwidthcanbesettoanintegerorfloatvaluebetween0and255.Thisvaluerepresentsthenumberofcharactersatthedefaultfontsize(11point)thatcanbedisplayedinthecell.Thedefaultcolumnwidthis8.43characters.Columnswithwidthsof0orrowswithheightsof0arehiddenfromtheuser.

MergingandUnmergingCellsArectangularareaofcellscanbemergedintoasinglecellwiththemerge_cells()sheet

method.Enterthefollowingintotheinteractiveshell:>>>importopenpyxl

>>>wb=openpyxl.Workbook()

>>>sheet=wb.get_active_sheet()

>>>sheet.merge_cells('A1:D3')

>>>sheet['A1']='Twelvecellsmergedtogether.'

>>>sheet.merge_cells('C5:D5')

>>>sheet['C5']='Twomergedcells.'

>>>wb.save('merged.xlsx')

Theargumenttomerge_cells()isasinglestringofthetop-leftandbottom-rightcellsoftherectangularareatobemerged:'A1:D3'merges12cellsintoasinglecell.Tosetthevalueofthesemergedcells,simplysetthevalueofthetop-leftcellofthemergedgroup.

Whenyourunthiscode,merged.xlsxwilllooklikeFigure12-7.

Figure12-7.Mergedcellsinaspreadsheet

Tounmergecells,calltheunmerge_cells()sheetmethod.Enterthisintotheinteractiveshell.

>>>importopenpyxl

>>>wb=openpyxl.load_workbook('merged.xlsx')

>>>sheet=wb.get_active_sheet()

>>>sheet.unmerge_cells('A1:D3')

>>>sheet.unmerge_cells('C5:D5')

>>>wb.save('merged.xlsx')

Ifyousaveyourchangesandthentakealookatthespreadsheet,you’llseethatthemergedcellshavegonebacktobeingindividualcells.

FreezePanesForspreadsheetstoolargetobedisplayedallatonce,it’shelpfulto“freeze”afewofthetoprowsorleftmostcolumnsonscreen.Frozencolumnorrowheaders,forexample,arealwaysvisibletotheuserevenastheyscrollthroughthespreadsheet.Theseareknownasfreezepanes.InOpenPyXL,eachWorksheetobjecthasafreeze_panesattributethatcanbesettoaCellobjectorastringofacell’scoordinates.Notethatallrowsaboveandallcolumnstotheleftofthiscellwillbefrozen,buttherowandcolumnofthecellitselfwillnotbefrozen.

Tounfreezeallpanes,setfreeze_panestoNoneor'A1'.Table12-3showswhichrowsandcolumnswillbefrozenforsomeexamplesettingsoffreeze_panes.

Table12-3.FrozenPaneExamples

freeze_panessetting Rowsandcolumnsfrozen

sheet.freeze_panes='A2' Row1

sheet.freeze_panes='B1' ColumnA

sheet.freeze_panes='C1' ColumnsAandB

sheet.freeze_panes='C2' Row1andcolumnsAandB

sheet.freeze_panes='A1'orsheet.freeze_panes=None Nofrozenpanes

Makesureyouhavetheproducesalesspreadsheetfromhttp://nostarch.com/automatestuff/.Thenenterthefollowingintotheinteractiveshell:

>>>importopenpyxl

>>>wb=openpyxl.load_workbook('produceSales.xlsx')

>>>sheet=wb.get_active_sheet()

>>>sheet.freeze_panes='A2'

>>>wb.save('freezeExample.xlsx')

Ifyousetthefreeze_panesattributeto'A2',row1willalwaysbeviewable,nomatterwheretheuserscrollsinthespreadsheet.YoucanseethisinFigure12-8.

Figure12-8.Withfreeze_panessetto'A2',row1isalwaysvisibleevenastheuserscrollsdown.

ChartsOpenPyXLsupportscreatingbar,line,scatter,andpiechartsusingthedatainasheet’scells.Tomakeachart,youneedtodothefollowing:

1. CreateaReferenceobjectfromarectangularselectionofcells.2. CreateaSeriesobjectbypassingintheReferenceobject.3. CreateaChartobject.4. AppendtheSeriesobjecttotheChartobject.5. Optionally,setthedrawing.top,drawing.left,drawing.width,and

drawing.heightvariablesoftheChartobject.6. AddtheChartobjecttotheWorksheetobject.

TheReferenceobjectrequiressomeexplaining.Referenceobjectsarecreatedbycallingtheopenpyxl.charts.Reference()functionandpassingthreearguments:

1. TheWorksheetobjectcontainingyourchartdata.2. Atupleoftwointegers,representingthetop-leftcelloftherectangularselectionof

cellscontainingyourchartdata:Thefirstintegerinthetupleistherow,andthesecondisthecolumn.Notethat1isthefirstrow,not0.

3. Atupleoftwointegers,representingthebottom-rightcelloftherectangularselectionofcellscontainingyourchartdata:Thefirstintegerinthetupleistherow,andthesecondisthecolumn.

Figure12-9showssomesamplecoordinatearguments.

Figure12-9.Fromlefttoright:(1,1),(10,1);(3,2),(6,4);(5,3),(5,3)

Enterthisinteractiveshellexampletocreateabarchartandaddittothespreadsheet:>>>importopenpyxl

>>>wb=openpyxl.Workbook()

>>>sheet=wb.get_active_sheet()

>>>foriinrange(1,11):#createsomedataincolumnA

sheet['A'+str(i)]=i

>>>refObj=openpyxl.charts.Reference(sheet,(1,1),(10,1))

>>>seriesObj=openpyxl.charts.Series(refObj,title='Firstseries')

>>>chartObj=openpyxl.charts.BarChart()

>>>chartObj.append(seriesObj)

>>>chartObj.drawing.top=50#settheposition

>>>chartObj.drawing.left=100

>>>chartObj.drawing.width=300#setthesize

>>>chartObj.drawing.height=200

>>>sheet.add_chart(chartObj)

>>>wb.save('sampleChart.xlsx')

ThisproducesaspreadsheetthatlookslikeFigure12-10.

Figure12-10.Aspreadsheetwithachartadded

We’vecreatedabarchartbycallingopenpyxl.charts.BarChart().Youcanalsocreatelinecharts,scattercharts,andpiechartsbycallingopenpyxl.charts.LineChart(),openpyxl.charts.ScatterChart(),andopenpyxl.charts.PieChart().

Unfortunately,inthecurrentversionofOpenPyXL(2.1.4),theload_workbook()functiondoesnotloadchartsinExcelfiles.EveniftheExcelfilehascharts,theloadedWorkbookobjectwillnotincludethem.IfyouloadaWorkbookobjectandimmediatelysaveittothesame.xlsxfilename,youwilleffectivelyremovethechartsfromit.

SummaryOftenthehardpartofprocessinginformationisn’ttheprocessingitselfbutsimplygettingthedataintherightformatforyourprogram.ButonceyouhaveyourspreadsheetloadedintoPython,youcanextractandmanipulateitsdatamuchfasterthanyoucouldbyhand.

Youcanalsogeneratespreadsheetsasoutputfromyourprograms.SoifcolleaguesneedyourtextfileorPDFofthousandsofsalescontactstransferredtoaspreadsheetfile,youwon’thavetotediouslycopyandpasteitallintoExcel.

Equippedwiththeopenpyxlmoduleandsomeprogrammingknowledge,you’llfindprocessingeventhebiggestspreadsheetsapieceofcake.

PracticeQuestionsForthefollowingquestions,imagineyouhaveaWorkbookobjectinthevariablewb,aWorksheetobjectinsheet,aCellobjectincell,aCommentobjectincomm,andanImageobjectinimg.

Q: 1.Whatdoestheopenpyxl.load_workbook()functionreturn?

Q: 2.Whatdoestheget_sheet_names()workbookmethodreturn?

Q: 3.HowwouldyouretrievetheWorksheetobjectforasheetnamed'Sheet1'?

Q: 4.HowwouldyouretrievetheWorksheetobjectfortheworkbook’sactivesheet?

Q: 5.HowwouldyouretrievethevalueinthecellC5?

Q: 6.HowwouldyousetthevalueinthecellC5to"Hello"?

Q: 7.Howwouldyouretrievethecell’srowandcolumnasintegers?

Q: 8.Whatdotheget_highest_column()andget_highest_row()sheetmethodsreturn,andwhatisthedatatypeofthesereturnvalues?

Q: 9.Ifyouneededtogettheintegerindexforcolumn'M',whatfunctionwouldyouneedtocall?

Q: 10.Ifyouneededtogetthestringnameforcolumn14,whatfunctionwouldyouneedtocall?

Q: 11.HowcanyouretrieveatupleofalltheCellobjectsfromA1toF1?

Q: 12.Howwouldyousavetheworkbooktothefilenameexample.xlsx?

Q: 13.Howdoyousetaformulainacell?

Q: 14.Ifyouwanttoretrievetheresultofacell’sformulainsteadofthecell’sformulaitself,whatmustyoudofirst?

Q: 15.Howwouldyousettheheightofrow5to100?

Q: 16.HowwouldyouhidecolumnC?

Q: 17.NameafewfeaturesthatOpenPyXL2.1.4doesnotloadfromaspreadsheetfile.

Q: 18.Whatisafreezepane?

Q: 19.Whatfivefunctionsandmethodsdoyouhavetocalltocreateabarchart?

PracticeProjectsForpractice,writeprogramsthatperformthefollowingtasks.

MultiplicationTableMakerCreateaprogrammultiplicationTable.pythattakesanumberNfromthecommandlineandcreatesanN×NmultiplicationtableinanExcelspreadsheet.Forexample,whentheprogramisrunlikethis:

pymultiplicationTable.py6

…itshouldcreateaspreadsheetthatlookslikeFigure12-11.

Figure12-11.Amultiplicationtablegeneratedinaspreadsheet

Row1andcolumnAshouldbeusedforlabelsandshouldbeinbold.

BlankRowInserterCreateaprogramblankRowInserter.pythattakestwointegersandafilenamestringascommandlinearguments.Let’scallthefirstintegerNandthesecondintegerM.StartingatrowN,theprogramshouldinsertMblankrowsintothespreadsheet.Forexample,whentheprogramisrunlikethis:

pythonblankRowInserter.py32myProduce.xlsx

…the“before”and“after”spreadsheetsshouldlooklikeFigure12-12.

Figure12-12.Before(left)andafter(right)thetwoblankrowsareinsertedatrow3

Youcanwritethisprogrambyreadinginthecontentsofthespreadsheet.Then,whenwritingoutthenewspreadsheet,useaforlooptocopythefirstNlines.Fortheremaininglines,addMtotherownumberintheoutputspreadsheet.

SpreadsheetCellInverter

Writeaprogramtoinverttherowandcolumnofthecellsinthespreadsheet.Forexample,thevalueatrow5,column3willbeatrow3,column5(andviceversa).Thisshouldbedoneforallcellsinthespreadsheet.Forexample,the“before”and“after”spreadsheetswouldlooksomethinglikeFigure12-13.

Figure12-13.Thespreadsheetbefore(top)andafter(bottom)inversion

Youcanwritethisprogrambyusingnestedforloopstoreadinthespreadsheet’sdataintoalistoflistsdatastructure.ThisdatastructurecouldhavesheetData[x][y]forthecellatcolumnxandrowy.Then,whenwritingoutthenewspreadsheet,usesheetData[y][x]forthecellatcolumnxandrowy.

TextFilestoSpreadsheetWriteaprogramtoreadinthecontentsofseveraltextfiles(youcanmakethetextfilesyourself)andinsertthosecontentsintoaspreadsheet,withonelineoftextperrow.ThelinesofthefirsttextfilewillbeinthecellsofcolumnA,thelinesofthesecondtextfilewillbeinthecellsofcolumnB,andsoon.

Usethereadlines()Fileobjectmethodtoreturnalistofstrings,onestringperlineinthefile.Forthefirstfile,outputthefirstlinetocolumn1,row1.Thesecondlineshouldbewrittentocolumn1,row2,andsoon.Thenextfilethatisreadwithreadlines()willbewrittentocolumn2,thenextfiletocolumn3,andsoon.

SpreadsheettoTextFilesWriteaprogramthatperformsthetasksofthepreviousprograminreverseorder:TheprogramshouldopenaspreadsheetandwritethecellsofcolumnAintoonetextfile,the

cellsofcolumnBintoanothertextfile,andsoon.

Chapter13.WorkingwithPDFandwordDocumentsPDFandWorddocumentsarebinaryfiles,whichmakesthemmuchmorecomplexthanplaintextfiles.Inadditiontotext,theystorelotsoffont,color,andlayoutinformation.IfyouwantyourprogramstoreadorwritetoPDFsorWorddocuments,you’llneedtodomorethansimplypasstheirfilenamestoopen().

Fortunately,therearePythonmodulesthatmakeiteasyforyoutointeractwithPDFsandWorddocuments.Thischapterwillcovertwosuchmodules:PyPDF2andPython-Docx.

PDFDocumentsPDFstandsforPortableDocumentFormatandusesthe.pdffileextension.AlthoughPDFssupportmanyfeatures,thischapterwillfocusonthetwothingsyou’llbedoingmostoftenwiththem:readingtextcontentfromPDFsandcraftingnewPDFsfromexistingdocuments.

Themoduleyou’llusetoworkwithPDFsisPyPDF2.Toinstallit,runpipinstallPyPDF2fromthecommandline.Thismodulenameiscasesensitive,somakesuretheyislowercaseandeverythingelseisuppercase.(CheckoutAppendixAforfulldetailsaboutinstallingthird-partymodules.)Ifthemodulewasinstalledcorrectly,runningimportPyPDF2intheinteractiveshellshouldn’tdisplayanyerrors.

THEPROBLEMATICPDFFORMAT

WhilePDFfilesaregreatforlayingouttextinawaythat’seasyforpeopletoprintandread,they’renotstraightforwardforsoftwaretoparseintoplaintext.Assuch,PyPDF2mightmakemistakeswhenextractingtextfromaPDFandmayevenbeunabletoopensomePDFsatall.Thereisn’tmuchyoucandoaboutthis,unfortunately.PyPDF2maysimplybeunabletoworkwithsomeofyourparticularPDFfiles.Thatsaid,Ihaven’tfoundanyPDFfilessofarthatcan’tbeopenedwithPyPDF2.

ExtractingTextfromPDFsPyPDF2doesnothaveawaytoextractimages,charts,orothermediafromPDFdocuments,butitcanextracttextandreturnitasaPythonstring.TostartlearninghowPyPDF2works,we’lluseitontheexamplePDFshowninFigure13-1.

Figure13-1.ThePDFpagethatwewillbeextractingtextfrom

DownloadthisPDFfromhttp://nostarch.com/automatestuff/,andenterthefollowingintotheinteractiveshell:

>>>importPyPDF2

>>>pdfFileObj=open('meetingminutes.pdf','rb')

>>>pdfReader=PyPDF2.PdfFileReader(pdfFileObj)

➊>>>pdfReader.numPages

19

➋>>>pageObj=pdfReader.getPage(0)

➌>>>pageObj.extractText()

'OOFFFFIICCIIAALLBBOOAARRDDMMIINNUUTTEESSMeetingofMarch7,2015

\nTheBoardofElementaryandSecondaryEducationshallprovideleadership

andcreatepoliciesforeducationthatexpandopportunitiesforchildren,

empowerfamiliesandcommunities,andadvanceLouisianainanincreasingly

competitiveglobalmarket.BOARDofELEMENTARYandSECONDARYEDUCATION'

First,importthePyPDF2module.Thenopenmeetingminutes.pdfinreadbinarymodeandstoreitinpdfFileObj.TogetaPdfFileReaderobjectthatrepresentsthisPDF,callPyPDF2.PdfFileReader()andpassitpdfFileObj.StorethisPdfFileReaderobjectinpdfReader.

ThetotalnumberofpagesinthedocumentisstoredinthenumPagesattributeofaPdfFileReaderobject➊.TheexamplePDFhas19pages,butlet’sextracttextfromonlythefirstpage.

Toextracttextfromapage,youneedtogetaPageobject,whichrepresentsasinglepage

ofaPDF,fromaPdfFileReaderobject.YoucangetaPageobjectbycallingthegetPage()method➋onaPdfFileReaderobjectandpassingitthepagenumberofthepageyou’reinterestedin—inourcase,0.

PyPDF2usesazero-basedindexforgettingpages:Thefirstpageispage0,thesecondisIntroduction,andsoon.Thisisalwaysthecase,evenifpagesarenumbereddifferentlywithinthedocument.Forexample,sayyourPDFisathree-pageexcerptfromalongerreport,anditspagesarenumbered42,43,and44.Togetthefirstpageofthisdocument,youwouldwanttocallpdfReader.getPage(0),notgetPage(42)orgetPage(1).

OnceyouhaveyourPageobject,callitsextractText()methodtoreturnastringofthepage’stext➌.Thetextextractionisn’tperfect:ThetextCharlesE.“Chas”Roemer,PresidentfromthePDFisabsentfromthestringreturnedbyextractText(),andthespacingissometimesoff.Still,thisapproximationofthePDFtextcontentmaybegoodenoughforyourprogram.

DecryptingPDFsSomePDFdocumentshaveanencryptionfeaturethatwillkeepthemfrombeingreaduntilwhoeverisopeningthedocumentprovidesapassword.EnterthefollowingintotheinteractiveshellwiththePDFyoudownloaded,whichhasbeenencryptedwiththepasswordrosebud:

>>>importPyPDF2

>>>pdfReader=PyPDF2.PdfFileReader(open('encrypted.pdf','rb'))

➊>>>pdfReader.isEncrypted

True

>>>pdfReader.getPage(0)

➋Traceback(mostrecentcalllast):

File"<pyshell#173>",line1,in<module>

pdfReader.getPage()

--snip--

File"C:\Python34\lib\site-packages\PyPDF2\pdf.py",line1173,ingetObject

raiseutils.PdfReadError("filehasnotbeendecrypted")

PyPDF2.utils.PdfReadError:filehasnotbeendecrypted

➌>>>pdfReader.decrypt('rosebud')

1

>>>pageObj=pdfReader.getPage(0)

AllPdfFileReaderobjectshaveanisEncryptedattributethatisTrueifthePDFisencryptedandFalseifitisn’t➊.Anyattempttocallafunctionthatreadsthefilebeforeithasbeendecryptedwiththecorrectpasswordwillresultinanerror➋.

ToreadanencryptedPDF,callthedecrypt()functionandpassthepasswordasastring➌.Afteryoucalldecrypt()withthecorrectpassword,you’llseethatcallinggetPage()nolongercausesanerror.Ifgiventhewrongpassword,thedecrypt()functionwillreturn0andgetPage()willcontinuetofail.Notethatthedecrypt()methoddecryptsonlythePdfFileReaderobject,nottheactualPDFfile.Afteryourprogramterminates,thefileonyourharddriveremainsencrypted.Yourprogramwillhavetocalldecrypt()againthenexttimeitisrun.

CreatingPDFsPyPDF2’scounterparttoPdfFileReaderobjectsisPdfFileWriterobjects,whichcancreatenewPDFfiles.ButPyPDF2cannotwritearbitrarytexttoaPDFlikePythoncandowithplaintextfiles.Instead,PyPDF2’sPDF-writingcapabilitiesarelimitedtocopyingpagesfromotherPDFs,rotatingpages,overlayingpages,andencryptingfiles.

PyPDF2doesn’tallowyoutodirectlyeditaPDF.Instead,youhavetocreateanewPDFandthencopycontentoverfromanexistingdocument.Theexamplesinthissectionwillfollowthisgeneralapproach:

1. OpenoneormoreexistingPDFs(thesourcePDFs)intoPdfFileReaderobjects.2. CreateanewPdfFileWriterobject.3. CopypagesfromthePdfFileReaderobjectsintothePdfFileWriterobject.4. Finally,usethePdfFileWriterobjecttowritetheoutputPDF.

CreatingaPdfFileWriterobjectcreatesonlyavaluethatrepresentsaPDFdocumentinPython.Itdoesn’tcreatetheactualPDFfile.Forthat,youmustcallthePdfFileWriter’swrite()method.

Thewrite()methodtakesaregularFileobjectthathasbeenopenedinwrite-binarymode.YoucangetsuchaFileobjectbycallingPython’sopen()functionwithtwoarguments:thestringofwhatyouwantthePDF’sfilenametobeand'wb'toindicatethefileshouldbeopenedinwrite-binarymode.

Ifthissoundsalittleconfusing,don’tworry—you’llseehowthisworksinthefollowingcodeexamples.

CopyingPages

YoucanusePyPDF2tocopypagesfromonePDFdocumenttoanother.ThisallowsyoutocombinemultiplePDFfiles,cutunwantedpages,orreorderpages.

Downloadmeetingminutes.pdfandmeetingminutes2.pdffromhttp://nostarch.com/automatestuff/andplacethePDFsinthecurrentworkingdirectory.Enterthefollowingintotheinteractiveshell:

>>>importPyPDF2

>>>pdf1File=open('meetingminutes.pdf','rb')

>>>pdf2File=open('meetingminutes2.pdf','rb')

➊>>>pdf1Reader=PyPDF2.PdfFileReader(pdf1File)

➋>>>pdf2Reader=PyPDF2.PdfFileReader(pdf2File)

➌>>>pdfWriter=PyPDF2.PdfFileWriter()

>>>forpageNuminrange(pdf1Reader.numPages):

➍pageObj=pdf1Reader.getPage(pageNum)

➎pdfWriter.addPage(pageObj)

>>>forpageNuminrange(pdf2Reader.numPages):

➏pageObj=pdf2Reader.getPage(pageNum)

➐pdfWriter.addPage(pageObj)

➑>>>pdfOutputFile=open('combinedminutes.pdf','wb')

>>>pdfWriter.write(pdfOutputFile)

>>>pdfOutputFile.close()

>>>pdf1File.close()

>>>pdf2File.close()

OpenbothPDFfilesinreadbinarymodeandstorethetworesultingFileobjectsinpdf1Fileandpdf2File.CallPyPDF2.PdfFileReader()andpassitpdf1FiletogetaPdfFileReaderobjectformeetingminutes.pdf➊.Callitagainandpassitpdf2FiletogetaPdfFileReaderobjectformeetingminutes2.pdf➋.ThencreateanewPdfFileWriterobject,whichrepresentsablankPDFdocument➌.

Next,copyallthepagesfromthetwosourcePDFsandaddthemtothePdfFileWriterobject.GetthePageobjectbycallinggetPage()onaPdfFileReaderobject➍.ThenpassthatPageobjecttoyourPdfFileWriter’saddPage()method➎.Thesestepsaredonefirst

forpdf1Readerandthenagainforpdf2Reader.Whenyou’redonecopyingpages,writeanewPDFcalledcombinedminutes.pdfbypassingaFileobjecttothePdfFileWriter’swrite()method➏.

NOTE

PyPDF2cannotinsertpagesinthemiddleofaPdfFileWriterobject;theaddPage()methodwillonlyaddpagestotheend.

YouhavenowcreatedanewPDFfilethatcombinesthepagesfrommeetingminutes.pdfandmeetingminutes2.pdfintoasingledocument.RememberthattheFileobjectpassedtoPyPDF2.PdfFileReader()needstobeopenedinread-binarymodebypassing'rb'asthesecondargumenttoopen().Likewise,theFileobjectpassedtoPyPDF2.PdfFileWriter()needstobeopenedinwrite-binarymodewith'wb'.

RotatingPages

ThepagesofaPDFcanalsoberotatedin90-degreeincrementswiththerotateClockwise()androtateCounterClockwise()methods.Passoneoftheintegers90,180,or270tothesemethods.Enterthefollowingintotheinteractiveshell,withthemeetingminutes.pdffileinthecurrentworkingdirectory:

>>>importPyPDF2

>>>minutesFile=open('meetingminutes.pdf','rb')

>>>pdfReader=PyPDF2.PdfFileReader(minutesFile)

➊>>>page=pdfReader.getPage(0)

➋>>>page.rotateClockwise(90)

{'/Contents':[IndirectObject(961,0),IndirectObject(962,0),

--snip--

}

>>>pdfWriter=PyPDF2.PdfFileWriter()

>>>pdfWriter.addPage(page)

➌>>>resultPdfFile=open('rotatedPage.pdf','wb')

>>>pdfWriter.write(resultPdfFile)

>>>resultPdfFile.close()

>>>minutesFile.close()

HereweusegetPage(0)toselectthefirstpageofthePDF➊,andthenwecallrotateClockwise(90)onthatpage➋.WewriteanewPDFwiththerotatedpageandsaveitasrotatedPage.pdf➌.

TheresultingPDFwillhaveonepage,rotated90degreesclockwise,asinFigure13-2.ThereturnvaluesfromrotateClockwise()androtateCounterClockwise()containalotofinformationthatyoucanignore.

Figure13-2.TherotatedPage.pdffilewiththepagerotated90degreesclockwise

OverlayingPages

PyPDF2canalsooverlaythecontentsofonepageoveranother,whichisusefulforaddingalogo,timestamp,orwatermarktoapage.WithPython,it’seasytoaddwatermarkstomultiplefilesandonlytopagesyourprogramspecifies.

Downloadwatermark.pdffromhttp://nostarch.com/automatestuff/andplacethePDFinthecurrentworkingdirectoryalongwithmeetingminutes.pdf.Thenenterthefollowingintotheinteractiveshell:

>>>importPyPDF2

>>>minutesFile=open('meetingminutes.pdf','rb')

➋>>>pdfReader=PyPDF2.PdfFileReader(minutesFile)

➋>>>minutesFirstPage=pdfReader.getPage(0)

➌>>>pdfWatermarkReader=PyPDF2.PdfFileReader(open('watermark.pdf','rb'))

➍>>>minutesFirstPage.mergePage(pdfWatermarkReader.getPage(0))

➎>>>pdfWriter=PyPDF2.PdfFileWriter()

➏>>>pdfWriter.addPage(minutesFirstPage)

➐>>>forpageNuminrange(1,pdfReader.numPages):

pageObj=pdfReader.getPage(pageNum)

pdfWriter.addPage(pageObj)

>>>resultPdfFile=open('watermarkedCover.pdf','wb')

>>>pdfWriter.write(resultPdfFile)

>>>minutesFile.close()

>>>resultPdfFile.close()

HerewemakeaPdfFileReaderobjectofmeetingminutes.pdf➊.WecallgetPage(0)togetaPageobjectforthefirstpageandstorethisobjectinminutesFirstPage➋.WethenmakeaPdfFileReaderobjectforwatermark.pdf➌andcallmergePage()onminutesFirstPage➍.TheargumentwepasstomergePage()isaPageobjectforthefirstpageofwatermark.pdf.

Nowthatwe’vecalledmergePage()onminutesFirstPage,minutesFirstPagerepresentsthewatermarkedfirstpage.WemakeaPdfFileWriterobject➎andaddthewatermarkedfirstpage➏.Thenweloopthroughtherestofthepagesinmeetingminutes.pdfandaddthemtothePdfFileWriterobject➐.Finally,weopenanewPDFcalledwatermarkedCover.pdfandwritethecontentsofthePdfFileWritertothenewPDF.

Figure13-3showstheresults.OurnewPDF,watermarkedCover.pdf,hasallthecontentsofthemeetingminutes.pdf,andthefirstpageiswatermarked.

Figure13-3.TheoriginalPDF(left),thewatermarkPDF(center),andthemergedPDF(right)

EncryptingPDFs

APdfFileWriterobjectcanalsoaddencryptiontoaPDFdocument.Enterthefollowingintotheinteractiveshell:

>>>importPyPDF2

>>>pdfFile=open('meetingminutes.pdf','rb')

>>>pdfReader=PyPDF2.PdfFileReader(pdfFile)

>>>pdfWriter=PyPDF2.PdfFileWriter()

>>>forpageNuminrange(pdfReader.numPages):

pdfWriter.addPage(pdfReader.getPage(pageNum))

➊>>>pdfWriter.encrypt('swordfish')

>>>resultPdf=open('encryptedminutes.pdf','wb')

>>>pdfWriter.write(resultPdf)

>>>resultPdf.close()

Beforecallingthewrite()methodtosavetoafile,calltheencrypt()methodandpassitapasswordstring➊.PDFscanhaveauserpassword(allowingyoutoviewthePDF)andanownerpassword(allowingyoutosetpermissionsforprinting,commenting,extractingtext,andotherfeatures).Theuserpasswordandownerpasswordarethefirstandsecondargumentstoencrypt(),respectively.Ifonlyonestringargumentispassedtoencrypt(),itwillbeusedforbothpasswords.

Inthisexample,wecopiedthepagesofmeetingminutes.pdftoaPdfFileWriterobject.WeencryptedthePdfFileWriterwiththepasswordswordfish,openedanewPDFcalledencryptedminutes.pdf,andwrotethecontentsofthePdfFileWritertothenewPDF.Beforeanyonecanviewencryptedminutes.pdf,they’llhavetoenterthispassword.Youmaywanttodeletetheoriginal,unencryptedmeetingminutes.pdffileafterensuringitscopywascorrectlyencrypted.

Project:CombiningSelectPagesfromManyPDFsSayyouhavetheboringjobofmergingseveraldozenPDFdocumentsintoasinglePDFfile.Eachofthemhasacoversheetasthefirstpage,butyoudon’twantthecoversheetrepeatedinthefinalresult.EventhoughtherearelotsoffreeprogramsforcombiningPDFs,manyofthemsimplymergeentirefilestogether.Let’swriteaPythonprogramtocustomizewhichpagesyouwantinthecombinedPDF.

Atahighlevel,here’swhattheprogramwilldo:

FindallPDFfilesinthecurrentworkingdirectory.SortthefilenamessothePDFsareaddedinorder.Writeeachpage,excludingthefirstpage,ofeachPDFtotheoutputfile.Intermsofimplementation,yourcodewillneedtodothefollowing:Callos.listdir()tofindallthefilesintheworkingdirectoryandremoveanynon-PDFfiles.CallPython’ssort()listmethodtoalphabetizethefilenames.CreateaPdfFileWriterobjectfortheoutputPDF.LoopovereachPDFfile,creatingaPdfFileReaderobjectforit.Loopovereachpage(exceptthefirst)ineachPDFfile.AddthepagestotheoutputPDF.WritetheoutputPDFtoafilenamedallminutes.pdf.

Forthisproject,openanewfileeditorwindowandsaveitascombinePdfs.py.

Step1:FindAllPDFFilesFirst,yourprogramneedstogetalistofallfileswiththe.pdfextensioninthecurrentworkingdirectoryandsortthem.Makeyourcodelooklikethefollowing:

#!python3

#combinePdfs.py-CombinesallthePDFsinthecurrentworkingdirectoryinto

#intoasinglePDF.

➊importPyPDF2,os

#GetallthePDFfilenames.

pdfFiles=[]

forfilenameinos.listdir('.'):

iffilename.endswith('.pdf'):

➋pdfFiles.append(filename)

➌pdfFiles.sort(key/str.lower)

➍pdfWriter=PyPDF2.PdfFileWriter()

#TODO:LoopthroughallthePDFfiles.

#TODO:Loopthroughallthepages(exceptthefirst)andaddthem.

#TODO:SavetheresultingPDFtoafile.

Aftertheshebanglineandthedescriptivecommentaboutwhattheprogramdoes,thiscodeimportstheosandPyPDF2modules➊.Theos.listdir('.')callwillreturnalistofeveryfileinthecurrentworkingdirectory.Thecodeloopsoverthislistandaddsonlythosefileswiththe.pdfextensiontopdfFiles➋.Afterward,thislistissortedinalphabeticalorderwiththekey/str.lowerkeywordargumenttosort()➌.

APdfFileWriterobjectiscreatedtoholdthecombinedPDFpages➍.Finally,afew

commentsoutlinetherestoftheprogram.

Step2:OpenEachPDFNowtheprogrammustreadeachPDFfileinpdfFiles.Addthefollowingtoyourprogram:

#!python3

#combinePdfs.py-CombinesallthePDFsinthecurrentworkingdirectoryinto

#asinglePDF.

importPyPDF2,os

#GetallthePDFfilenames.

pdfFiles=[]

--snip--

#LoopthroughallthePDFfiles.

forfilenameinpdfFiles:

pdfFileObj=open(filename,'rb')

pdfReader=PyPDF2.PdfFileReader(pdfFileObj)

#TODO:Loopthroughallthepages(exceptthefirst)andaddthem.

#TODO:SavetheresultingPDFtoafile.

ForeachPDF,theloopopensafilenameinread-binarymodebycallingopen()with'rb'asthesecondargument.Theopen()callreturnsaFileobject,whichgetspassedtoPyPDF2.PdfFileReader()tocreateaPdfFileReaderobjectforthatPDFfile.

Step3:AddEachPageForeachPDF,you’llwanttoloopovereverypageexceptthefirst.Addthiscodetoyourprogram:

#!python3

#combinePdfs.py-CombinesallthePDFsinthecurrentworkingdirectoryinto

#asinglePDF.

importPyPDF2,os

--snip--

#LoopthroughallthePDFfiles.

forfilenameinpdfFiles:

--snip--

#Loopthroughallthepages(exceptthefirst)andaddthem.

➊forpageNuminrange(1,pdfReader.numPages):

pageObj=pdfReader.getPage(pageNum)

pdfWriter.addPage(pageObj)

#TODO:SavetheresultingPDFtoafile.

ThecodeinsidetheforloopcopieseachPageobjectindividuallytothePdfFileWriterobject.Remember,youwanttoskipthefirstpage.SincePyPDF2considers0tobethefirstpage,yourloopshouldstartat1➊andthengoupto,butnotinclude,theintegerinpdfReader.numPages.

Step4:SavetheResultsAfterthesenestedforloopsaredonelooping,thepdfWritervariablewillcontainaPdfFileWriterobjectwiththepagesforallthePDFscombined.Thelaststepistowritethiscontenttoafileontheharddrive.Addthiscodetoyourprogram:

#!python3

#combinePdfs.py-CombinesallthePDFsinthecurrentworkingdirectoryinto

#asinglePDF.

importPyPDF2,os

--snip--

#LoopthroughallthePDFfiles.

forfilenameinpdfFiles:

--snip--

#Loopthroughallthepages(exceptthefirst)andaddthem.

forpageNuminrange(1,pdfReader.numPages):

--snip--

#SavetheresultingPDFtoafile.

pdfOutput=open('allminutes.pdf','wb')

pdfWriter.write(pdfOutput)

pdfOutput.close()

Passing'wb'toopen()openstheoutputPDFfile,allminutes.pdf,inwrite-binarymode.Then,passingtheresultingFileobjecttothewrite()methodcreatestheactualPDFfile.Acalltotheclose()methodfinishestheprogram.

IdeasforSimilarProgramsBeingabletocreatePDFsfromthepagesofotherPDFswillletyoumakeprogramsthatcandothefollowing:

CutoutspecificpagesfromPDFs.ReorderpagesinaPDF.CreateaPDFfromonlythosepagesthathavesomespecifictext,identifiedbyextractText().

WordDocumentsPythoncancreateandmodifyWorddocuments,whichhavethe.docxfileextension,withthepython-docxmodule.Youcaninstallthemodulebyrunningpipinstallpython-docx.(AppendixAhasfulldetailsoninstallingthird-partymodules.)

NOTE

WhenusingpiptofirstinstallPython-Docx,besuretoinstallpython-docx,notdocx.Theinstallationnamedocxisforadifferentmodulethatthisbookdoesnotcover.However,whenyouaregoingtoimportthepython-docxmodule,you’llneedtorunimportdocx,notimportpython-docx.

Ifyoudon’thaveWord,LibreOfficeWriterandOpenOfficeWriterarebothfreealternativeapplicationsforWindows,OSX,andLinuxthatcanbeusedtoopen.docxfiles.Youcandownloadthemfromhttps://www.libreoffice.organdhttp://openoffice.org,respectively.ThefulldocumentationforPython-Docxisavailableathttps://python-docx.readthedocs.org/.AlthoughthereisaversionofWordforOSX,thischapterwillfocusonWordforWindows.

Comparedtoplaintext,.docxfileshavealotofstructure.ThisstructureisrepresentedbythreedifferentdatatypesinPython-Docx.Atthehighestlevel,aDocumentobjectrepresentstheentiredocument.TheDocumentobjectcontainsalistofParagraphobjectsfortheparagraphsinthedocument.(AnewparagraphbeginswhenevertheuserpressesENTERorRETURNwhiletypinginaWorddocument.)EachoftheseParagraphobjectscontainsalistofoneormoreRunobjects.Thesingle-sentenceparagraphinFigure13-4hasfourruns.

Figure13-4.TheRunobjectsidentifiedinaParagraphobject

ThetextinaWorddocumentismorethanjustastring.Ithasfont,size,color,andotherstylinginformationassociatedwithit.AstyleinWordisacollectionoftheseattributes.ARunobjectisacontiguousrunoftextwiththesamestyle.AnewRunobjectisneededwheneverthetextstylechanges.

ReadingWordDocumentsLet’sexperimentwiththepython-docxmodule.Downloaddemo.docxfromhttp://nostarch.com/automatestuff/andsavethedocumenttotheworkingdirectory.Thenenterthefollowingintotheinteractiveshell:

>>>importdocx

➊>>>doc=docx.Document('demo.docx')

➋>>>len(doc.paragraphs)

7

➌>>>doc.paragraphs[0].text

'DocumentTitle'

➍>>>doc.paragraphs[1].text

'Aplainparagraphwithsomeboldandsomeitalic'

➎>>>len(doc.paragraphs[1].runs)

4

➏>>>doc.paragraphs[1].runs[0].text

'Aplainparagraphwithsome'

➐>>>doc.paragraphs[1].runs[1].text

'bold'

➑>>>doc.paragraphs[1].runs[2].text

'andsome'

➒>>>doc.paragraphs[1].runs[3].text

'italic'

At➊,weopena.docxfileinPython,calldocx.Document(),andpassthefilenamedemo.docx.ThiswillreturnaDocumentobject,whichhasaparagraphsattributethatisalistofParagraphobjects.Whenwecalllen()ondoc.paragraphs,itreturns7,whichtellsusthattherearesevenParagraphobjectsinthisdocument➋.EachoftheseParagraphobjectshasatextattributethatcontainsastringofthetextinthatparagraph(withoutthestyleinformation).Here,thefirsttextattributecontains'DocumentTitle'➌,andthesecondcontains'Aplainparagraphwithsomeboldandsomeitalic'➍.

EachParagraphobjectalsohasarunsattributethatisalistofRunobjects.Runobjectsalsohaveatextattribute,containingjustthetextinthatparticularrun.Let’slookatthetextattributesinthesecondParagraphobject,'Aplainparagraphwithsomeboldandsomeitalic'.Callinglen()onthisParagraphobjecttellsusthattherearefourRunobjects➎.Thefirstrunobjectcontains'Aplainparagraphwithsome'➏.Then,thetextchangetoaboldstyle,so'bold'startsanewRunobject➐.Thetextreturnstoanunboldedstyleafterthat,whichresultsinathirdRunobject,'andsome'➑.Finally,thefourthandlastRunobjectcontains'italic'inanitalicstyle➒.

WithPython-Docx,yourPythonprogramswillnowbeabletoreadthetextfroma.docxfileanduseitjustlikeanyotherstringvalue.

GettingtheFullTextfroma.docxFileIfyoucareonlyaboutthetext,notthestylinginformation,intheWorddocument,youcanusethegetText()function.Itacceptsafilenameofa.docxfileandreturnsasinglestringvalueofitstext.Openanewfileeditorwindowandenterthefollowingcode,savingitasreadDocx.py:

#!python3

importdocx

defgetText(filename):

doc=docx.Document(filename)

fullText=[]

forparaindoc.paragraphs:

fullText.append(para.text)

return'\n'.join(fullText)

ThegetText()functionopenstheWorddocument,loopsoveralltheParagraphobjectsintheparagraphslist,andthenappendstheirtexttothelistinfullText.Aftertheloop,thestringsinfullTextarejoinedtogetherwithnewlinecharacters.

ThereadDocx.pyprogramcanbeimportedlikeanyothermodule.NowifyoujustneedthetextfromaWorddocument,youcanenterthefollowing:

>>>importreadDocx

>>>print(readDocx.getText('demo.docx'))

DocumentTitle

Aplainparagraphwithsomeboldandsomeitalic

Heading,level1

Intensequote

firstiteminunorderedlist

firstiteminorderedlist

YoucanalsoadjustgetText()tomodifythestringbeforereturningit.Forexample,to

indenteachparagraph,replacetheappend()callinreadDocx.pywiththis:fullText.append(''+para.text)

Toaddadoublespaceinbetweenparagraphs,changethejoin()callcodetothis:return'\n\n'.join(fullText)

Asyoucansee,ittakesonlyafewlinesofcodetowritefunctionsthatwillreada.docxfileandreturnastringofitscontenttoyourliking.

StylingParagraphandRunObjectsInWordforWindows,youcanseethestylesbypressingCTRL-ALT-SHIFT-StodisplaytheStylespane,whichlookslikeFigure13-5.OnOSX,youcanviewtheStylespanebyclickingtheView▸Stylesmenuitem.

Figure13-5.DisplaytheStylespanebypressingCTRL-ALT-SHIFT-SonWindows.

Wordandotherwordprocessorsusestylestokeepthevisualpresentationofsimilartypesoftextconsistentandeasytochange.Forexample,perhapsyouwanttosetbodyparagraphsin11-point,TimesNewRoman,left-justified,ragged-righttext.Youcancreateastylewiththesesettingsandassignittoallbodyparagraphs.Then,ifyoulaterwanttochangethepresentationofallbodyparagraphsinthedocument,youcanjustchangethestyle,andallthoseparagraphswillbeautomaticallyupdated.

ForWorddocuments,therearethreetypesofstyles:ParagraphstylescanbeappliedtoParagraphobjects,characterstylescanbeappliedtoRunobjects,andlinkedstylescanbe

appliedtobothkindsofobjects.YoucangivebothParagraphandRunobjectsstylesbysettingtheirstyleattributetoastring.Thisstringshouldbethenameofastyle.IfstyleissettoNone,thentherewillbenostyleassociatedwiththeParagraphorRunobject.

ThestringvaluesforthedefaultWordstylesareasfollows:

'Normal' 'Heading5' 'ListBullet' 'ListParagraph'

'BodyText' 'Heading6' 'ListBullet2' 'MacroText'

'BodyText2' 'Heading7' 'ListBullet3' 'NoSpacing'

'BodyText3' 'Heading8' 'ListContinue' 'Quote'

'Caption' 'Heading9' 'ListContinue2' 'Subtitle'

'Heading1' 'IntenseQuote' 'ListContinue3' 'TOCHeading'

'Heading2' 'List' 'ListNumber' 'Title'

'Heading3' 'List2' 'ListNumber2'

'Heading4' 'List3' 'ListNumber3'

Whensettingthestyleattribute,donotusespacesinthestylename.Forexample,whilethestylenamemaybeSubtleEmphasis,youshouldsetthestyleattributetothestringvalue'SubtleEmphasis'insteadof'SubtleEmphasis'.IncludingspaceswillcauseWordtomisreadthestylenameandnotapplyit.

WhenusingalinkedstyleforaRunobject,youwillneedtoadd'Char'totheendofitsname.Forexample,tosettheQuotelinkedstyleforaParagraphobject,youwoulduseparagraphObj.style='Quote',butforaRunobject,youwoulduserunObj.style='QuoteChar'.

InthecurrentversionofPython-Docx(0.7.4),theonlystylesthatcanbeusedarethedefaultWordstylesandthestylesintheopened.docx.Newstylescannotbecreated—thoughthismaychangeinfutureversionsofPython-Docx.

CreatingWordDocumentswithNondefaultStylesIfyouwanttocreateWorddocumentsthatusestylesbeyondthedefaultones,youwillneedtoopenWordtoablankWorddocumentandcreatethestylesyourselfbyclickingtheNewStylebuttonatthebottomoftheStylespane(Figure13-6showsthisonWindows).

ThiswillopentheCreateNewStylefromFormattingdialog,whereyoucanenterthenewstyle.Then,gobackintotheinteractiveshellandopenthisblankWorddocumentwithdocx.Document(),usingitasthebaseforyourWorddocument.ThenameyougavethisstylewillnowbeavailabletousewithPython-Docx.

Figure13-6.TheNewStylebutton(left)andtheCreateNewStylefromFormattingdialog(right)

RunAttributesRunscanbefurtherstyledusingtextattributes.Eachattributecanbesettooneofthreevalues:True(theattributeisalwaysenabled,nomatterwhatotherstylesareappliedtotherun),False(theattributeisalwaysdisabled),orNone(defaultstowhatevertherun’sstyleissetto).

Table13-1liststhetextattributesthatcanbesetonRunobjects.

Table13-1.RunObjecttextAttributes

Attribute Description

bold Thetextappearsinbold.

italic Thetextappearsinitalic.

underline Thetextisunderlined.

strike Thetextappearswithstrikethrough.

double_strike Thetextappearswithdoublestrikethrough.

all_caps Thetextappearsincapitalletters.

small_caps Thetextappearsincapitalletters,withlowercaseletterstwopointssmaller.

shadow Thetextappearswithashadow.

outline Thetextappearsoutlinedratherthansolid.

rtl Thetextiswrittenright-to-left.

imprint Thetextappearspressedintothepage.

emboss Thetextappearsraisedoffthepageinrelief.

Forexample,tochangethestylesofdemo.docx,enterthefollowingintotheinteractiveshell:

>>>doc=docx.Document('demo.docx')

>>>doc.paragraphs[0].text

'DocumentTitle'

>>>doc.paragraphs[0].style

'Title'

>>>doc.paragraphs[0].style='Normal'

>>>doc.paragraphs[1].text

'Aplainparagraphwithsomeboldandsomeitalic'

>>>(doc.paragraphs[1].runs[0].text,doc.paragraphs[1].runs[1].text,doc.

paragraphs[1].runs[2].text,doc.paragraphs[1].runs[3].text)

('Aplainparagraphwithsome','bold','andsome','italic')

>>>doc.paragraphs[1].runs[0].style='QuoteChar'

>>>doc.paragraphs[1].runs[1].underline=True

>>>doc.paragraphs[1].runs[3].underline=True

>>>doc.save('restyled.docx')

Here,weusethetextandstyleattributestoeasilyseewhat’sintheparagraphsinourdocument.Wecanseethatit’ssimpletodivideaparagraphintorunsandaccesseachrunindividiaully.Sowegetthefirst,second,andfourthrunsinthesecondparagraph,styleeachrun,andsavetheresultstoanewdocument.

ThewordsDocumentTitleatthetopofrestyled.docxwillhavetheNormalstyleinsteadoftheTitlestyle,theRunobjectforthetextAplainparagraphwithsomewillhavetheQuoteCharstyle,andthetwoRunobjectsforthewordsboldanditalicwillhavetheirunderlineattributessettoTrue.Figure13-7showshowthestylesofparagraphsandruns

lookinrestyled.docx.

Figure13-7.Therestyled.docxfile

YoucanfindmorecompletedocumentationonPython-Docx’suseofstylesathttps://python-docx.readthedocs.org/en/latest/user/styles.html.

WritingWordDocumentsEnterthefollowingintotheinteractiveshell:

>>>importdocx

>>>doc=docx.Document()

>>>doc.add_paragraph('Helloworld!')

<docx.text.Paragraphobjectat0x0000000003B56F60>

>>>doc.save('helloworld.docx')

Tocreateyourown.docxfile,calldocx.Document()toreturnanew,blankWordDocumentobject.Theadd_paragraph()documentmethodaddsanewparagraphoftexttothedocumentandreturnsareferencetotheParagraphobjectthatwasadded.Whenyou’redoneaddingtext,passafilenamestringtothesave()documentmethodtosavetheDocumentobjecttoafile.

Thiswillcreateafilenamedhelloworld.docxinthecurrentworkingdirectorythat,whenopened,lookslikeFigure13-8.

Figure13-8.TheWorddocumentcreatedusingadd_paragraph('Helloworld!')

Youcanaddparagraphsbycallingtheadd_paragraph()methodagainwiththenewparagraph’stext.Ortoaddtexttotheendofanexistingparagraph,youcancalltheparagraph’sadd_run()methodandpassitastring.Enterthefollowingintotheinteractiveshell:

>>>importdocx

>>>doc=docx.Document()

>>>doc.add_paragraph('Helloworld!')

<docx.text.Paragraphobjectat0x000000000366AD30>

>>>paraObj1=doc.add_paragraph('Thisisasecondparagraph.')

>>>paraObj2=doc.add_paragraph('Thisisayetanotherparagraph.')

>>>paraObj1.add_run('Thistextisbeingaddedtothesecondparagraph.')

<docx.text.Runobjectat0x0000000003A2C860>

>>>doc.save('multipleParagraphs.docx')

TheresultingdocumentwilllooklikeFigure13-9.NotethatthetextThistextisbeingaddedtothesecondparagraph.wasaddedtotheParagraphobjectinparaObj1,whichwasthesecondparagraphaddedtodoc.Theadd_paragraph()andadd_run()functionsreturnparagraphandRunobjects,respectively,tosaveyouthetroubleofextractingthemasaseparatestep.

KeepinmindthatasofPython-Docxversion0.5.3,newParagraphobjectscanbeaddedonlytotheendofthedocument,andnewRunobjectscanbeaddedonlytotheendofaParagraphobject.

Thesave()methodcanbecalledagaintosavetheadditionalchangesyou’vemade.

Figure13-9.ThedocumentwithmultipleParagraphandRunobjectsadded

Bothadd_paragraph()andadd_run()acceptanoptionalsecondargumentthatisastringoftheParagraphorRunobject’sstyle.Forexample:

>>>doc.add_paragraph('Helloworld!','Title')

ThislineaddsaparagraphwiththetextHelloworld!intheTitlestyle.

AddingHeadingsCallingadd_heading()addsaparagraphwithoneoftheheadingstyles.Enterthefollowingintotheinteractiveshell:

>>>doc=docx.Document()

>>>doc.add_heading('Header0',0)

<docx.text.Paragraphobjectat0x00000000036CB3C8>

>>>doc.add_heading('Header1',1)

<docx.text.Paragraphobjectat0x00000000036CB630>

>>>doc.add_heading('Header2',2)

<docx.text.Paragraphobjectat0x00000000036CB828>

>>>doc.add_heading('Header3',3)

<docx.text.Paragraphobjectat0x00000000036CB2E8>

>>>doc.add_heading('Header4',4)

<docx.text.Paragraphobjectat0x00000000036CB3C8>

>>>doc.save('headings.docx')

Theargumentstoadd_heading()areastringoftheheadingtextandanintegerfrom0to4.Theinteger0makestheheadingtheTitlestyle,whichisusedforthetopofthedocument.Integers1to4areforvariousheadinglevels,with1beingthemainheadingand4thelowestsubheading.Theadd_heading()functionreturnsaParagraphobjecttosaveyouthestepofextractingitfromtheDocumentobjectasaseparatestep.

Theresultingheadings.docxfilewilllooklikeFigure13-10.

Figure13-10.Theheadings.docxdocumentwithheadings0to4

AddingLineandPageBreaksToaddalinebreak(ratherthanstartingawholenewparagraph),youcancalltheadd_break()methodontheRunobjectyouwanttohavethebreakappearafter.Ifyouwanttoaddapagebreakinstead,youneedtopassthevaluedocx.text.WD_BREAK.PAGEasaloneargumenttoadd_break(),asisdoneinthemiddleofthefollowingexample:

>>>doc=docx.Document()

>>>doc.add_paragraph('Thisisonthefirstpage!')

<docx.text.Paragraphobjectat0x0000000003785518>

➊>>>doc.paragraphs[0].runs[0].add_break(docx.text.WD_BREAK.PAGE)

>>>doc.add_paragraph('Thisisonthesecondpage!')

<docx.text.Paragraphobjectat0x00000000037855F8>

>>>doc.save('twoPage.docx')

Thiscreatesatwo-pageWorddocumentwithThisisonthefirstpage!onthefirstpageandThisisonthesecondpage!onthesecond.EventhoughtherewasstillplentyofspaceonthefirstpageafterthetextThisisonthefirstpage!,weforcedthenextparagraphtobeginonanewpagebyinsertingapagebreakafterthefirstrunofthefirstparagraph➊.

AddingPicturesDocumentobjectshaveanadd_picture()methodthatwillletyouaddanimagetotheendofthedocument.Sayyouhaveafilezophie.pnginthecurrentworkingdirectory.Youcanaddzophie.pngtotheendofyourdocumentwithawidthof1inchandheightof4centimeters(Wordcanusebothimperialandmetricunits)byenteringthefollowing:

>>>doc.add_picture('zophie.png',width=docx.shared.Inches(1),

height=docx.shared.Cm(4))

<docx.shape.InlineShapeobjectat0x00000000036C7D30>

Thefirstargumentisastringoftheimage’sfilename.Theoptionalwidthandheightkeywordargumentswillsetthewidthandheightoftheimageinthedocument.Ifleftout,thewidthandheightwilldefaulttothenormalsizeoftheimage.

You’llprobablyprefertospecifyanimage’sheightandwidthinfamiliarunitssuchas

inchesandcentimeters,soyoucanusethedocx.shared.Inches()anddocx.shared.Cm()functionswhenyou’respecifyingthewidthandheightkeywordarguments.

SummaryTextinformationisn’tjustforplaintextfiles;infact,it’sprettylikelythatyoudealwithPDFsandWorddocumentsmuchmoreoften.YoucanusethePyPDF2moduletoreadandwritePDFdocuments.Unfortunately,readingtextfromPDFdocumentsmightnotalwaysresultinaperfecttranslationtoastringbecauseofthecomplicatedPDFfileformat,andsomePDFsmightnotbereadableatall.Inthesecases,you’reoutofluckunlessfutureupdatestoPyPDF2supportadditionalPDFfeatures.

Worddocumentsaremorereliable,andyoucanreadthemwiththepython-docxmodule.YoucanmanipulatetextinWorddocumentsviaParagraphandRunobjects.Theseobjectscanalsobegivenstyles,thoughtheymustbefromthedefaultsetofstylesorstylesalreadyinthedocument.Youcanaddnewparagraphs,headings,breaks,andpicturestothedocument,thoughonlytotheend.

ManyofthelimitationsthatcomewithworkingwithPDFsandWorddocumentsarebecausetheseformatsaremeanttobenicelydisplayedforhumanreaders,ratherthaneasytoparsebysoftware.Thenextchaptertakesalookattwoothercommonformatsforstoringinformation:JSONandCSVfiles.Theseformatsaredesignedtobeusedbycomputers,andyou’llseethatPythoncanworkwiththeseformatsmuchmoreeasily.

PracticeQuestionsQ: 1.AstringvalueofthePDFfilenameisnotpassedtothePyPDF2.PdfFileReader()function.Whatdoyoupassto

thefunctioninstead?

Q: 2.WhatmodesdotheFileobjectsforPdfFileReader()andPdfFileWriter()needtobeopenedin?

Q: 3.HowdoyouacquireaPageobjectforAboutThisBookfromaPdfFileReaderobject?

Q: 4.WhatPdfFileReadervariablestoresthenumberofpagesinthePDFdocument?

Q: 5.IfaPdfFileReaderobject’sPDFisencryptedwiththepasswordswordfish,whatmustyoudobeforeyoucanobtainPageobjectsfromit?

Q: 6.Whatmethodsdoyouusetorotateapage?

Q: 7.WhatmethodreturnsaDocumentobjectforafilenameddemo.docx?

Q: 8.WhatisthedifferencebetweenaParagraphobjectandaRunobject?

Q: 9.HowdoyouobtainalistofParagraphobjectsforaDocumentobjectthat’sstoredinavariablenameddoc?

Q: 10.Whattypeofobjecthasbold,underline,italic,strike,andoutlinevariables?

Q: 11.WhatisthedifferencebetweensettingtheboldvariabletoTrue,False,orNone?

Q: 12.HowdoyoucreateaDocumentobjectforanewWorddocument?

Q: 13.Howdoyouaddaparagraphwiththetext'Hellothere!'toaDocumentobjectstoredinavariablenameddoc?

Q: 14.WhatintegersrepresentthelevelsofheadingsavailableinWorddocuments?

PracticeProjectsForpractice,writeprogramsthatdothefollowing.

PDFParanoiaUsingtheos.walk()functionfromChapter9,writeascriptthatwillgothrougheveryPDFinafolder(anditssubfolders)andencryptthePDFsusingapasswordprovidedonthecommandline.SaveeachencryptedPDFwithan_encrypted.pdfsuffixaddedtotheoriginalfilename.Beforedeletingtheoriginalfile,havetheprogramattempttoreadanddecryptthefiletoensurethatitwasencryptedcorrectly.

Then,writeaprogramthatfindsallencryptedPDFsinafolder(anditssubfolders)andcreatesadecryptedcopyofthePDFusingaprovidedpassword.Ifthepasswordisincorrect,theprogramshouldprintamessagetotheuserandcontinuetothenextPDF.

CustomInvitationsasWordDocumentsSayyouhaveatextfileofguestnames.Thisguests.txtfilehasonenameperline,asfollows:

Prof.Plum

MissScarlet

Col.Mustard

AlSweigart

Robocop

WriteaprogramthatwouldgenerateaWorddocumentwithcustominvitationsthatlooklikeFigure13-11.

SincePython-DocxcanuseonlythosestylesthatalreadyexistintheWorddocument,youwillhavetofirstaddthesestylestoablankWordfileandthenopenthatfilewithPython-Docx.ThereshouldbeoneinvitationperpageintheresultingWorddocument,socalladd_break()toaddapagebreakafterthelastparagraphofeachinvitation.Thisway,youwillneedtoopenonlyoneWorddocumenttoprintalloftheinvitationsatonce.

Figure13-11.TheWorddocumentgeneratedbyyourcustominvitescript

Youcandownloadasampleguests.txtfilefromhttp://nostarch.com/automatestuff/.

Brute-ForcePDFPasswordBreakerSayyouhaveanencryptedPDFthatyouhaveforgottenthepasswordto,butyourememberitwasasingleEnglishword.Tryingtoguessyourforgottenpasswordisquiteaboringtask.InsteadyoucanwriteaprogramthatwilldecryptthePDFbytryingeverypossibleEnglishworduntilitfindsonethatworks.Thisiscalledabrute-forcepasswordattack.Downloadthetextfiledictionary.txtfromhttp://nostarch.com/automatestuff/.Thisdictionaryfilecontainsover44,000Englishwordswithonewordperline.

Usingthefile-readingskillsyoulearnedinChapter8,createalistofwordstringsbyreadingthisfile.Thenloopovereachwordinthislist,passingittothedecrypt()method.Ifthismethodreturnstheinteger0,thepasswordwaswrongandyourprogramshouldcontinuetothenextpassword.Ifdecrypt()returns1,thenyourprogramshouldbreakoutoftheloopandprintthehackedpassword.Youshouldtryboththeuppercaseandlower-caseformofeachword.(Onmylaptop,goingthroughall88,000uppercaseandlowercasewordsfromthedictionaryfiletakesacoupleofminutes.Thisiswhyyoushouldn’tuseasimpleEnglishwordforyourpasswords.)

Chapter14.WorkingwithCSVFilesandJSONDataInChapter13,youlearnedhowtoextracttextfromPDFandWorddocuments.Thesefileswereinabinaryformat,whichrequiredspecialPythonmodulestoaccesstheirdata.CSVandJSONfiles,ontheotherhand,arejustplaintextfiles.Youcanviewtheminatexteditor,suchasIDLE’sfileeditor.ButPythonalsocomeswiththespecialcsvandjsonmodules,eachprovidingfunctionstohelpyouworkwiththesefileformats.

CSVstandsfor“comma-separatedvalues,”andCSVfilesaresimplifiedspreadsheetsstoredasplaintextfiles.Python’scsvmodulemakesiteasytoparseCSVfiles.

JSON(pronounced“JAY-sawn”or“Jason”—itdoesn’tmatterhowbecauseeitherwaypeoplewillsayyou’repronouncingitwrong)isaformatthatstoresinformationasJavaScriptsourcecodeinplaintextfiles.

(JSONisshortforJavaScriptObjectNotation.)Youdon’tneedtoknowtheJavaScriptprogramminglanguagetouseJSONfiles,buttheJSONformatisusefultoknowbecauseit’susedinmanywebapplications.

TheCSVModuleEachlineinaCSVfilerepresentsarowinthespreadsheet,andcommasseparatethecellsintherow.Forexample,thespreadsheetexample.xlsxfromhttp://nostarch.com/automatestuff/wouldlooklikethisinaCSVfile:

4/5/201513:34,Apples,73

4/5/20153:41,Cherries,85

4/6/201512:46,Pears,14

4/8/20158:59,Oranges,52

4/10/20152:07,Apples,152

4/10/201518:10,Bananas,23

4/10/20152:40,Strawberries,98

Iwillusethisfileforthischapter’sinteractiveshellexamples.Youcandownloadexample.csvfromhttp://nostarch.com/automatestuff/orenterthetextintoatexteditorandsaveitasexample.csv.

CSVfilesaresimple,lackingmanyofthefeaturesofanExcelspreadsheet.Forexample,CSVfiles

Don’thavetypesfortheirvalues—everythingisastringDon’thavesettingsforfontsizeorcolorDon’thavemultipleworksheetsCan’tspecifycellwidthsandheightsCan’thavemergedcellsCan’thaveimagesorchartsembeddedinthem

TheadvantageofCSVfilesissimplicity.CSVfilesarewidelysupportedbymanytypesofprograms,canbeviewedintexteditors(includingIDLE’sfileeditor),andareastraightforwardwaytorepresentspreadsheetdata.TheCSVformatisexactlyasadvertised:It’sjustatextfileofcomma-separatedvalues.

SinceCSVfilesarejusttextfiles,youmightbetemptedtoreadtheminasastringandthenprocessthatstringusingthetechniquesyoulearnedinChapter8.Forexample,sinceeachcellinaCSVfileisseparatedbyacomma,maybeyoucouldjustcallthesplit()methodoneachlineoftexttogetthevalues.ButnoteverycommainaCSVfilerepresentstheboundarybetweentwocells.CSVfilesalsohavetheirownsetofescapecharacterstoallowcommasandothercharacterstobeincludedaspartofthevalues.Thesplit()methoddoesn’thandletheseescapecharacters.Becauseofthesepotentialpitfalls,youshouldalwaysusethecsvmoduleforreadingandwritingCSVfiles.

ReaderObjectsToreaddatafromaCSVfilewiththecsvmodule,youneedtocreateaReaderobject.AReaderobjectletsyouiterateoverlinesintheCSVfile.Enterthefollowingintotheinteractiveshell,withexample.csvinthecurrentworkingdirectory:

➊>>>importcsv

➋>>>exampleFile=open('example.csv')

➌>>>exampleReader=csv.reader(exampleFile)

➍>>>exampleData=list(exampleReader)

➍>>>exampleData

[['4/5/201513:34','Apples','73'],['4/5/20153:41','Cherries','85'],

['4/6/201512:46','Pears','14'],['4/8/20158:59','Oranges','52'],

['4/10/20152:07','Apples','152'],['4/10/201518:10','Bananas','23'],

['4/10/20152:40','Strawberries','98']]

ThecsvmodulecomeswithPython,sowecanimportit➊withouthavingtoinstallitfirst.

ToreadaCSVfilewiththecsvmodule,firstopenitusingtheopen()function➋,justasyouwouldanyothertextfile.Butinsteadofcallingtheread()orreadlines()methodontheFileobjectthatopen()returns,passittothecsv.reader()function➌.ThiswillreturnaReaderobjectforyoutouse.Notethatyoudon’tpassafilenamestringdirectlytothecsv.reader()function.

ThemostdirectwaytoaccessthevaluesintheReaderobjectistoconvertittoaplainPythonlistbypassingittolist()➍.Usinglist()onthisReaderobjectreturnsalistoflists,whichyoucanstoreinavariablelikeexampleData.EnteringexampleDataintheshelldisplaysthelistoflists➎.

NowthatyouhavetheCSVfileasalistoflists,youcanaccessthevalueataparticularrowandcolumnwiththeexpressionexampleData[row][col],whererowistheindexofoneofthelistsinexampleData,andcolistheindexoftheitemyouwantfromthatlist.Enterthefollowingintotheinteractiveshell:

>>>exampleData[0][0]

'4/5/201513:34'

>>>exampleData[0][1]

'Apples'

>>>exampleData[0][2]

'73'

>>>exampleData[1][1]

'Cherries'

>>>exampleData[6][1]

'Strawberries'

exampleData[0][0]goesintothefirstlistandgivesusthefirststring,exampleData[0][2]goesintothefirstlistandgivesusthethirdstring,andsoon.

ReadingDatafromReaderObjectsinaforLoopForlargeCSVfiles,you’llwanttousetheReaderobjectinaforloop.Thisavoidsloadingtheentirefileintomemoryatonce.Forexample,enterthefollowingintotheinteractiveshell:

>>>importcsv

>>>exampleFile=open('example.csv')

>>>exampleReader=csv.reader(exampleFile)

>>>forrowinexampleReader:

print('Row#'+str(exampleReader.line_num)+''+str(row))

Row#1['4/5/201513:34','Apples','73']

Row#2['4/5/20153:41','Cherries','85']

Row#3['4/6/201512:46','Pears','14']

Row#4['4/8/20158:59','Oranges','52']

Row#5['4/10/20152:07','Apples','152']

Row#6['4/10/201518:10','Bananas','23']

Row#7['4/10/20152:40','Strawberries','98']

AfteryouimportthecsvmoduleandmakeaReaderobjectfromtheCSVfile,youcanloopthroughtherowsintheReaderobject.Eachrowisalistofvalues,witheachvaluerepresentingacell.

Theprint()functioncallprintsthenumberofthecurrentrowandthecontentsoftherow.Togettherownumber,usetheReaderobject’sline_numvariable,whichcontainsthenumberofthecurrentline.

TheReaderobjectcanbeloopedoveronlyonce.TorereadtheCSVfile,youmustcallcsv.readertocreateaReaderobject.

WriterObjectsAWriterobjectletsyouwritedatatoaCSVfile.TocreateaWriterobject,youusethecsv.writer()function.Enterthefollowingintotheinteractiveshell:

>>>importcsv

➊>>>outputFile=open('output.csv','w',newline='')

➋>>>outputWriter=csv.writer(outputFile)

>>>outputWriter.writerow(['spam','eggs','bacon','ham'])

21

>>>outputWriter.writerow(['Hello,world!','eggs','bacon','ham'])

32

>>>outputWriter.writerow([1,2,3.141592,4])

16

>>>outputFile.close()

First,callopen()andpassit'w'toopenafileinwritemode➊.Thiswillcreatetheobjectyoucanthenpasstocsv.writer()➋tocreateaWriterobject.

OnWindows,you’llalsoneedtopassablankstringfortheopen()function’snewlinekeywordargument.Fortechnicalreasonsbeyondthescopeofthisbook,ifyouforgettosetthenewlineargument,therowsinoutput.csvwillbedouble-spaced,asshowninFigure14-1.

Figure14-1.Ifyouforgetthenewline=''keywordargumentinopen(),theCSVfilewillbedouble-spaced.

Thewriterow()methodforWriterobjectstakesalistargument.EachvalueinthelistisplacedinitsowncellintheoutputCSVfile.Thereturnvalueofwriterow()isthenumberofcharacterswrittentothefileforthatrow(includingnewlinecharacters).

Thiscodeproducesanoutput.csvfilethatlookslikethis:spam,eggs,bacon,ham

"Hello,world!",eggs,bacon,ham

1,2,3.141592,4

NoticehowtheWriterobjectautomaticallyescapesthecommainthevalue'Hello,world!'withdoublequotesintheCSVfile.Thecsvmodulesavesyoufromhavingtohandlethesespecialcasesyourself.

ThedelimiterandlineterminatorKeywordArguments

Sayyouwanttoseparatecellswithatabcharacterinsteadofacommaandyouwanttherowstobedouble-spaced.Youcouldentersomethinglikethefollowingintotheinteractiveshell:

>>>importcsv

>>>csvFile=open('example.tsv','w',newline='')

➊>>>csvWriter=csv.writer(csvFile,delimiter='\t',lineterminator='\n\n')

>>>csvWriter.writerow(['apples','oranges','grapes'])

24

>>>csvWriter.writerow(['eggs','bacon','ham'])

17

>>>csvWriter.writerow(['spam','spam','spam','spam','spam','spam'])

32

>>>csvFile.close()

Thischangesthedelimiterandlineterminatorcharactersinyourfile.Thedelimiteristhecharacterthatappearsbetweencellsonarow.Bydefault,thedelimiterforaCSVfileisacomma.Thelineterminatoristhecharacterthatcomesattheendofarow.Bydefault,thelineterminatorisanewline.Youcanchangecharacterstodifferentvaluesbyusingthedelimiterandlineterminatorkeywordargumentswithcsv.writer().

Passingdelimeter='\t'andlineterminator='\n\n'➊changesthecharacterbetweencellstoatabandthecharacterbetweenrowstotwonewlines.Wethencallwriterow()threetimestogiveusthreerows.

Thisproducesafilenamedexample.tsvwiththefollowingcontents:applesorangesgrapes

eggsbaconham

spamspamspamspamspamspam

Nowthatourcellsareseparatedbytabs,we’reusingthefileextension.tsv,fortab-separatedvalues.

Project:RemovingtheHeaderfromCSVFilesSayyouhavetheboringjobofremovingthefirstlinefromseveralhundredCSVfiles.Maybeyou’llbefeedingthemintoanautomatedprocessthatrequiresjustthedataandnottheheadersatthetopofthecolumns.YoucouldopeneachfileinExcel,deletethefirstrow,andresavethefile—butthatwouldtakehours.Let’swriteaprogramtodoitinstead.

Theprogramwillneedtoopeneveryfilewiththe.csvextensioninthecurrentworkingdirectory,readinthecontentsoftheCSVfile,andrewritethecontentswithoutthefirstrowtoafileofthesamename.ThiswillreplacetheoldcontentsoftheCSVfilewiththenew,headlesscontents.

NOTE

Asalways,wheneveryouwriteaprogramthatmodifiesfiles,besuretobackupthefiles,firstjustincaseyourprogramdoesnotworkthewayyouexpectitto.Youdon’twanttoaccidentallyeraseyouroriginalfiles.

Atahighlevel,theprogrammustdothefollowing:

FindalltheCSVfilesinthecurrentworkingdirectory.Readinthefullcontentsofeachfile.Writeoutthecontents,skippingthefirstline,toanewCSVfile.Atthecodelevel,thismeanstheprogramwillneedtodothefollowing:Loopoveralistoffilesfromos.listdir(),skippingthenon-CSVfiles.CreateaCSVReaderobjectandreadinthecontentsofthefile,usingtheline_numattributetofigureoutwhichlinetoskip.CreateaCSVWriterobjectandwriteouttheread-indatatothenewfile.

Forthisproject,openanewfileeditorwindowandsaveitasremoveCsvHeader.py.

Step1:LoopThroughEachCSVFileThefirstthingyourprogramneedstodoisloopoveralistofallCSVfilenamesforthecurrentworkingdirectory.MakeyourremoveCsvHeader.pylooklikethis:

#!python3

#removeCsvHeader.py-RemovestheheaderfromallCSVfilesinthecurrent

#workingdirectory.

importcsv,os

os.makedirs('headerRemoved',exist_ok=True)

#Loopthrougheveryfileinthecurrentworkingdirectory.

forcsvFilenameinos.listdir('.'):

ifnotcsvFilename.endswith('.csv'):

➊continue#skipnon-csvfiles

print('Removingheaderfrom'+csvFilename+'...')

#TODO:ReadtheCSVfilein(skippingfirstrow).

#TODO:WriteouttheCSVfile.

Theos.makedirs()callwillcreateaheaderRemovedfolderwherealltheheadlessCSVfileswillbewritten.Aforlooponos.listdir('.')getsyoupartwaythere,butitwillloopoverallfilesintheworkingdirectory,soyou’llneedtoaddsomecodeatthestartoftheloopthatskipsfilenamesthatdon’tendwith.csv.Thecontinuestatement➊makes

theforloopmoveontothenextfilenamewhenitcomesacrossanon-CSVfile.

Justsothere’ssomeoutputastheprogramruns,printoutamessagesayingwhichCSVfiletheprogramisworkingon.Then,addsomeTODOcommentsforwhattherestoftheprogramshoulddo.

Step2:ReadintheCSVFileTheprogramdoesn’tremovethefirstlinefromtheCSVfile.Rather,itcreatesanewcopyoftheCSVfilewithoutthefirstline.Sincethecopy’sfilenameisthesameastheoriginalfilename,thecopywilloverwritetheoriginal.

Theprogramwillneedawaytotrackwhetheritiscurrentlyloopingonthefirstrow.AddthefollowingtoremoveCsvHeader.py.

#!python3

#removeCsvHeader.py-RemovestheheaderfromallCSVfilesinthecurrent

#workingdirectory.

--snip--

#ReadtheCSVfilein(skippingfirstrow).

csvRows=[]

csvFileObj=open(csvFilename)

readerObj=csv.reader(csvFileObj)

forrowinreaderObj:

ifreaderObj.line_num==1:

continue#skipfirstrow

csvRows.append(row)

csvFileObj.close()

#TODO:WriteouttheCSVfile.

TheReaderobject’sline_numattributecanbeusedtodeterminewhichlineintheCSVfileitiscurrentlyreading.AnotherforloopwillloopovertherowsreturnedfromtheCSVReaderobject,andallrowsbutthefirstwillbeappendedtocsvRows.

Astheforloopiteratesovereachrow,thecodecheckswhetherreaderObj.line_numissetto1.Ifso,itexecutesacontinuetomoveontothenextrowwithoutappendingittocsvRows.Foreveryrowafterward,theconditionwillbealwaysbeFalse,andtherowwillbeappendedtocsvRows.

Step3:WriteOuttheCSVFileWithouttheFirstRowNowthatcsvRowscontainsallrowsbutthefirstrow,thelistneedstobewrittenouttoaCSVfileintheheaderRemovedfolder.AddthefollowingtoremoveCsvHeader.py:

#!python3

#removeCsvHeader.py-RemovestheheaderfromallCSVfilesinthecurrent

#workingdirectory.

--snip--

#Loopthrougheveryfileinthecurrentworkingdirectory.

➊forcsvFilenameinos.listdir('.'):

ifnotcsvFilename.endswith('.csv'):

continue#skipnon-CSVfiles

--snip--

#WriteouttheCSVfile.

csvFileObj=open(os.path.join('headerRemoved',csvFilename),'w',

newline='')

csvWriter=csv.writer(csvFileObj)

forrowincsvRows:

csvWriter.writerow(row)

csvFileObj.close()

TheCSVWriterobjectwillwritethelisttoaCSVfileinheaderRemovedusingcsvFilename(whichwealsousedintheCSVreader).Thiswilloverwritetheoriginalfile.

OncewecreatetheWriterobject,weloopoverthesublistsstoredincsvRowsandwriteeachsublisttothefile.

Afterthecodeisexecuted,theouterforloop➊willlooptothenextfilenamefromos.listdir('.').Whenthatloopisfinished,theprogramwillbecomplete.

Totestyourprogram,downloadremoveCsvHeader.zipfromhttp://nostarch.com/automatestuff/andunzipittoafolder.RuntheremoveCsvHeader.pyprograminthatfolder.Theoutputwilllooklikethis:

RemovingheaderfromNAICS_data_1048.csv…

RemovingheaderfromNAICS_data_1218.csv…

--snip--

RemovingheaderfromNAICS_data_9834.csv…

RemovingheaderfromNAICS_data_9986.csv…

ThisprogramshouldprintafilenameeachtimeitstripsthefirstlinefromaCSVfile.

IdeasforSimilarProgramsTheprogramsthatyoucouldwriteforCSVfilesaresimilartothekindsyoucouldwriteforExcelfiles,sincethey’rebothspreadsheetfiles.Youcouldwriteprogramstodothefollowing:

ComparedatabetweendifferentrowsinaCSVfileorbetweenmultipleCSVfiles.CopyspecificdatafromaCSVfiletoanExcelfile,orviceversa.CheckforinvaliddataorformattingmistakesinCSVfilesandalerttheusertotheseerrors.ReaddatafromaCSVfileasinputforyourPythonprograms.

JSONandAPIsJavaScriptObjectNotationisapopularwaytoformatdataasasinglehuman-readablestring.JSONisthenativewaythatJavaScriptprogramswritetheirdatastructuresandusuallyresembleswhatPython’spprint()functionwouldproduce.Youdon’tneedtoknowJavaScriptinordertoworkwithJSON-formatteddata.

Here’sanexampleofdataformattedasJSON:{"name":"Zophie","isCat":true,

"miceCaught":0,"napsTaken":37.5,

"felineIQ":null}

JSONisusefultoknow,becausemanywebsitesofferJSONcontentasawayforprogramstointeractwiththewebsite.Thisisknownasprovidinganapplicationprogramminginterface(API).AccessinganAPIisthesameasaccessinganyotherwebpageviaaURL.ThedifferenceisthatthedatareturnedbyanAPIisformatted(withJSON,forexample)formachines;APIsaren’teasyforpeopletoread.

ManywebsitesmaketheirdataavailableinJSONformat.Facebook,Twitter,Yahoo,Google,Tumblr,Wikipedia,Flickr,Data.gov,Reddit,IMDb,RottenTomatoes,LinkedIn,andmanyotherpopularsitesofferAPIsforprogramstouse.Someofthesesitesrequireregistration,whichisalmostalwaysfree.You’llhavetofinddocumentationforwhatURLsyourprogramneedstorequestinordertogetthedatayouwant,aswellasthegeneralformatoftheJSONdatastructuresthatarereturned.ThisdocumentationshouldbeprovidedbywhateversiteisofferingtheAPI;iftheyhavea“Developers”page,lookforthedocumentationthere.

UsingAPIs,youcouldwriteprogramsthatdothefollowing:

Scraperawdatafromwebsites.(AccessingAPIsisoftenmoreconvenientthandownloadingwebpagesandparsingHTMLwithBeautifulSoup.)Automaticallydownloadnewpostsfromoneofyoursocialnetworkaccountsandpostthemtoanotheraccount.Forexample,youcouldtakeyourTumblrpostsandpostthemtoFacebook.Createa“movieencyclopedia”foryourpersonalmoviecollectionbypullingdatafromIMDb,RottenTomatoes,andWikipediaandputtingitintoasingletextfileonyourcomputer.

YoucanseesomeexamplesofJSONAPIsintheresourcesathttp://nostarch.com/automatestuff/.

TheJSONModulePython’sjsonmodulehandlesallthedetailsoftranslatingbetweenastringwithJSONdataandPythonvaluesforthejson.loads()andjson.dumps()functions.JSONcan’tstoreeverykindofPythonvalue.Itcancontainvaluesofonlythefollowingdatatypes:strings,integers,floats,Booleans,lists,dictionaries,andNoneType.JSONcannotrepresentPython-specificobjects,suchasFileobjects,CSVReaderorWriterobjects,Regexobjects,orSeleniumWebElementobjects.

ReadingJSONwiththeloads()FunctionTotranslateastringcontainingJSONdataintoaPythonvalue,passittothejson.loads()function.(Thenamemeans“loadstring,”not“loads.”)Enterthefollowingintotheinteractiveshell:

>>>stringOfJsonData='{"name":"Zophie","isCat":true,"miceCaught":0,

"felineIQ":null}'

>>>importjson

>>>jsonDataAsPythonValue=json.loads(stringOfJsonData)

>>>jsonDataAsPythonValue

{'isCat':True,'miceCaught':0,'name':'Zophie','felineIQ':None}

Afteryouimportthejsonmodule,youcancallloads()andpassitastringofJSONdata.NotethatJSONstringsalwaysusedoublequotes.ItwillreturnthatdataasaPythondictionary.Pythondictionariesarenotordered,sothekey-valuepairsmayappearinadifferentorderwhenyouprintjsonDataAsPythonValue.

WritingJSONwiththedumps()FunctionThejson.dumps()function(whichmeans“dumpstring,”not“dumps”)willtranslateaPythonvalueintoastringofJSON-formatteddata.Enterthefollowingintotheinteractiveshell:

>>>pythonValue={'isCat':True,'miceCaught':0,'name':'Zophie',

'felineIQ':None}

>>>importjson

>>>stringOfJsonData=json.dumps(pythonValue)

>>>stringOfJsonData

'{"isCat":true,"felineIQ":null,"miceCaught":0,"name":"Zophie"}'

ThevaluecanonlybeoneofthefollowingbasicPythondatatypes:dictionary,list,integer,float,string,Boolean,orNone.

Project:FetchingCurrentWeatherDataCheckingtheweatherseemsfairlytrivial:Openyourwebbrowser,clicktheaddressbar,typetheURLtoaweatherwebsite(orsearchforoneandthenclickthelink),waitforthepagetoload,lookpastalltheads,andsoon.

Actually,therearealotofboringstepsyoucouldskipifyouhadaprogramthatdownloadedtheweatherforecastforthenextfewdaysandprinteditasplaintext.ThisprogramusestherequestsmodulefromChapter11todownloaddatafromtheWeb.

Overall,theprogramdoesthefollowing:

Readstherequestedlocationfromthecommandline.DownloadsJSONweatherdatafromOpenWeatherMap.org.ConvertsthestringofJSONdatatoaPythondatastructure.Printstheweatherfortodayandthenexttwodays.Sothecodewillneedtodothefollowing:Joinstringsinsys.argvtogetthelocation.Callrequests.get()todownloadtheweatherdata.Calljson.loads()toconverttheJSONdatatoaPythondatastructure.Printtheweatherforecast.

Forthisproject,openanewfileeditorwindowandsaveitasquickWeather.py.

Step1:GetLocationfromtheCommandLineArgumentTheinputforthisprogramwillcomefromthecommandline.MakequickWeather.pylooklikethis:

#!python3

#quickWeather.py-Printstheweatherforalocationfromthecommandline.

importjson,requests,sys

#Computelocationfromcommandlinearguments.

iflen(sys.argv)<2:

print('Usage:quickWeather.pylocation')

sys.exit()

location=''.join(sys.argv[1:])

#TODO:DownloadtheJSONdatafromOpenWeatherMap.org'sAPI.

#TODO:LoadJSONdataintoaPythonvariable.

InPython,commandlineargumentsarestoredinthesys.argvlist.Afterthe#!shebanglineandimportstatements,theprogramwillcheckthatthereismorethanonecommandlineargument.(Recallthatsys.argvwillalwayshaveatleastoneelement,sys.argv[0],whichcontainsthePythonscript’sfilename.)Ifthereisonlyoneelementinthelist,thentheuserdidn’tprovidealocationonthecommandline,anda“usage”messagewillbeprovidedtotheuserbeforetheprogramends.

Commandlineargumentsaresplitonspaces.ThecommandlineargumentSanFrancisco,CAwouldmakesys.argvhold['quickWeather.py','San','Francisco,','CA'].Therefore,callthejoin()methodtojoinallthestringsexceptforthefirstinsys.argv.Storethisjoinedstringinavariablenamedlocation.

Step2:DownloadtheJSONData

OpenWeatherMap.orgprovidesreal-timeweatherinformationinJSONformat.Yourprogramsimplyhastodownloadthepageathttp://api.openweathermap.org/data/2.5/forecast/daily?q=<Location>&cnt=3,where<Location>isthenameofthecitywhoseweatheryouwant.AddthefollowingtoquickWeather.py.

#!python3

#quickWeather.py-Printstheweatherforalocationfromthecommandline.

--snip--

#DownloadtheJSONdatafromOpenWeatherMap.org'sAPI.

url='http://api.openweathermap.org/data/2.5/forecast/daily?q=%s&cnt=3'%(location)

response=requests.get(url)

response.raise_for_status()

#TODO:LoadJSONdataintoaPythonvariable.

Wehavelocationfromourcommandlinearguments.TomaketheURLwewanttoaccess,weusethe%splaceholderandinsertwhateverstringisstoredinlocationintothatspotintheURLstring.Westoretheresultinurlandpassurltorequests.get().Therequests.get()callreturnsaResponseobject,whichyoucancheckforerrorsbycallingraise_for_status().Ifnoexceptionisraised,thedownloadedtextwillbeinresponse.text.

Step3:LoadJSONDataandPrintWeatherTheresponse.textmembervariableholdsalargestringofJSON-formatteddata.ToconvertthistoaPythonvalue,callthejson.loads()function.TheJSONdatawilllooksomethinglikethis:

{'city':{'coord':{'lat':37.7771,'lon':-122.42},

'country':'UnitedStatesofAmerica',

'id':'5391959',

'name':'SanFrancisco',

'population':0},

'cnt':3,

'cod':'200',

'list':[{'clouds':0,

'deg':233,

'dt':1402344000,

'humidity':58,

'pressure':1012.23,

'speed':1.96,

'temp':{'day':302.29,

'eve':296.46,

'max':302.29,

'min':289.77,

'morn':294.59,

'night':289.77},

'weather':[{'description':'skyisclear',

'icon':'01d',

--snip--

YoucanseethisdatabypassingweatherDatatopprint.pprint().Youmaywanttocheckhttp://openweathermap.org/formoredocumentationonwhatthesefieldsmean.Forexample,theonlinedocumentationwilltellyouthatthe302.29after'day'isthedaytimetemperatureinKelvin,notCelsiusorFahrenheit.

Theweatherdescriptionsyouwantareafter'main'and'description'.Toneatlyprintthemout,addthefollowingtoquickWeather.py.

!python3

#quickWeather.py-Printstheweatherforalocationfromthecommandline.

--snip--

#LoadJSONdataintoaPythonvariable.

weatherData=json.loads(response.text)

#Printweatherdescriptions.

➊w=weatherData['list']

print('Currentweatherin%s:'%(location))

print(w[0]['weather'][0]['main'],'-',w[0]['weather'][0]['description'])

print()

print('Tomorrow:')

print(w[1]['weather'][0]['main'],'-',w[1]['weather'][0]['description'])

print()

print('Dayaftertomorrow:')

print(w[2]['weather'][0]['main'],'-',w[2]['weather'][0]['description'])

NoticehowthecodestoresweatherData['list']inthevariablewtosaveyousometyping➊.Youusew[0],w[1],andw[2]toretrievethedictionariesfortoday,tomorrow,andthedayaftertomorrow’sweather,respectively.Eachofthesedictionarieshasa'weather'key,whichcontainsalistvalue.You’reinterestedinthefirstlistitem,anesteddictionarywithseveralmorekeys,atindex0.Here,weprintthevaluesstoredinthe'main'and'description'keys,separatedbyahyphen.

WhenthisprogramisrunwiththecommandlineargumentquickWeather.pySanFrancisco,CA,theoutputlookssomethinglikethis:

CurrentweatherinSanFrancisco,CA:

Clear-skyisclear

Tomorrow:

Clouds-fewclouds

Dayaftertomorrow:

Clear-skyisclear

(TheweatherisoneofthereasonsIlikelivinginSanFrancisco!)

IdeasforSimilarProgramsAccessingweatherdatacanformthebasisformanytypesofprograms.Youcancreatesimilarprogramstodothefollowing:

Collectweatherforecastsforseveralcampsitesorhikingtrailstoseewhichonewillhavethebestweather.Scheduleaprogramtoregularlychecktheweatherandsendyouafrostalertifyouneedtomoveyourplantsindoors.(Chapter15coversscheduling,andChapter16explainshowtosendemail.)Pullweatherdatafrommultiplesitestoshowallatonce,orcalculateandshowtheaverageofthemultipleweatherpredictions.

SummaryCSVandJSONarecommonplaintextformatsforstoringdata.Theyareeasyforprogramstoparsewhilestillbeinghumanreadable,sotheyareoftenusedforsimplespreadsheetsorwebappdata.ThecsvandjsonmodulesgreatlysimplifytheprocessofreadingandwritingtoCSVandJSONfiles.

ThelastfewchaptershavetaughtyouhowtousePythontoparseinformationfromawidevarietyoffileformats.Onecommontaskistakingdatafromavarietyofformatsandparsingitfortheparticularinformationyouneed.Thesetasksareoftenspecifictothepointthatcommercialsoftwareisnotoptimallyhelpful.Bywritingyourownscripts,youcanmakethecomputerhandlelargeamountsofdatapresentedintheseformats.

InChapter15,you’llbreakawayfromdataformatsandlearnhowtomakeyourprogramscommunicatewithyoubysendingemailsandtextmessages.

PracticeQuestionsQ: 1.WhataresomefeaturesExcelspreadsheetshavethatCSVspreadsheetsdon’t?

Q: 2.Whatdoyoupasstocsv.reader()andcsv.writer()tocreateReaderandWriterobjects?

Q: 3.WhatmodesdoFileobjectsforreaderandWriterobjectsneedtobeopenedin?

Q: 4.WhatmethodtakesalistargumentandwritesittoaCSVfile?

Q: 5.Whatdothedelimiterandlineterminatorkeywordargumentsdo?

Q: 6.WhatfunctiontakesastringofJSONdataandreturnsaPythondatastructure?

Q: 7.WhatfunctiontakesaPythondatastructureandreturnsastringofJSONdata?

PracticeProjectForpractice,writeaprogramthatdoesthefollowing.

Excel-to-CSVConverterExcelcansaveaspreadsheettoaCSVfilewithafewmouseclicks,butifyouhadtoconverthundredsofExcelfilestoCSVs,itwouldtakehoursofclicking.UsingtheopenpyxlmodulefromChapter12,writeaprogramthatreadsalltheExcelfilesinthecurrentworkingdirectoryandoutputsthemasCSVfiles.

AsingleExcelfilemightcontainmultiplesheets;you’llhavetocreateoneCSVfilepersheet.ThefilenamesoftheCSVfilesshouldbe<excelfilename>_<sheettitle>.csv,where<excelfilename>isthefilenameoftheExcelfilewithoutthefileextension(forexample,'spam_data',not'spam_data.xlsx')and<sheettitle>isthestringfromtheWorksheetobject’stitlevariable.

Thisprogramwillinvolvemanynestedforloops.Theskeletonoftheprogramwilllooksomethinglikethis:

forexcelFileinos.listdir('.'):

#Skipnon-xlsxfiles,loadtheworkbookobject.

forsheetNameinwb.get_sheet_names():

#Loopthrougheverysheetintheworkbook.

sheet=wb.get_sheet_by_name(sheetName)

#CreatetheCSVfilenamefromtheExcelfilenameandsheettitle.

#Createthecsv.writerobjectforthisCSVfile.

#Loopthrougheveryrowinthesheet.

forrowNuminrange(1,sheet.get_highest_row()+1):

rowData=[]#appendeachcelltothislist

#Loopthrougheachcellintherow.

forcolNuminrange(1,sheet.get_highest_column()+1):

#Appendeachcell'sdatatorowData.

#WritetherowDatalisttotheCSVfile.

csvFile.close()

DownloadtheZIPfileexcelSpreadsheets.zipfromhttp://nostarch.com/automatestuff/,andunzipthespreadsheetsintothesamedirectoryasyourprogram.Youcanusetheseasthefilestotesttheprogramon.

Chapter15.KeepingTime,SchedulingTasks,andLaunchingProgramsRunningprogramswhileyou’resittingatyourcomputerisfine,butit’salsousefultohaveprogramsrunwithoutyourdirectsupervision.Yourcomputer’sclockcanscheduleprogramstoruncodeatsomespecifiedtimeanddateoratregularintervals.Forexample,yourprogramcouldscrapeawebsiteeveryhourtocheckforchangesordoaCPU-intensivetaskat4AMwhileyousleep.Python’stimeanddatetimemodulesprovidethesefunctions.

Youcanalsowriteprogramsthatlaunchotherprogramsonaschedulebyusingthesubprocessandthreadingmodules.Often,thefastestwaytoprogramistotakeadvantageofapplicationsthatotherpeoplehavealreadywritten.

ThetimeModuleYourcomputer’ssystemclockissettoaspecificdate,time,andtimezone.Thebuilt-intimemoduleallowsyourPythonprogramstoreadthesystemclockforthecurrenttime.Thetime.time()andtime.sleep()functionsarethemostusefulinthetimemodule.

Thetime.time()FunctionTheUnixepochisatimereferencecommonlyusedinprogramming:12AMonJanuary1,1970,CoordinatedUniversalTime(UTC).Thetime.time()functionreturnsthenumberofsecondssincethatmomentasafloatvalue.(Recallthatafloatisjustanumberwithadecimalpoint.)Thisnumberiscalledanepochtimestamp.Forexample,enterthefollowingintotheinteractiveshell:

>>>importtime

>>>time.time()

1425063955.068649

HereI’mcallingtime.time()onFebruary27,2015,at11:05PacificStandardTime,or7:05PMUTC.ThereturnvalueishowmanysecondshavepassedbetweentheUnixepochandthemomenttime.time()wascalled.

NOTE

TheinteractiveshellexampleswillyielddatesandtimesforwhenIwrotethischapterinFebruary2015.Unlessyou’reatimetraveler,yourdatesandtimeswillbedifferent.

Epochtimestampscanbeusedtoprofilecode,thatis,measurehowlongapieceofcodetakestorun.Ifyoucalltime.time()atthebeginningofthecodeblockyouwanttomeasureandagainattheend,youcansubtractthefirsttimestampfromthesecondtofindtheelapsedtimebetweenthosetwocalls.Forexample,openanewfileeditorwindowandenterthefollowingprogram:

importtime

➊defcalcProd():

#Calculatetheproductofthefirst100,000numbers.

product=1

foriinrange(1,100000):

product=product*i

returnproduct

➋startTime=time.time()

prod=calcProd()

➌endTime=time.time()

➍print('Theresultis%sdigitslong.'%(len(str(prod))))

➎print('Took%ssecondstocalculate.'%(endTime-startTime))

At➊,wedefineafunctioncalcProd()toloopthroughtheintegersfrom1to99,999andreturntheirproduct.At➋,wecalltime.time()andstoreitinstartTime.RightaftercallingcalcProd(),wecalltime.time()againandstoreitinendTime➌.WeendbyprintingthelengthoftheproductreturnedbycalcProd()➍andhowlongittooktoruncalcProd()➎.

SavethisprogramascalcProd.pyandrunit.Theoutputwilllooksomethinglikethis:Theresultis456569digitslong.

Took2.844162940979004secondstocalculate.

NOTE

AnotherwaytoprofileyourcodeistousethecProfile.run()function,whichprovidesamuchmoreinformativelevelofdetailthanthesimpletime.time()technique.ThecProfile.run()functionisexplainedathttps://docs.python.org/3/library/profile.html.

Thetime.sleep()FunctionIfyouneedtopauseyourprogramforawhile,callthetime.sleep()functionandpassitthenumberofsecondsyouwantyourprogramtostaypaused.Enterthefollowingintotheinteractiveshell:

>>>importtime

>>>foriinrange(3):

➊print('Tick')

➋time.sleep(1)

➌print('Tock')

➍time.sleep(1)

Tick

Tock

Tick

Tock

Tick

Tock

➎>>>time.sleep(5)

TheforloopwillprintTick➊,pauseforonesecond➋,printTock➌,pauseforonesecond➍,printTick,pause,andsoonuntilTickandTockhaveeachbeenprintedthreetimes.

Thetime.sleep()functionwillblock—thatis,itwillnotreturnandreleaseyourprogramtoexecuteothercode—untilafterthenumberofsecondsyoupassedtotime.sleep()haselapsed.Forexample,ifyouentertime.sleep(5)➎,you’llseethatthenextprompt(>>>)doesn’tappearuntilfivesecondshavepassed.

BeawarethatpressingCTRL-Cwillnotinterrupttime.sleep()callsinIDLE.IDLEwaitsuntiltheentirepauseisoverbeforeraisingtheKeyboardInterruptexception.Toworkaroundthisproblem,insteadofhavingasingletime.sleep(30)calltopausefor30seconds,useaforlooptomake30callstotime.sleep(1).

>>>foriinrange(30):

time.sleep(1)

IfyoupressCTRL-Csometimeduringthese30seconds,youshouldseetheKeyboardInterruptexceptionthrownrightaway.

RoundingNumbersWhenworkingwithtimes,you’lloftenencounterfloatvalueswithmanydigitsafterthedecimal.Tomakethesevalueseasiertoworkwith,youcanshortenthemwithPython’sbuilt-inround()function,whichroundsafloattotheprecisionyouspecify.Justpassinthenumberyouwanttoround,plusanoptionalsecondargumentrepresentinghowmanydigitsafterthedecimalpointyouwanttorounditto.Ifyouomitthesecondargument,round()roundsyournumbertothenearestwholeinteger.Enterthefollowingintotheinteractiveshell:

>>>importtime

>>>now=time.time()

>>>now

1425064108.017826

>>>round(now,2)

1425064108.02

>>>round(now,4)

1425064108.0178

>>>round(now)

1425064108

Afterimportingtimeandstoringtime.time()innow,wecallround(now,2)toroundnowtotwodigitsafterthedecimal,round(now,4)toroundtofourdigitsafterthedecimal,andround(now)toroundtothenearestinteger.

Project:SuperStopwatchSayyouwanttotrackhowmuchtimeyouspendonboringtasksyouhaven’tautomatedyet.Youdon’thaveaphysicalstopwatch,andit’ssurprisinglydifficulttofindafreestopwatchappforyourlaptoporsmartphonethatisn’tcoveredinadsanddoesn’tsendacopyofyourbrowserhistorytomarketers.(Itsaysitcandothisinthelicenseagreementyouagreedto.Youdidreadthelicenseagreement,didn’tyou?)YoucanwriteasimplestopwatchprogramyourselfinPython.

Atahighlevel,here’swhatyourprogramwilldo:

TracktheamountoftimeelapsedbetweenpressesoftheENTERkey,witheachkeypressstartinganew“lap”onthetimer.Printthelapnumber,totaltime,andlaptime.Thismeansyourcodewillneedtodothefollowing:Findthecurrenttimebycallingtime.time()andstoreitasatimestampatthestartoftheprogram,aswellasatthestartofeachlap.KeepalapcounterandincrementiteverytimetheuserpressesENTER.Calculatetheelapsedtimebysubtractingtimestamps.HandletheKeyboardInterruptexceptionsotheusercanpressCTRL-Ctoquit.

Openanewfileeditorwindowandsaveitasstopwatch.py.

Step1:SetUptheProgramtoTrackTimesThestopwatchprogramwillneedtousethecurrenttime,soyou’llwanttoimportthetimemodule.Yourprogramshouldalsoprintsomebriefinstructionstotheuserbeforecallinginput(),sothetimercanbeginaftertheuserpressesENTER.Thenthecodewillstarttrackinglaptimes.

Enterthefollowingcodeintothefileeditor,writingaTODOcommentasaplaceholderfortherestofthecode:

#!python3

#stopwatch.py-Asimplestopwatchprogram.

importtime

#Displaytheprogram'sinstructions.

print('PressENTERtobegin.Afterwards,pressENTERto"click"thestopwatch.

PressCtrl-Ctoquit.')

input()#pressEntertobegin

print('Started.')

startTime=time.time()#getthefirstlap'sstarttime

lastTime=startTime

lapNum=1

#TODO:Starttrackingthelaptimes.

Nowthatwe’vewrittenthecodetodisplaytheinstructions,startthefirstlap,notethetime,andsetourlapcountto1.

Step2:TrackandPrintLapTimesNowlet’swritethecodetostarteachnewlap,calculatehowlongthepreviouslaptook,andcalculatethetotaltimeelapsedsincestartingthestopwatch.We’lldisplaythelaptimeandtotaltimeandincreasethelapcountforeachnewlap.Addthefollowingcodetoyour

program:#!python3

#stopwatch.py-Asimplestopwatchprogram.

importtime

--snip--

#Starttrackingthelaptimes.

➊try:

➋whileTrue:

input()

➌lapTime=round(time.time()-lastTime,2)

➍totalTime=round(time.time()-startTime,2)

➎print('Lap#%s:%s(%s)'%(lapNum,totalTime,lapTime),end='')

lapNum+=1

lastTime=time.time()#resetthelastlaptime

➏exceptKeyboardInterrupt:

#HandletheCtrl-Cexceptiontokeepitserrormessagefromdisplaying.

print('\nDone.')

IftheuserpressesCTRL-Ctostopthestopwatch,theKeyboardInterruptexceptionwillberaised,andtheprogramwillcrashifitsexecutionisnotatrystatement.Topreventcrashing,wewrapthispartoftheprograminatrystatement➊.We’llhandletheexceptionintheexceptclause➏,sowhenCTRL-Cispressedandtheexceptionisraised,theprogramexecutionmovestotheexceptclausetoprintDone,insteadoftheKeyboardInterrupterrormessage.Untilthishappens,theexecutionisinsideaninfiniteloop➋thatcallsinput()andwaitsuntiltheuserpressesENTERtoendalap.Whenalapends,wecalculatehowlongthelaptookbysubtractingthestarttimeofthelap,lastTime,fromthecurrenttime,time.time()➌.Wecalculatethetotaltimeelapsedbysubtractingtheoverallstarttimeofthestopwatch,startTime,fromthecurrenttime➍.

Sincetheresultsofthesetimecalculationswillhavemanydigitsafterthedecimalpoint(suchas4.766272783279419),weusetheround()functiontoroundthefloatvaluetotwodigitsat➌and➍.

At➎,weprintthelapnumber,totaltimeelapsed,andthelaptime.SincetheuserpressingENTERfortheinput()callwillprintanewlinetothescreen,passend=''totheprint()functiontoavoiddouble-spacingtheoutput.Afterprintingthelapinformation,wegetreadyforthenextlapbyadding1tothecountlapNumandsettinglastTimetothecurrenttime,whichisthestarttimeofthenextlap.

IdeasforSimilarProgramsTimetrackingopensupseveralpossibilitiesforyourprograms.Althoughyoucandownloadappstodosomeofthesethings,thebenefitofwritingprogramsyourselfisthattheywillbefreeandnotbloatedwithadsanduselessfeatures.Youcouldwritesimilarprogramstodothefollowing:

Createasimpletimesheetappthatrecordswhenyoutypeaperson’snameandusesthecurrenttimetoclocktheminorout.Addafeaturetoyourprogramtodisplaytheelapsedtimesinceaprocessstarted,suchasadownloadthatusestherequestsmodule.(SeeChapter11.)Intermittentlycheckhowlongaprogramhasbeenrunningandoffertheuserachancetocanceltasksthataretakingtoolong.

ThedatetimeModuleThetimemoduleisusefulforgettingaUnixepochtimestamptoworkwith.Butifyouwanttodisplayadateinamoreconvenientformat,ordoarithmeticwithdates(forexample,figuringoutwhatdatewas205daysagoorwhatdateis123daysfromnow),youshouldusethedatetimemodule.

Thedatetimemodulehasitsowndatetimedatatype.datetimevaluesrepresentaspecificmomentintime.Enterthefollowingintotheinteractiveshell:

>>>importdatetime

➊>>>datetime.datetime.now()

➋datetime.datetime(2015,2,27,11,10,49,55,53)

➌>>>dt=datetime.datetime(2015,10,21,16,29,0)

➍>>>dt.year,dt.month,dt.day

(2015,10,21)

➎>>>dt.hour,dt.minute,dt.second

(16,29,0)

Callingdatetime.datetime.now()➊returnsadatetimeobject➋forthecurrentdateandtime,accordingtoyourcomputer’sclock.Thisobjectincludestheyear,month,day,hour,minute,second,andmicrosecondofthecurrentmoment.Youcanalsoretrieveadatetimeobjectforaspecificmomentbyusingthedatetime.datetime()function➌,passingitintegersrepresentingtheyear,month,day,hour,andsecondofthemomentyouwant.Theseintegerswillbestoredinthedatetimeobject’syear,month,day➍,hour,minute,andsecond➎attributes.

AUnixepochtimestampcanbeconvertedtoadatetimeobjectwiththedatetime.datetime.fromtimestamp()function.Thedateandtimeofthedatetimeobjectwillbeconvertedforthelocaltimezone.Enterthefollowingintotheinteractiveshell:

>>>datetime.datetime.fromtimestamp(1000000)

datetime.datetime(1970,1,12,5,46,40)

>>>datetime.datetime.fromtimestamp(time.time())

datetime.datetime(2015,2,27,11,13,0,604980)

Callingdatetime.datetime.fromtimestamp()andpassingit1000000returnsadatetimeobjectforthemoment1,000,000secondsaftertheUnixepoch.Passingtime.time(),theUnixepochtimestampforthecurrentmoment,returnsadatetimeobjectforthecurrentmoment.Sotheexpressionsdatetime.datetime.now()anddatetime.datetime.fromtimestamp(time.time())dothesamething;theybothgiveyouadatetimeobjectforthepresentmoment.

NOTE

TheseexampleswereenteredonacomputersettoPacificStandardTime.Ifyou’reinanothertimezone,yourresultswilllookdifferent.

datetimeobjectscanbecomparedwitheachotherusingcomparisonoperatorstofindoutwhichoneprecedestheother.Thelaterdatetimeobjectisthe“greater”value.Enterthefollowingintotheinteractiveshell:

➊>>>halloween2015=datetime.datetime(2015,10,31,0,0,0)

➋>>>newyears2016=datetime.datetime(2016,1,1,0,0,0)

>>>oct31_2015=datetime.datetime(2015,10,31,0,0,0)

➌>>>halloween2015==oct31_2015

True

➍>>>halloween2015>newyears2016

False

➎>>>newyears2016>halloween2015

True

>>>newyears2016!=oct31_2015

True

Makeadatetimeobjectforthefirstmoment(midnight)ofOctober31,2015andstoreitinhalloween2015➊.MakeadatetimeobjectforthefirstmomentofJanuary1,2016andstoreitinnewyears2016➋.ThenmakeanotherobjectformidnightonOctober31,2015andstoreitinoct31_2015.Comparinghalloween2015andoct31_2015showsthatthey’reequal➌.Comparingnewyears2016andhalloween2015showsthatnewyears2016isgreater(later)thanhalloween2015➍➎.

ThetimedeltaDataTypeThedatetimemodulealsoprovidesatimedeltadatatype,whichrepresentsadurationoftimeratherthanamomentintime.Enterthefollowingintotheinteractiveshell:

➊>>>delta=datetime.timedelta(days=11,hours=10,minutes=9,seconds=8)

➋>>>delta.days,delta.seconds,delta.microseconds

(11,36548,0)

>>>delta.total_seconds()

986948.0

>>>str(delta)

'11days,10:09:08'

Tocreateatimedeltaobject,usethedatetime.timedelta()function.Thedatetime.timedelta()functiontakeskeywordargumentsweeks,days,hours,minutes,seconds,milliseconds,andmicroseconds.Thereisnomonthoryearkeywordargumentbecause“amonth”or“ayear”isavariableamountoftimedependingontheparticularmonthoryear.Atimedeltaobjecthasthetotaldurationrepresentedindays,seconds,andmicroseconds.Thesenumbersarestoredinthedays,seconds,andmicrosecondsattributes,respectively.Thetotal_seconds()methodwillreturnthedurationinnumberofsecondsalone.Passingatimedeltaobjecttostr()willreturnanicelyformatted,human-readablestringrepresentationoftheobject.

Inthisexample,wepasskeywordargumentstodatetime.delta()tospecifyadurationof11days,10hours,9minutes,and8seconds,andstorethereturnedtimedeltaobjectindelta➊.Thistimedeltaobject’sdaysattributesstores11,anditssecondsattributestores36548(10hours,9minutes,and8seconds,expressedinseconds)➋.Callingtotal_seconds()tellsusthat11days,10hours,9minutes,and8secondsis986,948seconds.Finally,passingthetimedeltaobjecttostr()returnsastringclearlyexplaningtheduration.

Thearithmeticoperatorscanbeusedtoperformdatearithmeticondatetimevalues.Forexample,tocalculatethedate1,000daysfromnow,enterthefollowingintotheinteractiveshell:

>>>dt=datetime.datetime.now()

>>>dt

datetime.datetime(2015,2,27,18,38,50,636181)

>>>thousandDays=datetime.timedelta(days=1000)

>>>dt+thousandDays

datetime.datetime(2017,11,23,18,38,50,636181)

First,makeadatetimeobjectforthecurrentmomentandstoreitindt.Thenmakeatimedeltaobjectforadurationof1,000daysandstoreitinthousandDays.AdddtandthousandDaystogethertogetadatetimeobjectforthedate1,000daysfromnow.Python

willdothedatearithmetictofigureoutthat1,000daysafterFebruary27,2015,willbeNovember23,2017.Thisisusefulbecausewhenyoucalculate1,000daysfromagivendate,youhavetorememberhowmanydaysareineachmonthandfactorinleapyearsandothertrickydetails.Thedatetimemodulehandlesallofthisforyou.

timedeltaobjectscanbeaddedorsubtractedwithdatetimeobjectsorothertimedeltaobjectsusingthe+and-operators.Atimedeltaobjectcanbemultipliedordividedbyintegerorfloatvalueswiththe*and/operators.Enterthefollowingintotheinteractiveshell:

➊>>>oct21st=datetime.datetime(2015,10,21,16,29,0)

➋>>>aboutThirtyYears=datetime.timedelta(days=365*30)

>>>oct21st

datetime.datetime(2015,10,21,16,29)

>>>oct21st-aboutThirtyYears

datetime.datetime(1985,10,28,16,29)

>>>oct21st-(2*aboutThirtyYears)

datetime.datetime(1955,11,5,16,29)

HerewemakeadatetimeobjectforOctober21,2015➊andatimedeltaobjectforadurationofabout30years(we’reassuming365daysforeachofthoseyears)➋.SubtractingaboutThirtyYearsfromoct21stgivesusadatetimeobjectforthedate30yearsbeforeOctober21,2015.Subtracting2*aboutThirtyYearsfromoct21streturnsadatetimeobjectforthedate60yearsbeforeOctober21,2015.

PausingUntilaSpecificDateThetime.sleep()methodletsyoupauseaprogramforacertainnumberofseconds.Byusingawhileloop,youcanpauseyourprogramsuntilaspecificdate.Forexample,thefollowingcodewillcontinuetoloopuntilHalloween2016:

importdatetime

importtime

halloween2016=datetime.datetime(2016,10,31,0,0,0)

whiledatetime.datetime.now()<halloween2016:

time.sleep(1)

Thetime.sleep(1)callwillpauseyourPythonprogramsothatthecomputerdoesn’twasteCPUprocessingcyclessimplycheckingthetimeoverandover.Rather,thewhileloopwilljustchecktheconditiononcepersecondandcontinuewiththerestoftheprogramafterHalloween2016(orwheneveryouprogramittostop).

ConvertingdatetimeObjectsintoStringsEpochtimestampsanddatetimeobjectsaren’tveryfriendlytothehumaneye.Usethestrftime()methodtodisplayadatetimeobjectasastring.(Thefinthenameofthestrftime()functionstandsforformat.)

Thestrftime()methodusesdirectivessimilartoPython’sstringformatting.Table15-1hasafulllistofstrftime()directives.

Table15-1.strftime()Directives

strftimedirective Meaning

%Y Yearwithcentury,asin'2014'

%y Yearwithoutcentury,'00'to'99'(1970to2069)

%m Monthasadecimalnumber,'01'to'12'

%B Fullmonthname,asin'November'

%b Abbreviatedmonthname,asin'Nov'

%d Dayofthemonth,'01'to'31'

%j Dayoftheyear,'001'to'366'

%w Dayoftheweek,'0'(Sunday)to'6'(Saturday)

%A Fullweekdayname,asin'Monday'

%a Abbreviatedweekdayname,asin'Mon'

%H Hour(24-hourclock),'00'to'23'

%I Hour(12-hourclock),'01'to'12'

%M Minute,'00'to'59'

%S Second,'00'to'59'

%p 'AM'or'PM'

%% Literal'%'character

Passstrrftime()acustomformatstringcontainingformattingdirectives(alongwithanydesiredslashes,colons,andsoon),andstrftime()willreturnthedatetimeobject’sinformationasaformattedstring.Enterthefollowingintotheinteractiveshell:

>>>oct21st=datetime.datetime(2015,10,21,16,29,0)

>>>oct21st.strftime('%Y/%m/%d%H:%M:%S')

'2015/10/2116:29:00'

>>>oct21st.strftime('%I:%M%p')

'04:29PM'

>>>oct21st.strftime("%Bof'%y")

"Octoberof'15"

HerewehaveadatetimeobjectforOctober21,2015at4:29PM,storedinoct21st.Passingstrftime()thecustomformatstring'%Y/%m/%d%H:%M:%S'returnsastringcontaining2015,10,and21separatedbyslahesand16,29,and00separatedbycolons.Passing'%I:%M%p'returns'04:29PM',andpassing"%Bof'%y"returns"Octoberof'15".Notethatstrftime()doesn’tbeginwithdatetime.datetime.

ConvertingStringsintodatetimeObjectsIfyouhaveastringofdateinformation,suchas'2015/10/2116:29:00'or'October21,2015',andneedtoconvertittoadatetimeobject,usethedatetime.datetime.strptime()function.Thestrptime()functionistheinverseofthestrftime()method.Acustomformatstringusingthesamedirectivesasstrftime()mustbepassedsothatstrptime()knowshowtoparseandunderstandthestring.(Thepinthenameofthestrptime()functionstandsforparse.)

Enterthefollowingintotheinteractiveshell:➊>>>datetime.datetime.strptime('October21,2015','%B%d,%Y')

datetime.datetime(2015,10,21,0,0)

>>>datetime.datetime.strptime('2015/10/2116:29:00','%Y/%m/%d%H:%M:%S')

datetime.datetime(2015,10,21,16,29)

>>>datetime.datetime.strptime("Octoberof'15","%Bof'%y")

datetime.datetime(2015,10,1,0,0)

>>>datetime.datetime.strptime("Novemberof'63","%Bof'%y")

datetime.datetime(2063,11,1,0,0)

Togetadatetimeobjectfromthestring'October21,2015',pass'October21,2015'asthefirstargumenttostrptime()andthecustomformatstringthatcorrespondsto'October21,2015'asthesecondargument➊.Thestringwiththedateinformationmustmatchthecustomformatstringexactly,orPythonwillraiseaValueErrorexception.

ReviewofPython’sTimeFunctionsDatesandtimesinPythoncaninvolvequiteafewdifferentdatatypesandfunctions.Here’sareviewofthethreedifferenttypesofvaluesusedtorepresenttime:

AUnixepochtimestamp(usedbythetimemodule)isafloatorintegervalueofthenumberofsecondssince12AMonJanuary1,1970,UTC.Adatetimeobject(ofthedatetimemodule)hasintegersstoredintheattributesyear,month,day,hour,minute,andsecond.Atimedeltaobject(ofthedatetimemodule)representsatimeduration,ratherthanaspecificmoment.

Here’sareviewoftimefunctionsandtheirparametersandreturnvalues:

Thetime.time()functionreturnsanepochtimestampfloatvalueofthecurrentmoment.Thetime.sleep(seconds)functionstopstheprogramfortheamountofsecondsspecifiedbythesecondsargument.Thedatetime.datetime(year,month,day,hour,minute,second)functionreturnsadatetimeobjectofthemomentspecifiedbythearguments.Ifhour,minute,orsecondargumentsarenotprovided,theydefaultto0.Thedatetime.datetime.now()functionreturnsadatetimeobjectofthecurrentmoment.Thedatetime.datetime.fromtimestamp(epoch)functionreturnsadatetimeobjectofthemomentrepresentedbytheepochtimestampargument.Thedatetime.timedelta(weeks,days,hours,minutes,seconds,milliseconds,microseconds)functionreturnsatimedeltaobjectrepresentingadurationoftime.Thefunction’skeywordargumentsarealloptionalanddonotincludemonthoryear.Thetotal_seconds()methodfortimedeltaobjectsreturnsthenumberofsecondsthetimedeltaobjectrepresents.Thestrftime(format)methodreturnsastringofthetimerepresentedbythedatetimeobjectinacustomformatthat’sbasedontheformatstring.SeeTable15-1fortheformatdetails.Thedatetime.datetime.strptime(time_string,format)functionreturnsadatetimeobjectofthemomentspecifiedbytime_string,parsedusingtheformatstringargument.SeeTable15-1fortheformatdetails.

MultithreadingTointroducetheconceptofmultithreading,let’slookatanexamplesituation.Sayyouwanttoschedulesomecodetorunafteradelayorataspecifictime.Youcouldaddcodelikethefollowingatthestartofyourprogram:

importtime,datetime

startTime=datetime.datetime(2029,10,31,0,0,0)

whiledatetime.datetime.now()<startTime:

time.sleep(1)

print('ProgramnowstartingonHalloween2029')

--snip--

ThiscodedesignatesastarttimeofOctober31,2029,andkeepscallingtime.sleep(1)untilthestarttimearrives.Yourprogramcannotdoanythingwhilewaitingfortheloopoftime.sleep()callstofinish;itjustsitsarounduntilHalloween2029.ThisisbecausePythonprogramsbydefaulthaveasinglethreadofexecution.

Tounderstandwhatathreadofexecutionis,remembertheChapter2discussionofflowcontrol,whenyouimaginedtheexecutionofaprogramasplacingyourfingeronalineofcodeinyourprogramandmovingtothenextlineorwhereveritwassentbyaflowcontrolstatement.Asingle-threadedprogramhasonlyonefinger.Butamultithreadedprogramhasmultiplefingers.Eachfingerstillmovestothenextlineofcodeasdefinedbytheflowcontrolstatements,butthefingerscanbeatdifferentplacesintheprogram,executingdifferentlinesofcodeatthesametime.(Alloftheprogramsinthisbooksofarhavebeensinglethreaded.)

Ratherthanhavingallofyourcodewaituntilthetime.sleep()functionfinishes,youcanexecutethedelayedorscheduledcodeinaseparatethreadusingPython’sthreadingmodule.Theseparatethreadwillpauseforthetime.sleepcalls.Meanwhile,yourprogramcandootherworkintheoriginalthread.

Tomakeaseparatethread,youfirstneedtomakeaThreadobjectbycallingthethreading.Thread()function.EnterthefollowingcodeinanewfileandsaveitasthreadDemo.py:

importthreading,time

print('Startofprogram.')

➊deftakeANap():

time.sleep(5)

print('Wakeup!')

➋threadObj=threading.Thread(target=takeANap)

➌threadObj.start()

print('Endofprogram.')

At➊,wedefineafunctionthatwewanttouseinanewthread.TocreateaThreadobject,wecallthreading.Thread()andpassitthekeywordargumenttarget=takeANap➋.ThismeansthefunctionwewanttocallinthenewthreadistakeANap().Noticethatthekeywordargumentistarget=takeANap,nottarget=takeANap().ThisisbecauseyouwanttopassthetakeANap()functionitselfastheargument,notcalltakeANap()andpassitsreturnvalue.

AfterwestoretheThreadobjectcreatedbythreading.Thread()inthreadObj,wecall

threadObj.start()➌tocreatethenewthreadandstartexecutingthetargetfunctioninthenewthread.Whenthisprogramisrun,theoutputwilllooklikethis:

Startofprogram.

Endofprogram.

Wakeup!

Thiscanbeabitconfusing.Ifprint('Endofprogram.')isthelastlineoftheprogram,youmightthinkthatitshouldbethelastthingprinted.ThereasonWakeup!comesafteritisthatwhenthreadObj.start()iscalled,thetargetfunctionforthreadObjisruninanewthreadofexecution.ThinkofitasasecondfingerappearingatthestartofthetakeANap()function.Themainthreadcontinuestoprint('Endofprogram.').Meanwhile,thenewthreadthathasbeenexecutingthetime.sleep(5)call,pausesfor5seconds.Afteritwakesfromits5-secondnap,itprints'Wakeup!'andthenreturnsfromthetakeANap()function.Chronologically,'Wakeup!'isthelastthingprintedbytheprogram.

Normallyaprogramterminateswhenthelastlineofcodeinthefilehasrun(orthesys.exit()functioniscalled).ButthreadDemo.pyhastwothreads.Thefirstistheoriginalthreadthatbeganatthestartoftheprogramandendsafterprint('Endofprogram.').ThesecondthreadiscreatedwhenthreadObj.start()iscalled,beginsatthestartofthetakeANap()function,andendsaftertakeANap()returns.

APythonprogramwillnotterminateuntilallitsthreadshaveterminated.WhenyouranthreadDemo.py,eventhoughtheoriginalthreadhadterminated,thesecondthreadwasstillexecutingthetime.sleep(5)call.

PassingArgumentstotheThread’sTargetFunctionIfthetargetfunctionyouwanttoruninthenewthreadtakesarguments,youcanpassthetargetfunction’sargumentstothreading.Thread().Forexample,sayyouwantedtorunthisprint()callinitsownthread:

>>>print('Cats','Dogs','Frogs',sep='&')

Cats&Dogs&Frogs

Thisprint()callhasthreeregulararguments,'Cats','Dogs',and'Frogs',andonekeywordargument,sep='&'.Theregularargumentscanbepassedasalisttotheargskeywordargumentinthreading.Thread().Thekeywordargumentcanbespecifiedasadictionarytothekwargskeywordargumentinthreading.Thread().

Enterthefollowingintotheinteractiveshell:>>>importthreading

>>>threadObj=threading.Thread(target=print,args=['Cats','Dogs','Frogs'],

kwargs={'sep':'&'})

>>>threadObj.start()

Cats&Dogs&Frogs

Tomakesurethearguments'Cats','Dogs',and'Frogs'getpassedtoprint()inthenewthread,wepassargs=['Cats','Dogs','Frogs']tothreading.Thread().Tomakesurethekeywordargumentsep='&'getspassedtoprint()inthenewthread,wepasskwargs={'sep':'&'}tothreading.Thread().

ThethreadObj.start()callwillcreateanewthreadtocalltheprint()function,anditwillpass'Cats','Dogs',and'Frogs'asargumentsand'&'forthesepkeywordargument.

Thisisanincorrectwaytocreatethenewthreadthatcallsprint():threadObj=threading.Thread(target=print('Cats','Dogs','Frogs',sep='&'))

Whatthisendsupdoingiscallingtheprint()functionandpassingitsreturnvalue(print()’sreturnvalueisalwaysNone)asthetargetkeywordargument.Itdoesn’tpasstheprint()functionitself.Whenpassingargumentstoafunctioninanewthread,usethethreading.Thread()function’sargsandkwargskeywordarguments.

ConcurrencyIssuesYoucaneasilycreateseveralnewthreadsandhavethemallrunningatthesametime.Butmultiplethreadscanalsocauseproblemscalledconcurrencyissues.Theseissueshappenwhenthreadsreadandwritevariablesatthesametime,causingthethreadstotripovereachother.Concurrencyissuescanbehardtoreproduceconsistently,makingthemhardtodebug.

Multithreadedprogrammingisitsownwidesubjectandbeyondthescopeofthisbook.Whatyouhavetokeepinmindisthis:Toavoidconcurrencyissues,neverletmultiplethreadsreadorwritethesamevariables.WhenyoucreateanewThreadobject,makesureitstargetfunctionusesonlylocalvariablesinthatfunction.Thiswillavoidhard-to-debugconcurrencyissuesinyourprograms.

NOTE

Abeginner’stutorialonmultithreadedprogrammingisavailableathttp://nostarch.com/automatestuff/.

Project:MultithreadedXKCDDownloaderInChapter11,youwroteaprogramthatdownloadedalloftheXKCDcomicstripsfromtheXKCDwebsite.Thiswasasingle-threadedprogram:Itdownloadedonecomicatatime.Muchoftheprogram’srunningtimewasspentestablishingthenetworkconnectiontobeginthedownloadandwritingthedownloadedimagestotheharddrive.IfyouhaveabroadbandInternetconnection,yoursingle-threadedprogramwasn’tfullyutilizingtheavailablebandwidth.

AmultithreadedprogramthathassomethreadsdownloadingcomicswhileothersareestablishingconnectionsandwritingthecomicimagefilestodiskusesyourInternetconnectionmoreefficientlyanddownloadsthecollectionofcomicsmorequickly.OpenanewfileeditorwindowandsaveitasmultidownloadXkcd.py.Youwillmodifythisprogramtoaddmultithreading.Thecompletelymodifiedsourcecodeisavailabletodownloadfromhttp://nostarch.com/automatestuff/.

Step1:ModifytheProgramtoUseaFunctionThisprogramwillmostlybethesamedownloadingcodefromChapter11,soI’llskiptheexplanationfortheRequestsandBeautifulSoupcode.ThemainchangesyouneedtomakeareimportingthethreadingmoduleandmakingadownloadXkcd()function,whichtakesstartingandendingcomicnumbersasparameters.

Forexample,callingdownloadXkcd(140,280)wouldloopoverthedownloadingcodetodownloadthecomicsathttp://xkcd.com/140,http://xkcd.com/141,http://xkcd.com/142,andsoon,uptohttp://xkcd.com/279.EachthreadthatyoucreatewillcalldownloadXkcd()andpassadifferentrangeofcomicstodownload.

AddthefollowingcodetoyourmultidownloadXkcd.pyprogram:#!python3

#multidownloadXkcd.py-DownloadsXKCDcomicsusingmultiplethreads.

importrequests,os,bs4,threading

➊os.makedirs('xkcd',exist_ok=True)#storecomicsin./xkcd

➋defdownloadXkcd(startComic,endComic):

➌forurlNumberinrange(startComic,endComic):

#Downloadthepage.

print('Downloadingpagehttp://xkcd.com/%s…'%(urlNumber))

➍res=requests.get('http://xkcd.com/%s'%(urlNumber))

res.raise_for_status()

➎soup=bs4.BeautifulSoup(res.text)

#FindtheURLofthecomicimage.

➏comicElem=soup.select('#comicimg')

ifcomicElem==[]:

print('Couldnotfindcomicimage.')

else:

➐comicUrl=comicElem[0].get('src')

#Downloadtheimage.

print('Downloadingimage%s…'%(comicUrl))

➑res=requests.get(comicUrl)

res.raise_for_status()

#Savetheimageto./xkcd.

imageFile=open(os.path.join('xkcd',os.path.basename(comicUrl)),'wb')

forchunkinres.iter_content(100000):

imageFile.write(chunk)

imageFile.close()

#TODO:CreateandstarttheThreadobjects.

#TODO:Waitforallthreadstoend.

Afterimportingthemodulesweneed,wemakeadirectorytostorecomicsin➊andstartdefiningdownloadxkcd()➋.Weloopthroughallthenumbersinthespecifiedrange➌anddownloadeachpage➍.WeuseBeautifulSouptolookthroughtheHTMLofeachpage➎andfindthecomicimage➏.Ifnocomicimageisfoundonapage,weprintamessage.Otherwise,wegettheURLoftheimage➐anddownloadtheimage➑.Finally,wesavetheimagetothedirectorywecreated.

Step2:CreateandStartThreadsNowthatwe’vedefineddownloadXkcd(),we’llcreatethemultiplethreadsthateachcalldownloadXkcd()todownloaddifferentrangesofcomicsfromtheXKCDwebsite.AddthefollowingcodetomultidownloadXkcd.pyafterthedownloadXkcd()functiondefinition:

#!python3

#multidownloadXkcd.py-DownloadsXKCDcomicsusingmultiplethreads.

--snip--

#CreateandstarttheThreadobjects.

downloadThreads=[]#alistofalltheThreadobjects

foriinrange(0,1400,100):#loops14times,creates14threads

downloadThread=threading.Thread(target=downloadXkcd,args=(i,i+99))

downloadThreads.append(downloadThread)

downloadThread.start()

FirstwemakeanempylistdownloadThreads;thelistwillhelpuskeeptrackofthemanyThreadobjectswe’llcreate.Thenwestartourforloop.Eachtimethroughtheloop,wecreateaThreadobjectwiththreading.Thread(),appendtheThreadobjecttothelist,andcallstart()tostartrunningdownloadXkcd()inthenewthread.Sincetheforloopsetstheivariablefrom0to1400atstepsof100,iwillbesetto0onthefirstiteration,100ontheseconditeration,200onthethird,andsoon.Sincewepassargs=(i,i+99)tothreading.Thread(),thetwoargumentspassedtodownloadXkcd()willbe0and99onthefirstiteration,100and199ontheseconditeration,200and299onthethird,andsoon.

AstheThreadobject’sstart()methodiscalledandthenewthreadbeginstorunthecodeinsidedownloadXkcd(),themainthreadwillcontinuetothenextiterationoftheforloopandcreatethenextthread.

Step3:WaitforAllThreadstoEndThemainthreadmovesonasnormalwhiletheotherthreadswecreatedownloadcomics.Butsaythere’ssomecodeyoudon’twanttoruninthemainthreaduntilallthethreadshavecompleted.CallingaThreadobject’sjoin()methodwillblockuntilthatthreadhasfinished.ByusingaforlooptoiterateoveralltheThreadobjectsinthedownloadThreadslist,themainthreadcancallthejoin()methodoneachoftheotherthreads.Addthefollowingtothebottomofyourprogram:

#!python3

#multidownloadXkcd.py-DownloadsXKCDcomicsusingmultiplethreads.

--snip--

#Waitforallthreadstoend.

fordownloadThreadindownloadThreads:

downloadThread.join()

print('Done.')

The'Done.'stringwillnotbeprinteduntilallofthejoin()callshavereturned.IfaThreadobjecthasalreadycompletedwhenitsjoin()methodiscalled,thenthemethodwillsimplyreturnimmediately.Ifyouwantedtoextendthisprogramwithcodethatrunsonlyafterallofthecomicsdownloaded,youcouldreplacetheprint('Done.')linewithyournewcode.

LaunchingOtherProgramsfromPythonYourPythonprogramcanstartotherprogramsonyourcomputerwiththePopen()functioninthebuilt-insubprocessmodule.(ThePinthenameofthePopen()functionstandsforprocess.)Ifyouhavemultipleinstancesofanapplicationopen,eachofthoseinstancesisaseparateprocessofthesameprogram.Forexample,ifyouopenmultiplewindowsofyourwebbrowseratthesametime,eachofthosewindowsisadifferentprocessofthewebbrowserprogram.SeeFigure15-1foranexampleofmultiplecalculatorprocessesopenatonce.

Everyprocesscanhavemultiplethreads.Unlikethreads,aprocesscannotdirectlyreadandwriteanotherprocess’svariables.Ifyouthinkofamultithreadedprogramashavingmultiplefingersfollowingsourcecode,thenhavingmultipleprocessesofthesameprogramopenislikehavingafriendwithaseparatecopyoftheprogram’ssourcecode.Youarebothindependentlyexecutingthesameprogram.

IfyouwanttostartanexternalprogramfromyourPythonscript,passtheprogram’sfilenametosubprocess.Popen().(OnWindows,right-clicktheapplication’sStartmenuitemandselectPropertiestoviewtheapplication’sfilename.OnOSX,CTRL-clicktheapplicationandselectShowPackageContentstofindthepathtotheexecutablefile.)ThePopen()functionwillthenimmediatelyreturn.KeepinmindthatthelaunchedprogramisnotruninthesamethreadasyourPythonprogram.

Figure15-1.Sixrunningprocessesofthesamecalculatorprogram

OnaWindowscomputer,enterthefollowingintotheinteractiveshell:>>>importsubprocess

>>>subprocess.Popen('C:\\Windows\\System32\\calc.exe')

<subprocess.Popenobjectat0x0000000003055A58>

OnUbuntuLinux,youwouldenterthefollowing:>>>importsubprocess

>>>subprocess.Popen('/usr/bin/gnome-calculator')

<subprocess.Popenobjectat0x7f2bcf93b20>

OnOSX,theprocessisslightlydifferent.SeeOpeningFileswithDefaultApplications.

ThereturnvalueisaPopenobject,whichhastwousefulmethods:poll()andwait().

Youcanthinkofthepoll()methodasaskingyourfriendifshe’sfinishedrunningthecodeyougaveher.Thepoll()methodwillreturnNoneiftheprocessisstillrunningatthetimepoll()iscalled.Iftheprogramhasterminated,itwillreturntheprocess’sintegerexitcode.Anexitcodeisusedtoindicatewhethertheprocessterminatedwithouterrors(anexitcodeof0)orwhetheranerrorcausedtheprocesstoterminate(anonzeroexitcode—generally1,butitmayvarydependingontheprogram).

Thewait()methodislikewaitingforyourfriendtofinishworkingonhercodebeforeyoukeepworkingonyours.Thewait()methodwillblockuntilthelaunchedprocesshas

terminated.Thisishelpfulifyouwantyourprogramtopauseuntiltheuserfinisheswiththeotherprogram.Thereturnvalueofwait()istheprocess’sintegerexitcode.

OnWindows,enterthefollowingintotheinteractiveshell.Notethatthewait()callwillblockuntilyouquitthelaunchedcalculatorprogram.

➊>>>calcProc=subprocess.Popen('c:\\Windows\\System32\\calc.exe')

➋>>>calcProc.poll()==None

True

➌>>>calcProc.wait()

0

>>>calcProc.poll()

0

Hereweopenacalculatorprocess➊.Whileit’sstillrunning,wecheckifpoll()returnsNone➋.Itshould,astheprocessisstillrunning.Thenweclosethecalculatorprogramandcallwait()ontheterminatedprocess➌.wait()andpoll()nowreturn0,indicatingthattheprocessterminatedwithouterrors.

PassingCommandLineArgumentstoPopen()YoucanpasscommandlineargumentstoprocessesyoucreatewithPopen().Todoso,youpassalistasthesoleargumenttoPopen().Thefirststringinthislistwillbetheexecutablefilenameoftheprogramyouwanttolaunch;allthesubsequentstringswillbethecommandlineargumentstopasstotheprogramwhenitstarts.Ineffect,thislistwillbethevalueofsys.argvforthelaunchedprogram.

Mostapplicationswithagraphicaluserinterface(GUI)don’tusecommandlineargumentsasextensivelyascommandline–basedorterminal-basedprogramsdo.ButmostGUIapplicationswillacceptasingleargumentforafilethattheapplicationswillimmediatelyopenwhentheystart.Forexample,ifyou’reusingWindows,createasimpletextfilecalledC:\hello.txtandthenenterthefollowingintotheinteractiveshell:

>>>subprocess.Popen(['C:\\Windows\\notepad.exe','C:\\hello.txt'])

<subprocess.Popenobjectat0x00000000032DCEB8>

ThiswillnotonlylaunchtheNotepadapplicationbutalsohaveitimmediatelyopentheC:\hello.txtfile.

TaskScheduler,launchd,andcronIfyouarecomputersavvy,youmayknowaboutTaskScheduleronWindows,launchdonOSX,orthecronscheduleronLinux.Thesewell-documentedandreliabletoolsallallowyoutoscheduleapplicationstolaunchatspecifictimes.Ifyou’dliketolearnmoreaboutthem,youcanfindlinkstotutorialsathttp://nostarch.com/automatestuff/.

Usingyouroperatingsystem’sbuilt-inschedulersavesyoufromwritingyourownclock-checkingcodetoscheduleyourprograms.However,usethetime.sleep()functionifyoujustneedyourprogramtopausebriefly.Orinsteadofusingtheoperatingsystem’sscheduler,yourcodecanloopuntilacertaindateandtime,callingtime.sleep(1)eachtimethroughtheloop.

OpeningWebsiteswithPythonThewebbrowser.open()functioncanlaunchawebbrowserfromyourprogramtoaspecificwebsite,ratherthanopeningthebrowserapplicationwithsubprocess.Popen().

SeeProject:mapit.pywiththewebbrowserModuleformoredetails.

RunningOtherPythonScriptsYoucanlaunchaPythonscriptfromPythonjustlikeanyotherapplication.Youjusthavetopassthepython.exeexecutabletoPopen()andthefilenameofthe.pyscriptyouwanttorunasitsargument.Forexample,thefollowingwouldrunthehello.pyscriptfromChapter1:

>>>subprocess.Popen(['C:\\python34\\python.exe','hello.py'])

<subprocess.Popenobjectat0x000000000331CF28>

PassPopen()alistcontainingastringofthePythonexecutable’spathandastringofthescript’sfilename.Ifthescriptyou’relaunchingneedscommandlinearguments,addthemtothelistafterthescript’sfilename.ThelocationofthePythonexecutableonWindowsisC:\python34\python.exe.OnOSX,itis/Library/Frameworks/Python.framework/Versions/3.3/bin/python3.OnLinux,itis/usr/bin/python3.

UnlikeimportingthePythonprogramasamodule,whenyourPythonprogramlaunchesanotherPythonprogram,thetwoareruninseparateprocessesandwillnotbeabletoshareeachother’svariables.

OpeningFileswithDefaultApplicationsDouble-clickinga.txtfileonyourcomputerwillautomaticallylaunchtheapplicationassociatedwiththe.txtfileextension.Yourcomputerwillhaveseveralofthesefileextensionassociationssetupalready.PythoncanalsoopenfilesthiswaywithPopen().

Eachoperatingsystemhasaprogramthatperformstheequivalentofdouble-clickingadocumentfiletoopenit.OnWindows,thisisthestartprogram.OnOSX,thisistheopenprogram.OnUbuntuLinux,thisistheseeprogram.Enterthefollowingintotheinteractiveshell,passing'start','open',or'see'toPopen()dependingonyoursystem:

>>>fileObj=open('hello.txt','w')

>>>fileObj.write('Helloworld!')

12

>>>fileObj.close()

>>>importsubprocess

>>>subprocess.Popen(['start','hello.txt'],shell=True)

HerewewriteHelloworld!toanewhello.txtfile.ThenwecallPopen(),passingitalistcontainingtheprogramname(inthisexample,'start'forWindows)andthefilename.Wealsopasstheshell=Truekeywordargument,whichisneededonlyonWindows.Theoperatingsystemknowsallofthefileassociationsandcanfigureoutthatitshouldlaunch,say,Notepad.exetohandlethehello.txtfile.

OnOSX,theopenprogramisusedforopeningbothdocumentfilesandprograms.EnterthefollowingintotheinteractiveshellifyouhaveaMac:

>>>subprocess.Popen(['open','/Applications/Calculator.app/'])

<subprocess.Popenobjectat0x10202ff98>

TheCalculatorappshouldopen.

THEUNIXPHILOSOPHY

Programswelldesignedtobelaunchedbyotherprogramsbecomemorepowerfulthantheircodealone.TheUnixphilosophyisasetofsoftwaredesignprinciplesestablishedbytheprogrammersoftheUnixoperatingsystem(onwhichthemodernLinuxandOSXarebuilt).Itsaysthatit’sbettertowritesmall,limited-purposeprogramsthatcaninteroperate,ratherthanlarge,feature-richapplications.Thesmallerprogramsareeasiertounderstand,andbybeinginteroperable,theycanbethebuildingblocksofmuchmorepowerfulapplications.

Smartphoneappsfollowthisapproachaswell.Ifyourrestaurantappneedstodisplaydirectionstoacafé,thedevelopersdidn’treinventthewheelbywritingtheirownmapcode.Therestaurantappsimplylaunchesamapappwhilepassingitthecafé’saddress,justasyourPythoncodewouldcallafunctionandpassitarguments.

ThePythonprogramsyou’vebeenwritinginthisbookmostlyfittheUnixphilosophy,especiallyinoneimportantway:Theyusecommandlineargumentsratherthaninput()functioncalls.Ifalltheinformationyourprogramneedscanbesuppliedupfront,itispreferabletohavethisinformationpassedascommandlineargumentsratherthanwaitingfortheusertotypeitin.Thisway,thecommandlineargumentscanbeenteredbyahumanuserorsuppliedbyanotherprogram.Thisinteroperableapproachwillmakeyourprogramsreusableaspartofanotherprogram.

Thesoleexceptionisthatyoudon’twantpasswordspassedascommandlinearguments,sincethecommandlinemayrecordthemaspartofitscommandhistoryfeature.Instead,yourprogramshouldcalltheinput()functionwhenitneedsyoutoenterapassword.

YoucanreadmoreaboutUnixphilosophyathttps://en.wikipedia.org/wiki/Unix_philosophy/.

Project:SimpleCountdownProgramJustlikeit’shardtofindasimplestopwatchapplication,itcanbehardtofindasimplecountdownapplication.Let’swriteacountdownprogramthatplaysanalarmattheendofthecountdown.

Atahighlevel,here’swhatyourprogramwilldo:

Countdownfrom60.Playasoundfile(alarm.wav)whenthecountdownreacheszero.

Thismeansyourcodewillneedtodothefollowing:

Pauseforonesecondinbetweendisplayingeachnumberinthecountdownbycallingtime.sleep().Callsubprocess.Popen()toopenthesoundfilewiththedefaultapplication.

Openanewfileeditorwindowandsaveitascountdown.py.

Step1:CountDownThisprogramwillrequirethetimemoduleforthetime.sleep()functionandthesubprocessmoduleforthesubprocess.Popen()function.Enterthefollowingcodeandsavethefileascountdown.py:

#!python3

#countdown.py-Asimplecountdownscript.

importtime,subprocess

➊timeLeft=60

whiletimeLeft>0:

➋print(timeLeft,end='')

➌time.sleep(1)

➍timeLeft=timeLeft-1

#TODO:Attheendofthecountdown,playasoundfile.

Afterimportingtimeandsubprocess,makeavariablecalledtimeLefttoholdthenumberofsecondsleftinthecountdown➊.Itcanstartat60—oryoucanchangethevalueheretowhateveryouneedorevenhaveitgetsetfromacommandlineargument.

Inawhileloop,youdisplaytheremainingcount➋,pauseforonesecond➌,andthendecrementthetimeLeftvariable➍beforetheloopstartsoveragain.TheloopwillkeeploopingaslongastimeLeftisgreaterthan0.Afterthat,thecountdownwillbeover.

Step2:PlaytheSoundFileWhiletherearethird-partymodulestoplaysoundfilesofvariousformats,thequickandeasywayistojustlaunchwhateverapplicationtheuseralreadyusestoplaysoundfiles.Theoperatingsystemwillfigureoutfromthe.wavfileextensionwhichapplicationitshouldlaunchtoplaythefile.This.wavfilecouldeasilybesomeothersoundfileformat,suchas.mp3or.ogg.

Youcanuseanysoundfilethatisonyourcomputertoplayattheendofthecountdown,oryoucandownloadalarm.wavfromhttp://nostarch.com/automatestuff/.

Addthefollowingtoyourcode:

#!python3

#countdown.py-Asimplecountdownscript.

importtime,subprocess

--snip--

#Attheendofthecountdown,playasoundfile.

subprocess.Popen(['start','alarm.wav'],shell=True)

Afterthewhileloopfinishes,alarm.wav(orthesoundfileyouchoose)willplaytonotifytheuserthatthecountdownisover.OnWindows,besuretoinclude'start'inthelistyoupasstoPopen()andpassthekeywordargumentshell=True.OnOSX,pass'open'insteadof'start'andremoveshell=True.

Insteadofplayingasoundfile,youcouldsaveatextfilesomewherewithamessagelikeBreaktimeisover!andusePopen()toopenitattheendofthecountdown.Thiswilleffectivelycreateapop-upwindowwithamessage.Oryoucouldusethewebbrowser.open()functiontoopenaspecificwebsiteattheendofthecountdown.Unlikesomefreecountdownapplicationyou’dfindonline,yourowncountdownprogram’salarmcanbeanythingyouwant!

IdeasforSimilarProgramsAcountdownisasimpledelaybeforecontinuingtheprogram’sexecution.Thiscanalsobeusedforotherapplicationsandfeatures,suchasthefollowing:

Usetime.sleep()togivetheuserachancetopressCTRL-Ctocancelanaction,suchasdeletingfiles.Yourprogramcanprinta“PressCTRL-Ctocancel”messageandthenhandleanyKeyboardInterruptexceptionswithtryandexceptstatements.Foralong-termcountdown,youcanusetimedeltaobjectstomeasurethenumberofdays,hours,minutes,andsecondsuntilsomepoint(abirthday?ananniversary?)inthefuture.

SummaryTheUnixepoch(January1,1970,atmidnight,UTC)isastandardreferencetimeformanyprogramminglanguages,includingPython.Whilethetime.time()functionmodulereturnsanepochtimestamp(thatis,afloatvalueofthenumberofsecondssincetheUnixepoch),thedatetimemoduleisbetterforperformingdatearithmeticandformattingorparsingstringswithdateinformation.

Thetime.sleep()functionwillblock(thatis,notreturn)foracertainnumberofseconds.Itcanbeusedtoaddpausestoyourprogram.Butifyouwanttoscheduleyourprogramstostartatacertaintime,theinstructionsathttp://nostarch.com/automatestuff/cantellyouhowtousethescheduleralreadyprovidedbyyouroperatingsystem.

Thethreadingmoduleisusedtocreatemultiplethreads,whichisusefulwhenyouneedtodownloadmultiplefilesordoothertaskssimultaneously.Butmakesurethethreadreadsandwritesonlylocalvariables,oryoumightrunintoconcurrencyissues.

Finally,yourPythonprogramscanlaunchotherapplicationswiththesubprocess.Popen()function.CommandlineargumentscanbepassedtothePopen()calltoopenspecificdocumentswiththeapplication.Alternatively,youcanusethestart,open,orseeprogramwithPopen()touseyourcomputer’sfileassociationstoautomaticallyfigureoutwhichapplicationtousetoopenadocument.Byusingtheotherapplicationsonyourcomputer,yourPythonprogramscanleveragetheircapabilitiesforyourautomationneeds.

PracticeQuestionsQ: 1.WhatistheUnixepoch?

Q: 2.WhatfunctionreturnsthenumberofsecondssincetheUnixepoch?

Q: 3.Howcanyoupauseyourprogramforexactly5seconds?

Q: 4.Whatdoestheround()functionreturn?

Q: 5.Whatisthedifferencebetweenadatetimeobjectandatimedeltaobject?

Q: 6.Sayyouhaveafunctionnamedspam().Howcanyoucallthisfunctionandrunthecodeinsideitinaseparatethread?

Q: 7.Whatshouldyoudotoavoidconcurrencyissueswithmultiplethreads?

Q: 8.HowcanyouhaveyourPythonprogramrunthecalc.exeprogramlocatedintheC:\Windows\System32folder?

PracticeProjectsForpractice,writeprogramsthatdothefollowing.

PrettifiedStopwatchExpandthestopwatchprojectfromthischaptersothatitusestherjust()andljust()stringmethodsto“prettify”theoutput.(ThesemethodswerecoveredinChapter6.)Insteadofoutputsuchasthis:

Lap#1:3.56(3.56)

Lap#2:8.63(5.07)

Lap#3:17.68(9.05)

Lap#4:19.11(1.43)

…theoutputwilllooklikethis:Lap#1:3.56(3.56)

Lap#2:8.63(5.07)

Lap#3:17.68(9.05)

Lap#4:19.11(1.43)

NotethatyouwillneedstringversionsofthelapNum,lapTime,andtotalTimeintegerandfloatvariablesinordertocallthestringmethodsonthem.

Next,usethepyperclipmoduleintroducedinChapter6tocopythetextoutputtotheclipboardsotheusercanquicklypastetheoutputtoatextfileoremail.

ScheduledWebComicDownloaderWriteaprogramthatchecksthewebsitesofseveralwebcomicsandautomaticallydownloadstheimagesifthecomicwasupdatedsincetheprogram’slastvisit.Youroperatingsystem’sscheduler(ScheduledTasksonWindows,launchdonOSX,andcrononLinux)canrunyourPythonprogramonceaday.ThePythonprogramitselfcandownloadthecomicandthencopyittoyourdesktopsothatitiseasytofind.Thiswillfreeyoufromhavingtocheckthewebsiteyourselftoseewhetherithasupdated.(Alistofwebcomicsisavailableathttp://nostarch.com/automatestuff/.)

Chapter16.SendingEmailandTextMessagesCheckingandreplyingtoemailisahugetimesink.Ofcourse,youcan’tjustwriteaprogramtohandleallyouremailforyou,sinceeachmessagerequiresitsownresponse.Butyoucanstillautomateplentyofemail-relatedtasksonceyouknowhowtowriteprogramsthatcansendandreceiveemail.

Forexample,maybeyouhaveaspreadsheetfullofcustomerrecordsandwanttosendeachcustomeradifferentformletterdependingontheirageandlocationdetails.Commercialsoftwaremightnotbeabletodothisforyou;fortunately,youcanwriteyourownprogramtosendtheseemails,savingyourselfalotoftimecopyingandpastingformemails.

YoucanalsowriteprogramstosendemailsandSMStextstonotifyyouofthingsevenwhileyou’reawayfromyourcomputer.Ifyou’reautomatingataskthattakesacoupleofhourstodo,youdon’twanttogobacktoyourcomputereveryfewminutestocheckontheprogram’sstatus.Instead,theprogramcanjusttextyourphonewhenit’sdone—freeingyoutofocusonmoreimportantthingswhileyou’reawayfromyourcomputer.

SMTPMuchlikeHTTPistheprotocolusedbycomputerstosendwebpagesacrosstheInternet,SimpleMailTransferProtocol(SMTP)istheprotocolusedforsendingemail.SMTPdictateshowemailmessagesshouldbeformatted,encrypted,andrelayedbetweenmailservers,andalltheotherdetailsthatyourcomputerhandlesafteryouclickSend.Youdon’tneedtoknowthesetechnicaldetails,though,becausePython’ssmtplibmodulesimplifiesthemintoafewfunctions.

SMTPjustdealswithsendingemailstoothers.Adifferentprotocol,calledIMAP,dealswithretrievingemailssenttoyouandisdescribedinIMAP.

SendingEmailYoumaybefamiliarwithsendingemailsfromOutlookorThunderbirdorthroughawebsitesuchasGmailorYahoo!Mail.Unfortunately,Pythondoesn’tofferyouanicegraphicaluserinterfacelikethoseservices.Instead,youcallfunctionstoperformeachmajorstepofSMTP,asshowninthefollowinginteractiveshellexample.

NOTE

Don’tenterthisexampleinIDLE;itwon’tworkbecausesmtp.example.com,[email protected],MY_SECRET_PASSWORD,andalice@example.comarejustplaceholders.ThiscodeisjustanoverviewoftheprocessofsendingemailwithPython.

>>>importsmtplib

>>>smtpObj=smtplib.SMTP('smtp.example.com',587)

>>>smtpObj.ehlo()

(250,b'mx.example.comatyourservice,[216.172.148.131]\nSIZE35882577\

n8BITMIME\nSTARTTLS\nENHANCEDSTATUSCODES\nCHUNKING')

>>>smtpObj.starttls()

(220,b'2.0.0ReadytostartTLS')

>>>smtpObj.login('[email protected]','MY_SECRET_PASSWORD')

(235,b'2.7.0Accepted')

>>>smtpObj.sendmail('[email protected]','[email protected]','Subject:So

long.\nDearAlice,solongandthanksforallthefish.Sincerely,Bob')

{}

>>>smtpObj.quit()

(221,b'2.0.0closingconnectionko10sm23097611pbd.52-gsmtp')

Inthefollowingsections,we’llgothrougheachstep,replacingtheplaceholderswithyourinformationtoconnectandlogintoanSMTPserver,sendanemail,anddisconnectfromtheserver.

ConnectingtoanSMTPServerIfyou’veeversetupThunderbird,Outlook,oranotherprogramtoconnecttoyouremailaccount,youmaybefamiliarwithconfiguringtheSMTPserverandport.Thesesettingswillbedifferentforeachemailprovider,butawebsearchfor<yourprovider>smtpsettingsshouldturnuptheserverandporttouse.

ThedomainnamefortheSMTPserverwillusuallybethenameofyouremailprovider’sdomainname,withsmtp.infrontofit.Forexample,Gmail’sSMTPserverisatsmtp.gmail.com.Table16-1listssomecommonemailprovidersandtheirSMTPservers.(Theportisanintegervalueandwillalmostalwaysbe587,whichisusedbythecommandencryptionstandard,TLS.)

Table16-1.EmailProvidersandTheirSMTPServers

Provider SMTPserverdomainname

Gmail smtp.gmail.com

Outlook.com/Hotmail.com smtp-mail.outlook.com

YahooMail smtp.mail.yahoo.com

AT&T smpt.mail.att.net(port465)

Comcast smtp.comcast.net

Verizon smtp.verizon.net(port465)

Onceyouhavethedomainnameandportinformationforyouremailprovider,createanSMTPobjectbycallingsmptlib.SMTP(),passingthedomainnameasastringargument,andpassingtheportasanintegerargument.TheSMTPobjectrepresentsaconnectiontoanSMTPmailserverandhasmethodsforsendingemails.Forexample,thefollowingcallcreatesanSMTPobjectforconnectingtoGmail:

>>>smtpObj=smtplib.SMTP('smtp.gmail.com',587)

>>>type(smtpObj)

<class'smtplib.SMTP'>

Enteringtype(smtpObj)showsyouthatthere’sanSMTPobjectstoredinsmtpObj.You’llneedthisSMTPobjectinordertocallthemethodsthatlogyouinandsendemails.Ifthesmptlib.SMTP()callisnotsuccessful,yourSMTPservermightnotsupportTLSonport587.Inthiscase,youwillneedtocreateanSMTPobjectusingsmtplib.SMTP_SSL()andport465instead.

>>>smtpObj=smtplib.SMTP_SSL('smtp.gmail.com',465)

NOTE

IfyouarenotconnectedtotheInternet,Pythonwillraiseasocket.gaierror:[Errno11004]getaddrinfofailedorsimilarexception.

Foryourprograms,thedifferencesbetweenTLSandSSLaren’timportant.YouonlyneedtoknowwhichencryptionstandardyourSMTPserverusessoyouknowhowtoconnecttoit.Inalloftheinteractiveshellexamplesthatfollow,thesmtpObjvariablewillcontainanSMTPobjectreturnedbythesmtplib.SMTP()orsmtplib.SMTP_SSL()function.

SendingtheSMTP“Hello”MessageOnceyouhavetheSMTPobject,callitsoddlynamedehlo()methodto“sayhello”totheSMTPemailserver.ThisgreetingisthefirststepinSMTPandisimportantforestablishingaconnectiontotheserver.Youdon’tneedtoknowthespecificsoftheseprotocols.Justbesuretocalltheehlo()methodfirstthingaftergettingtheSMTPobjectorelsethelatermethodcallswillresultinerrors.Thefollowingisanexampleofanehlo()callanditsreturnvalue:

>>>smtpObj.ehlo()

(250,b'mx.google.comatyourservice,[216.172.148.131]\nSIZE35882577\

n8BITMIME\nSTARTTLS\nENHANCEDSTATUSCODES\nCHUNKING')

Ifthefirstiteminthereturnedtupleistheinteger250(thecodefor“success”inSMTP),thenthegreetingsucceeded.

StartingTLSEncryptionIfyouareconnectingtoport587ontheSMTPserver(thatis,you’reusingTLSencryption),you’llneedtocallthestarttls()methodnext.Thisrequiredstepenablesencryptionforyourconnection.Ifyouareconnectingtoport465(usingSSL),thenencryptionisalreadysetup,andyoushouldskipthisstep.

Here’sanexampleofthestarttls()methodcall:>>>smtpObj.starttls()

(220,b'2.0.0ReadytostartTLS')

starttls()putsyourSMTPconnectioninTLSmode.The220inthereturnvaluetellsyouthattheserverisready.

LoggingintotheSMTPServerOnceyourencryptedconnectiontotheSMTPserverissetup,youcanloginwithyourusername(usuallyyouremailaddress)andemailpasswordbycallingthelogin()method.

>>>smtpObj.login('[email protected]','MY_SECRET_PASSWORD')

(235,b'2.7.0Accepted')

GMAIL’SAPPLICATION-SPECIFICPASSWORDS

GmailhasanadditionalsecurityfeatureforGoogleaccountscalledapplication-specificpasswords.IfyoureceiveanApplication-specificpasswordrequirederrormessagewhenyourprogramtriestologin,youwillhavetosetuponeofthesepasswordsforyourPythonscript.Checkouttheresourcesathttp://nostarch.com/automatestuff/fordetaileddirectionsonhowtosetupanapplication-specificpasswordforyourGoogleaccount.

Passastringofyouremailaddressasthefirstargumentandastringofyourpasswordasthesecondargument.The235inthereturnvaluemeansauthenticationwassuccessful.Pythonwillraiseansmtplib.SMTPAuthenticationErrorexceptionforincorrectpasswords.

WARNING

Becarefulaboutputtingpasswordsinyoursourcecode.Ifanyoneevercopiesyourprogram,they’llhaveaccesstoyouremailaccount!It’sagoodideatocallinput()andhavetheusertypeinthepassword.Itmaybeinconvenienttohavetoenterapasswordeachtimeyourunyourprogram,butthisapproachwillpreventyoufromleavingyourpasswordinanunencryptedfileonyourcomputerwhereahackerorlaptopthiefcouldeasilygetit.

SendinganEmailOnceyouareloggedintoyouremailprovider’sSMTPserver,youcancallthesendmail()methodtoactuallysendtheemail.Thesendmail()methodcalllookslikethis:

>>>smtpObj.sendmail('[email protected]','[email protected]',

'Subject:Solong.\nDearAlice,solongandthanksforallthefish.Sincerely,

Bob')

{}

Thesendmail()methodrequiresthreearguments.

Youremailaddressasastring(fortheemail’s“from”address)Therecipient’semailaddressasastringoralistofstringsformultiplerecipients(forthe“to”address)Theemailbodyasastring

Thestartoftheemailbodystringmustbeginwith'Subject:\n'forthesubjectlineoftheemail.The'\n'newlinecharacterseparatesthesubjectlinefromthemainbodyoftheemail.

Thereturnvaluefromsendmail()isadictionary.Therewillbeonekey-valuepairinthedictionaryforeachrecipientforwhomemaildeliveryfailed.Anemptydictionarymeansallrecipientsweresuccessfullysenttheemail.

DisconnectingfromtheSMTPServerBesuretocallthequit()methodwhenyouaredonesendingemails.ThiswilldisconnectyourprogramfromtheSMTPserver.

>>>smtpObj.quit()

(221,b'2.0.0closingconnectionko10sm23097611pbd.52-gsmtp')

The221inthereturnvaluemeansthesessionisending.

Toreviewallthestepsforconnectingandloggingintotheserver,sendingemail,anddisconnection,seeSendingEmail.

IMAPJustasSMTPistheprotocolforsendingemail,theInternetMessageAccessProtocol(IMAP)specifieshowtocommunicatewithanemailprovider’sservertoretrieveemailssenttoyouremailaddress.Pythoncomeswithanimaplibmodule,butinfactthethird-partyimapclientmoduleiseasiertouse.ThischapterprovidesanintroductiontousingIMAPClient;thefulldocumentationisathttp://imapclient.readthedocs.org/.

TheimapclientmoduledownloadsemailsfromanIMAPserverinarathercomplicatedformat.Mostlikely,you’llwanttoconvertthemfromthisformatintosimplestringvalues.Thepyzmailmoduledoesthehardjobofparsingtheseemailmessagesforyou.YoucanfindthecompletedocumentationforPyzMailathttp://www.magiksys.net/pyzmail/.

InstallimapclientandpyzmailfromaTerminalwindow.AppendixAhasstepsonhowtoinstallthird-partymodules.

RetrievingandDeletingEmailswithIMAPFindingandretrievinganemailinPythonisamultistepprocessthatrequiresboththeimapclientandpyzmailthird-partymodules.Justtogiveyouanoverview,here’safullexampleofloggingintoanIMAPserver,searchingforemails,fetchingthem,andthenextractingthetextoftheemailmessagesfromthem.

>>>importimapclient

>>>imapObj=imapclient.IMAPClient('imap.gmail.com',ssl=True)

>>>imapObj.login('[email protected]','MY_SECRET_PASSWORD')

'[email protected](Success)'

>>>imapObj.select_folder('INBOX',readonly=True)

>>>UIDs=imapObj.search(['SINCE05-Jul-2014'])

>>>UIDs

[40032,40033,40034,40035,40036,40037,40038,40039,40040,40041]

>>>rawMessages=imapObj.fetch([40041],['BODY[]','FLAGS'])

>>>importpyzmail

>>>message=pyzmail.PyzMessage.factory(rawMessages[40041]['BODY[]'])

>>>message.get_subject()

'Hello!'

>>>message.get_addresses('from')

[('EdwardSnowden','[email protected]')]

>>>message.get_addresses('to')

[(JaneDoe','[email protected]')]

>>>message.get_addresses('cc')

[]

>>>message.get_addresses('bcc')

[]

>>>message.text_part!=None

True

>>>message.text_part.get_payload().decode(message.text_part.charset)

'Followthemoney.\r\n\r\n-Ed\r\n'

>>>message.html_part!=None

True

>>>message.html_part.get_payload().decode(message.html_part.charset)

'<divdir="ltr"><div>Solong,andthanksforallthefish!<br><br></div>-

Al<br></div>\r\n'

>>>imapObj.logout()

Youdon’thavetomemorizethesesteps.Afterwegothrougheachstepindetail,youcancomebacktothisoverviewtorefreshyourmemory.

ConnectingtoanIMAPServerJustlikeyouneededanSMTPobjecttoconnecttoanSMTPserverandsendemail,youneedanIMAPClientobjecttoconnecttoanIMAPserverandreceiveemail.Firstyou’llneedthedomainnameofyouremailprovider’sIMAPserver.ThiswillbedifferentfromtheSMTPserver’sdomainname.Table16-2liststheIMAPserversforseveralpopularemailproviders.

Table16-2.EmailProvidersandTheirIMAPServers

Provider IMAPserverdomainname

Gmail imap.gmail.com

Outlook.com/Hotmail.com imap-mail.outlook.com

YahooMail imap.mail.yahoo.com

AT&T imap.mail.att.net

Comcast imap.comcast.net

Verizon incoming.verizon.net

OnceyouhavethedomainnameoftheIMAPserver,calltheimapclient.IMAPClient()functiontocreateanIMAPClientobject.MostemailprovidersrequireSSLencryption,sopassthessl=Truekeywordargument.Enterthefollowingintotheinteractiveshell(usingyourprovider’sdomainname):

>>>importimapclient

>>>imapObj=imapclient.IMAPClient('imap.gmail.com',ssl=True)

Inalloftheinteractiveshellexamplesinthefollowingsections,theimapObjvariablewillcontainanIMAPClientobjectreturnedfromtheimapclient.IMAPClient()function.Inthiscontext,aclientistheobjectthatconnectstotheserver.

LoggingintotheIMAPServerOnceyouhaveanIMAPClientobject,callitslogin()method,passingintheusername(thisisusuallyyouremailaddress)andpasswordasstrings.

>>>imapObj.login('[email protected]','MY_SECRET_PASSWORD')

'[email protected](Success)'

WARNING

Remember,neverwriteapassworddirectlyintoyourcode!Instead,designyourprogramtoacceptthepasswordreturnedfrominput().

IftheIMAPserverrejectsthisusername/passwordcombination,Pythonwillraiseanimaplib.errorexception.ForGmailaccounts,youmayneedtouseanapplication-specificpassword;formoredetails,seeGmail’sApplication-SpecificPasswords.

SearchingforEmailOnceyou’reloggedon,actuallyretrievinganemailthatyou’reinterestedinisatwo-stepprocess.First,youmustselectafolderyouwanttosearchthrough.Then,youmustcalltheIMAPClientobject’ssearch()method,passinginastringofIMAPsearchkeywords.

SelectingaFolder

AlmosteveryaccounthasanINBOXfolderbydefault,butyoucanalsogetalistoffoldersbycallingtheIMAPClientobject’slist_folders()method.Thisreturnsalistoftuples.Eachtuplecontainsinformationaboutasinglefolder.Continuetheinteractiveshell

examplebyenteringthefollowing:>>>importpprint

>>>pprint.pprint(imapObj.list_folders())

[(('\\HasNoChildren',),'/','Drafts'),

(('\\HasNoChildren',),'/','Filler'),

(('\\HasNoChildren',),'/','INBOX'),

(('\\HasNoChildren',),'/','Sent'),

--snip-

(('\\HasNoChildren','\\Flagged'),'/','[Gmail]/Starred'),

(('\\HasNoChildren','\\Trash'),'/','[Gmail]/Trash')]

ThisiswhatyouroutputmightlooklikeifyouhaveaGmailaccount.(Gmailcallsitsfolderslabels,buttheyworkthesamewayasfolders.)Thethreevaluesineachofthetuples—forexample,(('\\HasNoChildren',),'/','INBOX')—areasfollows:

Atupleofthefolder’sflags.(Exactlywhattheseflagsrepresentisbeyondthescopeofthisbook,andyoucansafelyignorethisfield.)Thedelimiterusedinthenamestringtoseparateparentfoldersandsubfolders.Thefullnameofthefolder.

Toselectafoldertosearchthrough,passthefolder’snameasastringintotheIMAPClientobject’sselect_folder()method.

>>>imapObj.select_folder('INBOX',readonly=True)

Youcanignoreselect_folder()’sreturnvalue.Iftheselectedfolderdoesnotexist,Pythonwillraiseanimaplib.errorexception.

Thereadonly=Truekeywordargumentpreventsyoufromaccidentallymakingchangesordeletionstoanyoftheemailsinthisfolderduringthesubsequentmethodcalls.Unlessyouwanttodeleteemails,it’sagoodideatoalwayssetreadonlytoTrue.

PerformingtheSearch

Withafolderselected,youcannowsearchforemailswiththeIMAPClientobject’ssearch()method.Theargumenttosearch()isalistofstrings,eachformattedtotheIMAP’ssearchkeys.Table16-3describesthevarioussearchkeys.

Table16-3.IMAPSearchKeys

Searchkey Meaning

'ALL' Returnsallmessagesinthefolder.Youmayrunintoimaplibsizelimitsifyourequestallthemessagesinalargefolder.SeeSizeLimits.

'BEFOREdate','ONdate','SINCEdate'

Thesethreesearchkeysreturn,respectively,messagesthatwerereceivedbytheIMAPserverbefore,on,orafterthegivendate.Thedatemustbeformattedlike05-Jul-2015.Also,while'SINCE05-Jul-2015'willmatchmessagesonandafterJuly5,'BEFORE05-Jul-2015'willmatchonlymessagesbeforeJuly5butnotonJuly5itself.

'SUBJECT

string','BODYstring','TEXTstring'

Returnsmessageswherestringisfoundinthesubject,body,oreither,respectively.Ifstringhasspacesinit,thenencloseitwithdoublequotes:'TEXT"searchwithspaces"'.

'FROMstring','TOstring','CCstring','BCCstring'

Returnsallmessageswherestringisfoundinthe“from”emailaddress,“to”addresses,“cc”(carboncopy)addresses,or“bcc”(blindcarboncopy)addresses,respectively.Iftherearemultipleemailaddressesinstring,thenseparatethemwithspacesandenclosethemallwithdoublequotes:'CC"[email protected]@example.com"'.

'SEEN','UNSEEN'

Returnsallmessageswithandwithoutthe\Seenflag,respectively.Anemailobtainsthe\Seenflagifithasbeenaccessedwithafetch()methodcall(describedlater)orifitisclickedwhenyou’recheckingyouremailinanemailprogramorwebbrowser.It’smorecommontosaytheemailhasbeen“read”ratherthan“seen,”buttheymeanthesamething.

'ANSWERED','UNANSWERED'

Returnsallmessageswithandwithoutthe\Answeredflag,respectively.Amessageobtainsthe\Answeredflagwhenitisrepliedto.

'DELETED','UNDELETED'

Returnsallmessageswithandwithoutthe\Deletedflag,respectively.Emailmessagesdeletedwiththedelete_messages()methodaregiventhe\Deletedflagbutarenotpermanentlydeleteduntiltheexpunge()methodiscalled(seeDeletingEmails).Notethatsomeemailproviders,suchasGmail,automaticallyexpungeemails.

'DRAFT','UNDRAFT'

Returnsallmessageswithandwithoutthe\Draftflag,respectively.DraftmessagesareusuallykeptinaseparateDraftsfolderratherthanintheINBOXfolder.

'FLAGGED','UNFLAGGED'

Returnsallmessageswithandwithoutthe\Flaggedflag,respectively.Thisflagisusuallyusedtomarkemailmessagesas“Important”or“Urgent.”

'LARGERN','SMALLERN'

ReturnsallmessageslargerorsmallerthanNbytes,respectively.

'NOTsearch-key'

Returnsthemessagesthatsearch-keywouldnothavereturned.

'ORsearch-key1search-

key2'

Returnsthemessagesthatmatcheitherthefirstorsecondsearch-key.

NotethatsomeIMAPserversmayhaveslightlydifferentimplementationsforhowtheyhandletheirflagsandsearchkeys.Itmayrequiresomeexperimentationintheinteractiveshelltoseeexactlyhowtheybehave.

YoucanpassmultipleIMAPsearchkeystringsinthelistargumenttothesearch()method.Themessagesreturnedaretheonesthatmatchallthesearchkeys.Ifyouwanttomatchanyofthesearchkeys,usetheORsearchkey.FortheNOTandORsearchkeys,oneandtwocompletesearchkeysfollowtheNOTandOR,respectively.

Herearesomeexamplesearch()methodcallsalongwiththeirmeanings:

imapObj.search(['ALL']).Returnseverymessageinthecurrentlyselectedfolder.imapObj.search(['ON05-Jul-2015']).ReturnseverymessagesentonJuly5,2015.imapObj.search(['SINCE01-Jan-2015','BEFORE01-Feb-2015',

'UNSEEN']).ReturnseverymessagesentinJanuary2015thatisunread.(NotethatthismeansonandafterJanuary1anduptobutnotincludingFebruary1.)imapObj.search(['SINCE01-Jan-2015','[email protected]']).Returnseverymessagefromalice@example.comsentsincethestartof2015.imapObj.search(['SINCE01-Jan-2015','NOTFROM

[email protected]']).Returnseverymessagesentfromeveryoneexceptalice@example.comsincethestartof2015.imapObj.search(['[email protected]

[email protected]'])[email protected]@example.com.imapObj.search(['[email protected]','[email protected]']).Trickexample!Thissearchwillneverreturnanymessages,becausemessagesmustmatchallsearchkeywords.Sincetherecanbeonlyone“from”address,[email protected]@example.com.

Thesearch()methoddoesn’treturntheemailsthemselvesbutratheruniqueIDs(UIDs)fortheemails,asintegervalues.YoucanthenpasstheseUIDstothefetch()methodtoobtaintheemailcontent.

Continuetheinteractiveshellexamplebyenteringthefollowing:>>>UIDs=imapObj.search(['SINCE05-Jul-2015'])

>>>UIDs

[40032,40033,40034,40035,40036,40037,40038,40039,40040,40041]

Here,thelistofmessageIDs(formessagesreceivedJuly5onward)returnedbysearch()isstoredinUIDs.ThelistofUIDsreturnedonyourcomputerwillbedifferentfromtheonesshownhere;theyareuniquetoaparticularemailaccount.WhenyoulaterpassUIDstootherfunctioncalls,usetheUIDvaluesyoureceived,nottheonesprintedinthisbook’sexamples.

SizeLimits

Ifyoursearchmatchesalargenumberofemailmessages,Pythonmightraiseanexceptionthatsaysimaplib.error:gotmorethan10000bytes.Whenthishappens,youwillhavetodisconnectandreconnecttotheIMAPserverandtryagain.

ThislimitisinplacetopreventyourPythonprogramsfromeatinguptoomuchmemory.Unfortunately,thedefaultsizelimitisoftentoosmall.Youcanchangethislimitfrom10,000bytesto10,000,000bytesbyrunningthiscode:

>>>importimaplib

>>>imaplib._MAXLINE=10000000

Thisshouldpreventthiserrormessagefromcomingupagain.YoumaywanttomakethesetwolinespartofeveryIMAPprogramyouwrite.

USINGIMAPCLIENT’SGMAIL_SEARCH()METHOD

Ifyouareloggingintotheimap.gmail.comservertoaccessaGmailaccount,theIMAPClientobjectprovidesanextrasearchfunctionthatmimicsthesearchbaratthetopoftheGmailwebpage,ashighlightedinFigure16-1.

Figure16-1.ThesearchbaratthetopoftheGmailwebpage

InsteadofsearchingwithIMAPsearchkeys,youcanuseGmail’smoresophisticatedsearchengine.Gmaildoesagoodjobofmatchingcloselyrelatedwords(forexample,asearchfordrivingwillalsomatchdriveanddrove)andsortingthesearchresultsbymostsignificantmatches.YoucanalsouseGmail’sadvancedsearchoperators(seehttp://nostarch.com/automatestuff/formoreinformation).IfyouareloggingintoaGmailaccount,passthesearchtermstothegmail_search()methodinsteadofthesearch()method,likeinthefollowinginteractiveshellexample:

>>>UIDs=imapObj.gmail_search('meaningoflife')

>>>UIDs

[42]

Ah,yes—there’sthatemailwiththemeaningoflife!Iwaslookingforthat.

FetchinganEmailandMarkingItAsReadOnceyouhavealistofUIDs,youcancalltheIMAPClientobject’sfetch()methodtogettheactualemailcontent.

ThelistofUIDswillbefetch()’sfirstargument.Thesecondargumentshouldbethelist['BODY[]'],whichtellsfetch()todownloadallthebodycontentfortheemailsspecifiedinyourUIDlist.

Let’scontinueourinteractiveshellexample.>>>rawMessages=imapObj.fetch(UIDs,['BODY[]'])

>>>importpprint

>>>pprint.pprint(rawMessages)

{40040:{'BODY[]':'Delivered-To:[email protected]\r\n'

'Received:by10.76.71.167withSMTPid'

--snip--

'\r\n'

'------=_Part_6000970_707736290.1404819487066--\r\n',

'SEQ':5430}}

Importpprintandpassthereturnvaluefromfetch(),storedinthevariablerawMessages,topprint.pprint()to“prettyprint”it,andyou’llseethatthisreturnvalueisanesteddictionaryofmessageswithUIDsasthekeys.Eachmessageisstoredasadictionarywithtwokeys:'BODY[]'and'SEQ'.The'BODY[]'keymapstotheactualbodyoftheemail.The'SEQ'keyisforasequencenumber,whichhasasimilarroletotheUID.Youcansafelyignoreit.

Asyoucansee,themessagecontentinthe'BODY[]'keyisprettyunintelligible.It’sinaformatcalledRFC822,whichisdesignedforIMAPserverstoread.Butyoudon’tneedtounderstandtheRFC822format;laterinthischapter,thepyzmailmodulewillmakesenseofitforyou.

Whenyouselectedafoldertosearchthrough,youcalledselect_folder()withthereadonly=Truekeywordargument.Doingthiswillpreventyoufromaccidentallydeletinganemail—butitalsomeansthatemailswillnotgetmarkedasreadifyoufetchthemwiththefetch()method.Ifyoudowantemailstobemarkedasreadwhenyoufetchthem,youwillneedtopassreadonly=Falsetoselect_folder().Iftheselectedfolderisalreadyinreadonlymode,youcanreselectthecurrentfolderwithanothercalltoselect_folder(),thistimewiththereadonly=Falsekeywordargument:

>>>imapObj.select_folder('INBOX',readonly=False)

GettingEmailAddressesfromaRawMessageTherawmessagesreturnedfromthefetch()methodstillaren’tveryusefultopeoplewhojustwanttoreadtheiremail.ThepyzmailmoduleparsestheserawmessagesandreturnsthemasPyzMessageobjects,whichmakethesubject,body,“To”field,“From”field,andothersectionsoftheemaileasilyaccessibletoyourPythoncode.

Continuetheinteractiveshellexamplewiththefollowing(usingUIDsfromyourownemailaccount,nottheonesshownhere):

>>>importpyzmail

>>>message=pyzmail.PyzMessage.factory(rawMessages[40041]['BODY[]'])

First,importpyzmail.Then,tocreateaPyzMessageobjectofanemail,callthepyzmail.PyzMessage.factory()functionandpassitthe'BODY[]'sectionoftherawmessage.Storetheresultinmessage.NowmessagecontainsaPyzMessageobject,whichhasseveralmethodsthatmakeiteasytogettheemail’ssubjectline,aswellasallsenderandrecipientaddresses.Theget_subject()methodreturnsthesubjectasasimplestringvalue.Theget_addresses()methodreturnsalistofaddressesforthefieldyoupassit.Forexample,themethodcallsmightlooklikethis:

>>>message.get_subject()

'Hello!'

>>>message.get_addresses('from')

[('EdwardSnowden','[email protected]')]

>>>message.get_addresses('to')

[(JaneDoe','[email protected]')]

>>>message.get_addresses('cc')

[]

>>>message.get_addresses('bcc')

[]

Noticethattheargumentforget_addresses()is'from','to','cc',or'bcc'.Thereturnvalueofget_addresses()isalistoftuples.Eachtuplecontainstwostrings:Thefirstisthenameassociatedwiththeemailaddress,andthesecondistheemailaddressitself.If

therearenoaddressesintherequestedfield,get_addresses()returnsablanklist.Here,the'cc'carboncopyand'bcc'blindcarboncopyfieldsbothcontainednoaddressesandsoreturnedemptylists.

GettingtheBodyfromaRawMessageEmailscanbesentasplaintext,HTML,orboth.Plaintextemailscontainonlytext,whileHTMLemailscanhavecolors,fonts,images,andotherfeaturesthatmaketheemailmessagelooklikeasmallwebpage.Ifanemailisonlyplaintext,itsPyzMessageobjectwillhaveitshtml_partattributessettoNone.Likewise,ifanemailisonlyHTML,itsPyzMessageobjectwillhaveitstext_partattributesettoNone.

Otherwise,thetext_partorhtml_partvaluewillhaveaget_payload()methodthatreturnstheemail’sbodyasavalueofthebytesdatatype.(Thebytesdatatypeisbeyondthescopeofthisbook.)Butthisstillisn’tastringvaluethatwecanuse.Ugh!Thelaststepistocallthedecode()methodonthebytesvaluereturnedbyget_payload().Thedecode()methodtakesoneargument:themessage’scharacterencoding,storedinthetext_part.charsetorhtml_part.charsetattribute.This,finally,willreturnthestringoftheemail’sbody.

Continuetheinteractiveshellexamplebyenteringthefollowing:➊>>>message.text_part!=None

True

>>>message.text_part.get_payload().decode(message.text_part.charset)

➋'Solong,andthanksforallthefish!\r\n\r\n-Al\r\n'

➌>>>message.html_part!=None

True

➍>>>message.html_part.get_payload().decode(message.html_part.charset)

'<divdir="ltr"><div>Solong,andthanksforallthefish!<br><br></div>-Al

<br></div>\r\n'

Theemailwe’reworkingwithhasbothplaintextandHTMLcontent,sothePyzMessageobjectstoredinmessagehastext_partandhtml_partattributesnotequaltoNone➊➌.Callingget_payload()onthemessage’stext_partandthencallingdecode()onthebytesvaluereturnsastringofthetextversionoftheemail➋.Usingget_payload()anddecode()withthemessage’shtml_partreturnsastringoftheHTMLversionoftheemail➍.

DeletingEmailsTodeleteemails,passalistofmessageUIDstotheIMAPClientobject’sdelete_messages()method.Thismarkstheemailswiththe\Deletedflag.Callingtheexpunge()methodwillpermanentlydeleteallemailswiththe\Deletedflaginthecurrentlyselectedfolder.Considerthefollowinginteractiveshellexample:

➊>>>imapObj.select_folder('INBOX',readonly=False)

➋>>>UIDs=imapObj.search(['ON09-Jul-2015'])

>>>UIDs

[40066]

>>>imapObj.delete_messages(UIDs)

➌{40066:('\\Seen','\\Deleted')}

>>>imapObj.expunge()

('Success',[(5452,'EXISTS')])

Hereweselecttheinboxbycallingselect_folder()ontheIMAPClientobjectandpassing'INBOX'asthefirstargument;wealsopassthekeywordargumentreadonly=Falsesothatwecandeleteemails➊.Wesearchtheinboxformessages

receivedonaspecificdateandstorethereturnedmessageIDsinUIDs➋.Callingdelete_message()andpassingitUIDsreturnsadictionary;eachkey-valuepairisamessageIDandatupleofthemessage’sflags,whichshouldnowinclude\Deleted➌.Callingexpunge()thenpermanentlydeletesmessageswiththe\Deletedflagandreturnsasuccessmessageiftherewerenoproblemsexpungingtheemails.Notethatsomeemailproviders,suchasGmail,automaticallyexpungeemailsdeletedwithdelete_messages()insteadofwaitingforanexpungecommandfromtheIMAPclient.

DisconnectingfromtheIMAPServerWhenyourprogramhasfinishedretrievingordeletingemails,simplycalltheIMAPClient’slogout()methodtodisconnectfromtheIMAPserver.

>>>imapObj.logout()

Ifyourprogramrunsforseveralminutesormore,theIMAPservermaytimeout,orautomaticallydisconnect.Inthiscase,thenextmethodcallyourprogrammakesontheIMAPClientobjectwillraiseanexceptionlikethefollowing:

imaplib.abort:socketerror:[WinError10054]Anexistingconnectionwas

forciblyclosedbytheremotehost

Inthisevent,yourprogramwillhavetocallimapclient.IMAPClient()toconnectagain.

Whew!That’sit.Therewerealotofhoopstojumpthrough,butyounowhaveawaytogetyourPythonprogramstologintoanemailaccountandfetchemails.YoucanalwaysconsulttheoverviewinRetrievingandDeletingEmailswithIMAPwheneveryouneedtorememberallofthesteps.

Project:SendingMemberDuesReminderEmailsSayyouhavebeen“volunteered”totrackmemberduesfortheMandatoryVolunteerismClub.Thisisatrulyboringjob,involvingmaintainingaspreadsheetofeveryonewhohaspaideachmonthandemailingreminderstothosewhohaven’t.Insteadofgoingthroughthespreadsheetyourselfandcopyingandpastingthesameemailtoeveryonewhoisbehindondues,let’s—youguessedit—writeascriptthatdoesthisforyou.

Atahighlevel,here’swhatyourprogramwilldo:

ReaddatafromanExcelspreadsheet.Findallmemberswhohavenotpaidduesforthelatestmonth.Findtheiremailaddressesandsendthempersonalizedreminders.

Thismeansyourcodewillneedtodothefollowing:

OpenandreadthecellsofanExceldocumentwiththeopenpyxlmodule.(SeeChapter12forworkingwithExcelfiles.)Createadictionaryofmemberswhoarebehindontheirdues.LogintoanSMTPserverbycallingsmtplib.SMTP(),ehlo(),starttls(),andlogin().Forallmembersbehindontheirdues,sendapersonalizedreminderemailbycallingthesendmail()method.

OpenanewfileeditorwindowandsaveitassendDuesReminders.py.

Step1:OpentheExcelFileLet’ssaytheExcelspreadsheetyouusetotrackmembershipduespaymentslookslikeFigure16-2andisinafilenamedduesRecords.xlsx.Youcandownloadthisfilefromhttp://nostarch.com/automatestuff/.

Figure16-2.Thespreadsheetfortrackingmemberduespayments

Thisspreadsheethaseverymember’snameandemailaddress.Eachmonthhasacolumntrackingmembers’paymentstatuses.Thecellforeachmemberismarkedwiththetext

paidoncetheyhavepaidtheirdues.

TheprogramwillhavetoopenduesRecords.xlsxandfigureoutthecolumnforthelatestmonthbycallingtheget_highest_column()method.(YoucanconsultChapter12formoreinformationonaccessingcellsinExcelspreadsheetfileswiththeopenpyxlmodule.)Enterthefollowingcodeintothefileeditorwindow:

#!python3

#sendDuesReminders.py-Sendsemailsbasedonpaymentstatusinspreadsheet.

importopenpyxl,smtplib,sys

#Openthespreadsheetandgetthelatestduesstatus.

➊wb=openpyxl.load_workbook('duesRecords.xlsx')

➋sheet=wb.get_sheet_by_name('Sheet1')

➌lastCol=sheet.get_highest_column()

➍latestMonth=sheet.cell(row=1,column=lastCol).value

#TODO:Checkeachmember'spaymentstatus.

#TODO:Logintoemailaccount.

#TODO:Sendoutreminderemails.

Afterimportingtheopenpyxl,smtplib,andsysmodules,weopenourduesRecords.xlsxfileandstoretheresultingWorkbookobjectinwb➊.ThenwegetSheet1andstoretheresultingWorksheetobjectinsheet➋.NowthatwehaveaWorksheetobject,wecanaccessrows,columns,andcells.WestorethehighestcolumninlastCol➌,andwethenuserownumber1andlastColtoaccessthecellthatshouldholdthemostrecentmonth.WegetthevalueinthiscellandstoreitinlatestMonth➍.

Step2:FindAllUnpaidMembersOnceyou’vedeterminedthecolumnnumberofthelatestmonth(storedinlastCol),youcanloopthroughallrowsafterthefirstrow(whichhasthecolumnheaders)toseewhichmembershavethetextpaidinthecellforthatmonth’sdues.Ifthememberhasn’tpaid,youcangrabthemember’snameandemailaddressfromcolumns1and2,respectively.ThisinformationwillgointotheunpaidMembersdictionary,whichwilltrackallmemberswhohaven’tpaidinthemostrecentmonth.AddthefollowingcodetosendDuesReminder.py.

#!python3

#sendDuesReminders.py-Sendsemailsbasedonpaymentstatusinspreadsheet.

--snip--

#Checkeachmember'spaymentstatus.

unpaidMembers={}

➊forrinrange(2,sheet.get_highest_row()+1):

➋payment=sheet.cell(row=r,column=lastCol).value

ifpayment!='paid':

➌name=sheet.cell(row=r,column=1).value

➍email=sheet.cell(row=r,column=2).value

➎unpaidMembers[name]=email

ThiscodesetsupanemptydictionaryunpaidMembersandthenloopsthroughalltherowsafterthefirst➊.Foreachrow,thevalueinthemostrecentcolumnisstoredinpayment➋.Ifpaymentisnotequalto'paid',thenthevalueofthefirstcolumnisstoredinname➌,thevalueofthesecondcolumnisstoredinemail➍,andnameandemailareaddedtounpaidMembers➎.

Step3:SendCustomizedEmailRemindersOnceyouhavealistofallunpaidmembers,it’stimetosendthememailreminders.Addthefollowingcodetoyourprogram,exceptwithyourrealemailaddressandproviderinformation:

#!python3

#sendDuesReminders.py-Sendsemailsbasedonpaymentstatusinspreadsheet.

--snip--

#Logintoemailaccount.

smtpObj=smtplib.SMTP('smtp.gmail.com',587)

smtpObj.ehlo()

smtpObj.starttls()

smtpObj.login('[email protected]',sys.argv[1])

CreateanSMTPobjectbycallingsmtplib.SMTP()andpassingitthedomainnameandportforyourprovider.Callehlo()andstarttls(),andthencalllogin()andpassityouremailaddressandsys.argv[1],whichwillstoreyourpasswordstring.You’llenterthepasswordasacommandlineargumenteachtimeyouruntheprogram,toavoidsavingyourpasswordinyoursourcecode.

Onceyourprogramhasloggedintoyouremailaccount,itshouldgothroughtheunpaidMembersdictionaryandsendapersonalizedemailtoeachmember’semailaddress.AddthefollowingtosendDuesReminders.py:

#!python3

#sendDuesReminders.py-Sendsemailsbasedonpaymentstatusinspreadsheet.

--snip--

#Sendoutreminderemails.

forname,emailinunpaidMembers.items():

➊body="Subject:%sduesunpaid.\nDear%s,\nRecordsshowthatyouhavenot

paidduesfor%s.Pleasemakethispaymentassoonaspossible.Thankyou!'"%

(latestMonth,name,latestMonth)

➋print('Sendingemailto%s…'%email)

➌sendmailStatus=smtpObj.sendmail('[email protected]',email,body)

➍ifsendmailStatus!={}:

print('Therewasaproblemsendingemailto%s:%s'%(email,

sendmailStatus))

smtpObj.quit()

ThiscodeloopsthroughthenamesandemailsinunpaidMembers.Foreachmemberwhohasn’tpaid,wecustomizeamessagewiththelatestmonthandthemember’sname,andstorethemessageinbody➊.Weprintoutputsayingthatwe’resendinganemailtothismember’semailaddress➋.Thenwecallsendmail(),passingitthefromaddressandthecustomizedmessage➌.WestorethereturnvalueinsendmailStatus.

Rememberthatthesendmail()methodwillreturnanonemptydictionaryvalueiftheSMTPserverreportedanerrorsendingthatparticularemail.Thelastpartoftheforloopat➍checksifthereturneddictionaryisnonempty,andifitis,printstherecipient’semailaddressandthereturneddictionary.

Aftertheprogramisdonesendingalltheemails,thequit()methodiscalledtodisconnectfromtheSMTPserver.

Whenyouruntheprogram,theoutputwilllooksomethinglikethis:[email protected]

[email protected]

[email protected]

TherecipientswillreceiveanemailthatlookslikeFigure16-3.

Figure16-3.AnautomaticallysentemailfromsendDuesReminders.py

SendingTextMessageswithTwilioMostpeoplearemorelikelytobeneartheirphonesthantheircomputers,sotextmessagescanbeamoreimmediateandreliablewayofsendingnotificationsthanemail.Also,theshortlengthoftextmessagesmakesitmorelikelythatapersonwillgetaroundtoreadingthem.

Inthissection,you’lllearnhowtosignupforthefreeTwilioserviceanduseitsPythonmoduletosendtextmessages.TwilioisanSMSgatewayservice,whichmeansit’saservicethatallowsyoutosendtextmessagesfromyourprograms.AlthoughyouwillbelimitedinhowmanytextsyoucansendpermonthandthetextswillbeprefixedwiththewordsSentfromaTwiliotrialaccount,thistrialserviceisprobablyadequateforyourpersonalprograms.Thefreetrialisindefinite;youwon’thavetoupgradetoapaidplanlater.

Twilioisn’ttheonlySMSgatewayservice.IfyouprefernottouseTwilio,youcanfindalternativeservicesbysearchingonlineforfreesmsgateway,pythonsmsapi,oreventwilioalternatives.

BeforesigningupforaTwilioaccount,installthetwiliomodule.AppendixAhasmoredetailsaboutinstallingthird-partymodules.

NOTE

ThissectionisspecifictotheUnitedStates.TwiliodoesofferSMStextingservicesforcountriesoutsideoftheUnitedStates,butthosespecificsaren’tcoveredinthisbook.Thetwiliomoduleanditsfunctions,however,willworkthesameoutsidetheUnitedStates.Seehttp://twilio.com/formoreinformation.

SigningUpforaTwilioAccountGotohttp://twilio.com/andfilloutthesign-upform.Onceyou’vesignedupforanewaccount,you’llneedtoverifyamobilephonenumberthatyouwanttosendtextsto.(Thisverificationisnecessarytopreventpeoplefromusingtheservicetospamrandomphonenumberswithtextmessages.)

Afterreceivingthetextwiththeverificationnumber,enteritintotheTwiliowebsitetoprovethatyouownthemobilephoneyouareverifying.Youwillnowbeabletosendtextstothisphonenumberusingthetwiliomodule.

Twilioprovidesyourtrialaccountwithaphonenumbertouseasthesenderoftextmessages.Youwillneedtwomorepiecesofinformation:youraccountSIDandtheauth(authentication)token.YoucanfindthisinformationontheDashboardpagewhenyouareloggedintoyourTwilioaccount.ThesevaluesactasyourTwiliousernameandpasswordwhenlogginginfromaPythonprogram.

SendingTextMessagesOnceyou’veinstalledthetwiliomodule,signedupforaTwilioaccount,verifiedyourphonenumber,registeredaTwiliophonenumber,andobtainedyouraccountSIDandauthtoken,youwillfinallybereadytosendyourselftextmessagesfromyourPythonscripts.

Comparedtoalltheregistrationsteps,theactualPythoncodeisfairlysimple.WithyourcomputerconnectedtotheInternet,enterthefollowingintotheinteractiveshell,replacing

theaccountSID,authToken,myTwilioNumber,andmyCellPhonevariablevalueswithyourrealinformation:

➊>>>fromtwilio.restimportTwilioRestClient

>>>accountSID='ACxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'

>>>authToken='xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'

➋>>>twilioCli=TwilioRestClient(accountSID,authToken)

>>>myTwilioNumber='+14955551234'

>>>myCellPhone='+14955558888'

➌>>>message=twilioCli.messages.create(body='Mr.Watson-Comehere-Iwant

toseeyou.',from_=myTwilioNumber,to=myCellPhone)

Afewmomentsaftertypingthelastline,youshouldreceiveatextmessagethatreadsSentfromyourTwiliotrialaccount-Mr.Watson-Comehere-Iwanttoseeyou.

Becauseofthewaythetwiliomoduleissetup,youneedtoimportitusingfromtwilio.restimportTwilioRestClient,notjustimporttwilio➊.StoreyouraccountSIDinaccountSIDandyourauthtokeninauthTokenandthencallTwilioRestClient()andpassitaccountSIDandauthToken.ThecalltoTwilioRestClient()returnsaTwilioRestClientobject➋.Thisobjecthasamessagesattribute,whichinturnhasacreate()methodyoucanusetosendtextmessages.ThisisthemethodthatwillinstructTwilio’sserverstosendyourtextmessage.AfterstoringyourTwilionumberandcellphonenumberinmyTwilioNumberandmyCellPhone,respectively,callcreate()andpassitkeywordargumentsspecifyingthebodyofthetextmessage,thesender’snumber(myTwilioNumber),andtherecipient’snumber(myCellPhone)➌.

TheMessageobjectreturnedfromthecreate()methodwillhaveinformationaboutthetextmessagethatwassent.Continuetheinteractiveshellexamplebyenteringthefollowing:

>>>message.to

'+14955558888'

>>>message.from_

'+14955551234'

>>>message.body

'Mr.Watson-Comehere-Iwanttoseeyou.'

Theto,from_,andbodyattributesshouldholdyourcellphonenumber,Twilionumber,andmessage,respectively.Notethatthesendingphonenumberisinthefrom_attribute—withanunderscoreattheend—notfrom.ThisisbecausefromisakeywordinPython(you’veseenitusedinthefrommodulenameimport*formofimportstatement,forexample),soitcannotbeusedasanattributename.Continuetheinteractiveshellexamplewiththefollowing:

>>>message.status

'queued'

>>>message.date_created

datetime.datetime(2015,7,8,1,36,18)

>>>message.date_sent==None

True

Thestatusattributeshouldgiveyouastring.Thedate_createdanddate_sentattributesshouldgiveyouadatetimeobjectifthemessagehasbeencreatedandsent.Itmayseemoddthatthestatusattributeissetto'queued'andthedate_sentattributeissettoNonewhenyou’vealreadyreceivedthetextmessage.ThisisbecauseyoucapturedtheMessageobjectinthemessagevariablebeforethetextwasactuallysent.YouwillneedtorefetchtheMessageobjectinordertoseeitsmostup-to-datestatusanddate_sent.EveryTwiliomessagehasauniquestringID(SID)thatcanbeusedtofetchthelatest

updateoftheMessageobject.Continuetheinteractiveshellexamplebyenteringthefollowing:

>>>message.sid

'SM09520de7639ba3af137c6fcb7c5f4b51'

➊>>>updatedMessage=twilioCli.messages.get(message.sid)

>>>updatedMessage.status

'delivered'

>>>updatedMessage.date_sent

datetime.datetime(2015,7,8,1,36,18)

Enteringmessage.sidshowyouthismessage’slongSID.BypassingthisSIDtotheTwilioclient’sget()method➊,youcanretrieveanewMessageobjectwiththemostup-to-dateinformation.InthisnewMessageobject,thestatusanddate_sentattributesarecorrect.

Thestatusattributewillbesettooneofthefollowingstringvalues:'queued','sending','sent','delivered','undelivered',or'failed'.Thesestatusesareself-explanatory,butformoreprecisedetails,takealookattheresourcesathttp://nostarch.com/automatestuff/.

RECEIVINGTEXTMESSAGESWITHPYTHON

Unfortunately,receivingtextmessageswithTwilioisabitmorecomplicatedthansendingthem.Twiliorequiresthatyouhaveawebsiterunningitsownwebapplication.That’sbeyondthescopeofthisbook,butyoucanfindmoredetailsintheresourcesforthisbook(http://nostarch.com/automatestuff/).

Project:“JustTextMe”ModuleThepersonyou’llmostoftentextfromyourprogramsisprobablyyou.Textingisagreatwaytosendyourselfnotificationswhenyou’reawayfromyourcomputer.Ifyou’veautomatedaboringtaskwithaprogramthattakesacoupleofhourstorun,youcouldhaveitnotifyyouwithatextwhenit’sfinished.Oryoumayhavearegularlyscheduledprogramrunningthatsometimesneedstocontactyou,suchasaweather-checkingprogramthattextsyouaremindertopackanumbrella.

Asasimpleexample,here’sasmallPythonprogramwithatextmyself()functionthatsendsamessagepassedtoitasastringargument.Openanewfileeditorwindowandenterthefollowingcode,replacingtheaccountSID,authtoken,andphonenumberswithyourowninformation.SaveitastextMyself.py.

#!python3

#textMyself.py-Definesthetextmyself()functionthattextsamessage

#passedtoitasastring.

#Presetvalues:

accountSID='ACxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'

authToken='xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'

myNumber='+15559998888'

twilioNumber='+15552225678'

fromtwilio.restimportTwilioRestClient

➊deftextmyself(message):

➋twilioCli=TwilioRestClient(accountSID,authToken)

➌twilioCli.messages.create(body=message,from_=twilioNumber,to=myNumber)

ThisprogramstoresanaccountSID,authtoken,sendingnumber,andreceivingnumber.Itthendefinedtextmyself()totakeonargument➊,makeaTwilioRestClientobject➋,andcallcreate()withthemessageyoupassed➌.

Ifyouwanttomakethetextmyself()functionavailabletoyourotherprograms,simplyplacethetextMyself.pyfileinthesamefolderasthePythonexecutable(C:\Python34onWindows,/usr/local/lib/python3.4onOSX,and/usr/bin/python3onLinux).Nowyoucanusethefunctioninyourotherprograms.Wheneveryouwantoneofyourprogramstotextyou,justaddthefollowing:

importtextmyself

textmyself.textmyself('Theboringtaskisfinished.')

YouneedtosignupforTwilioandwritethetextingcodeonlyonce.Afterthat,it’sjusttwolinesofcodetosendatextfromanyofyourotherprograms.

SummaryWecommunicatewitheachotherontheInternetandovercellphonenetworksindozensofdifferentways,butemailandtextingpredominate.Yourprogramscancommunicatethroughthesechannels,whichgivesthempowerfulnewnotificationfeatures.Youcanevenwriteprogramsrunningondifferentcomputersthatcommunicatewithoneanotherdirectlyviaemail,withoneprogramsendingemailswithSMTPandtheotherretrievingthemwithIMAP.

Python’ssmtplibprovidesfunctionsforusingtheSMTPtosendemailsthroughyouremailprovider’sSMTPserver.Likewise,thethird-partyimapclientandpyzmailmodulesletyouaccessIMAPserversandretrieveemailssenttoyou.AlthoughIMAPisabitmoreinvolvedthanSMTP,it’salsoquitepowerfulandallowsyoutosearchforparticularemails,downloadthem,andparsethemtoextractthesubjectandbodyasstringvalues.

Textingisabitdifferentfromemail,since,unlikeemail,morethanjustanInternetconnectionisneededtosendSMStexts.Fortunately,servicessuchasTwilioprovidemodulestoallowyoutosendtextmessagesfromyourprograms.Onceyougothroughaninitialsetupprocess,you’llbeabletosendtextswithjustacouplelinesofcode.

Withthesemodulesinyourskillset,you’llbeabletoprogramthespecificconditionsunderwhichyourprogramsshouldsendnotificationsorreminders.Nowyourprogramswillhavereachfarbeyondthecomputerthey’rerunningon!

PracticeQuestionsQ: 1.Whatistheprotocolforsendingemail?Forcheckingandreceivingemail?

Q: 2.Whatfoursmtplibfunctions/methodsmustyoucalltologintoanSMTPserver?

Q: 3.Whattwoimapclientfunctions/methodsmustyoucalltologintoanIMAPserver?

Q: 4.WhatkindofargumentdoyoupasstoimapObj.search()?

Q: 5.Whatdoyoudoifyourcodegetsanerrormessagethatsaysgotmorethan10000bytes?

Q: 6.TheimapclientmodulehandlesconnectingtoanIMAPserverandfindingemails.Whatisonemodulethathandlesreadingtheemailsthatimapclientcollects?

Q: 7.WhatthreepiecesofinformationdoyouneedfromTwiliobeforeyoucansendtextmessages?

PracticeProjectsForpractice,writeprogramsthatdothefollowing.

RandomChoreAssignmentEmailerWriteaprogramthattakesalistofpeople’semailaddressesandalistofchoresthatneedtobedoneandrandomlyassignschorestopeople.Emaileachpersontheirassignedchores.Ifyou’refeelingambitious,keeparecordofeachperson’spreviouslyassignedchoressothatyoucanmakesuretheprogramavoidsassigninganyonethesamechoretheydidlasttime.Foranotherpossiblefeature,scheduletheprogramtorunonceaweekautomatically.

Here’sahint:Ifyoupassalisttotherandom.choice()function,itwillreturnarandomlyselecteditemfromthelist.Partofyourcodecouldlooklikethis:

chores=['dishes','bathroom','vacuum','walkdog']

randomChore=random.choice(chores)

chores.remove(randomChore)#thischoreisnowtaken,soremoveit

UmbrellaReminderChapter11showedyouhowtousetherequestsmoduletoscrapedatafromhttp://weather.gov/.Writeaprogramthatrunsjustbeforeyouwakeupinthemorningandcheckswhetherit’srainingthatday.Ifso,havetheprogramtextyouaremindertopackanumbrellabeforeleavingthehouse.

AutoUnsubscriberWriteaprogramthatscansthroughyouremailaccount,findsalltheunsubscribelinksinallyouremails,andautomaticallyopenstheminabrowser.Thisprogramwillhavetologintoyouremailprovider’sIMAPserveranddownloadallofyouremails.YoucanuseBeautifulSoup(coveredinChapter11)tocheckforanyinstancewherethewordunsubscribeoccurswithinanHTMLlinktag.

OnceyouhavealistoftheseURLs,youcanusewebbrowser.open()toautomaticallyopenalloftheselinksinabrowser.

You’llstillhavetomanuallygothroughandcompleteanyadditionalstepstounsubscribeyourselffromtheselists.Inmostcases,thisinvolvesclickingalinktoconfirm.

Butthisscriptsavesyoufromhavingtogothroughallofyouremailslookingforunsubscribelinks.Youcanthenpassthisscriptalongtoyourfriendssotheycanrunitontheiremailaccounts.(Justmakesureyouremailpasswordisn’thardcodedinthesourcecode!)

ControllingYourComputerThroughEmailWriteaprogramthatchecksanemailaccountevery15minutesforanyinstructionsyouemailitandexecutesthoseinstructionsautomatically.Forexample,BitTorrentisapeer-to-peerdownloadingsystem.UsingfreeBitTorrentsoftwaresuchasqBittorrent,youcandownloadlargemediafilesonyourhomecomputer.Ifyouemailtheprograma(completelylegal,notatallpiratical)BitTorrentlink,theprogramwilleventuallycheckitsemail,findthismessage,extractthelink,andthenlaunchqBittorrenttostart

downloadingthefile.Thisway,youcanhaveyourhomecomputerbegindownloadswhileyou’reaway,andthe(completelylegal,notatallpiratical)downloadcanbefinishedbythetimeyoureturnhome.

Chapter15covershowtolaunchprogramsonyourcomputerusingthesubprocess.Popen()function.Forexample,thefollowingcallwouldlaunchtheqBittorrentprogram,alongwithatorrentfile:

qbProcess=subprocess.Popen(['C:\\ProgramFiles(x86)\\qBittorrent\\

qbittorrent.exe','shakespeare_complete_works.torrent'])

Ofcourse,you’llwanttheprogramtomakesuretheemailscomefromyou.Inparticular,youmightwanttorequirethattheemailscontainapassword,sinceitisfairlytrivialforhackerstofakea“from”addressinemails.Theprogramshoulddeletetheemailsitfindssothatitdoesn’trepeatinstructionseverytimeitcheckstheemailaccount.Asanextrafeature,havetheprogramemailortextyouaconfirmationeverytimeitexecutesacommand.Sinceyouwon’tbesittinginfrontofthecomputerthatisrunningtheprogram,it’sagoodideatousetheloggingfunctions(seeChapter10)towriteatextfilelogthatyoucancheckiferrorscomeup.

qBittorrent(aswellasotherBitTorrentapplications)hasafeaturewhereitcanquitautomaticallyafterthedownloadcompletes.Chapter15explainshowyoucandeterminewhenalaunchedapplicationhasquitwiththewait()methodforPopenobjects.Thewait()methodcallwillblockuntilqBittorrenthasstopped,andthenyourprogramcanemailortextyouanotificationthatthedownloadhascompleted.

Therearealotofpossiblefeaturesyoucouldaddtothisproject.Ifyougetstuck,youcandownloadanexampleimplementationofthisprogramfromhttp://nostarch.com/automatestuff/.

Chapter17.ManipulatingImagesIfyouhaveadigitalcameraorevenifyoujustuploadphotosfromyourphonetoFacebook,youprobablycrosspathswithdigitalimagefilesallthetime.Youmayknowhowtousebasicgraphicssoftware,suchasMicrosoftPaintorPaintbrush,orevenmoreadvancedapplicationssuchasAdobePhotoshop.Butifyouneedtoeditamassivenumberofimages,editingthembyhandcanbealengthy,boringjob.

EnterPython.Pillowisathird-partyPythonmoduleforinteractingwithimagefiles.Themodulehasseveralfunctionsthatmakeiteasytocrop,resize,andeditthecontentofanimage.WiththepowertomanipulateimagesthesamewayyouwouldwithsoftwaresuchasMicrosoftPaintorAdobePhotoshop,Pythoncanautomaticallyedithundredsorthousandsofimageswithease.

ComputerImageFundamentalsInordertomanipulateanimage,youneedtounderstandthebasicsofhowcomputersdealwithcolorsandcoordinatesinimagesandhowyoucanworkwithcolorsandcoordinatesinPillow.Butbeforeyoucontinue,installthepillowmodule.SeeAppendixAforhelpinstallingthird-partymodules.

ColorsandRGBAValuesComputerprogramsoftenrepresentacolorinanimageasanRGBAvalue.AnRGBAvalueisagroupofnumbersthatspecifytheamountofred,green,blue,andalpha(ortransparency)inacolor.Eachofthesecomponentvaluesisanintegerfrom0(noneatall)to255(themaximum).TheseRGBAvaluesareassignedtoindividualpixels;apixelisthesmallestdotofasinglecolorthecomputerscreencanshow(asyoucanimagine,therearemillionsofpixelsonascreen).Apixel’sRGBsettingtellsitpreciselywhatshadeofcoloritshoulddisplay.ImagesalsohaveanalphavaluetocreateRGBAvalues.Ifanimageisdisplayedonthescreenoverabackgroundimageordesktopwallpaper,thealphavaluedetermineshowmuchofthebackgroundyoucan“seethrough”theimage’spixel.

InPillow,RGBAvaluesarerepresentedbyatupleoffourintegervalues.Forexample,thecolorredisrepresentedby(255,0,0,255).Thiscolorhasthemaximumamountofred,nogreenorblue,andthemaximumalphavalue,meaningitisfullyopaque.Greenisrepresentedby(0,255,0,255),andblueis(0,0,255,255).White,thecombinationofallcolors,is(255,255,255,255),whileblack,whichhasnocoloratall,is(0,0,0,255).

Ifacolorhasanalphavalueof0,itisinvisible,anditdoesn’treallymatterwhattheRGBvaluesare.Afterall,invisibleredlooksthesameasinvisibleblack.

PillowusesthestandardcolornamesthatHTMLuses.Table17-1listsaselectionofstandardcolornamesandtheirvalues.

Table17-1.StandardColorNamesandTheirRGBAValues

Name RGBAValue Name RGBAValue

White (255,255,255,255) Red (255,0,0,255)

Green (0,128,0,255) Blue (0,0,255,255)

Gray (128,128,128,255) Yellow (255,255,0,255)

Black (0,0,0,255) Purple (128,0,128,255)

PillowofferstheImageColor.getcolor()functionsoyoudon’thavetomemorizeRGBAvaluesforthecolorsyouwanttouse.Thisfunctiontakesacolornamestringasitsfirstargument,andthestring'RGBA'asitssecondargument,anditreturnsanRGBAtuple.

CMYKANDRGBCOLORING

Ingradeschoolyoulearnedthatmixingred,yellow,andbluepaintscanformothercolors;forexample,youcanmixblueandyellowtomakegreenpaint.Thisisknownasthesubtractivecolormodel,anditappliestodyes,inks,andpigments.ThisiswhycolorprintershaveCMYKinkcartridges:theCyan(blue),Magenta(red),Yellow,andblacKinkcanbemixedtogethertoformanycolor.

However,thephysicsoflightuseswhat’scalledanadditivecolormodel.Whencombininglight(suchasthelightgivenoffbyyourcomputerscreen),red,green,andbluelightcanbecombinedtoformanyothercolor.ThisiswhyRGBvaluesrepresentcolorincomputerprograms.

Toseehowthisfunctionworks,enterthefollowingintotheinteractiveshell:➊>>>fromPILimportImageColor

➋>>>ImageColor.getcolor('red','RGBA')

(255,0,0,255)

➌>>>ImageColor.getcolor('RED','RGBA')

(255,0,0,255)

>>>ImageColor.getcolor('Black','RGBA')

(0,0,0,255)

>>>ImageColor.getcolor('chocolate','RGBA')

(210,105,30,255)

>>>ImageColor.getcolor('CornflowerBlue','RGBA')

(100,149,237,255)

First,youneedtoimporttheImageColormodulefromPIL➊(notfromPillow;you’llseewhyinamoment).ThecolornamestringyoupasstoImageColor.getcolor()iscaseinsensitive,sopassing'red'➋andpassing'RED'➌giveyouthesameRGBAtuple.Youcanalsopassmoreunusualcolornames,like'chocolate'and'CornflowerBlue'.

Pillowsupportsahugenumberofcolornames,from'aliceblue'to'whitesmoke'.Youcanfindthefulllistofmorethan100standardcolornamesintheresourcesathttp://nostarch.com/automatestuff/.

CoordinatesandBoxTuplesImagepixelsareaddressedwithx-andy-coordinates,whichrespectivelyspecifyapixel’shorizontalandverticallocationinanimage.Theoriginisthepixelatthetop-leftcorneroftheimageandisspecifiedwiththenotation(0,0).Thefirstzerorepresentsthex-coordinate,whichstartsatzeroattheoriginandincreasesgoingfromlefttoright.Thesecondzerorepresentsthey-coordinate,whichstartsatzeroattheoriginandincreasesgoingdowntheimage.Thisbearsrepeating:y-coordinatesincreasegoingdownward,whichistheoppositeofhowyoumayremembery-coordinatesbeingusedinmathclass.Figure17-1demonstrateshowthiscoordinatesystemworks.

ManyofPillow’sfunctionsandmethodstakeaboxtupleargument.ThismeansPillowisexpectingatupleoffourintegercoordinatesthatrepresentarectangularregioninanimage.Thefourintegersare,inorder,asfollows:

Figure17-1.Thex-andy-coordinatesofa27×26imageofsomesortofancientdatastoragedevice

Left:Thex-coordinateoftheleftmostedgeofthebox.Top:They-coordinateofthetopedgeofthebox.Right:Thex-coordinateofonepixeltotherightoftherightmostedgeofthebox.Thisintegermustbegreaterthantheleftinteger.Bottom:They-coordinateofonepixellowerthanthebottomedgeofthebox.Thisintegermustbegreaterthanthetopinteger.

Figure17-2.Thearearepresentedbytheboxtuple(3,1,9,6)

Notethattheboxincludestheleftandtopcoordinatesandgoesuptobutdoesnotincludetherightandbottomcoordinates.Forexample,theboxtuple(3,1,9,6)representsallthepixelsintheblackboxinFigure17-2.

ManipulatingImageswithPillowNowthatyouknowhowcolorsandcoordinatesworkinPillow,let’susePillowtomanipulateanimage.Figure17-3istheimagethatwillbeusedforalltheinteractiveshellexamplesinthischapter.Youcandownloaditfromhttp://nostarch.com/automatestuff/.

OnceyouhavetheimagefileZophie.pnginyourcurrentworkingdirectory,you’llbereadytoloadtheimageofZophieintoPython,likeso:

>>>fromPILimportImage

>>>catIm=Image.open('zophie.png')

Figure17-3.MycatZophie.Thecameraadds10pounds(whichisalotforacat).

Toloadtheimage,youimporttheImagemodulefromPillowandcallImage.open(),passingittheimage’sfilename.YoucanthenstoretheloadedimageinavariablelikeCatIm.ThemodulenameofPillowisPILtomakeitbackwardcompatiblewithanoldermodulecalledPythonImagingLibrary,whichiswhyyoumustrunfromPILimportImageinsteadoffromPillowimportImage.BecauseofthewayPillow’screatorssetupthepillowmodule,youmustusethefromPILimportImageformofimportstatement,ratherthansimplyimportPIL.

Iftheimagefileisn’tinthecurrentworkingdirectory,changetheworkingdirectorytothefolderthatcontainstheimagefilebycallingtheos.chdir()function.

>>>importos

>>>os.chdir('C:\\folder_with_image_file')

TheImage.open()functionreturnsavalueoftheImageobjectdatatype,whichishowPillowrepresentsanimageasaPythonvalue.YoucanloadanImageobjectfromanimagefile(ofanyformat)bypassingtheImage.open()functionastringofthefilename.AnychangesyoumaketotheImageobjectcanbesavedtoanimagefile(alsoofanyformat)withthesave()method.Alltherotations,resizing,cropping,drawing,andotherimagemanipulationswillbedonethroughmethodcallsonthisImageobject.

Toshortentheexamplesinthischapter,I’llassumeyou’veimportedPillow’sImagemoduleandthatyouhavetheZophieimagestoredinavariablenamedcatIm.Besurethatthezophie.pngfileisinthecurrentworkingdirectorysothattheImage.open()functioncanfindit.Otherwise,youwillalsohavetospecifythefullabsolutepathinthestringargumenttoImage.open().

WorkingwiththeImageDataTypeAnImageobjecthasseveralusefulattributesthatgiveyoubasicinformationabouttheimagefileitwasloadedfrom:itswidthandheight,thefilename,andthegraphicsformat(suchasJPEG,GIF,orPNG).

Forexample,enterthefollowingintotheinteractiveshell:>>>fromPILimportImage

>>>catIm=Image.open('zophie.png')

>>>catIm.size

➊(816,1088)

➋>>>width,height=catIm.size

➌>>>width

816

➍>>>height

1088

>>>catIm.filename

'zophie.png'

>>>catIm.format

'PNG'

>>>catIm.format_description

'Portablenetworkgraphics'

➎>>>catIm.save('zophie.jpg')

AftermakinganImageobjectfromZophie.pngandstoringtheImageobjectincatIm,wecanseethattheobject’ssizeattributecontainsatupleoftheimage’swidthandheightinpixels➊.Wecanassignthevaluesinthetupletowidthandheightvariables➋inordertoaccesswithwidth➌andheight➍individually.Thefilenameattributedescribestheoriginalfile’sname.Theformatandformat_descriptionattributesarestringsthatdescribetheimageformatoftheoriginalfile(withformat_descriptionbeingabitmoreverbose).

Finally,callingthesave()methodandpassingit'zophie.jpg'savesanewimagewiththefilenamezophie.jpgtoyourharddrive➎.Pillowseesthatthefileextensionis.jpgandautomaticallysavestheimageusingtheJPEGimageformat.Nowyoushouldhavetwoimages,zophie.pngandzophie.jpg,onyourharddrive.Whilethesefilesarebasedonthesameimage,theyarenotidenticalbecauseoftheirdifferentformats.

PillowalsoprovidestheImage.new()function,whichreturnsanImageobject—muchlikeImage.open(),excepttheimagerepresentedbyImage.new()’sobjectwillbeblank.TheargumentstoImage.new()areasfollows:

Thestring'RGBA',whichsetsthecolormodetoRGBA.(Thereareothermodesthat

thisbookdoesn’tgointo.)Thesize,asatwo-integertupleofthenewimage’swidthandheight.Thebackgroundcolorthattheimageshouldstartwith,asafour-integertupleofanRGBAvalue.YoucanusethereturnvalueoftheImageColor.getcolor()functionforthisargument.Alternatively,Image.new()alsosupportsjustpassingthestringofthestandardcolorname.

Forexample,enterthefollowingintotheinteractiveshell:>>>fromPILimportImage

➊>>>im=Image.new('RGBA',(100,200),'purple')

>>>im.save('purpleImage.png')

➋>>>im2=Image.new('RGBA',(20,20))

>>>im2.save('transparentImage.png')

HerewecreateanImageobjectforanimagethat’s100pixelswideand200pixelstall,withapurplebackground➊.ThisimageisthensavedtothefilepurpleImage.png.WecallImage.new()againtocreateanotherImageobject,thistimepassing(20,20)forthedimensionsandnothingforthebackgroundcolor➋.Invisibleblack,(0,0,0,0),isthedefaultcolorusedifnocolorargumentisspecified,sothesecondimagehasatransparentbackground;wesavethis20×20transparentsquareintransparentImage.png.

CroppingImagesCroppinganimagemeansselectingarectangularregioninsideanimageandremovingeverythingoutsidetherectangle.Thecrop()methodonImageobjectstakesaboxtupleandreturnsanImageobjectrepresentingthecroppedimage.Thecroppingdoesnothappeninplace—thatis,theoriginalImageobjectisleftuntouched,andthecrop()methodreturnsanewImageobject.Remeberthataboxedtuple—inthiscase,thecroppedsection—includestheleftcolumnandtoprowofpixelsbutonlygoesuptoanddoesnotincludetherightcolumnandbottomrowofpixels.

Enterthefollowingintotheinteractiveshell:>>>croppedIm=catIm.crop((335,345,565,560))

>>>croppedIm.save('cropped.png')

ThismakesanewImageobjectforthecroppedimage,storestheobjectincroppedIm,andthencallssave()oncroppedImtosavethecroppedimageincropped.png.Thenewfilecropped.pngwillbecreatedfromtheoriginalimage,likeinFigure17-4.

Figure17-4.Thenewimagewillbejustthecroppedsectionoftheoriginalimage.

CopyingandPastingImagesontoOtherImagesThecopy()methodwillreturnanewImageobjectwiththesameimageastheImageobjectitwascalledon.Thisisusefulifyouneedtomakechangestoanimagebutalsowanttokeepanuntouchedversionoftheoriginal.Forexample,enterthefollowingintotheinteractiveshell:

>>>catIm=Image.open('zophie.png')

>>>catCopyIm=catIm.copy()

ThecatImandcatCopyImvariablescontaintwoseparateImageobjects,whichbothhavethesameimageonthem.NowthatyouhaveanImageobjectstoredincatCopyIm,youcanmodifycatCopyImasyoulikeandsaveittoanewfilename,leavingzophie.pnguntouched.Forexample,let’strymodifyingcatCopyImwiththepaste()method.

Thepaste()methodiscalledonanImageobjectandpastesanotherimageontopofit.Let’scontinuetheshellexamplebypastingasmallerimageontocatCopyIm.

>>>faceIm=catIm.crop((335,345,565,560))

>>>faceIm.size

(230,215)

>>>catCopyIm.paste(faceIm,(0,0))

>>>catCopyIm.paste(faceIm,(400,500))

>>>catCopyIm.save('pasted.png')

Firstwepasscrop()aboxtuplefortherectangularareainzophie.pngthatcontainsZophie’sface.ThiscreatesanImageobjectrepresentinga230×215crop,whichwestoreinfaceIm.NowwecanpastefaceImontocatCopyIm.Thepaste()methodtakestwo

arguments:a“source”Imageobjectandatupleofthex-andy-coordinateswhereyouwanttopastethetop-leftcornerofthesourceImageobjectontothemainImageobject.Herewecallpaste()twiceoncatCopyIm,passing(0,0)thefirsttimeand(400,500)thesecondtime.ThispastesfaceImontocatCopyImtwice:oncewiththetop-leftcorneroffaceImat(0,0)oncatCopyIm,andoncewiththetop-leftcorneroffaceImat(400,500).Finally,wesavethemodifiedcatCopyImtopasted.png.Thepasted.pngimagelookslikeFigure17-5.

Figure17-5.Zophiethecat,withherfacepastedtwice

NOTE

Despitetheirnames,thecopy()andpaste()methodsinPillowdonotuseyourcomputer’sclipboard.

Notethatthepaste()methodmodifiesitsImageobjectinplace;itdoesnotreturnanImageobjectwiththepastedimage.Ifyouwanttocallpaste()butalsokeepanuntouchedversionoftheoriginalimagearound,you’llneedtofirstcopytheimageandthencallpaste()onthatcopy.

SayyouwanttotileZophie’sheadacrosstheentireimage,asinFigure17-6.Youcanachievethiseffectwithjustacoupleforloops.Continuetheinteractiveshellexamplebyenteringthefollowing:

>>>catImWidth,catImHeight=catIm.size

>>>faceImWidth,faceImHeight=faceIm.size

➊>>>catCopyTwo=catIm.copy()

➋>>>forleftinrange(0,catImWidth,faceImWidth):

➌fortopinrange(0,catImHeight,faceImHeight):

print(left,top)

catCopyTwo.paste(faceIm,(left,top))

00

0215

0430

0645

0860

01075

2300

230215

--snip--

690860

6901075

>>>catCopyTwo.save('tiled.png')

HerewestorethewidthofheightofcatImincatImWidthandcatImHeight.At➊wemakeacopyofcatImandstoreitincatCopyTwo.Nowthatwehaveacopythatwecanpasteonto,westartloopingtopastefaceImontocatCopyTwo.Theouterforloop’sleftvariablestartsat0andincreasesbyfaceImWidth(230)➋.Theinnerforloop’stopvariablestartat0andincreasesbyfaceImHeight(215)➌.ThesenestedforloopsproducevaluesforleftandtoptopasteagridoffaceImimagesoverthecatCopyTwoImageobject,asinFigure17-6.Toseeournestedloopsworking,weprintleftandtop.Afterthepastingiscomplete,wesavethemodifiedcatCopyTwototiled.png.

Figure17-6.Nestedforloopsusedwithpaste()toduplicatethecat’sface(aduplicat,ifyouwill).

PASTINGTRANSPARENTPIXELS

Normallytransparentpixelsarepastedaswhitepixels.Iftheimageyouwanttopastehastransparentpixels,passtheImageobjectasthethirdargumentsothatasolidrectangleisn’tpasted.Thisthirdargumentisthe“mask”Imageobject.AmaskisanImageobjectwherethealphavalueissignificant,butthered,green,andbluevaluesareignored.Themasktellsthepaste()functionwhichpixelsitshouldcopyandwhichitshouldleavetransparent.Advancedusageofmasksisbeyondthisbook,butifyouwanttopasteanimagethathastransparentpixels,passtheImageobjectagainasthethirdargument.

ResizinganImageTheresize()methodiscalledonanImageobjectandreturnsanewImageobjectofthespecifiedwidthandheight.Itacceptsatwo-integertupleargument,representingthenewwidthandheightofthereturnedimage.Enterthefollowingintotheinteractiveshell:

➊>>>width,height=catIm.size

➋>>>quartersizedIm=catIm.resize((int(width/2),int(height/2)))

>>>quartersizedIm.save('quartersized.png')

➌>>>svelteIm=catIm.resize((width,height+300))

>>>svelteIm.save('svelte.png')

HereweassignthetwovaluesinthecatIm.sizetupletothevariableswidthandheight➊.UsingwidthandheightinsteadofcatIm.size[0]andcatIm.size[1]makestherestofthecodemorereadable.

Thefirstresize()callpassesint(width/2)forthenewwidthandint(height/2)forthenewheight➋,sotheImageobjectreturnedfromresize()willbehalfthelengthandwidthoftheoriginalimage,orone-quarteroftheoriginalimagesizeoverall.Theresize()methodacceptsonlyintegersinitstupleargument,whichiswhyyouneededtowrapbothdivisionsby2inanint()call.

Thisresizingkeepsthesameproportionsforthewidthandheight.Butthenewwidthandheightpassedtoresize()donothavetobeproportionaltotheoriginalimage.ThesvelteImvariablecontainsanImageobjectthathastheoriginalwidthbutaheightthatis300pixelstaller➌,givingZophieamoreslenderlook.

Notethattheresize()methoddoesnotedittheImageobjectinplacebutinsteadreturnsanewImageobject.

RotatingandFlippingImagesImagescanberotatedwiththerotate()method,whichreturnsanewImageobjectoftherotatedimageandleavestheoriginalImageobjectunchanged.Theargumenttorotate()isasingleintegerorfloatrepresentingthenumberofdegreestorotatetheimagecounterclockwise.Enterthefollowingintotheinteractiveshell:

>>>catIm.rotate(90).save('rotated90.png')

>>>catIm.rotate(180).save('rotated180.png')

>>>catIm.rotate(270).save('rotated270.png')

Notehowyoucanchainmethodcallsbycallingsave()directlyontheImageobjectreturnedfromrotate().Thefirstrotate()andsave()callmakesanewImageobjectrepresentingtheimagerotatedcounterclockwiseby90degreesandsavestherotatedimagetorotated90.png.Thesecondandthirdcallsdothesame,butwith180degressand270degress.TheresultslooklikeFigure17-7.

Figure17-7.Theoriginalimage(left)andtheimagerotatedcounterclockwiseby90,180,and270degrees

Noticethatthewidthandheightoftheimagechangewhentheimageisrotated90or270degrees.Ifyourotateanimagebysomeotheramount,theoriginaldimensionsoftheimagearemaintained.OnWindows,ablackbackgroundisusedtofillinanygapsmadebytherotation,likeinFigure17-8.OnOSX,transparentpixelsareusedforthegapsinstead.

Therotate()methodhasanoptionalexpandkeywordargumentthatcanbesettoTruetoenlargethedimensionsoftheimagetofittheentirerotatednewimage.Forexample,enterthefollowingintotheinteractiveshell:

>>>catIm.rotate(6).save('rotated6.png')

>>>catIm.rotate(6,expand=True).save('rotated6_expanded.png')

Thefirstcallrotatestheimage6degreesandsavesittorotate6.png(seetheimageontheleftofFigure17-8).Thesecondcallrotatestheimage6degreeswithexpandsettoTrueandsavesittorotate6_expanded.png(seetheimageontherightofFigure17-8).

Figure17-8.Theimagerotated6degreesnormally(left)andwithexpand=True(right)

Youcanalsogeta“mirrorflip”ofanimagewiththetranspose()method.YoumustpasseitherImage.FLIP_LEFT_RIGHTorImage.FLIP_TOP_BOTTOMtothetranspose()method.Enterthefollowingintotheinteractiveshell:

>>>catIm.transpose(Image.FLIP_LEFT_RIGHT).save('horizontal_flip.png')

>>>catIm.transpose(Image.FLIP_TOP_BOTTOM).save('vertical_flip.png')

Likerotate(),transpose()createsanewImageobject.HerewaspassImage.FLIP_LEFT_RIGHTtofliptheimagehorizontallyandthensavetheresulttohorizontal_flip.png.Tofliptheimagevertically,wepassImage.FLIP_TOP_BOTTOMandsavetovertical_flip.png.TheresultslooklikeFigure17-9.

Figure17-9.Theoriginalimage(left),horizontalflip(center),andverticalflip(right)

ChangingIndividualPixelsThecolorofanindividualpixelcanberetrievedorsetwiththegetpixel()andputpixel()methods.Thesemethodsbothtakeatuplerepresentingthex-andy-coordinatesofthepixel.Theputpixel()methodalsotakesanadditionaltupleargumentforthecolorofthepixel.Thiscolorargumentisafour-integerRGBAtupleorathree-integerRGBtuple.Enterthefollowingintotheinteractiveshell:

➊>>>im=Image.new('RGBA',(100,100))

➋>>>im.getpixel((0,0))

(0,0,0,0)

➌>>>forxinrange(100):

foryinrange(50):

➍im.putpixel((x,y),(210,210,210))

>>>fromPILimportImageColor

➎>>>forxinrange(100):

foryinrange(50,100):

➏im.putpixel((x,y),ImageColor.getcolor('darkgray','RGBA'))

>>>im.getpixel((0,0))

(210,210,210,255)

>>>im.getpixel((0,50))

(169,169,169,255)

>>>im.save('putPixel.png')

At➊wemakeanewimagethatisa100×100transparentsquare.Callinggetpixel()onsomecoordinatesinthisimagereturns(0,0,0,0)becausetheimageistransparent➋.Tocolorpixelsinthisimage,wecanusenestedforloopstogothroughallthepixelsinthetophalfoftheimage➌andcoloreachpixelusingputpixel()➍.Herewepassputpixel()theRGBtuple(210,210,210),alightgray.

Saywewanttocolorthebottomhalfoftheimagedarkgraybutdon’tknowtheRGBtuplefordarkgray.Theputpixel()methoddoesn’tacceptastandardcolornamelike'darkgray',soyouhavetouseImageColor.getcolor()togetacolortuplefrom'darkgray'.Loopthroughthepixelsinthebottomhalfoftheimage➎andpass

putpixel()thereturnvalueofImageColor.getcolor()➏,andyoushouldnowhaveanimagethatislightgrayinitstophalfanddarkgrayinthebottomhalf,asshowninFigure17-10.Youcancallgetpixel()onsomecoordinatestoconfirmthatthecoloratanygivenpixeliswhatyouexpect.Finally,savetheimagetoputPixel.png.

Figure17-10.TheputPixel.pngimage

Ofcourse,drawingonepixelatatimeontoanimageisn’tveryconvenient.Ifyouneedtodrawshapes,usetheImageDrawfunctionsexplainedlaterinthischapter.

Project:AddingaLogoSayyouhavetheboringjobofresizingthousandsofimagesandaddingasmalllogowatermarktothecornerofeach.DoingthiswithabasicgraphicsprogramsuchasPaintbrushorPaintwouldtakeforever.AfanciergraphicsapplicationsuchasPhotoshopcandobatchprocessing,butthatsoftwarecostshundredsofdollars.Let’swriteascripttodoitinstead.

SaythatFigure17-11isthelogoyouwanttoaddtothebottom-rightcornerofeachimage:ablackcaticonwithawhiteborder,withtherestoftheimagetransparent.

Figure17-11.Thelogotobeaddedtotheimage.

Atahighlevel,here’swhattheprogramshoulddo:

Loadthelogoimage.Loopoverall.pngand.jpgfilesintheworkingdirectory.Checkwhethertheimageiswiderortallerthan300pixels.Ifso,reducethewidthorheight(whicheverislarger)to300pixelsandscaledowntheotherdimensionproportionally.Pastethelogoimageintothecorner.Savethealteredimagestoanotherfolder.

Thismeansthecodewillneedtodothefollowing:

Openthecatlogo.pngfileasanImageobject.Loopoverthestringsreturnedfromos.listdir('.').Getthewidthandheightoftheimagefromthesizeattribute.Calculatethenewwidthandheightoftheresizedimage.Calltheresize()methodtoresizetheimage.Callthepaste()methodtopastethelogo.Callthesave()methodtosavethechanges,usingtheoriginalfilename.

Step1:OpentheLogoImageForthisproject,openanewfileeditorwindow,enterthefollowingcode,andsaveitasresizeAndAddLogo.py:

#!python3

#resizeAndAddLogo.py-Resizesallimagesincurrentworkingdirectorytofit

#ina300x300square,andaddscatlogo.pngtothelower-rightcorner.

importos

fromPILimportImage

➊SQUARE_FIT_SIZE=300

➋LOGO_FILENAME='catlogo.png'

➌logoIm=Image.open(LOGO_FILENAME)

➍logoWidth,logoHeight=logoIm.size

#TODO:Loopoverallfilesintheworkingdirectory.

#TODO:Checkifimageneedstoberesized.

#TODO:Calculatethenewwidthandheighttoresizeto.

#TODO:Resizetheimage.

#TODO:Addthelogo.

#TODO:Savechanges.

BysettinguptheSQUARE_FIT_SIZE➊andLOGO_FILENAME➋constantsatthestartoftheprogram,we’vemadeiteasytochangetheprogramlater.Saythelogothatyou’readdingisn’tthecaticon,orsayyou’rereducingtheoutputimages’largestdimensiontosomethingotherthan300pixels.Withtheseconstantsatthestartoftheprogram,youcanjustopenthecode,changethosevaluesonce,andyou’redone.(Oryoucanmakeitsothatthevaluesfortheseconstantsaretakenfromthecommandlinearguments.)Withouttheseconstants,you’dinsteadhavetosearchthecodeforallinstancesof300and'catlogo.png'andreplacethemwiththevaluesforyournewproject.Inshort,usingconstantsmakesyourprogrammoregeneralized.

ThelogoImageobjectisreturnedfromImage.open()➌.Forreadability,logoWidthandlogoHeightareassignedtothevaluesfromlogoIm.size➍.

TherestoftheprogramisaskeletonofTODOcommentsfornow.

Step2:LoopOverAllFilesandOpenImagesNowyouneedtofindevery.pngfileand.jpgfileinthecurrentworkingdirectory.Notethatyoudon’twanttoaddthelogoimagetothelogoimageitself,sotheprogramshouldskipanyimagewithafilenamethat’sthesameasLOGO_FILENAME.Addthefollowingtoyourcode:

#!python3

#resizeAndAddLogo.py-Resizesallimagesincurrentworkingdirectorytofit

#ina300x300square,andaddscatlogo.pngtothelower-rightcorner.

importos

fromPILimportImage

--snip--

os.makedirs('withLogo',exist_ok=True)

#Loopoverallfilesintheworkingdirectory.

➊forfilenameinos.listdir('.'):

➋ifnot(filename.endswith('.png')orfilename.endswith('.jpg'))\

orfilename==LOGO_FILENAME:

➌continue#skipnon-imagefilesandthelogofileitself

➍im=Image.open(filename)

width,height=im.size

--snip--

First,theos.makedirs()callcreatesawithLogofoldertostorethefinishedimageswithlogos,insteadofoverwritingtheoriginalimagefiles.Theexist_ok=Truekeywordargumentwillkeepos.makedirs()fromraisinganexceptionifwithLogoalreadyexists.

Whileloopingthroughallthefilesintheworkingdirectorywithos.listdir('.')➊,thelongifstatement➋checkswhethereachfilenamedoesn’tendwith.pngor.jpg.Ifso—orifthefileisthelogoimageitself—thentheloopshouldskipitandusecontinue➌togotothenextfile.Iffilenamedoesendwith'.png'or'.jpg'(andisn’tthelogofile),youcanopenitasanImageobject➍andsetwidthandheight.

Step3:ResizetheImagesTheprogramshouldresizetheimageonlyifthewidthorheightislargerthanSQUARE_FIT_SIZE(300pixels,inthiscase),soputalloftheresizingcodeinsideanifstatementthatchecksthewidthandheightvariables.Addthefollowingcodetoyourprogram:

#!python3

#resizeAndAddLogo.py-Resizesallimagesincurrentworkingdirectorytofit

#ina300x300square,andaddscatlogo.pngtothelower-rightcorner.

importos

fromPILimportImage

--snip--

#Checkifimageneedstoberesized.

ifwidth>SQUARE_FIT_SIZEandheight>SQUARE_FIT_SIZE:

#Calculatethenewwidthandheighttoresizeto.

ifwidth>height:

➊height=int((SQUARE_FIT_SIZE/width)*height)

width=SQUARE_FIT_SIZE

else:

➋width=int((SQUARE_FIT_SIZE/height)*width)

height=SQUARE_FIT_SIZE

#Resizetheimage.

print('Resizing%s…'%(filename))

➌im=im.resize((width,height))

--snip--

Iftheimagedoesneedtoberesized,youneedtofindoutwhetheritisawideortallimage.Ifwidthisgreaterthanheight,thentheheightshouldbereducedbythesameproportionthatthewidthwouldbereduced➊.ThisproportionistheSQUARE_FIT_SIZEvaluedividedbythecurrentwidth.Thenewheightvalueisthisproportionmultipliedbythecurrentheightvalue.Sincethedivisionoperatorreturnsafloatvalueandresize()requiresthedimensionstobeintegers,remembertoconverttheresulttoanintegerwiththeint()function.Finally,thenewwidthvaluewillsimplybesettoSQUARE_FIT_SIZE.

Iftheheightisgreaterthanorequaltothewidth(bothcasesarehandledintheelseclause),thenthesamecalculationisdone,exceptwiththeheightandwidthvariablesswapped➋.

Oncewidthandheightcontainthenewimagedimensions,passthemtotheresize()methodandstorethereturnedImageobjectinim➌.

Step4:AddtheLogoandSavetheChangesWhetherornottheimagewasresized,thelogoshouldstillbepastedtothebottom-rightcorner.Whereexactlythelogoshouldbepasteddependsonboththesizeoftheimageandthesizeofthelogo.Figure17-12showshowtocalculatethepastingposition.Theleftcoordinateforwheretopastethelogowillbetheimagewidthminusthelogowidth;thetopcoordinateforwheretopastethelogowillbetheimageheightminusthelogoheight.

Figure17-12.Theleftandtopcoordinatesforplacingthelogointhebottom-rightcornershouldbetheimagewidth/heightminusthelogowidth/height.

Afteryourcodepastesthelogointotheimage,itshouldsavethemodifiedImageobject.Addthefollowingtoyourprogram:

#!python3

#resizeAndAddLogo.py-Resizesallimagesincurrentworkingdirectorytofit

#ina300x300square,andaddscatlogo.pngtothelower-rightcorner.

importos

fromPILimportImage

--snip--

#Checkifimageneedstoberesized.

--snip--

#Addthelogo.

➊print('Addinglogoto%s…'%(filename))

➋im.paste(logoIm,(width-logoWidth,height-logoHeight),logoIm)

#Savechanges.

➌im.save(os.path.join('withLogo',filename))

Thenewcodeprintsamessagetellingtheuserthatthelogoisbeingadded➊,pasteslogoImontoimatthecalculatedcoordinates➋,andsavesthechangestoafilenameinthewithLogodirectory➌.Whenyourunthisprogramwiththezophie.pngfileastheonlyimageintheworkingdirectory,theoutputwilllooklikethis:

Resizingzophie.png…

Addinglogotozophie.png…

Theimagezophie.pngwillbechangedtoa225×300-pixelimagethatlookslikeFigure17-13.Rememberthatthepaste()methodwillnotpastethetransparencypixelsifyoudonotpassthelogoImforthethirdargumentaswell.Thisprogramcanautomaticallyresizeand“logo-ify”hundredsofimagesinjustacoupleminutes.

Figure17-13.Theimagezophie.pngresizedandthelogoadded(left).Ifyouforgetthethirdargument,thetransparentpixelsinthelogowillbecopiedassolidwhitepixels(right).

IdeasforSimilarProgramsBeingabletocompositeimagesormodifyimagesizesinabatchcanbeusefulinmanyapplications.Youcouldwritesimilarprogramstodothefollowing:

AddtextorawebsiteURLtoimages.Addtimestampstoimages.Copyormoveimagesintodifferentfoldersbasedontheirsizes.Addamostlytransparentwatermarktoanimagetopreventothersfromcopyingit.

DrawingonImagesIfyouneedtodrawlines,rectangles,circles,orothersimpleshapesonanimage,usePillow’sImageDrawmodule.Enterthefollowingintotheinteractiveshell:

>>>fromPILimportImage,ImageDraw

>>>im=Image.new('RGBA',(200,200),'white')

>>>draw=ImageDraw.Draw(im)

First,weimportImageandImageDraw.Thenwecreateanewimage,inthiscase,a200×200whiteimage,andstoretheImageobjectinim.WepasstheImageobjecttotheImageDraw.Draw()functiontoreceiveanImageDrawobject.ThisobjecthasseveralmethodsfordrawingshapesandtextontoanImageobject.StoretheImageDrawobjectinavariablelikedrawsoyoucanuseiteasilyinthefollowingexample.

DrawingShapesThefollowingImageDrawmethodsdrawvariouskindsofshapesontheimage.Thefillandoutlineparametersforthesemethodsareoptionalandwilldefaulttowhiteifleftunspecified.

Points

Thepoint(xy,fill)methoddrawsindividualpixels.Thexyargumentrepresentsalistofthepointsyouwanttodraw.Thelistcanbealistofx-andy-coordinatetuples,suchas[(x,y),(x,y),...],oralistofx-andy-coordinateswithouttuples,suchas[x1,y1,x2,y2,...].ThefillargumentisthecolorofthepointsandiseitheranRGBAtupleorastringofacolorname,suchas'red'.Thefillargumentisoptional.

Lines

Theline(xy,fill,width)methoddrawsalineorseriesoflines.xyiseitheralistoftuples,suchas[(x,y),(x,y),...],oralistofintegers,suchas[x1,y1,x2,y2,...].Eachpointisoneoftheconnectingpointsonthelinesyou’redrawing.Theoptionalfillargumentisthecolorofthelines,asanRGBAtupleorcolorname.Theoptionalwidthargumentisthewidthofthelinesanddefaultsto1ifleftunspecified.

Rectangles

Therectangle(xy,fill,outline)methoddrawsarectangle.Thexyargumentisaboxtupleoftheform(left,top,right,bottom).Theleftandtopvaluesspecifythex-andy-coordinatesoftheupper-leftcorneroftherectangle,whilerightandbottomspecifythelower-rightcorner.Theoptionalfillargumentisthecolorthatwillfilltheinsideoftherectangle.Theoptionaloutlineargumentisthecoloroftherectangle’soutline.

Ellipses

Theellipse(xy,fill,outline)methoddrawsanellipse.Ifthewidthandheightoftheellipseareidentical,thismethodwilldrawacircle.Thexyargumentisaboxtuple(left,top,right,bottom)thatrepresentsaboxthatpreciselycontainstheellipse.Theoptionalfillargumentisthecoloroftheinsideoftheellipse,andtheoptionaloutlineargumentisthecoloroftheellipse’soutline.

Polygons

Thepolygon(xy,fill,outline)methoddrawsanarbitrarypolygon.Thexyargumentisalistoftuples,suchas[(x,y),(x,y),...],orintegers,suchas[x1,y1,x2,y2,...],representingtheconnectingpointsofthepolygon’ssides.Thelastpairofcoordinateswillbeautomaticallyconnectedtothefirstpair.Theoptionalfillargumentisthecoloroftheinsideofthepolygon,andtheoptionaloutlineargumentisthecolorofthepolygon’soutline.

DrawingExample

Enterthefollowingintotheinteractiveshell:>>>fromPILimportImage,ImageDraw

>>>im=Image.new('RGBA',(200,200),'white')

>>>draw=ImageDraw.Draw(im)

➊>>>draw.line([(0,0),(199,0),(199,199),(0,199),(0,0)],fill='black')

➋>>>draw.rectangle((20,30,60,60),fill='blue')

➌>>>draw.ellipse((120,30,160,60),fill='red')

➍>>>draw.polygon(((57,87),(79,62),(94,85),(120,90),(103,113)),

fill='brown')

➎>>>foriinrange(100,200,10):

draw.line([(i,0),(200,i-100)],fill='green')

>>>im.save('drawing.png')

AftermakinganImageobjectfora200×200whiteimage,passingittoImageDraw.Draw()togetanImageDrawobject,andstoringtheImageDrawobjectindraw,youcancalldrawingmethodsondraw.Herewemakeathin,blackoutlineattheedgesoftheimage➊,abluerectanglewithitstop-leftcornerat(20,30)andbottom-rightcornerat(60,60)➋,aredellipsedefinedbyaboxfrom(120,30)to(160,60)➌,abrownpolygonwithfivepoints➍,andapatternofgreenlinesdrawnwithaforloop➎.Theresultingdrawing.pngfilewilllooklikeFigure17-14.

Figure17-14.Theresultingdrawing.pngimage

Thereareseveralothershape-drawingmethodsforImageDrawobjects.Thefulldocumentationisavailableathttp://pillow.readthedocs.org/en/latest/reference/ImageDraw.html.

DrawingTextTheImageDrawobjectalsohasatext()methodfordrawingtextontoanimage.The

text()methodtakesfourarguments:xy,text,fill,andfont.

Thexyargumentisatwo-integertuplespecifyingtheupper-leftcornerofthetextbox.Thetextargumentisthestringoftextyouwanttowrite.Theoptionalfillargumentisthecolorofthetext.TheoptionalfontargumentisanImageFontobject,usedtosetthetype-faceandsizeofthetext.Thisisdescribedinmoredetailinthenextsection.

Sinceit’softenhardtoknowinadvancewhatsizeablockoftextwillbeinagivenfont,theImageDrawmodulealsooffersatextsize()method.Itsfirstargumentisthestringoftextyouwanttomeasure,anditssecondargumentisanoptionalImageFontobject.Thetextsize()methodwillthenreturnatwo-integertupleofthewidthandheightthatthetextinthegivenfontwouldbeifitwerewrittenontotheimage.Youcanusethiswidthandheighttohelpyoucalculateexactlywhereyouwanttoputthetextonyourimage.

Thefirstthreeargumentsfortext()arestraightforward.Beforeweusetext()todrawtextontoanimage,let’slookattheoptionalfourthargument,theImageFontobject.

Bothtext()andtextsize()takeanoptionalImageFontobjectastheirfinalarguments.Tocreateoneoftheseobjects,firstrunthefollowing:

>>>fromPILimportImageFont

Nowthatyou’veimportedPillow’sImageFontmodule,youcancalltheImageFont.truetype()function,whichtakestwoarguments.Thefirstargumentisastringforthefont’sTrueTypefile—thisistheactualfontfilethatlivesonyourharddrive.ATrueTypefilehasthe.ttffileextensionandcanusuallybefoundinthefollowingfolders:

OnWindows:C:\Windows\FontsOnOSX:/Library/Fontsand/System/Library/FontsOnLinux:/usr/share/fonts/truetype

Youdon’tactuallyneedtoenterthesepathsaspartoftheTrueTypefilestringbecausePythonknowstoautomaticallysearchforfontsinthesedirectories.ButPythonwilldisplayanerrorifitisunabletofindthefontyouspecified.

ThesecondargumenttoImageFont.truetype()isanintegerforthefontsizeinpoints(ratherthan,say,pixels).KeepinmindthatPillowcreatesPNGimagesthatare72pixelsperinchbydefault,andapointis1/72ofaninch.

Enterthefollowingintotheinteractiveshell,replacingFONT_FOLDERwiththeactualfoldernameyouroperatingsystemuses:

>>>fromPILimportImage,ImageDraw,ImageFont

>>>importos

➊>>>im=Image.new('RGBA',(200,200),'white')

➋>>>draw=ImageDraw.Draw(im)

➌>>>draw.text((20,150),'Hello',fill='purple')

>>>fontsFolder='FONT_FOLDER'#e.g.'Library/Fonts'

➍>>>arialFont=ImageFont.truetype(os.path.join(fontsFolder,'arial.ttf'),32)

➎>>>draw.text((100,150),'Howdy',fill='gray',font=arialFont)

>>>im.save('text.png')

AfterimportingImage,ImageDraw,ImageFont,andos,wemakeanImageobjectforanew200×200whiteimage➊andmakeanImageDrawobjectfromtheImageobject➋.Weusetext()todrawHelloat(20,150)inpurple➌.Wedidn’tpasstheoptionalfourth

argumentinthistext()call,sothetypefaceandsizeofthistextaren’tcustomized.

Tosetatypefaceandsize,wefirststorethefoldername(like/Library/Fonts)infontsFolder.ThenwecallImageFont.truetype(),passingitthe.ttffileforthefontwewant,followedbyanintegerfontsize➍.StoretheFontobjectyougetfromImageFont.truetype()inavariablelikearialFont,andthenpassthevariabletotext()inthefinalkeywordargument.Thetext()callat➎drawsHowdyat(100,150)ingrayin32-pointArial.

Theresultingtext.pngfilewilllooklikeFigure17-15.

Figure17-15.Theresultingtext.pngimage

SummaryImagesconsistofacollectionofpixels,andeachpixelhasanRGBAvalueforitscoloranditsaddressablebyx-andy-coordinates.TwocommonimageformatsareJPEGandPNG.Thepillowmodulecanhandlebothoftheseimageformatsandothers.

WhenanimageisloadedintoanImageobject,itswidthandheightdimensionsarestoredasatwo-integertupleinthesizeattribute.ObjectsoftheImagedatatypealsohavemethodsforcommonimagemanipulations:crop(),copy(),paste(),resize(),rotate(),andtranspose().TosavetheImageobjecttoanimagefile,callthesave()method.

Ifyouwantyourprogramtodrawshapesontoanimage,useImageDrawmethodstodrawpoints,lines,rectangles,ellipses,andpolygons.Themodulealsoprovidesmethodsfordrawingtextinatypefaceandfontsizeofyourchoosing.

Althoughadvanced(andexpensive)applicationssuchasPhotoshopprovideautomaticbatchprocessingfeatures,youcanusePythonscriptstodomanyofthesamemodificationsforfree.Inthepreviouschapters,youwrotePythonprogramstodealwithplaintextfiles,spreadsheets,PDFs,andotherformats.Withthepillowmodule,you’veextendedyourprogrammingpowerstoprocessingimagesaswell!

PracticeQuestionsQ: 1.WhatisanRGBAvalue?

Q: 2.HowcanyougettheRGBAvalueof'CornflowerBlue'fromthePillowmodule?

Q: 3.Whatisaboxtuple?

Q: 4.WhatfunctionreturnsanImageobjectfor,say,animagefilenamedzophie.png?

Q: 5.HowcanyoufindoutthewidthandheightofanImageobject’simage?

Q: 6.WhatmethodwouldyoucalltogetImageobjectfora100×100image,excludingthelowerleftquarterofit?

Q: 7.AftermakingchangestoanImageobject,howcouldyousaveitasanimagefile?

Q: 8.WhatmodulecontainsPillow’sshape-drawingcode?

Q: 9.Imageobjectsdonothavedrawingmethods.Whatkindofobjectdoes?Howdoyougetthiskindofobject?

PracticeProjectsForpractice,writeprogramsthatdothefollowing.

ExtendingandFixingtheChapterProjectProgramsTheresizeAndAddLogo.pyprograminthischapterworkswithPNGandJPEGfiles,butPillowsupportsmanymoreformatsthanjustthesetwo.ExtendresizeAndAddLogo.pytoprocessGIFandBMPimagesaswell.

AnothersmallissueisthattheprogrammodifiesPNGandJPEGfilesonlyiftheirfileextensionsaresetinlowercase.Forexample,itwillprocesszophie.pngbutnotzophie.PNG.Changethecodesothatthefileextensioncheckiscaseinsensitive.

Figure17-16.Whentheimageisn’tmuchlargerthanthelogo,theresultslookugly.

Finally,thelogoaddedtothebottom-rightcornerismeanttobejustasmallmark,butiftheimageisaboutthesamesizeasthelogoitself,theresultwilllooklikeFigure17-16.ModifyresizeAndAddLogo.pysothattheimagemustbeatleasttwicethewidthandheightofthelogoimagebeforethelogoispasted.Otherwise,itshouldskipaddingthelogo.

IdentifyingPhotoFoldersontheHardDriveIhaveabadhabitoftransferringfilesfrommydigitalcameratotemporaryfolderssomewhereontheharddriveandthenforgettingaboutthesefolders.Itwouldbenicetowriteaprogramthatcouldscantheentireharddriveandfindtheseleftover“photofolders.”

Writeaprogramthatgoesthrougheveryfolderonyourharddriveandfindspotentialphotofolders.Ofcourse,firstyou’llhavetodefinewhatyouconsidera“photofolder”tobe;let’ssaythatit’sanyfolderwheremorethanhalfofthefilesarephotos.Andhowdoyoudefinewhatfilesarephotos?

First,aphotofilemusthavethefileextension.pngor.jpg.Also,photosarelargeimages;aphotofile’swidthandheightmustbothbelargerthan500pixels.Thisisasafebet,sincemostdigitalcameraphotosareseveralthousandpixelsinwidthandheight.

Asahint,here’saroughskeletonofwhatthisprogrammightlooklike:#!python3#

Importmodulesandwritecommentstodescribethisprogram.

forfoldername,subfolders,filenamesinos.walk('C:\\'):

numPhotoFiles=0

numNonPhotoFiles=0

forfilenameinfilenames:

#Checkiffileextensionisn't.pngor.jpg.

ifTODO:

numNonPhotoFiles+=1

continue#skiptonextfilename

#OpenimagefileusingPillow.

#Checkifwidth&heightarelargerthan500.

ifTODO:

#Imageislargeenoughtobeconsideredaphoto.

numPhotoFiles+=1

else:

#Imageistoosmalltobeaphoto.

numNonPhotoFiles+=1

#Ifmorethanhalfoffileswerephotos,

#printtheabsolutepathofthefolder.

ifTODO:

print(TODO)

Whentheprogramruns,itshouldprinttheabsolutepathofanyphotofolderstothescreen.

CustomSeatingCardsChapter13includedapracticeprojecttocreatecustominvitationsfromalistofguestsinaplaintextfile.Asanadditionalproject,usethepillowmoduletocreateimagesforcustomseatingcardsforyourguests.Foreachoftheguestslistedintheguests.txtfilefromtheresourcesathttp://nostarch.com/automatestuff/,generateanimagefilewiththeguestnameandsomeflowerydecoration.Apublicdomainflowerimageisavailableintheresourcesathttp://nostarch.com/automatestuff/.

Toensurethateachseatingcardisthesamesize,addablackrectangleontheedgesoftheinvitationimagesothatwhentheimageisprintedout,therewillbeaguidelineforcutting.ThePNGfilesthatPillowproducesaresetto72pixelsperinch,soa4×5-inchcardwouldrequirea288×360-pixelimage.

Chapter18.ControllingtheKeyboardandMousewithGUIAutomationKnowingvariousPythonmodulesforeditingspreadsheets,downloadingfiles,andlaunchingprogramsisuseful,butsometimestherejustaren’tanymodulesfortheapplicationsyouneedtoworkwith.Theultimatetoolsforautomatingtasksonyourcomputerareprogramsyouwritethatdirectlycontrolthekeyboardandmouse.Theseprogramscancontrolotherapplicationsbysendingthemvirtualkeystrokesandmouseclicks,justasifyouweresittingatyourcomputerandinteractingwiththeapplicationsyourself.Thistechniqueisknownasgraphicaluserinterfaceautomation,orGUIautomationforshort.WithGUIautomation,yourprogramscandoanythingthatahumanusersittingatthecomputercando,exceptspillcoffeeonthekeyboard.

ThinkofGUIautomationasprogrammingaroboticarm.Youcanprogramtheroboticarmtotypeatyourkeyboardandmoveyourmouseforyou.Thistechniqueisparticularlyusefulfortasksthatinvolvealotofmindlessclickingorfillingoutofforms.

Thepyautoguimodulehasfunctionsforsimulatingmousemovements,buttonclicks,andscrollingthemousewheel.ThischaptercoversonlyasubsetofPyAutoGUI’sfeatures;youcanfindthefulldocumentationathttp://pyautogui.readthedocs.org/.

InstallingthepyautoguiModuleThepyautoguimodulecansendvirtualkeypressesandmouseclickstoWindows,OSX,andLinux.Dependingonwhichoperatingsystemyou’reusing,youmayhavetoinstallsomeothermodules(calleddependencies)beforeyoucaninstallPyAutoGUI.

OnWindows,therearenoothermodulestoinstall.OnOSX,runsudopip3installpyobjc-framework-Quartz,sudopip3installpyobjc-core,andthensudopip3installpyobjc.OnLinux,runsudopip3installpython3-xlibandsudoapt-getscrot.(ScrotisascreenshotprogramthatPyAutoGUIuses.)

Afterthesedependenciesareinstalled,runpipinstallpyautogui(orpip3onOSXandLinux)toinstallPyAutoGUI.

AppendixAhascompleteinformationoninstallingthird-partymodules.TotestwhetherPyAutoGUIhasbeeninstalledcorrectly,runimportpyautoguifromtheinteractiveshellandcheckforanyerrormessages.

StayingonTrackBeforeyoujumpintoaGUIautomation,youshouldknowhowtoescapeproblemsthatmayarise.Pythoncanmoveyourmouseandtypekeystrokesatanincrediblespeed.Infact,itmightbetoofastforotherprogramstokeepupwith.Also,ifsomethinggoeswrongbutyourprogramkeepsmovingthemousearound,itwillbehardtotellwhatexactlytheprogramisdoingorhowtorecoverfromtheproblem.LiketheenchantedbroomsfromDisney’sTheSorcerer’sApprentice,whichkeptfilling—andthenoverfilling—Mickey’stubwithwater,yourprogramcouldgetoutofcontroleventhoughit’sfollowingyourinstructionsperfectly.Stoppingtheprogramcanbedifficultifthemouseismovingaroundonitsown,preventingyoufromclickingtheIDLEwindowtocloseit.Fortunately,thereareseveralwaystopreventorrecoverfromGUIautomationproblems.

ShuttingDownEverythingbyLoggingOutPerhapsthesimplestwaytostopanout-of-controlGUIautomationprogramistologout,whichwillshutdownallrunningprograms.OnWindowsandLinux,thelogouthotkeyisCTRL-ALT-DEL.OnOSX,itis -SHIFT-OPTION-Q.Byloggingout,you’llloseanyunsavedwork,butatleastyouwon’thavetowaitforafullrebootofthecomputer.

PausesandFail-SafesYoucantellyourscripttowaitaftereveryfunctioncall,givingyouashortwindowtotakecontrolofthemouseandkeyboardifsomethinggoeswrong.Todothis,setthepyautogui.PAUSEvariabletothenumberofsecondsyouwantittopause.Forexample,aftersettingpyautogui.PAUSE=1.5,everyPyAutoGUIfunctioncallwillwaitoneandahalfsecondsafterperformingitsaction.Non-PyAutoGUIinstructionswillnothavethispause.

PyAutoGUIalsohasafail-safefeature.Movingthemousecursortotheupper-leftcornerofthescreenwillcausePyAutoGUItoraisethepyautogui.FailSafeExceptionexception.Yourprogramcaneitherhandlethisexceptionwithtryandexceptstatementsorlettheexceptioncrashyourprogram.Eitherway,thefail-safefeaturewillstoptheprogramifyouquicklymovethemouseasfarupandleftasyoucan.Youcandisablethisfeaturebysettingpyautogui.FAILSAFE=False.Enterthefollowingintotheinteractiveshell:

>>>importpyautogui

>>>pyautogui.PAUSE=1

>>>pyautogui.FAILSAFE=True

Hereweimportpyautoguiandsetpyautogui.PAUSEto1foraone-secondpauseaftereachfunctioncall.Wesetpyautogui.FAILSAFEtoTruetoenablethefail-safefeature.

ControllingMouseMovementInthissection,you’lllearnhowtomovethemouseandtrackitspositiononthescreenusingPyAutoGUI,butfirstyouneedtounderstandhowPyAutoGUIworkswithcoordinates.

ThemousefunctionsofPyAutoGUIusex-andy-coordinates.Figure18-1showsthecoordinatesystemforthecomputerscreen;it’ssimilartothecoordinatesystemusedforimages,discussedinChapter17.Theorigin,wherexandyarebothzero,isattheupper-leftcornerofthescreen.Thex-coordinatesincreasegoingtotheright,andthey-coordinatesincreasegoingdown.Allcoordinatesarepositiveintegers;therearenonegativecoordinates.

Figure18-1.Thecoordinatesofacomputerscreenwith1920×1080resolution

Yourresolutionishowmanypixelswideandtallyourscreenis.Ifyourscreen’sresolutionissetto1920×1080,thenthecoordinatefortheupper-leftcornerwillbe(0,0),andthecoordinateforthebottom-rightcornerwillbe(1919,1079).

Thepyautogui.size()functionreturnsatwo-integertupleofthescreen’swidthandheightinpixels.Enterthefollowingintotheinteractiveshell:

>>>importpyautogui

>>>pyautogui.size()

(1920,1080)

>>>width,height=pyautogui.size()

pyautogui.size()returns(1920,1080)onacomputerwitha1920×1080resolution;dependingonyourscreen’sresolution,yourreturnvaluemaybedifferent.Youcanstorethewidthandheightfrompyautogui.size()invariableslikewidthandheightforbetterreadabilityinyourprograms.

MovingtheMouse

Nowthatyouunderstandscreencoordinates,let’smovethemouse.Thepyautogui.moveTo()functionwillinstantlymovethemousecursortoaspecifiedpositiononthescreen.Integervaluesforthex-andy-coordinatesmakeupthefunction’sfirstandsecondarguments,respectively.Anoptionaldurationintegerorfloatkeywordargumentspecifiesthenumberofsecondsitshouldtaketomovethemousetothedestination.Ifyouleaveitout,thedefaultis0forinstantaneousmovement.(AllofthedurationkeywordargumentsinPyAutoGUIfunctionsareoptional.)Enterthefollowingintotheinteractiveshell:

>>>importpyautogui

>>>foriinrange(10):

pyautogui.moveTo(100,100,duration=0.25)

pyautogui.moveTo(200,100,duration=0.25)

pyautogui.moveTo(200,200,duration=0.25)

pyautogui.moveTo(100,200,duration=0.25)

Thisexamplemovesthemousecursorclockwiseinasquarepatternamongthefourcoordinatesprovidedatotaloftentimes.Eachmovementtakesaquarterofasecond,asspecifiedbytheduration=0.25keywordargument.Ifyouhadn’tpassedathirdargumenttoanyofthepyautogui.moveTo()calls,themousecursorwouldhaveinstantlyteleportedfrompointtopoint.

Thepyautogui.moveRel()functionmovesthemousecursorrelativetoitscurrentposition.Thefollowingexamplemovesthemouseinthesamesquarepattern,exceptitbeginsthesquarefromwhereverthemousehappenstobeonthescreenwhenthecodestartsrunning:

>>>importpyautogui

>>>foriinrange(10):

pyautogui.moveRel(100,0,duration=0.25)

pyautogui.moveRel(0,100,duration=0.25)

pyautogui.moveRel(-100,0,duration=0.25)

pyautogui.moveRel(0,-100,duration=0.25)

pyautogui.moveRel()alsotakesthreearguments:howmanypixelstomovehorizontallytotheright,howmanypixelstomoveverticallydownward,and(optionally)howlongitshouldtaketocompletethemovement.Anegativeintegerforthefirstorsecondargumentwillcausethemousetomoveleftorupward,respectively.

GettingtheMousePositionYoucandeterminethemouse’scurrentpositionbycallingthepyautogui.position()function,whichwillreturnatupleofthemousecursor’sxandypositionsatthetimeofthefunctioncall.Enterthefollowingintotheinteractiveshell,movingthemousearoundaftereachcall:

>>>pyautogui.position()

(311,622)

>>>pyautogui.position()

(377,481)

>>>pyautogui.position()

(1536,637)

Ofcourse,yourreturnvalueswillvarydependingonwhereyourmousecursoris.

Project:“WhereIstheMouseRightNow?”BeingabletodeterminethemousepositionisanimportantpartofsettingupyourGUIautomationscripts.Butit’salmostimpossibletofigureouttheexactcoordinatesofapixeljustbylookingatthescreen.Itwouldbehandytohaveaprogramthatconstantlydisplaysthex-andy-coordinatesofthemousecursorasyoumoveitaround.

Atahighlevel,here’swhatyourprogramshoulddo:

Displaythecurrentx-andy-coordinatesofthemousecursor.Updatethesecoordinatesasthemousemovesaroundthescreen.

Thismeansyourcodewillneedtodothefollowing:

Calltheposition()functiontofetchthecurrentcoordinates.Erasethepreviouslyprintedcoordinatesbyprinting\bbackspacecharacterstothescreen.HandletheKeyboardInterruptexceptionsotheusercanpressCTRL-Ctoquit.

OpenanewfileeditorwindowandsaveitasmouseNow.py.

Step1:ImporttheModuleStartyourprogramwiththefollowing:

#!python3

#mouseNow.py-Displaysthemousecursor'scurrentposition.

importpyautogui

print('PressCtrl-Ctoquit.')

#TODO:Getandprintthemousecoordinates.

ThebeginningoftheprogramimportsthepyautoguimoduleandprintsaremindertotheuserthattheyhavetopressCTRL-Ctoquit.

Step2:SetUptheQuitCodeandInfiniteLoopYoucanuseaninfinitewhilelooptoconstantlyprintthecurrentmousecoordinatesfrommouse.position().Asforthecodethatquitstheprogram,you’llneedtocatchtheKeyboardInterruptexception,whichisraisedwhenevertheuserpressesCTRL-C.Ifyoudon’thandlethisexception,itwilldisplayanuglytracebackanderrormessagetotheuser.Addthefollowingtoyourprogram:

#!python3

#mouseNow.py-Displaysthemousecursor'scurrentposition.

importpyautogui

print('PressCtrl-Ctoquit.')

try:

whileTrue:

#TODO:Getandprintthemousecoordinates.

➊exceptKeyboardInterrupt:

➋print('\nDone.')

Tohandletheexception,enclosetheinfinitewhileloopinatrystatement.WhentheuserpressesCTRL-C,theprogramexecutionwillmovetotheexceptclause➊andDone.willbeprintedinanewline➋.

Step3:GetandPrinttheMouseCoordinatesThecodeinsidethewhileloopshouldgetthecurrentmousecoordinates,formatthemtolooknice,andprintthem.Addthefollowingcodetotheinsideofthewhileloop:

#!python3

#mouseNow.py-Displaysthemousecursor'scurrentposition.

importpyautogui

print('PressCtrl-Ctoquit.')

--snip--

#Getandprintthemousecoordinates.

x,y=pyautogui.position()

positionStr='X:'+str(x).rjust(4)+'Y:'+str(y).rjust(4)

--snip--

Usingthemultipleassignmenttrick,thexandyvariablesaregiventhevaluesofthetwointegersreturnedinthetuplefrompyautogui.position().Bypassingxandytothestr()function,youcangetstringformsoftheintegercoordinates.Therjust()stringmethodwillright-justifythemsothattheytakeupthesameamountofspace,whetherthecoordinatehasone,two,three,orfourdigits.Concatenatingtheright-justifiedstringcoordinateswith'X:'and'Y:'labelsgivesusaneatlyformattedstring,whichwillbestoredinpositionStr.

Attheendofyourprogram,addthefollowingcode:#!python3

#mouseNow.py-Displaysthemousecursor'scurrentposition.

--snip--

print(positionStr,end='')

➊print('\b'*len(positionStr),end='',flush=True)

ThisactuallyprintspositionStrtothescreen.Theend=''keywordargumenttoprint()preventsthedefaultnewlinecharacterfrombeingaddedtotheendoftheprintedline.It’spossibletoerasetextyou’vealreadyprintedtothescreen—butonlyforthemostrecentlineoftext.Onceyouprintanewlinecharacter,youcan’teraseanythingprintedbeforeit.

Toerasetext,printthe\bbackspaceescapecharacter.Thisspecialcharactererasesacharacterattheendofthecurrentlineonthescreen.Thelineat➊usesstringreplicationtoproduceastringwithasmany\bcharactersasthelengthofthestringstoredinpositionStr,whichhastheeffectoferasingthepositionStrstringthatwaslastprinted.

Foratechnicalreasonbeyondthescopeofthisbook,alwayspassflush=Truetoprint()callsthatprint\bbackspacecharacters.Otherwise,thescreenmightnotupdatethetextasdesired.

Sincethewhilelooprepeatssoquickly,theuserwon’tactuallynoticethatyou’redeletingandreprintingthewholenumberonthescreen.Forexample,ifthex-coordinateis563andthemousemovesonepixeltotheright,itwilllooklikeonlythe3in563ischangedtoa4.

Whenyouruntheprogram,therewillbeonlytwolinesprinted.Theyshouldlooklikesomethinglikethis:

PressCtrl-Ctoquit.

X:290Y:424

ThefirstlinedisplaystheinstructiontopressCTRL-Ctoquit.Thesecondlinewiththemousecoordinateswillchangeasyoumovethemousearoundthescreen.Usingthisprogram,you’llbeabletofigureoutthemousecoordinatesforyourGUIautomationscripts.

ControllingMouseInteractionNowthatyouknowhowtomovethemouseandfigureoutwhereitisonthescreen,you’rereadytostartclicking,dragging,andscrolling.

ClickingtheMouseTosendavirtualmouseclicktoyourcomputer,callthepyautogui.click()method.Bydefault,thisclickusestheleftmousebuttonandtakesplacewhereverthemousecursoriscurrentlylocated.Youcanpassx-andy-coordinatesoftheclickasoptionalfirstandsecondargumentsifyouwantittotakeplacesomewhereotherthanthemouse’scurrentposition.

Ifyouwanttospecifywhichmousebuttontouse,includethebuttonkeywordargument,withavalueof'left','middle',or'right'.Forexample,pyautogui.click(100,150,button='left')willclicktheleftmousebuttonatthecoordinates(100,150),whilepyautogui.click(200,250,button='right')willperformaright-clickat(200,250).

Enterthefollowingintotheinteractiveshell:>>>importpyautogui

>>>pyautogui.click(10,5)

Youshouldseethemousepointermovetonearthetop-leftcornerofyourscreenandclickonce.Afull“click”isdefinedaspushingamousebuttondownandthenreleasingitbackupwithoutmovingthecursor.Youcanalsoperformaclickbycallingpyautogui.mouseDown(),whichonlypushesthemousebuttondown,andpyautogui.mouseUp(),whichonlyreleasesthebutton.Thesefunctionshavethesameargumentsasclick(),andinfact,theclick()functionisjustaconvenientwrapperaroundthesetwofunctioncalls.

Asafurtherconvenience,thepyautogui.doubleClick()functionwillperformtwoclickswiththeleftmousebutton,whilethepyautogui.rightClick()andpyautogui.middleClick()functionswillperformaclickwiththerightandmiddlemousebuttons,respectively.

DraggingtheMouseDraggingmeansmovingthemousewhileholdingdownoneofthemousebuttons.Forexample,youcanmovefilesbetweenfoldersbydraggingthefoldericons,oryoucanmoveappointmentsaroundinacalendarapp.

PyAutoGUIprovidesthepyautogui.dragTo()andpyautogui.dragRel()functionstodragthemousecursortoanewlocationoralocationrelativetoitscurrentone.TheargumentsfordragTo()anddragRel()arethesameasmoveTo()andmoveRel():thex-coordinate/horizontalmovement,they-coordinate/verticalmovement,andanoptionaldurationoftime.(OSXdoesnotdragcorrectlywhenthemousemovestooquickly,sopassingadurationkeywordargumentisrecommended.)

Totrythesefunctions,openagraphics-drawingapplicationsuchasPaintonWindows,PaintbrushonOSX,orGNUPaintonLinux.(Ifyoudon’thaveadrawingapplication,youcanusetheonlineoneathttp://sumopaint.com/.)IwillusePyAutoGUItodrawintheseapplications.

Withthemousecursoroverthedrawingapplication’scanvasandthePencilorBrushtoolselected,enterthefollowingintoanewfileeditorwindowandsaveitasspiralDraw.py:

importpyautogui,time

➊time.sleep(5)

➋pyautogui.click()#clicktoputdrawingprograminfocus

distance=200

whiledistance>0:

➌pyautogui.dragRel(distance,0,duration=0.2)#moveright

➍distance=distance-5

➎pyautogui.dragRel(0,distance,duration=0.2)#movedown

➏pyautogui.dragRel(-distance,0,duration=0.2)#moveleft

distance=distance-5

pyautogui.dragRel(0,-distance,duration=0.2)#moveup

Whenyourunthisprogram,therewillbeafive-seconddelay➊foryoutomovethemousecursoroverthedrawingprogram’swindowwiththePencilorBrushtoolselected.ThenspiralDraw.pywilltakecontrolofthemouseandclicktoputthedrawingprograminfocus➋.Awindowisinfocuswhenithasanactiveblinkingcursor,andtheactionsyoutake—liketypingor,inthiscase,draggingthemouse—willaffectthatwindow.Oncethedrawingprogramisinfocus,spiralDraw.pydrawsasquarespiralpatternliketheoneinFigure18-2.

Figure18-2.Theresultsfromthepyautogui.dragRel()example

Thedistancevariablestartsat200,soonthefirstiterationofthewhileloop,thefirstdragRel()calldragsthecursor200pixelstotheright,taking0.2seconds➌.distanceisthendecreasedto195➍,andtheseconddragRel()calldragsthecursor195pixelsdown

➎.ThethirddragRel()calldragsthecursor–195horizontally(195totheleft)➏,distanceisdecreasedto190,andthelastdragRel()calldragsthecursor190pixelsup.Oneachiteration,themouseisdraggedright,down,left,andup,anddistanceisslightlysmallerthanitwasinthepreviousiteration.Byloopingoverthiscode,youcanmovethemousecursortodrawasquarespiral.

Youcoulddrawthisspiralbyhand(orrather,bymouse),butyou’dhavetoworkslowlytobesoprecise.PyAutoGUIcandoitinafewseconds!

NOTE

Youcouldhaveyourcodedrawtheimageusingthepillowmodule’sdrawingfunctions—seeChapter17formoreinformation.ButusingGUIautomationallowsyoutomakeuseoftheadvanceddrawingtoolsthatgraphicsprogramscanprovide,suchasgradients,differentbrushes,orthefillbucket.

ScrollingtheMouseThefinalPyAutoGUImousefunctionisscroll(),whichyoupassanintegerargumentforhowmanyunitsyouwanttoscrollthemouseupordown.Thesizeofaunitvariesforeachoperatingsystemandapplication,soyou’llhavetoexperimenttoseeexactlyhowfaritscrollsinyourparticularsituation.Thescrollingtakesplaceatthemousecursor’scurrentposition.Passingapositiveintegerscrollsup,andpassinganegativeintegerscrollsdown.RunthefollowinginIDLE’sinteractiveshellwhilethemousecursorisovertheIDLEwindow:

>>>pyautogui.scroll(200)

You’llseeIDLEbrieflyscrollupward—andthengobackdown.ThedownwardscrollinghappensbecauseIDLEautomaticallyscrollsdowntothebottomafterexecutinganinstruction.Enterthiscodeinstead:

>>>importpyperclip

>>>numbers=''

>>>foriinrange(200):

numbers=numbers+str(i)+'\n'

>>>pyperclip.copy(numbers)

Thisimportspyperclipandsetsupanemptystring,numbers.Thecodethenloopsthrough200numbersandaddseachnumbertonumbers,alongwithanewline.Afterpyperclip.copy(numbers),theclipboardwillbeloadedwith200linesofnumbers.Openanewfileeditorwindowandpastethetextintoit.Thiswillgiveyoualargetextwindowtotryscrollingin.Enterthefollowingcodeintotheinteractiveshell:

>>>importtime,pyautogui

>>>time.sleep(5);pyautogui.scroll(100)

Onthesecondline,youentertwocommandsseparatedbyasemicolon,whichtellsPythontorunthecommandsasiftheywereonseparatelines.Theonlydifferenceisthattheinteractiveshellwon’tpromptyouforinputbetweenthetwoinstructions.Thisisimportantforthisexamplebecausewewanttothecalltopyautogui.scroll()tohappenautomaticallyafterthewait.(Notethatwhileputtingtwocommandsononelinecanbeusefulintheinteractiveshell,youshouldstillhaveeachinstructiononaseparatelineinyourprograms.)

AfterpressingENTERtorunthecode,youwillhavefivesecondstoclickthefileeditorwindowtoputitinfocus.Oncethepauseisover,thepyautogui.scroll()callwillcause

thefileeditorwindowtoscrollupafterthefive-seconddelay.

WorkingwiththeScreenYourGUIautomationprogramsdon’thavetoclickandtypeblindly.PyAutoGUIhasscreenshotfeaturesthatcancreateanimagefilebasedonthecurrentcontentsofthescreen.ThesefunctionscanalsoreturnaPillowImageobjectofthecurrentscreen’sappearance.Ifyou’vebeenskippingaroundinthisbook,you’llwanttoreadChapter17andinstallthepillowmodulebeforecontinuingwiththissection.

OnLinuxcomputers,thescrotprogramneedstobeinstalledtousethescreenshotfunctionsinPyAutoGUI.InaTerminalwindow,runsudoapt-getinstallscrottoinstallthisprogram.Ifyou’reonWindowsorOSX,skipthisstepandcontinuewiththesection.

GettingaScreenshotTotakescreenshotsinPython,callthepyautogui.screenshot()function.Enterthefollowingintotheinteractiveshell:

>>>importpyautogui

>>>im=pyautogui.screenshot()

TheimvariablewillcontaintheImageobjectofthescreenshot.YoucannowcallmethodsontheImageobjectintheimvariable,justlikeanyotherImageobject.Enterthefollowingintotheinteractiveshell:

>>>im.getpixel((0,0))

(176,176,175)

>>>im.getpixel((50,200))

(130,135,144)

Passgetpixel()atupleofcoordinates,like(0,0)or(50,200),andit’lltellyouthecolorofthepixelatthosecoordinatesinyourimage.Thereturnvaluefromgetpixel()isanRGBtupleofthreeintegersfortheamountofred,green,andblueinthepixel.(Thereisnofourthvalueforalpha,becausescreenshotimagesarefullyopaque.)Thisishowyourprogramscan“see”whatiscurrentlyonthescreen.

AnalyzingtheScreenshotSaythatoneofthestepsinyourGUIautomationprogramistoclickagraybutton.Beforecallingtheclick()method,youcouldtakeascreenshotandlookatthepixelwherethescriptisabouttoclick.Ifit’snotthesamegrayasthegraybutton,thenyourprogramknowssomethingiswrong.Maybethewindowmovedunexpectedly,ormaybeapop-updialoghasblockedthebutton.Atthispoint,insteadofcontinuing—andpossiblywreakinghavocbyclickingthewrongthing—yourprogramcan“see”thatitisn’tclickingontherightthingandstopitself.

PyAutoGUI’spixelMatchesColor()functionwillreturnTrueifthepixelatthegivenx-andy-coordinatesonthescreenmatchesthegivencolor.Thefirstandsecondargumentsareintegersforthex-andy-coordinates,andthethirdargumentisatupleofthreeintegersfortheRGBcolorthescreenpixelmustmatch.Enterthefollowingintotheinteractiveshell:

>>>importpyautogui

>>>im=pyautogui.screenshot()

➊>>>im.getpixel((50,200))

(130,135,144)

➋>>>pyautogui.pixelMatchesColor(50,200,(130,135,144))

True

➌>>>pyautogui.pixelMatchesColor(50,200,(255,135,144))

False

Aftertakingascreenshotandusinggetpixel()togetanRGBtupleforthecolorofapixelatspecificcoordinates➊,passthesamecoordinatesandRGBtupletopixelMatchesColor()➋,whichshouldreturnTrue.ThenchangeavalueintheRGBtupleandcallpixelMatchesColor()againforthesamecoordinates➌.Thisshouldreturnfalse.ThismethodcanbeusefultocallwheneveryourGUIautomationprogramsareabouttocallclick().Notethatthecoloratthegivencoordinatesmustexactlymatch.Ifitisevenslightlydifferent—forexample,(255,255,254)insteadof(255,255,255)—thenpixelMatchesColor()willreturnFalse.

Project:ExtendingthemouseNowProgramYoucouldextendthemouseNow.pyprojectfromearlierinthischaptersothatitnotonlygivesthex-andy-coordinatesofthemousecursor’scurrentpositionbutalsogivestheRGBcolorofthepixelunderthecursor.ModifythecodeinsidethewhileloopofmouseNow.pytolooklikethis:

#!python3

#mouseNow.py-Displaysthemousecursor'scurrentposition.

--snip--

positionStr='X:'+str(x).rjust(4)+'Y:'+str(y).rjust(4)

pixelColor=pyautogui.screenshot().getpixel((x,y))

positionStr+='RGB:('+str(pixelColor[0]).rjust(3)

positionStr+=','+str(pixelColor[1]).rjust(3)

positionStr+=','+str(pixelColor[2]).rjust(3)+')'

print(positionStr,end='')

--snip--

Now,whenyourunmouseNow.py,theoutputwillincludetheRGBcolorvalueofthepixelunderthemousecursor.

PressCtrl-Ctoquit.

X:406Y:17RGB:(161,50,50)

Thisinformation,alongwiththepixelMatchesColor()function,shouldmakeiteasytoaddpixelcolorcheckstoyourGUIautomationscripts.

ImageRecognitionButwhatifyoudonotknowbeforehandwherePyAutoGUIshouldclick?Youcanuseimagerecognitioninstead.GivePyAutoGUIanimageofwhatyouwanttoclickandletitfigureoutthecoordinates.

Forexample,ifyouhavepreviouslytakenascreenshottocapturetheimageofaSubmitbuttoninsubmit.png,thelocateOnScreen()functionwillreturnthecoordinateswherethatimageisfound.ToseehowlocateOnScreen()works,trytakingascreenshotofasmallareaonyourscreen;thensavetheimageandenterthefollowingintotheinteractiveshell,replacing'submit.png'withthefilenameofyourscreenshot:

>>>importpyautogui

>>>pyautogui.locateOnScreen('submit.png')

(643,745,70,29)

Thefour-integertuplethatlocateOnScreen()returnshasthex-coordinateoftheleftedge,they-coordinateofthetopedge,thewidth,andtheheightforthefirstplaceonthescreentheimagewasfound.Ifyou’retryingthisonyourcomputerwithyourownscreenshot,yourreturnvaluewillbedifferentfromtheoneshownhere.

Iftheimagecannotbefoundonthescreen,locateOnScreen()willreturnNone.Notethattheimageonthescreenmustmatchtheprovidedimageperfectlyinordertoberecognized.Iftheimageisevenapixeloff,locateOnScreen()willreturnNone.

Iftheimagecanbefoundinseveralplacesonthescreen,locateAllOnScreen()willreturnaGeneratorobject,whichcanbepassedtolist()toreturnalistoffour-integertuples.Therewillbeonefour-integertupleforeachlocationwheretheimageisfoundonthescreen.Continuetheinteractiveshellexamplebyenteringthefollowing(andreplacing'submit.png'withyourownimagefilename):

>>>list(pyautogui.locateAllOnScreen('submit.png'))

[(643,745,70,29),(1007,801,70,29)]

Eachofthefour-integertuplesrepresentsanareaonthescreen.Ifyourimageisonlyfoundinonearea,thenusinglist()andlocateAllOnScreen()justreturnsalistcontainingonetuple.

Onceyouhavethefour-integertuplefortheareaonthescreenwhereyourimagewasfound,youcanclickthecenterofthisareabypassingthetupletothecenter()functiontoreturnx-andy-coordinatesofthearea’scenter.Enterthefollowingintotheinteractiveshell,replacingtheargumentswithyourownfilename,four-integertuple,andcoordinatepair:

>>>pyautogui.locateOnScreen('submit.png')

(643,745,70,29)

>>>pyautogui.center((643,745,70,29))

(678,759)

>>>pyautogui.click((678,759))

Onceyouhavecentercoordinatesfromcenter(),passingthecoordinatestoclick()shouldclickthecenteroftheareaonthescreenthatmatchestheimageyoupassedtolocateOnScreen().

ControllingtheKeyboardPyAutoGUIalsohasfunctionsforsendingvirtualkeypressestoyourcomputer,whichenablesyoutofilloutformsorentertextintoapplications.

SendingaStringfromtheKeyboardThepyautogui.typewrite()functionsendsvirtualkeypressestothecomputer.Whatthesekeypressesdodependsonwhatwindowandtextfieldhavefocus.Youmaywanttofirstsendamouseclicktothetextfieldyouwantinordertoensurethatithasfocus.

Asasimpleexample,let’susePythontoautomaticallytypethewordsHelloworld!intoafileeditorwindow.First,openanewfileeditorwindowandpositionitintheupper-leftcornerofyourscreensothatPyAutoGUIwillclickintherightplacetobringitintofocus.Next,enterthefollowingintotheinteractiveshell:

>>>pyautogui.click(100,100);pyautogui.typewrite('Helloworld!')

Noticehowplacingtwocommandsonthesameline,separatedbyasemicolon,keepstheinteractiveshellfrompromptingyouforinputbetweenrunningthetwoinstructions.Thispreventsyoufromaccidentallybringinganewwindowintofocusbetweentheclick()andtypewrite()calls,whichwouldmessuptheexample.

Pythonwillfirstsendavirtualmouseclicktothecoordinates(100,100),whichshouldclickthefileeditorwindowandputitinfocus.Thetypewrite()callwillsendthetextHelloworld!tothewindow,makingitlooklikeFigure18-3.Younowhavecodethatcantypeforyou!

Figure18-3.UsingPyAutogGUItoclickthefileeditorwindowandtypeHelloworld!intoit

Bydefault,thetypewrite()functionwilltypethefullstringinstantly.However,youcanpassanoptionalsecondargumenttoaddashortpausebetweeneachcharacter.Thissecondargumentisanintegerorfloatvalueofthenumberofsecondstopause.Forexample,pyautogui.typewrite('Helloworld!',0.25)willwaitaquarter-secondaftertypingH,anotherquarter-secondaftere,andsoon.Thisgradualtypewritereffectmaybeusefulforslowerapplicationsthatcan’tprocesskeystrokesfastenoughtokeepupwithPyAutoGUI.

ForcharacterssuchasAor!,PyAutoGUIwillautomaticallysimulateholdingdowntheSHIFTkeyaswell.

KeyNamesNotallkeysareeasytorepresentwithsingletextcharacters.Forexample,howdoyourepresentSHIFTortheleftarrowkeyasasinglecharacter?InPyAutoGUI,thesekeyboardkeysarerepresentedbyshortstringvaluesinstead:'esc'fortheESCkeyor'enter'fortheENTERkey.

Insteadofasinglestringargument,alistofthesekeyboardkeystringscanbepassedtotypewrite().Forexample,thefollowingcallpressestheAkey,thentheBkey,thentheleftarrowkeytwice,andfinallytheXandYkeys:

>>>pyautogui.typewrite(['a','b','left','left','X','Y'])

Becausepressingtheleftarrowkeymovesthekeyboardcursor,thiswilloutputXYab.Table18-1liststhePyAutoGUIkeyboardkeystringsthatyoucanpasstotypewrite()tosimulatepressinganycombinationofkeys.

Youcanalsoexaminethepyautogui.KEYBOARD_KEYSlisttoseeallpossiblekeyboardkeystringsthatPyAutoGUIwillaccept.The'shift'stringreferstotheleftSHIFTkeyandisequivalentto'shiftleft'.Thesameappliesfor'ctrl','alt',and'win'strings;theyallrefertotheleft-sidekey.

Table18-1.PyKeyboardAttributes

Keyboardkeystring Meaning

'a','b','c','A','B','C','1','2','3','!','@','#',andsoon

Thekeysforsinglecharacters

'enter'(or'return'or'\n') TheENTERkey

'esc' TheESCkey

'shiftleft','shiftright' TheleftandrightSHIFTkeys

'altleft','altright' TheleftandrightALTkeys

'ctrlleft','ctrlright' TheleftandrightCTRLkeys

'tab'(or'\t') TheTABkey

'backspace','delete' TheBACKSPACEandDELETEkeys

'pageup','pagedown' ThePAGEUPandPAGEDOWNkeys

'home','end' TheHOMEandENDkeys

'up','down','left','right' Theup,down,left,andrightarrowkeys

'f1','f2','f3',andsoon TheF1toF12keys

'volumemute','volumedown','volumeup'

Themute,volumedown,andvolumeupkeys(somekeyboardsdonothavethesekeys,butyouroperatingsystemwillstillbeabletounderstandthesesimulatedkeypresses)

'pause' ThePAUSEkey

'capslock','numlock','scrolllock'

TheCAPSLOCK,NUMLOCK,andSCROLLLOCKkeys

'insert' TheINSorINSERTkey

'printscreen' ThePRTSCorPRINTSCREENkey

'winleft','winright' TheleftandrightWINkeys(onWindows)

'command' TheCommand( )key(onOSX)'option'TheOPTIONkey(onOSX)

PressingandReleasingtheKeyboardMuchlikethemouseDown()andmouseUp()functions,pyautogui.keyDown()andpyautogui.keyUp()willsendvirtualkeypressesandreleasestothecomputer.Theyare

passedakeyboardkeystring(seeTable18-1)fortheirargument.Forconvenience,PyAutoGUIprovidesthepyautogui.press()function,whichcallsbothofthesefunctionstosimulateacompletekeypress.

Runthefollowingcode,whichwilltypeadollarsigncharacter(obtainedbyholdingtheSHIFTkeyandpressing4):

>>>pyautogui.keyDown('shift');pyautogui.press('4');pyautogui.keyUp('shift')

ThislinepressesdownSHIFT,presses(andreleases)4,andthenreleasesSHIFT.Ifyouneedtotypeastringintoatextfield,thetypewrite()functionismoresuitable.Butforapplicationsthattakesingle-keycommands,thepress()functionisthesimplerapproach.

HotkeyCombinationsAhotkeyorshortcutisacombinationofkeypressestoinvokesomeapplicationfunction.ThecommonhotkeyforcopyingaselectionisCTRL-C(onWindowsandLinux)or⌘-C(onOSX).TheuserpressesandholdstheCTRLkey,thenpressestheCkey,andthenreleasestheCandCTRLkeys.TodothiswithPyAutoGUI’skeyDown()andkeyUp()functions,youwouldhavetoenterthefollowing:

pyautogui.keyDown('ctrl')

pyautogui.keyDown('c')

pyautogui.keyUp('c')

pyautogui.keyUp('ctrl')

Thisisrathercomplicated.Instead,usethepyautogui.hotkey()function,whichtakesmultiplekeyboardkeystringarguments,pressestheminorder,andreleasestheminthereverseorder.FortheCTRL-Cexample,thecodewouldsimplybeasfollows:

pyautogui.hotkey('ctrl','c')

Thisfunctionisespeciallyusefulforlargerhotkeycombinations.InWord,theCTRL-ALT-SHIFT-ShotkeycombinationdisplaystheStylepane.Insteadofmakingeightdifferentfunctioncalls(fourkeyDown()callsandfourkeyUp()calls),youcanjustcallhotkey('ctrl','alt','shift','s').

WithanewIDLEfileeditorwindowintheupper-leftcornerofyourscreen,enterthefollowingintotheinteractiveshell(inOSX,replace'alt'with'ctrl'):

>>>importpyautogui,time

>>>defcommentAfterDelay():

➊pyautogui.click(100,100)

➋pyautogui.typewrite('InIDLE,Alt-3commentsoutaline.')

time.sleep(2)

➌pyautogui.hotkey('alt','3')

>>>commentAfterDelay()

ThisdefinesafunctioncommentAfterDelay()that,whencalled,willclickthefileeditorwindowtobringitintofocus➊,typeInIDLE,Atl-3commentsoutaline➋,pausefor2seconds,andthensimulatepressingtheALT-3hotkey(orCTRL-3onOSX)➌.Thiskeyboardshortcutaddstwo#characterstothecurrentline,commentingitout.(ThisisausefultricktoknowwhenwritingyourowncodeinIDLE.)

ReviewofthePyAutoGUIFunctionsSincethischaptercoveredmanydifferentfunctions,hereisaquicksummaryreference:

moveTo(x,y).Movesthemousecursortothegivenxandycoordinates.moveRel(xOffset,yOffset).Movesthemousecursorrelativetoitscurrentposition.dragTo(x,y).Movesthemousecursorwhiletheleftbuttonishelddown.dragRel(xOffset,yOffset).Movesthemousecursorrelativetoitscurrentpositionwhiletheleftbuttonishelddown.click(x,y,button).Simulatesaclick(leftbuttonbydefault).rightClick().Simulatesaright-buttonclick.middleClick().Simulatesamiddle-buttonclick.doubleClick().Simulatesadoubleleft-buttonclick.mouseDown(x,y,button).Simulatespressingdownthegivenbuttonatthepositionx,y.mouseUp(x,y,button).Simulatesreleasingthegivenbuttonatthepositionx,y.scroll(units).Simulatesthescrollwheel.Apositiveargumentscrollsup;anegativeargumentscrollsdown.typewrite(message).Typesthecharactersinthegivenmessagestring.typewrite([key1,key2,key3]).Typesthegivenkeyboardkeystrings.press(key).Pressesthegivenkeyboardkeystring.keyDown(key).Simulatespressingdownthegivenkeyboardkey.keyUp(key).Simulatesreleasingthegivenkeyboardkey.hotkey([key1,key2,key3]).Simulatespressingthegivenkeyboardkeystringsdowninorderandthenreleasingtheminreverseorder.screenshot().ReturnsascreenshotasanImageobject.(SeeChapter17forinformationonImageobjects.)

Project:AutomaticFormFillerOfalltheboringtasks,fillingoutformsisthemostdreadedofchores.It’sonlyfittingthatnow,inthefinalchapterproject,youwillslayit.Sayyouhaveahugeamountofdatainaspreadsheet,andyouhavetotediouslyretypeitintosomeotherapplication’sforminterface—withnointerntodoitforyou.AlthoughsomeapplicationswillhaveanImportfeaturethatwillallowyoutouploadaspreadsheetwiththeinformation,sometimesitseemsthatthereisnootherwaythanmindlesslyclickingandtypingforhoursonend.You’vecomethisfarinthisbook;youknowthatofcoursethere’sanotherway.

TheformforthisprojectisaGoogleDocsformthatyoucanfindathttp://nostarch.com/automatestuff.ItlookslikeFigure18-4.

Figure18-4.Theformusedforthisproject

Atahighlevel,here’swhatyourprogramshoulddo:

Clickthefirsttextfieldoftheform.Movethroughtheform,typinginformationintoeachfield.ClicktheSubmitbutton.

Repeattheprocesswiththenextsetofdata.

Thismeansyourcodewillneedtodothefollowing:

Callpyautogui.click()toclicktheformandSubmitbutton.Callpyautogui.typewrite()toentertextintothefields.HandletheKeyboardInterruptexceptionsotheusercanpressCTRL-Ctoquit.

OpenanewfileeditorwindowandsaveitasformFiller.py.

Step1:FigureOuttheStepsBeforewritingcode,youneedtofigureouttheexactkeystrokesandmouseclicksthatwillfillouttheformonce.ThemouseNow.pyscriptinProject:“WhereIstheMouseRightNow?”canhelpyoufigureoutspecificmousecoordinates.Youneedtoknowonlythecoordinatesofthefirsttextfield.Afterclickingthefirstfield,youcanjustpressTABtomovefocustothenextfield.Thiswillsaveyoufromhavingtofigureoutthex-andy-coordinatestoclickforeveryfield.

Herearethestepsforenteringdataintotheform:

1. ClicktheNamefield.(UsemouseNow.pytodeterminethecoordinatesaftermaximizingthebrowserwindow.OnOSX,youmayneedtoclicktwice:oncetoputthebrowserinfocusandagaintoclicktheNamefield.)

2. TypeanameandthenpressTAB.3. TypeagreatestfearandthenpressTAB.4. Pressthedownarrowkeythecorrectnumberoftimestoselectthewizardpower

source:onceforwand,twiceforamulet,threetimesforcrystalball,andfourtimesformoney.ThenpressTAB.(NotethatonOSX,youwillhavetopressthedownarrowkeyonemoretimeforeachoption.Forsomebrowsers,youmayneedtopresstheENTERkeyaswell.)

5. PresstherightarrowkeytoselecttheanswertotheRobocopquestion.Pressitoncefor2,twicefor3,threetimesfor4,orfourtimesfor5;orjustpressthespacebartoselect1(whichishighlightedbydefault).ThenpressTAB.

6. TypeanadditionalcommentandthenpressTAB.7. PresstheENTERkeyto“click”theSubmitbutton.8. Aftersubmittingtheform,thebrowserwilltakeyoutoapagewhereyouwillneed

toclickalinktoreturntotheformpage.

Notethatifyourunthisprogramagainlater,youmayhavetoupdatethemouseclickcoordinates,sincethebrowserwindowmighthavechangedposition.Toworkaroundthis,alwaysmakesurethebrowserwindowismaximizedbeforefindingthecoordinatesofthefirstformfield.Also,differentbrowsersondifferentoperatingsystemsmightworkslightlydifferentlyfromthestepsgivenhere,socheckthatthesekeystrokecombinationsworkforyourcomputerbeforerunningyourprogram.

Step2:SetUpCoordinatesLoadtheexampleformyoudownloaded(Figure18-4)inabrowserandmaximizeyourbrowserwindow.OpenanewTerminalorcommandlinewindowtorunthemouseNow.pyscript,andthenmouseovertheNamefieldtofigureoutitsthex-andy-coordinates.

ThesenumberswillbeassignedtothenameFieldvariableinyourprogram.Also,findoutthex-andy-coordinatesandRGBtuplevalueoftheblueSubmitbutton.ThesevalueswillbeassignedtothesubmitButtonandsubmitButtonColorvariables,respectively.

Next,fillinsomedummydatafortheformandclickSubmit.YouneedtoseewhatthenextpagelookslikesothatyoucanusemouseNow.pytofindthecoordinatesoftheSubmitanotherresponselinkonthisnewpage.

Makeyoursourcecodelooklikethefollowing,beingsuretoreplaceallthevaluesinitalicswiththecoordinatesyoudeterminedfromyourowntests:

#!python3

#formFiller.py-Automaticallyfillsintheform.

importpyautogui,time

#Setthesetothecorrectcoordinatesforyourcomputer.

nameField=(648,319)

submitButton=(651,817)

submitButtonColor=(75,141,249)

submitAnotherLink=(760,224)

#TODO:Givetheuserachancetokillthescript.

#TODO:Waituntiltheformpagehasloaded.

#TODO:FillouttheNameField.

#TODO:FillouttheGreatestFear(s)field.

#TODO:FillouttheSourceofWizardPowersfield.

#TODO:FillouttheRobocopfield.

#TODO:FillouttheAdditionalCommentsfield.

#TODO:ClickSubmit.

#TODO:Waituntilformpagehasloaded.

#TODO:ClicktheSubmitanotherresponselink.

Nowyouneedthedatayouactuallywanttoenterintothisform.Intherealworld,thisdatamightcomefromaspreadsheet,aplaintextfile,orawebsite,anditwouldrequireadditionalcodetoloadintotheprogram.Butforthisproject,you’lljusthardcodeallthisdatainavariable.Addthefollowingtoyourprogram:

#!python3

#formFiller.py-Automaticallyfillsintheform.

--snip--

formData=[{'name':'Alice','fear':'eavesdroppers','source':'wand',

'robocop':4,'comments':'TellBobIsaidhi.'},

{'name':'Bob','fear':'bees','source':'amulet','robocop':4,

'comments':'n/a'},

{'name':'Carol','fear':'puppets','source':'crystalball',

'robocop':1,'comments':'Pleasetakethepuppetsoutofthe

breakroom.'},

{'name':'AlexMurphy','fear':'ED-209','source':'money',

'robocop':5,'comments':'Protecttheinnocent.Servethepublic

trust.Upholdthelaw.'},

]

--snip--

TheformDatalistcontainsfourdictionariesforfourdifferentnames.Eachdictionaryhasnamesoftextfieldsaskeysandresponsesasvalues.Thelastbitofsetupistoset

PyAutoGUI’sPAUSEvariabletowaithalfasecondaftereachfunctioncall.AddthefollowingtoyourprogramaftertheformDataassignmentstatement:

pyautogui.PAUSE=0.5

Step3:StartTypingDataAforloopwilliterateovereachofthedictionariesintheformDatalist,passingthevaluesinthedictionarytothePyAutoGUIfunctionsthatwillvirtuallytypeinthetextfields.

Addthefollowingcodetoyourprogram:#!python3

#formFiller.py-Automaticallyfillsintheform.

--snip--

forpersoninformData:

#Givetheuserachancetokillthescript.

print('>>>5SECONDPAUSETOLETUSERPRESSCTRL-C<<<')

➊time.sleep(5)

#Waituntiltheformpagehasloaded.

➋whilenotpyautogui.pixelMatchesColor(submitButton[0],submitButton[1],

submitButtonColor):

time.sleep(0.5)

--snip--

Asasmallsafetyfeature,thescripthasafive-secondpause➊thatgivestheuserachancetohitCTRL-C(ormovethemousecursortotheupper-leftcornerofthescreentoraisetheFailSafeExceptionexception)toshuttheprogramdownincaseit’sdoingsomethingunexpected.ThentheprogramwaitsuntiltheSubmitbutton’scolorisvisible➋,lettingtheprogramknowthattheformpagehasloaded.Rememberthatyoufiguredoutthecoordinateandcolorinformationinstep2andstoreditinthesubmitButtonandsubmitButtonColorvariables.TousepixelMatchesColor(),youpassthecoordinatessubmitButton[0]andsubmitButton[1],andthecolorsubmitButtonColor.

AfterthecodethatwaitsuntiltheSubmitbutton’scolorisvisible,addthefollowing:#!python3

#formFiller.py-Automaticallyfillsintheform.

--snip--

➊print('Entering%sinfo…'%(person['name']))

➋pyautogui.click(nameField[0],nameField[1])

#FillouttheNamefield.

➌pyautogui.typewrite(person['name']+'\t')

#FillouttheGreatestFear(s)field.

➍pyautogui.typewrite(person['fear']+'\t')

--snip--

Weaddanoccasionalprint()calltodisplaytheprogram’sstatusinitsTerminalwindowtolettheuserknowwhat’sgoingon➊.

Sincetheprogramknowsthattheformisloaded,it’stimetocallclick()toclicktheNamefield➋andtypewrite()toenterthestringinperson['name']➌.The'\t'characterisaddedtotheendofthestringpassedtotypewrite()tosimulatepressingTAB,whichmovesthekeyboardfocustothenextfield,GreatestFear(s).Anothercalltotypewrite()willtypethestringinperson['fear']intothisfieldandthentabtothenext

fieldintheform➍.

Step4:HandleSelectListsandRadioButtonsThedrop-downmenuforthe“wizardpowers”questionandtheradiobuttonsfortheRobocopfieldaretrickiertohandlethanthetextfields.Toclicktheseoptionswiththemouse,youwouldhavetofigureoutthex-andy-coordinatesofeachpossibleoption.It’seasiertousethekeyboardarrowkeystomakeaselectioninstead.

Addthefollowingtoyourprogram:#!python3

#formFiller.py-Automaticallyfillsintheform.

--snip--

#FillouttheSourceofWizardPowersfield.

➊ifperson['source']=='wand':

➋pyautogui.typewrite(['down','\t'])

elifperson['source']=='amulet':

pyautogui.typewrite(['down','down','\t'])

elifperson['source']=='crystalball':

pyautogui.typewrite(['down','down','down','\t'])

elifperson['source']=='money':

pyautogui.typewrite(['down','down','down','down','\t'])

#FillouttheRobocopfield.

➌ifperson['robocop']==1:

➍pyautogui.typewrite(['','\t'])

elifperson['robocop']==2:

pyautogui.typewrite(['right','\t'])

elifperson['robocop']==3:

pyautogui.typewrite(['right','right','\t'])

elifperson['robocop']==4:

pyautogui.typewrite(['right','right','right','\t'])

elifperson['robocop']==5:

pyautogui.typewrite(['right','right','right','right','\t'])

--snip--

Oncethedrop-downmenuhasfocus(rememberthatyouwrotecodetosimulatepressingTABafterfillingouttheGreatestFear(s)field),pressingthedownarrowkeywillmovetothenextitemintheselectionlist.Dependingonthevalueinperson['source'],yourprogramshouldsendanumberofdownarrowkeypressesbeforetabbingtothenextfield.Ifthevalueatthe'source'keyinthisuser’sdictionaryis'wand'➊,wesimulatepressingthedownarrowkeyonce(toselectWand)andpressingTAB➋.Ifthevalueatthe'source'keyis'amulet',wesimulatepressingthedownarrowkeytwiceandpressingTAB,andsoonfortheotherpossibleanswers.

TheradiobuttonsfortheRobocopquestioncanbeselectedwiththerightarrowkeys—or,ifyouwanttoselectthefirstchoice➌,byjustpressingthespacebar➍.

Step5:SubmittheFormandWaitYoucanfillouttheAdditionalCommentsfieldwiththetypewrite()functionbypassingperson['comments']asanargument.Youcantypeanadditional'\t'tomovethekeyboardfocustothenextfieldortheSubmitbutton.OncetheSubmitbuttonisinfocus,callingpyautogui.press('enter')willsimulatepressingtheENTERkeyandsubmittheform.Aftersubmittingtheform,yourprogramwillwaitfivesecondsforthenextpagetoload.

Oncethenewpagehasloaded,itwillhaveaSubmitanotherresponselinkthatwilldirect

thebrowsertoanew,emptyformpage.YoustoredthecoordinatesofthislinkasatupleinsubmitAnotherLinkinstep2,sopassthesecoordinatestopyautogui.click()toclickthislink.

Withthenewformreadytogo,thescript’souterforloopcancontinuetothenextiterationandenterthenextperson’sinformationintotheform.

Completeyourprogrambyaddingthefollowingcode:#!python3

#formFiller.py-Automaticallyfillsintheform.

--snip--

#FillouttheAdditionalCommentsfield.

pyautogui.typewrite(person['comments']+'\t')

#ClickSubmit.

pyautogui.press('enter')

#Waituntilformpagehasloaded.

print('ClickedSubmit.')

time.sleep(5)

#ClicktheSubmitanotherresponselink.

pyautogui.click(submitAnotherLink[0],submitAnotherLink[1])

Oncethemainforloophasfinished,theprogramwillhavepluggedintheinformationforeachperson.Inthisexample,thereareonlyfourpeopletoenter.Butifyouhad4,000people,thenwritingaprogramtodothiswouldsaveyoualotoftimeandtyping!

SummaryGUIautomationwiththepyautoguimoduleallowsyoutointeractwithapplicationsonyourcomputerbycontrollingthemouseandkeyboard.Whilethisapproachisflexibleenoughtodoanythingthatahumanusercando,thedownsideisthattheseprogramsarefairlyblindtowhattheyareclickingortyping.WhenwritingGUIautomationprograms,trytoensurethattheywillcrashquicklyifthey’regivenbadinstructions.Crashingisannoying,butit’smuchbetterthantheprogramcontinuinginerror.

Youcanmovethemousecursoraroundthescreenandsimulatemouseclicks,keystrokes,andkeyboardshortcutswithPyAutoGUI.Thepyautoguimodulecanalsocheckthecolorsonthescreen,whichcanprovideyourGUIautomationprogramwithenoughofanideaofthescreencontentstoknowwhetherithasgottenofftrack.YoucanevengivePyAutoGUIascreen-shotandletitfigureoutthecoordinatesoftheareayouwanttoclick.

YoucancombineallofthesePyAutoGUIfeaturestoautomateanymindlesslyrepetitivetaskonyourcomputer.Infact,itcanbedownrighthypnotictowatchthemousecursormoveonitsownandseetextappearonthescreenautomatically.Whynotspendthetimeyousavedbysittingbackandwatchingyourprogramdoallyourworkforyou?There’sacertainsatisfactionthatcomesfromseeinghowyourclevernesshassavedyoufromtheboringstuff.

PracticeQuestionsQ: 1.HowcanyoutriggerPyAutoGUI’sfailsafetostopaprogram?

Q: 2.Whatfunctionreturnsthecurrentresolution()?

Q: 3.Whatfunctionreturnsthecoordinatesforthemousecursor’scurrentposition?

Q: 4.Whatisthedifferencebetweenpyautogui.moveTo()andpyautogui.moveRel()?

Q: 5.Whatfunctionscanbeusedtodragthemouse?

Q: 6.Whatfunctioncallwilltypeoutthecharactersof"Helloworld!"?

Q: 7.Howcanyoudokeypressesforspecialkeyssuchasthekeyboard’sleftarrowkey?

Q: 8.Howcanyousavethecurrentcontentsofthescreentoanimagefilenamedscreenshot.png?

Q: 9.WhatcodewouldsetatwosecondpauseaftereveryPyAutoGUIfunctioncall?

PracticeProjectsForpractice,writeprogramsthatdothefollowing.

LookingBusyManyinstantmessagingprogramsdeterminewhetheryouareidle,orawayfromyourcomputer,bydetectingalackofmousemovementoversomeperiodoftime—say,tenminutes.Maybeyou’dliketosneakawayfromyourdeskforawhilebutdon’twantotherstoseeyourinstantmessengerstatusgointoidlemode.Writeascripttonudgeyourmousecursorslightlyeverytenseconds.Thenudgeshouldbesmallenoughsothatitwon’tgetinthewayifyoudohappentoneedtouseyourcomputerwhilethescriptisrunning.

InstantMessengerBotGoogleTalk,Skype,YahooMessenger,AIM,andotherinstantmessagingapplicationsoftenuseproprietaryprotocolsthatmakeitdifficultforotherstowritePythonmodulesthatcaninteractwiththeseprograms.Buteventheseproprietaryprotocolscan’tstopyoufromwritingaGUIautomationtool.

TheGoogleTalkapplicationhasasearchbarthatletsyouenterausernameonyourfriendlistandopenamessagingwindowwhenyoupressENTER.Thekeyboardfocusautomaticallymovestothenewwindow.Otherinstantmessengerapplicationshavesimilarwaystoopennewmessagewindows.Writeaprogramthatwillautomaticallysendoutanotificationmessagetoaselectgroupofpeopleonyourfriendlist.Yourprogrammayhavetodealwithexceptionalcases,suchasfriendsbeingoffline,thechatwindowappearingatdifferentcoordinatesonthescreen,orconfirmationboxesthatinterruptyourmessaging.Yourprogramwillhavetotakescreen-shotstoguideitsGUIinteractionandadoptwaysofdetectingwhenitsvirtualkeystrokesaren’tbeingsent.

NOTE

Youmaywanttosetupsomefaketestaccountssothatyoudon’taccidentallyspamyourrealfriendswhilewritingthisprogram.

Game-PlayingBotTutorialThereisagreattutorialtitled“HowtoBuildaPythonBotThatCanPlayWebGames”thatyoucanfindathttp://nostarch.com/automatestuff/.ThistutorialexplainshowtocreateaGUIautomationprograminPythonthatplaysaFlashgamecalledSushiGoRound.Thegameinvolvesclickingthecorrectingredientbuttonstofillcustomers’sushiorders.Thefasteryoufillorderswithoutmistakes,themorepointsyouget.ThisisaperfectlysuitedtaskforaGUIautomationprogram—andawaytocheattoahighscore!ThetutorialcoversmanyofthesametopicsthatthischaptercoversbutalsoincludesdescriptionsofPyAutoGUI’sbasicimagerecognitionfeatures.

AppendixA.InstallingThird-PartyModulesBeyondthestandardlibraryofmodulespackagedwithPython,otherdevelopershavewrittentheirownmodulestoextendPython’scapabilitiesevenfurther.Theprimarywaytoinstallthird-partymodulesistousePython’spiptool.ThistoolsecurelydownloadsandinstallsPythonmodulesontoyourcomputerfromhttps://pypi.python.org/,thewebsiteofthePythonSoftwareFoundation.PyPI,orthePythonPackageIndex,isasortoffreeappstoreforPythonmodules.

ThepipToolTheexecutablefileforthepiptooliscalledpiponWindowsandpip3onOSXandLinux.OnWindows,youcanfindpipatC:\Python34\Scripts\pip.exe.OnOSX,itisin/Library/Frameworks/Python.framework/Versions/3.4/bin/pip3.OnLinux,itisin/usr/bin/pip3.

WhilepipcomesautomaticallyinstalledwithPython3.4onWindowsandOSX,youmustinstallitseparatelyonLinux.Toinstallpip3onUbuntuorDebianLinux,openanewTerminalwindowandentersudoapt-getinstallpython3-pip.Toinstallpip3onFedoraLinux,entersudoyuminstallpython3-pipintoaTerminalwindow.Youwillneedtoentertheadministratorpasswordforyourcomputerinordertoinstallthissoftware.

InstallingThird-PartyModulesThepiptoolismeanttoberunfromthecommandline:Youpassitthecommandinstallfollowedbythenameofthemoduleyouwanttoinstall.Forexample,onWindowsyouwouldenterpipinstallModuleName,whereModuleNameisthenameofthemodule.OnOSXandLinux,you’llhavetorunpip3withthesudoprefixtograntadministrativeprivilegestoinstallthemodule.Youwouldneedtotypesudopip3installModuleName.

IfyoualreadyhavethemoduleinstalledbutwouldliketoupgradeittothelatestversionavailableonPyPI,runpipinstall–UModuleName(orpip3install–UModuleNameonOSXandLinux).

Afterinstallingthemodule,youcantestthatitinstalledsuccessfullybyrunningimportModuleNameintheinteractiveshell.Ifnoerrormessagesaredisplayed,youcanassumethemodulewasinstalledsuccessfully.

Youcaninstallallofthemodulescoveredinthisbookbyrunningthecommandslistednext.(Remembertoreplacepipwithpip3ifyou’reonOSXorLinux.)

pipinstallsend2trash

pipinstallrequests

pipinstallbeautifulsoup4

pipinstallselenium

pipinstallopenpyxl

pipinstallPyPDF2

pipinstallpython-docx(installpython-docx,notdocx)pipinstallimapclient

pipinstallpyzmail

pipinstalltwilio

pipinstallpillow

pipinstallpyobjc-core(onOSXonly)pipinstallpyobjc(onOSXonly)pipinstallpython3-xlib(onLinuxonly)pipinstallpyautogui

NOTE

ForOSXusers:Thepyobjcmodulecantake20minutesorlongertoinstall,sodon’tbealarmedifittakesawhile.Youshouldalsoinstallthepyobjc-coremodulefirst,whichwillreducetheoverallinstallationtime.

AppendixB.RunningProgramsIfyouhaveaprogramopeninIDLE’sfileeditor,runningitisasimplematterofpressingF5orselectingtheRun▸RunModulemenuitem.Thisisaneasywaytorunprogramswhilewritingthem,butopeningIDLEtorunyourfinishedprogramscanbeaburden.TherearemoreconvenientwaystoexecutePythonscripts.

ShebangLineThefirstlineofallyourPythonprogramsshouldbeashebangline,whichtellsyourcomputerthatyouwantPythontoexecutethisprogram.Theshebanglinebeginswith#!,buttherestdependsonyouroperatingsystem.

OnWindows,theshebanglineis#!python3.OnOSX,theshebanglineis#!/usr/bin/envpython3.OnLinux,theshebanglineis#!/usr/bin/python3.

YouwillbeabletorunPythonscriptsfromIDLEwithouttheshebangline,butthelineisneededtorunthemfromthecommandline.

RunningPythonProgramsonWindowsOnWindows,thePython3.4interpreterislocatedatC:\Python34\python.exe.Alternatively,theconvenientpy.exeprogramwillreadtheshebanglineatthetopofthe.pyfile’ssourcecodeandruntheappropriateversionofPythonforthatscript.Thepy.exeprogramwillmakesuretorunthePythonprogramwiththecorrectversionofPythonifmultipleversionsareinstalledonyourcomputer.

TomakeitconvenienttorunyourPythonprogram,createa.batbatchfileforrunningthePythonprogramwithpy.exe.Tomakeabatchfile,makeanewtextfilecontainingasinglelinelikethefollowing:

@py.exeC:\path\to\your\pythonScript.py%*

Replacethispathwiththeabsolutepathtoyourownprogram,andsavethisfilewitha.batfileextension(forexample,pythonScript.bat).ThisbatchfilewillkeepyoufromhavingtotypethefullabsolutepathforthePythonprogrameverytimeyouwanttorunit.Irecommendyouplaceallyourbatchand.pyfilesinasinglefolder,suchasC:\MyPythonScriptsorC:\Users\YourName\PythonScripts.

TheC:\MyPythonScriptsfoldershouldbeaddedtothesystempathonWindowssothatyoucanrunthebatchfilesinitfromtheRundialog.Todothis,modifythePATHenvironmentvariable.ClicktheStartbuttonandtypeEditenvironmentvariablesforyouraccount.Thisoptionshouldauto-completeafteryou’vebeguntotypeit.TheEnvironmentVariableswindowthatappearswilllooklikeFigureB-1.

FromSystemvariables,selectthePathvariableandclickEdit.IntheValuetextfield,appendasemicolon,typeC:\MyPythonScripts,andthenclickOK.NowyoucanrunanyPythonscriptintheC:\MyPythonScriptsfolderbysimplypressingWIN-Randenteringthescript’sname.RunningpythonScript,forinstance,willrunpythonScript.bat,whichinturnwillsaveyoufromhavingtorunthewholecommandpy.exeC:\MyPythonScripts\pythonScript.pyfromtheRundialog.

FigureB-1.TheEnvironmentVariableswindowonWindows

RunningPythonProgramsonOSXandLinuxOnOSX,selectingApplications▸Utilities▸TerminalwillbringupaTerminalwindow.ATerminalwindowisawaytoentercommandsonyourcomputerusingonlytext,ratherthanclickingthroughagraphicinterface.TobringuptheTerminalwindowonUbuntuLinux,presstheWIN(orSUPER)keytobringupDashandtypeinTerminal.

TheTerminalwindowwillbegininthehomefolderofyouruseraccount.Ifmyusernameisasweigart,thehomefolderwillbe/Users/asweigartonOSXand/home/asweigartonLinux.Thetilde(~)characterisashortcutforyourhomefolder,soyoucanentercd~tochangetoyourhomefolder.Youcanalsousethecdcommandtochangethecurrentworkingdirectorytoanyotherdirectory.OnbothOSXandLinux,thepwdcommandwillprintthecurrentworkingdirectory.

TorunyourPythonprograms,saveyour.pyfiletoyourhomefolder.Then,changethe.pyfile’spermissionstomakeitexecutablebyrunningchmod+xpythonScript.py.Filepermissionsarebeyondthescopeofthisbook,butyouwillneedtorunthiscommandonyourPythonfileifyouwanttoruntheprogramfromtheTerminalwindow.Onceyoudoso,youwillbeabletorunyourscriptwheneveryouwantbyopeningaTerminalwindowandentering./pythonScript.py.TheshebanglineatthetopofthescriptwilltelltheoperatingsystemwheretolocatethePythoninterpreter.

AppendixC.AnswerstothePracticeQuestionsThisappendixcontainstheanswerstothepracticeproblemsattheendofeachchapter.Ihighlyrecommendthatyoutakethetimetoworkthroughtheseproblems.Programmingismorethanmemorizingsyntaxandalistoffunctionnames.Aswhenlearningaforeignlanguage,themorepracticeyouputintoit,themoreyouwillgetoutofit.Therearemanywebsiteswithpracticeprogrammingproblemsaswell.Youcanfindalistoftheseathttp://nostarch.com/automatestuff/.

Chapter11. Theoperatorsare+,-,*,and/.Thevaluesare'hello',-88.8,and5.2. Thestringis'spam';thevariableisspam.Stringsalwaysstartandendwithquotes.3. Thethreedatatypesintroducedinthischapterareintegers,floating-pointnumbers,

andstrings.4. Anexpressionisacombinationofvaluesandoperators.Allexpressionsevaluate

(thatis,reduce)toasinglevalue.5. Anexpressionevaluatestoasinglevalue.Astatementdoesnot.6. Thebaconvariableissetto20.Thebacon+1expressiondoesnotreassignthe

valueinbacon(thatwouldneedanassignmentstatement:bacon=bacon+1).7. Bothexpressionsevaluatetothestring'spamspamspam'.8. Variablenamescannotbeginwithanumber.9. Theint(),float(),andstr()functionswillevaluatetotheinteger,floating-point

number,andstringversionsofthevaluepassedtothem.10. Theexpressioncausesanerrorbecause99isaninteger,andonlystringscanbe

concatenatedtootherstringswiththe+operator.ThecorrectwayisIhaveeaten'+str(99)+'burritos.'.

Chapter21. TrueandFalse,usingcapitalTandF,withtherestofthewordinlowercase2. and,or,andnot3. TrueandTrueisTrue.

TrueandFalseisFalse.FalseandTrueisFalse.FalseandFalseisFalse.TrueorTrueisTrue.TrueorFalseisTrue.FalseorTrueisTrue.FalseorFalseisFalse.notTrueisFalse.notFalseisTrue.

4. False

False

True

False

False

True

5. ==,!=,<,>,<=,and>=.6. ==istheequaltooperatorthatcomparestwovaluesandevaluatestoaBoolean,

while=istheassignmentoperatorthatstoresavalueinavariable.7. Aconditionisanexpressionusedinaflowcontrolstatementthatevaluatestoa

Booleanvalue.8. Thethreeblocksareeverythinginsidetheifstatementandthelines

print('bacon')andprint('ham').print('eggs')

ifspam>5:

print('bacon')

else:

print('ham')

print('spam')

9. Thecode:ifspam==1:

print('Hello')

elifspam==2:

print('Howdy')

else:

print('Greetings!')

10. PressCTRL-Ctostopaprogramstuckinaninfiniteloop.11. Thebreakstatementwillmovetheexecutionoutsideandjustafteraloop.The

continuestatementwillmovetheexecutiontothestartoftheloop.12. Theyalldothesamething.Therange(10)callrangesfrom0upto(butnot

including)10,range(0,10)explicitlytellsthelooptostartat0,andrange(0,10,1)explicitlytellsthelooptoincreasethevariableby1oneachiteration.

13. Thecode:foriinrange(1,11):

print(i)

and:i=1

whilei<=10:

print(i)

i=i+1

14. Thisfunctioncanbecalledwithspam.bacon().

Chapter31. Functionsreducetheneedforduplicatecode.Thismakesprogramsshorter,easierto

read,andeasiertoupdate.2. Thecodeinafunctionexecuteswhenthefunctioniscalled,notwhenthefunctionis

defined.3. Thedefstatementdefines(thatis,creates)afunction.4. Afunctionconsistsofthedefstatementandthecodeinitsdefclause.

Afunctioncalliswhatmovestheprogramexecutionintothefunction,andthefunctioncallevaluatestothefunction’sreturnvalue.

5. Thereisoneglobalscope,andalocalscopeiscreatedwheneverafunctioniscalled.6. Whenafunctionreturns,thelocalscopeisdestroyed,andallthevariablesinitare

forgotten.7. Areturnvalueisthevaluethatafunctioncallevaluatesto.Likeanyvalue,areturn

valuecanbeusedaspartofanexpression.8. Ifthereisnoreturnstatementforafunction,itsreturnvalueisNone.9. Aglobalstatementwillforceavariableinafunctiontorefertotheglobalvariable.10. ThedatatypeofNoneisNoneType.11. Thatimportstatementimportsamodulenamedareallyourpetsnamederic.(This

isn’tarealPythonmodule,bytheway.)12. Thisfunctioncanbecalledwithspam.bacon().13. Placethelineofcodethatmightcauseanerrorinatryclause.14. Thecodethatcouldpotentiallycauseanerrorgoesinthetryclause.

Thecodethatexecutesifanerrorhappensgoesintheexceptclause.

Chapter41. Theemptylistvalue,whichisalistvaluethatcontainsnoitems.Thisissimilarto

how''istheemptystringvalue.2. spam[2]='hello'(Noticethatthethirdvalueinalistisatindex2becausethe

firstindexis0.)3. 'd'(Notethat'3'*2isthestring'33',whichispassedtoint()beforebeing

dividedby11.Thiseventuallyevaluatesto3.Expressionscanbeusedwherevervaluesareused.)

4. 'd'(Negativeindexescountfromtheend.)5. ['a','b']6. 17. [3.14,'cat',11,'cat',True,99]8. [3.14,11,'cat',True]9. Theoperatorforlistconcatenationis+,whiletheoperatorforreplicationis*.(This

isthesameasforstrings.)10. Whileappend()willaddvaluesonlytotheendofalist,insert()canaddthem

anywhereinthelist.11. Thedelstatementandtheremove()listmethodaretwowaystoremovevaluesfrom

alist.12. Bothlistsandstringscanbepassedtolen(),haveindexesandslices,beusedinfor

loops,beconcatenatedorreplicated,andbeusedwiththeinandnotinoperators.13. Listsaremutable;theycanhavevaluesadded,removed,orchanged.Tuplesare

immutable;theycannotbechangedatall.Also,tuplesarewrittenusingparentheses,(and),whilelistsusethesquarebrackets,[and].

14. (42,)(Thetrailingcommaismandatory.)15. Thetuple()andlist()functions,respectively16. Theycontainreferencestolistvalues.17. Thecopy.copy()functionwilldoashallowcopyofalist,whilethe

copy.deepcopy()functionwilldoadeepcopyofalist.Thatis,onlycopy.deepcopy()willduplicateanylistsinsidethelist.

Chapter51. Twocurlybrackets:{}2. {'foo':42}3. Theitemsstoredinadictionaryareunordered,whiletheitemsinalistareordered.4. YougetaKeyErrorerror.5. Thereisnodifference.Theinoperatorcheckswhetheravalueexistsasakeyinthe

dictionary.6. 'cat'inspamcheckswhetherthereisa'cat'keyinthedictionary,while'cat'

inspam.values()checkswhetherthereisavalue'cat'foroneofthekeysinspam.

7. spam.setdefault('color','black')8. pprint.pprint()

Chapter61. Escapecharactersrepresentcharactersinstringvaluesthatwouldotherwisebe

difficultorimpossibletotypeintocode.2. \nisanewline;\tisatab.3. The\\escapecharacterwillrepresentabackslashcharacter.4. ThesinglequoteinHowl'sisfinebecauseyou’veuseddoublequotestomarkthe

beginningandendofthestring.5. Multilinestringsallowyoutousenewlinesinstringswithoutthe\nescape

character.6. Theexpressionsevaluatetothefollowing:

'e'

'Hello'

'Hello'

'loworld!

7. Theexpressionsevaluatetothefollowing:

'HELLO'

True

'hello'

8. Theexpressionsevaluatetothefollowing:

['Remember,','remember,','the','fifth','of','November.']

'There-can-be-only-one.'

9. Therjust(),ljust(),andcenter()stringmethods,respectively10. Thelstrip()andrstrip()methodsremovewhitespacefromtheleftandright

endsofastring,respectively.

Chapter71. There.compile()functionreturnsRegexobjects.2. Rawstringsareusedsothatbackslashesdonothavetobeescaped.3. Thesearch()methodreturnsMatchobjects.4. Thegroup()methodreturnsstringsofthematchedtext.5. Group0istheentirematch,group1coversthefirstsetofparentheses,andgroup2

coversthesecondsetofparentheses.6. Periodsandparenthesescanbeescapedwithabackslash:\.,\(,and\).7. Iftheregexhasnogroups,alistofstringsisreturned.Iftheregexhasgroups,alist

oftuplesofstringsisreturned.8. The|charactersignifiesmatching“either,or”betweentwogroups.9. The?charactercaneithermean“matchzerooroneoftheprecedinggroup”orbe

usedtosignifynongreedymatching.10. The+matchesoneormore.The*matcheszeroormore.11. The{3}matchesexactlythreeinstancesoftheprecedinggroup.The{3,5}matches

betweenthreeandfiveinstances.12. The\d,\w,and\sshorthandcharacterclassesmatchasingledigit,word,orspace

character,respectively.13. The\D,\W,and\Sshorthandcharacterclassesmatchasinglecharacterthatisnota

digit,word,orspacecharacter,respectively.14. Passingre.Iorre.IGNORECASEasthesecondargumenttore.compile()willmake

thematchingcaseinsensitive.15. The.characternormallymatchesanycharacterexceptthenewlinecharacter.If

re.DOTALLispassedasthesecondargumenttore.compile(),thenthedotwillalsomatchnewlinecharacters.

16. The.*performsagreedymatch,andthe.*?performsanongreedymatch.17. Either[0-9a-z]or[a-z0-9]18. 'Xdrummers,Xpipers,fiverings,Xhens'19. There.VERBOSEargumentallowsyoutoaddwhitespaceandcommentstothestring

passedtore.compile().20. re.compile(r'^\d{1,3}(,{3})*$')willcreatethisregex,butotherregexstrings

canproduceasimilarregularexpression.21. re.compile(r'[A-Z][a-z]*\sNakamoto')22. re.compile(r'(Alice|Bob|Carol)\s(eats|pets|throws)\

s(apples|cats|baseballs)\.',re.IGNORECASE)

Chapter81. Relativepathsarerelativetothecurrentworkingdirectory.2. Absolutepathsstartwiththerootfolder,suchas/orC:\.3. Theos.getcwd()functionreturnsthecurrentworkingdirectory.Theos.chdir()

functionchangesthecurrentworkingdirectory.4. The.folderisthecurrentfolder,and..istheparentfolder.5. C:\bacon\eggsisthedirname,whilespam.txtisthebasename.6. Thestring'r'forreadmode,'w'forwritemode,and'a'forappendmode7. Anexistingfileopenedinwritemodeiserasedandcompletelyoverwritten.8. Theread()methodreturnsthefile’sentirecontentsasasinglestringvalue.The

readlines()methodreturnsalistofstrings,whereeachstringisalinefromthefile’scontents.

9. Ashelfvalueresemblesadictionaryvalue;ithaskeysandvalues,alongwithkeys()andvalues()methodsthatworksimilarlytothedictionarymethodsofthesamenames.

Chapter91. Theshutil.copy()functionwillcopyasinglefile,whileshutil.copytree()will

copyanentirefolder,alongwithallitscontents.2. Theshutil.move()functionisusedforrenamingfiles,aswellasmovingthem.3. Thesend2trashfunctionswillmoveafileorfoldertotherecyclebin,whileshutil

functionswillpermanentlydeletefilesandfolders.4. Thezipfile.ZipFile()functionisequivalenttotheopen()function;thefirst

argumentisthefilename,andthesecondargumentisthemodetoopentheZIPfilein(read,write,orappend).

Chapter101. assert(spam>=10,'Thespamvariableislessthan10.')2. assert(eggs.lower()!=bacon.lower(),'Theeggsandbaconvariablesare

thesame!')orassert(eggs.upper()!=bacon.upper(),'Theeggsandbaconvariablesarethesame!')

3. assert(False,'Thisassertionalwaystriggers.')4. Tobeabletocalllogging.debug(),youmusthavethesetwolinesatthestartof

yourprogram:importlogging

logging.basicConfig(level=logging.DEBUG,format='%(asctime)s-

%(levelname)s-%(message)s')

5. TobeabletosendloggingmessagestoafilenamedprogramLog.txtwithlogging.debug(),youmusthavethesetwolinesatthestartofyourprogram:

importlogging

>>>logging.basicConfig(filename='programLog.txt',level=logging.DEBUG,

format='%(asctime)s-%(levelname)s-%(message)s')

6. DEBUG,INFO,WARNING,ERROR,andCRITICAL7. logging.disable(logging.CRITICAL)8. Youcandisableloggingmessageswithoutremovingtheloggingfunctioncalls.You

canselectivelydisablelower-levelloggingmessages.Youcancreateloggingmessages.Loggingmessagesprovidesatimestamp.

9. TheStepbuttonwillmovethedebuggerintoafunctioncall.TheOverbuttonwillquicklyexecutethefunctioncallwithoutsteppingintoit.TheOutbuttonwillquicklyexecutetherestofthecodeuntilitstepsoutofthefunctionitcurrentlyisin.

10. AfteryouclickGo,thedebuggerwillstopwhenithasreachedtheendoftheprogramoralinewithabreakpoint.

11. Abreakpointisasettingonalineofcodethatcausesthedebuggertopausewhentheprogramexecutionreachestheline.

12. TosetabreakpointinIDLE,right-clickthelineandselectSetBreakpointfromthecontextmenu.

Chapter111. Thewebbrowsermodulehasanopen()methodthatwilllaunchawebbrowsertoa

specificURL,andthat’sit.TherequestsmodulecandownloadfilesandpagesfromtheWeb.TheBeautifulSoupmoduleparsesHTML.Finally,theseleniummodulecanlaunchandcontrolabrowser.

2. Therequests.get()functionreturnsaResponseobject,whichhasatextattributethatcontainsthedownloadedcontentasastring.

3. Theraise_for_status()methodraisesanexceptionifthedownloadhadproblemsanddoesnothingifthedownloadsucceeded.

4. Thestatus_codeattributeoftheResponseobjectcontainstheHTTPstatuscode.5. Afteropeningthenewfileonyourcomputerin'wb'“writebinary”mode,useafor

loopthatiteratesovertheResponseobject’siter_content()methodtowriteoutchunkstothefile.Here’sanexample:

saveFile=open('filename.html','wb')

forchunkinres.iter_content(100000):

saveFile.write(chunk)

6. F12bringsupthedevelopertoolsinChrome.PressingCTRL-SHIFT-C(onWindowsandLinux)or⌘-OPTION-C(onOSX)bringsupthedevelopertoolsinFirefox.

7. Right-clicktheelementinthepage,andselectInspectElementfromthemenu.8. '#main'9. '.highlight'10. 'divdiv'11. 'button[value="favorite"]'12. spam.getText()13. linkElem.attrs14. Theseleniummoduleisimportedwithfromseleniumimportwebdriver.15. Thefind_element_*methodsreturnthefirstmatchingelementasaWebElement

object.Thefind_elements_*methodsreturnalistofallmatchingelementsasWebElementobjects.

16. Theclick()andsend_keys()methodssimulatemouseclicksandkeyboardkeys,respectively.

17. Callingthesubmit()methodonanyelementwithinaformsubmitstheform.18. Theforward(),back(),andrefresh()WebDriverobjectmethodssimulatethese

browserbuttons.

Chapter121. Theopenpyxl.load_workbook()functionreturnsaWorkbookobject.2. Theget_sheet_names()methodreturnsaWorksheetobject.3. Callwb.get_sheet_by_name('Sheet1').4. Callwb.get_active_sheet().5. sheet['C5'].valueorsheet.cell(row=5,column=3).value6. sheet['C5']='Hello'orsheet.cell(row=5,column=3).value='Hello'7. cell.rowandcell.column8. Theyreturnthehighestcolumnandrowwithvaluesinthesheet,respectively,as

integervalues.9. openpyxl.cell.column_index_from_string('M')10. openpyxl.cell.get_column_letter(14)11. sheet['A1':'F1']12. wb.save('example.xlsx’)13. Aformulaissetthesamewayasanyvalue.Setthecell’svalueattributetoastring

oftheformulatext.Rememberthatformulasbeginwiththe=sign.14. Whencallingload_workbook(),passTrueforthedata_onlykeywordargument.15. sheet.row_dimensions[5].height=10016. sheet.column_dimensions['C'].hidden=True17. OpenPyXL2.0.5doesnotloadfreezepanes,printtitles,images,orcharts.18. Freezepanesarerowsandcolumnsthatwillalwaysappearonthescreen.Theyare

usefulforheaders.19. openpyxl.charts.Reference(),openpyxl.charts.Series(),openpyxl.charts.

BarChart(),chartObj.append(seriesObj),andadd_chart()

Chapter131. AFileobjectreturnedfromopen()2. Read-binary('rb')forPdfFileReader()andwrite-binary('wb')for

PdfFileWriter()

3. CallinggetPage(4)willreturnaPageobjectforAboutThisBook,sincepage0isthefirstpage.

4. ThenumPagesvariablestoresanintegerofthenumberofpagesinthePdfFileReaderobject.

5. Calldecrypt('swordfish').6. TherotateClockwise()androtateCounterClockwise()methods.Thedegreesto

rotateispassedasanintegerargument.7. docx.Document('demo.docx')8. Adocumentcontainsmultipleparagraphs.Aparagraphbeginsonanewlineand

containsmultipleruns.Runsarecontiguousgroupsofcharacterswithinaparagraph.9. Usedoc.paragraphs.10. ARunobjecthasthesevariables(notaParagraph).11. TruealwaysmakestheRunobjectboldedandFalsemakesitalwaysnotbolded,no

matterwhatthestyle’sboldsettingis.NonewillmaketheRunobjectjustusethestyle’sboldsetting.

12. Callthedocx.Document()function.13. doc.add_paragraph('Hellothere!')14. Theintegers0,1,2,3,and4

Chapter141. InExcel,spreadsheetscanhavevaluesofdatatypesotherthanstrings;cellscan

havedifferentfonts,sizes,orcolorsettings;cellscanhavevaryingwidthsandheights;adjacentcellscanbemerged;andyoucanembedimagesandcharts.

2. YoupassaFileobject,obtainedfromacalltoopen().3. Fileobjectsneedtobeopenedinread-binary('rb')forReaderobjectsandwrite-

binary('wb')forWriterobjects.4. Thewriterow()method5. Thedelimiterargumentchangesthestringusedtoseparatecellsinarow.The

lineterminatorargumentchangesthestringusedtoseparaterows.6. json.loads()7. json.dumps()

Chapter151. Areferencemomentthatmanydateandtimeprogramsuse.ThemomentisJanuary

1st,1970,UTC.2. time.time()3. time.sleep(5)4. Itreturnstheclosestintegertotheargumentpassed.Forexample,round(2.4)

returns2.5. Adatetimeobjectrepresentsaspecificmomentintime.Atimedeltaobject

representsadurationoftime.6. threadObj=threading.Thread(target=spam)7. threadObj.start()8. Makesurethatcoderunninginonethreaddoesnotreadorwritethesamevariables

ascoderunninginanotherthread.9. subprocess.Popen('c:\\Windows\\System32\\calc.exe')

Chapter161. SMTPandIMAP,respectively2. smtplib.SMTP(),smtpObj.ehlo(),smptObj.starttls(),andsmtpObj.login()3. imapclient.IMAPClient()andimapObj.login()4. AlistofstringsofIMAPkeywords,suchas'BEFORE<date>','FROM<string>',or

'SEEN'

5. Assignthevariableimaplib._MAXLINEalargeintegervalue,suchas10000000.6. Thepyzmailmodulereadsdownloadedemails.7. YouwillneedtheTwilioaccountSIDnumber,theauthenticationtokennumber,and

yourTwiliophonenumber.

Chapter171. AnRGBAvalueisatupleof4integers,eachrangingfrom0to255.Thefour

integerscorrespondtotheamountofred,green,blue,andalpha(transparency)inthecolor.

2. AfunctioncalltoImageColor.getcolor('CornflowerBlue','RGBA')willreturn(100,149,237,255),theRGBAvalueforthatcolor.

3. Aboxtupleisatuplevalueoffourintegers:theleftedgex-coordinate,thetopedgey-coordinate,thewidth,andtheheight,respectively.

4. Image.open('zophie.png')5. imageObj.sizeisatupleoftwointegers,thewidthandtheheight.6. imageObj.crop((0,50,50,50)).Noticethatyouarepassingaboxtupleto

crop(),notfourseparateintegerarguments.7. CalltheimageObj.save('new_filename.png')methodoftheImageobject.8. TheImageDrawmodulecontainscodetodrawonimages.9. ImageDrawobjectshaveshape-drawingmethodssuchaspoint(),line(),or

rectangle().TheyarereturnedbypassingtheImageobjecttotheImageDraw.Draw()function.

Chapter181. Movethemousetothetop-leftcornerofthescreen,thatis,the(0,0)coordinates.2. pyautogui.size()returnsatuplewithtwointegersforthewidthandheightofthe

screen.3. pyautogui.position()returnsatuplewithtwointegersforthex-andy-

coordinatesofthemousecursor.4. ThemoveTo()functionmovesthemousetoabsolutecoordinatesonthescreen,

whilethemoveRel()functionmovesthemouserelativetothemouse’scurrentposition.

5. pyautogui.dragTo()andpyautogui.dragRel()6. pyautogui.typewrite('Helloworld!')7. Eitherpassalistofkeyboardkeystringstopyautogui.typewrite()(suchas

'left')orpassasinglekeyboardkeystringtopyautogui.press().8. pyautogui.screenshot('screenshot.png')9. pyautogui.PAUSE=2

AppendixD.ResourcesVisithttp://nostarch.com/automatestuff/forresources,errata,andmoreinformation.

Moreno-nonsensebooksfrom NOSTARCHPRESS

PYTHONPLAYGROUND

GeekyWeekendProjectsfortheCuriousProgrammer

byMAHESHVENKITACHALAM

MAY2015,304PP.,$29.95

ISBN978-1-59327-604-1

PYTHONCRASHCOURSE

AHands-On,Project-BasedIntroductiontoProgramming

byERICMATTHES

JULY2015,624PP.,$34.95

ISBN978-1-59327-603-4

THELINUXCOMMANDLINE

ACompleteIntroduction

byWILLIAME.SHOTTS,JR.

JANUARY2012,480PP.,$39.95

ISBN978-1-59327-389-7

JAVASCRIPTFORKIDS

APlayfulIntroductiontoProgramming

byNICKMORGAN

DECEMBER2014,336PP.,$34.95

ISBN978-1-59327-408-5

fullcolor

STATISTICSDONEWRONG

TheWoefullyCompleteGuide

byALEXREINHART

MARCH2015,176PP.,$24.95

ISBN978-1-59327-620-1

DATAVISUALIZATIONWITHJAVASCRIPT

bySTEPHENA.THOMAS

MARCH2015,384PP.,$39.95

ISBN978-1-59327-605-8

fullcolor

PHONE:

800.420.7240OR415.863.9900

EMAIL:

[email protected]

WEB:

WWW.NOSTARCH.COM

IndexANOTEONTHEDIGITALINDEX

Alinkinanindexentryisdisplayedasthesectiontitleinwhichthatentryappears.Becausesomesectionshavemultipleindexmarkers,itisnotunusualforanentrytohaveseverallinkstothesamesection.Clickingonanylinkwilltakeyoudirectlytotheplaceinthetextinwhichthemarkerappears.

Symbols=(assignment)operator,StringConcatenationandReplication,ComparisonOperators

$(dollarsign),CharacterClasses,MatchingNewlineswiththeDotCharacter

.(dotcharacter),TheCaretandDollarSignCharacters

usinginpaths,TheCurrentWorkingDirectory

wildcardmatches,TheCaretandDollarSignCharacters

”(doublequotes),StringLiterals

**(exponent)operator,EnteringExpressionsintotheInteractiveShell

==(equalto)operator,BooleanValues,ComparisonOperators

/(forwardslash),FilesandFilePaths

divisionoperator,EnteringExpressionsintotheInteractiveShell,TheMultipleAssignmentTrick

\(backslash),StringLiterals,CreatingRegexObjects,MatchingNewlineswiththeDotCharacter,FilesandFilePaths

linecontinuationcharacter,ExampleProgram:Magic8BallwithaList

>(greaterthan)operator,BooleanValues

>=(greaterthanorequalto)operator,BooleanValues

#(hashcharacter),MultilineStringswithTripleQuotes

//(integerdivision/flooredquotient)operator,EnteringExpressionsintotheInteractiveShell

<(lessthan)operator,BooleanValues

<=(lessthanorequalto)operator,BooleanValues

%(modulus/remainder)operator,EnteringExpressionsintotheInteractiveShell,TheMultipleAssignmentTrick

*(multiplication)operator,EnteringExpressionsintotheInteractiveShell,GettingaList’sLengthwithlen(),TheMultipleAssignmentTrick

!=(notequalto)operator,BooleanValues

()(parentheses),MutableandImmutableDataTypes,ReviewofRegularExpressionMatching

|(pipecharacter),GroupingwithParentheses,ManagingComplexRegexes

+(plussign),OptionalMatchingwiththeQuestionMark,MatchingNewlineswiththeDotCharacter

additionoperator,EnteringExpressionsintotheInteractiveShell,TheInteger,Floating-Point,andStringDataTypes,GettingaList’sLengthwithlen(),TheMultipleAssignmentTrick

?(questionmark),MatchingMultipleGroupswiththePipe,MatchingNewlineswiththeDotCharacter

‘(singlequote),StringLiterals

[](squarebrackets),TheListDataType,MatchingNewlineswiththeDotCharacter

*(star),MatchingNewlineswiththeDotCharacter

usingwithwildcardcharacter,TheWildcardCharacter

zeroormorematcheswith,OptionalMatchingwiththeQuestionMark

-(subtraction)operator,EnteringExpressionsintotheInteractiveShell,TheMultipleAssignmentTrick

^(caretsymbol),MatchingNewlineswiththeDotCharacter

matchingbeginningofstring,CharacterClasses

negativecharacterclasses,CharacterClasses

”’(triplequotes),EscapeCharacters,ManagingComplexRegexes

_(underscore),VariableNames

:(colon),BlocksofCode,whileLoopStatements,forLoopsandtherange()Function,NegativeIndexes,IndexingandSlicingStrings

{}(curlybrackets),DictionariesandStructuringData,MatchingNewlineswiththeDotCharacter

greedyvs.nongreedymatching,MatchingOneorMorewiththePlus

matchingspecificrepetitionswith,MatchingOneorMorewiththePlus

A%Adirective,PausingUntilaSpecificDate

%adirective,PausingUntilaSpecificDate

absolutepaths,TheCurrentWorkingDirectory

abspath()function,Theos.pathModule

addition(+)operator,EnteringExpressionsintotheInteractiveShell,TheInteger,Floating-Point,andStringDataTypes,GettingaList’sLengthwithlen(),TheMultipleAssignmentTrick

additivecolormodel,ColorsandRGBAValues

add_heading()method,WritingWordDocuments

addPage()method,CreatingPDFs

add_paragraph()method,WritingWordDocuments

add_picture()method,AddingHeadings

add_run()method,WritingWordDocuments

algebraicchessnotation,PrettyPrinting

all_capsattribute,RunAttributes

ALLsearchkey,SelectingaFolder

alpha,defined,ComputerImageFundamentals

andoperator,ComparisonOperators

ANSWEREDsearchkey,PerformingtheSearch

API(applicationprogramminginterface),Step3:WriteOuttheCSVFileWithouttheFirstRow

append()method,Methods

application-specificpasswords,LoggingintotheSMTPServer

argskeyword,PassingArgumentstotheThread’sTargetFunction

arguments,function,Comments,defStatementswithParameters

keywordarguments,ReturnValuesandreturnStatements

passingtoprocesses,LaunchingOtherProgramsfromPython

passingtothreads,Multithreading

assertions,Assertions

assignment(=)operator,StringConcatenationandReplication,ComparisonOperators

AT&Tmail,ConnectingtoanSMTPServer,RetrievingandDeletingEmailswithIMAP

attributes,HTML,AQuickRefresher,GettingDatafromanElement’sAttributes

augmentedassignmentoperators,TheMultipleAssignmentTrick

B\bbackspaceescapecharacter,Step3:GetandPrinttheMouseCoordinates

%Bdirective,PausingUntilaSpecificDate

%bdirective,PausingUntilaSpecificDate

back()method,SendingSpecialKeys

backslash(\),StringLiterals,CreatingRegexObjects,MatchingNewlineswiththeDotCharacter,FilesandFilePaths

BarChart()function,Charts

basename()function,HandlingAbsoluteandRelativePaths

BCCsearchkey,PerformingtheSearch

BeautifulSoup,ParsingHTMLwiththeBeautifulSoupModule

(seealsobs4module)

BeautifulSoupobjects,ParsingHTMLwiththeBeautifulSoupModule

BEFOREsearchkey,SelectingaFolder

binaryfiles,FindingFileSizesandFolderContents,WritingtoFiles

binaryoperators,ComparisonOperators

bitwiseoroperator,ManagingComplexRegexes

blankstrings,TheInteger,Floating-Point,andStringDataTypes

blockingexecution,Thetime.time()Function

blocksofcode,MixingBooleanandComparisonOperators

BODYsearchkey,SelectingaFolder

boldattribute,RunAttributes

Booleandatatype

binaryoperators,ComparisonOperators

flowcontroland,FlowControl

inoperator,TheinandnotinOperators

notinoperator,TheinandnotinOperators

“truthy”and“falsey”values,continueStatements

usingbinaryandcomparisonoperatorstogether,BinaryBooleanOperators

boxtuples,CoordinatesandBoxTuples

breakpoints,debuggingusing,DebuggingaNumberAddingProgram

breakstatements

overview,AnAnnoyingwhileLoop

usinginforloop,forLoopsandtherange()Function

browser,openingusingwebbrowsermodule,WebScraping

bs4module

creatingobjectfromHTML,ParsingHTMLwiththeBeautifulSoupModule

findingelementwithselect()method,CreatingaBeautifulSoupObjectfromHTML

gettingattribute,GettingDatafromanElement’sAttributes

overview,ParsingHTMLwiththeBeautifulSoupModule

built-infunctions,ImportingModules

bulletedlist,creatinginWikimarkup,Project:AddingBulletstoWikiMarkup

copyingandpastingclipboard,Project:AddingBulletstoWikiMarkup

joiningmodifiedlines,Step3:JointheModifiedLines

overview,Project:AddingBulletstoWikiMarkup

separatinglinesoftext,Step1:CopyandPastefromtheClipboard

Ccallingfunctions,Comments

callstack,defined,RaisingExceptions

camelcase,VariableNames

caretsymbol(^),MatchingNewlineswiththeDotCharacter

matchingbeginningofstring,CharacterClasses

negativecharacterclasses,CharacterClasses

CascadingStyleSheets(CSS)

matchingwithseleniummodule,FindingElementsonthePage

selectors,CreatingaBeautifulSoupObjectfromHTML

casesensitivity,VariableNames,Case-InsensitiveMatching

CCsearchkey,PerformingtheSearch

Cellobjects,GettingSheetsfromtheWorkbook

cells,inExcelspreadsheets,WorkingwithExcelSpreadsheets

accessingCellobjectbyitsname,GettingSheetsfromtheWorkbook

mergingandunmerging,SettingRowHeightandColumnWidth

writingvaluesto,CreatingandRemovingSheets

center()method,Thejoin()andsplit()StringMethods,ImageRecognition

chainingmethodcalls,RotatingandFlippingImages

characterclasses,Thefindall()Method,MatchingNewlineswiththeDotCharacter

characterstyles,StylingParagraphandRunObjects

charts,Excel,FreezePanes

chdir()function,TheCurrentWorkingDirectory

Chrome,developertoolsin,ViewingtheSourceHTMLofaWebPage

clear()method,FindingElementsonthePage

click()function,ClickingtheMouse,ReviewofthePyAutoGUIFunctions,Project:AutomaticFormFiller

clickingmouse,ClickingtheMouse

click()method,FindingElementsonthePage

clipboard,usingstringfrom,Step3:HandletheClipboardContentandLaunchtheBrowser

CMYKcolormodel,ColorsandRGBAValues

colon(:),BlocksofCode,whileLoopStatements,forLoopsandtherange()Function,NegativeIndexes,IndexingandSlicingStrings

colorvalues

CMYKvs.RGBcolormodels,ColorsandRGBAValues

RGBAvalues,ComputerImageFundamentals

column_index_from_string()function,ConvertingBetweenColumnLettersandNumbers

columns,inExcelspreadsheets

settingheightandwidthof,Formulas

slicingWorksheetobjectstogetCellobjectsin,ConvertingBetweenColumnLettersandNumbers

Comcastmail,ConnectingtoanSMTPServer,RetrievingandDeletingEmailswithIMAP

comma-delimiteditems,TheListDataType

commandlinearguments,Step1:FigureOuttheURL

commentAfterDelay()function,PressingandReleasingtheKeyboard

comments

multiline,MultilineStringswithTripleQuotes

overview,Comments

comparisonoperators

overview,BooleanValues

usingbinaryoperatorswith,BinaryBooleanOperators

compile()function,CreatingRegexObjects,ReviewofRegularExpressionMatching,ManagingComplexRegexes

compressedfiles

backingupfolderinto,Step3:FormtheNewFilenameandRenametheFiles

creatingZIPfiles,ExtractingfromZIPFiles

extractingZIPfiles,ExtractingfromZIPFiles

overview,WalkingaDirectoryTree

readingZIPfiles,CompressingFileswiththezipfileModule

computerscreen

coordinatesof,PausesandFail-Safes

resolutionof,ControllingMouseMovement

concatenation

oflists,GettingaList’sLengthwithlen()

string,TheInteger,Floating-Point,andStringDataTypes

concurrencyissues,PassingArgumentstotheThread’sTargetFunction

conditions,defined,MixingBooleanandComparisonOperators

continuestatements

overview,continueStatements

usinginforloop,forLoopsandtherange()Function

CoordinatedUniversalTime(UTC),ThetimeModule

coordinates

ofcomputerscreen,PausesandFail-Safes

ofanimage,ColorsandRGBAValues

copy()function,PassingReferences,RemovingWhitespacewithstrip(),rstrip(),andlstrip(),OrganizingFiles,CopyingandPastingImagesontoOtherImages

copytree()function,OrganizingFiles

countdownproject,Project:SimpleCountdownProgram

countingdown,Project:SimpleCountdownProgram

overview,Project:SimpleCountdownProgram

playingsoundfile,Project:SimpleCountdownProgram

cProfile.run()function,Thetime.time()Function

crashes,program,EnteringExpressionsintotheInteractiveShell

create_sheet()method,CreatingandRemovingSheets

CRITICALlevel,LoggingLevels

cron,LaunchingOtherProgramsfromPython

croppingimages,WorkingwiththeImageDataType

CSS(CascadingStyleSheets)

matchingwithseleniummodule,FindingElementsonthePage

selectors,CreatingaBeautifulSoupObjectfromHTML

CSVfiles

defined,WorkingwithCSVFilesandJSONData

delimeterfor,ThedelimiterandlineterminatorKeywordArguments

formatoverview,WorkingwithCSVFilesandJSONData

lineterminatorfor,ThedelimiterandlineterminatorKeywordArguments

Readerobjects,ReaderObjects

readingdatainloop,ReadingDatafromReaderObjectsinaforLoop

removingheaderfrom,ThedelimiterandlineterminatorKeywordArguments

loopingthroughCSVfiles,Project:RemovingtheHeaderfromCSVFiles

overview,ThedelimiterandlineterminatorKeywordArguments

readinginCSVfile,Project:RemovingtheHeaderfromCSVFiles

writingoutCSVfile,Step2:ReadintheCSVFile

Writerobjects,ReadingDatafromReaderObjectsinaforLoop

curlybrackets({}),DictionariesandStructuringData,MatchingNewlineswiththeDotCharacter

greedyvs.nongreedymatching,MatchingOneorMorewiththePlus

matchingspecificrepetitionswith,MatchingOneorMorewiththePlus

currentworkingdirectory,TheCurrentWorkingDirectory

D\Dcharacterclass,Thefindall()Method

\dcharacterclass,Thefindall()Method

%ddirective,PausingUntilaSpecificDate

datastructures

algebraicchessnotation,PrettyPrinting

tic-tac-toeboard,UsingDataStructurestoModelReal-WorldThings

datatypes

Booleans,FlowControl

defined,EnteringExpressionsintotheInteractiveShell

dictionaries,DictionariesandStructuringData

floating-pointnumbers,TheInteger,Floating-Point,andStringDataTypes

integers,TheInteger,Floating-Point,andStringDataTypes

list()function,TheTupleDataType

lists,TheListDataType

mutablevs.immutable,List-likeTypes:StringsandTuples

Nonevalue,ReturnValuesandreturnStatements

strings,TheInteger,Floating-Point,andStringDataTypes

tuple()function,TheTupleDataType

tuples,MutableandImmutableDataTypes

datetimemodule

arithmeticusing,ThetimedeltaDataType

convertingobjectstostrings,PausingUntilaSpecificDate

convertingstringstoobjects,ConvertingdatetimeObjectsintoStrings

fromtimestamp()function,ThedatetimeModule

now()function,ThedatetimeModule

overview,ThedatetimeModule,ReviewofPython’sTimeFunctions

pausingprogramuntiltime,PausingUntilaSpecificDate

timedeltadatatype,ThedatetimeModule

total_seconds()method,ThedatetimeModule

datetimeobjects,ThedatetimeModule

convertingtostrings,PausingUntilaSpecificDate

convertingfromstringsto,ConvertingdatetimeObjectsintoStrings

debug()function,UsingtheloggingModule

debugging

assertions,Assertions

defined,WhatIsPython?

gettingtracebackasstring,RaisingExceptions

inIDLE

overview,DisablingLogging

steppingthroughprogram,Over

usingbreakpoints,DebuggingaNumberAddingProgram

logging

disabling,LoggingLevels

tofile,DisablingLogging

levelsof,UsingtheloggingModule

loggingmodule,UsinganAssertioninaTrafficLightSimulation

print()functionand,UsingtheloggingModule

raisingexceptions,Debugging

DEBUGlevel,UsingtheloggingModule

decimalnumbers(seefloating-pointnumbers)

decode()method,GettingEmailAddressesfromaRawMessage

decryption,ofPDFfiles,ExtractingTextfromPDFs

deduplicatingcode,Functions

deepcopy()function,PassingReferences

defstatements,Functions

withparameters,defStatementswithParameters

DELETEDsearchkey,PerformingtheSearch

delete_messages()method,GettingtheBodyfromaRawMessage

deletingfiles/folders

permanently,MovingandRenamingFilesandFolders

usingsend2trashmodule,PermanentlyDeletingFilesandFolders

delstatements,RemovingValuesfromListswithdelStatements

dictionaries

copy()function,PassingReferences

deepcopy()function,PassingReferences

get()method,CheckingWhetheraKeyorValueExistsinaDictionary

inoperator,CheckingWhetheraKeyorValueExistsinaDictionary

items()method,Dictionariesvs.Lists

keys()method,Dictionariesvs.Lists

listsvs.,TheDictionaryDataType

nesting,ATic-Tac-ToeBoard

notinoperator,CheckingWhetheraKeyorValueExistsinaDictionary

overview,DictionariesandStructuringData

setdefault()method,Thesetdefault()Method

values()method,Dictionariesvs.Lists

directories

absolutevs.relativepaths,TheCurrentWorkingDirectory

backslashvs.forwardslash,FilesandFilePaths

copying,OrganizingFiles

creating,Absolutevs.RelativePaths

currentworkingdirectory,TheCurrentWorkingDirectory

defined,ReadingandWritingFiles

deletingpermanently,MovingandRenamingFilesandFolders

deletingusingsend2trashmodule,PermanentlyDeletingFilesandFolders

moving,CopyingFilesandFolders

os.pathmodule

absolutepathsin,Theos.pathModule

filesizes,HandlingAbsoluteandRelativePaths

foldercontents,HandlingAbsoluteandRelativePaths

overview,Theos.pathModule

pathvalidity,FindingFileSizesandFolderContents

relativepathsin,Theos.pathModule

renaming,CopyingFilesandFolders

walking,SafeDeleteswiththesend2trashModule

dirname()function,HandlingAbsoluteandRelativePaths

disable()function,LoggingLevels

division(/)operator,EnteringExpressionsintotheInteractiveShell,TheMultipleAssignmentTrick

Documentobjects,WordDocuments

dollarsign($),CharacterClasses,MatchingNewlineswiththeDotCharacter

dotcharacter(.),TheCaretandDollarSignCharacters

usinginpaths,TheCurrentWorkingDirectory

wildcardmatches,TheCaretandDollarSignCharacters

dot-starcharacter(.*),TheWildcardCharacter

doubleClick()function,ClickingtheMouse,ReviewofthePyAutoGUIFunctions

doublequotes(“),StringLiterals

double_strikeattribute,RunAttributes

downloading

filesfromweb,SavingDownloadedFilestotheHardDrive

webpages,DownloadingFilesfromtheWebwiththerequestsModule

XKCDcomics,Step3:OpenWebBrowsersforEachResult,Project:MultithreadedXKCDDownloader

DRAFTsearchkey,PerformingtheSearch

draggingmouse,ClickingtheMouse

dragRel()function,ClickingtheMouse,DraggingtheMouse,ReviewofthePyAutoGUIFunctions

dragTo()function,ClickingtheMouse,ReviewofthePyAutoGUIFunctions

drawingonimages

ellipses,Lines

exampleprogram,Lines

ImageDrawmodule,IdeasforSimilarPrograms

lines,IdeasforSimilarPrograms

points,IdeasforSimilarPrograms

polygons,Lines

rectangles,Lines

text,DrawingExample

dumps()function,ReadingJSONwiththeloads()Function

durationkeywordarguments,ControllingMouseMovement

Eehlo()method,ConnectingtoanSMTPServer,Step3:SendCustomizedEmailReminders

elements,HTML,SavingDownloadedFilestotheHardDrive

elifstatements,elseStatements

ellipse()method,Lines

elsestatements,ifStatements

emailaddresses,extracting,Combiningre.IGNORECASE,re.DOTALL,andre.VERBOSE

creatingregex,Project:PhoneNumberandEmailAddressExtractor

findingmatchesonclipboard,Step2:CreateaRegexforEmailAddresses

joiningmatchesintoastring,Step3:FindAllMatchesintheClipboardText

overview,Combiningre.IGNORECASE,re.DOTALL,andre.VERBOSE

emails

deleting,GettingtheBodyfromaRawMessage

disconnectingfromserver,GettingtheBodyfromaRawMessage

fetching

folders,ConnectingtoanIMAPServer

gettingmessagecontent,SizeLimits

loggingintoserver,ConnectingtoanIMAPServer

overview,DisconnectingfromtheSMTPServer

rawmessages,FetchinganEmailandMarkingItAsRead

gmail_search()method,SizeLimits

IMAP,DisconnectingfromtheSMTPServer

markingmessageasread,SizeLimits

searching,ConnectingtoanIMAPServer

sending

connectingtoSMTPserver,ConnectingtoanSMTPServer

disconnectingfromserver,DisconnectingfromtheSMTPServer

loggingintoserver,ConnectingtoanSMTPServer

overview,SMTP

reminder,DisconnectingfromtheIMAPServer

sending“hello”message,ConnectingtoanSMTPServer

sendingmessage,LoggingintotheSMTPServer

TLSencryption,ConnectingtoanSMTPServer

SMTP,SMTP

embossattribute,RunAttributes

encryption,ofPDFfiles,OverlayingPages

endswith()method,TheisXStringMethods

epochtimestamps,ThetimeModule,ThedatetimeModule,ReviewofPython’sTimeFunctions

equalto(==)operator,BooleanValues,ComparisonOperators

ERRORlevel,LoggingLevels

errors

crashesand,EnteringExpressionsintotheInteractiveShell

helpfor,StartingIDLE

escapecharacters,StringLiterals

evaluation,defined,EnteringExpressionsintotheInteractiveShell

Excelspreadsheets

applicationsupport,WorkingwithExcelSpreadsheets

chartsin,FreezePanes

columnwidth,Formulas

convertingbetweencolumnlettersandnumbers,ConvertingBetweenColumnLettersandNumbers

creatingdocuments,IdeasforSimilarPrograms

creatingworksheets,CreatingandRemovingSheets

deletingworksheets,CreatingandRemovingSheets

fontstyles,SettingtheFontStyleofCells

formulasin,FontObjects

freezingpanes,MergingandUnmergingCells

gettingcellvalues,GettingSheetsfromtheWorkbook

gettingrowsandcolumns,ConvertingBetweenColumnLettersandNumbers

gettingworksheetnames,GettingSheetsfromtheWorkbook

mergingandunmergingcells,SettingRowHeightandColumnWidth

openingdocuments,ReadingExcelDocuments

openpyxlmodule,WorkingwithExcelSpreadsheets

overview,WorkingwithExcelSpreadsheets

readingfiles

overview,GettingRowsandColumnsfromtheSheets

populatingdatastructure,Step1:ReadtheSpreadsheetData

readingdata,Project:ReadingDatafromaSpreadsheet

writingresultstofile,Step2:PopulatetheDataStructure

andreminderemailsproject,DisconnectingfromtheIMAPServer

rowheight,Formulas

savingworkbooks,IdeasforSimilarPrograms

updating,WritingValuestoCells

overview,WritingValuestoCells

setup,Project:UpdatingaSpreadsheet

workbooksvs.,WorkingwithExcelSpreadsheets

writingvaluestocells,CreatingandRemovingSheets

Exceptionobjects,RaisingExceptions

exceptions

assertionsand,Assertions

gettingtracebackasstring,RaisingExceptions

handling,TheglobalStatement

raising,Debugging

execution,program

defined,FlowControl

overview,BlocksofCode

pausinguntilspecifictime,PausingUntilaSpecificDate

terminatingprogramwithsys.exit(),ImportingModules

exists()function,FindingFileSizesandFolderContents

exitcodes,LaunchingOtherProgramsfromPython

expandkeyword,RotatingandFlippingImages

exponent(**)operator,EnteringExpressionsintotheInteractiveShell

expressions

conditionsand,MixingBooleanandComparisonOperators

ininteractiveshell,EnteringExpressionsintotheInteractiveShell

expunge()method,GettingtheBodyfromaRawMessage

extensions,file,ReadingandWritingFiles

extractall()method,ExtractingfromZIPFiles

extractingZIPfiles,ExtractingfromZIPFiles

extract()method,ExtractingfromZIPFiles

FFailSafeExceptionexception,Step2:SetUpCoordinates

“falsey”values,continueStatements

fetch()method,PerformingtheSearch,SizeLimits

fileeditor,VariableNames

filemanagement

absolutevs.relativepaths,TheCurrentWorkingDirectory

backslashvs.forwardslash,FilesandFilePaths

compressedfiles

backingupto,Step3:FormtheNewFilenameandRenametheFiles

creatingZIPfiles,ExtractingfromZIPFiles

extractingZIPfiles,ExtractingfromZIPFiles

overview,WalkingaDirectoryTree

readingZIPfiles,CompressingFileswiththezipfileModule

creatingdirectories,Absolutevs.RelativePaths

currentworkingdirectory,TheCurrentWorkingDirectory

multiclipboardproject,Step4:WriteContenttotheQuizandAnswerKeyFiles

openingfiles,TheFileReading/WritingProcess

os.pathmodule

absolutepathsin,Theos.pathModule

filesizes,HandlingAbsoluteandRelativePaths

foldercontents,HandlingAbsoluteandRelativePaths

overview,Theos.pathModule

pathvalidity,FindingFileSizesandFolderContents

relativepathsin,Theos.pathModule

overview,ReadingandWritingFiles

paths,ReadingandWritingFiles

plaintextvs.binaryfiles,FindingFileSizesandFolderContents

readingfiles,OpeningFileswiththeopen()Function

renamingfiles,datestyles,CreatingandAddingtoZIPFiles

savingvariableswithpformat()function,SavingVariableswiththeshelveModule

send2trashmodule,PermanentlyDeletingFilesandFolders

shelvemodule,WritingtoFiles

shutilmodule

copyingfiles/folders,OrganizingFiles

deletingfiles/folders,MovingandRenamingFilesandFolders

movingfiles/folders,CopyingFilesandFolders

renamingfiles/folders,CopyingFilesandFolders

walkingdirectorytrees,SafeDeleteswiththesend2trashModule

writingfiles,ReadingtheContentsofFiles

filenames,defined,ReadingandWritingFiles

Fileobjects,OpeningFileswiththeopen()Function

findall()method,GreedyandNongreedyMatching

find_element_by_*methods,StartingaSelenium-ControlledBrowser

find_elements_by_*methods,StartingaSelenium-ControlledBrowser

Firefox,developertoolsin,OpeningYourBrowser’sDeveloperTools

FLAGGEDsearchkey,PerformingtheSearch

flippingimages,RotatingandFlippingImages

float()function,Thelen()Function

floating-pointnumbers

integerequivalence,Thestr(),int(),andfloat()Functions

overview,TheInteger,Floating-Point,andStringDataTypes

rounding,Thetime.sleep()Function

flowcontrol

binaryoperators,ComparisonOperators

blocksofcode,MixingBooleanandComparisonOperators

Booleanvaluesand,FlowControl

breakstatements,AnAnnoyingwhileLoop

comparisonoperators,BooleanValues

conditions,MixingBooleanandComparisonOperators

continuestatements,continueStatements

elifstatements,elseStatements

elsestatements,ifStatements

ifstatements,BlocksofCode

overview,FlowControl

usingbinaryandcomparisonoperatorstogether,BinaryBooleanOperators

whileloops,whileLoopStatements

folders

absolutevs.relativepaths,TheCurrentWorkingDirectory

backinguptoZIPfile,Step3:FormtheNewFilenameandRenametheFiles

creatingnewZIPfile,Step1:FigureOuttheZIPFile’sName

figuringoutZIPfilename,Project:BackingUpaFolderintoaZIPFile

walkingdirectorytree,Step1:FigureOuttheZIPFile’sName

backslashvs.forwardslash,FilesandFilePaths

copying,OrganizingFiles

creating,Absolutevs.RelativePaths

currentworkingdirectory,TheCurrentWorkingDirectory

defined,ReadingandWritingFiles

deletingpermanently,MovingandRenamingFilesandFolders

deletingusingsend2trashmodule,PermanentlyDeletingFilesandFolders

moving,CopyingFilesandFolders

os.pathmodule

absolutepathsin,Theos.pathModule

filesizes,HandlingAbsoluteandRelativePaths

foldercontents,HandlingAbsoluteandRelativePaths

overview,Theos.pathModule

pathvalidity,FindingFileSizesandFolderContents

relativepathsin,Theos.pathModule

renaming,CopyingFilesandFolders

walkingdirectorytrees,SafeDeleteswiththesend2trashModule

Fontobjects,SettingtheFontStyleofCells

fontstyles,inExcelspreadsheets,SettingtheFontStyleofCells

forloops

overview,continueStatements

usingdictionaryitemsin,Thekeys(),values(),anditems()Methods

usinglistswith,UsingforLoopswithLists

formatattribute,WorkingwiththeImageDataType

format_descriptionattribute,WorkingwiththeImageDataType

formDatalist,Step2:SetUpCoordinates

formfillerproject,ReviewofthePyAutoGUIFunctions

overview,ReviewofthePyAutoGUIFunctions

radiobuttons,Step3:StartTypingData

selectlists,Step3:StartTypingData

settingupcoordinates,Step1:FigureOuttheSteps

stepsinprocess,Project:AutomaticFormFiller

submittingform,Step4:HandleSelectListsandRadioButtons

typingdata,Step2:SetUpCoordinates

formulas,inExcelspreadsheets,FontObjects

forward()method,SendingSpecialKeys

forwardslash(/),FilesandFilePaths

FROMsearchkey,PerformingtheSearch

fromtimestamp()function,ThedatetimeModule,ReviewofPython’sTimeFunctions

functions,ReviewofthePyAutoGUIFunctions

(seealsonamesofindividualfunctions)

arguments,Comments,defStatementswithParameters

as“blackbox”,TheglobalStatement

built-in,ImportingModules

defstatements,defStatementswithParameters

exceptionhandling,TheglobalStatement

keywordarguments,ReturnValuesandreturnStatements

Nonevalueand,ReturnValuesandreturnStatements

overview,Functions

parameters,defStatementswithParameters

returnvalues,defStatementswithParameters

Gget_active_sheet()method,GettingSheetsfromtheWorkbook

get_addresses()method,GettingEmailAddressesfromaRawMessage

get_attribute()method,FindingElementsonthePage

getcolor()function,ComputerImageFundamentals,WorkingwiththeImageDataType

get_column_letter()function,ConvertingBetweenColumnLettersandNumbers

getcwd()function,TheCurrentWorkingDirectory

get()function

overview,CheckingWhetheraKeyorValueExistsinaDictionary

requestsmodule,DownloadingFilesfromtheWebwiththerequestsModule

get_highest_column()method,GettingCellsfromtheSheets,Step1:OpentheExcelFile

get_highest_row()method,GettingCellsfromtheSheets

get_payload()method,GettingEmailAddressesfromaRawMessage

getpixel()function,ChangingIndividualPixels,ScrollingtheMouse,AnalyzingtheScreenshot

get_sheet_by_name()method,GettingSheetsfromtheWorkbook

get_sheet_names()method,GettingSheetsfromtheWorkbook

getsize()function,HandlingAbsoluteandRelativePaths

get_subject()method,GettingEmailAddressesfromaRawMessage

getText()function,ReadingWordDocuments

GIFformat,WorkingwiththeImageDataType

globalscope,LocalandGlobalVariableswiththeSameName

Gmail,ConnectingtoanSMTPServer,LoggingintotheSMTPServer,RetrievingandDeletingEmailswithIMAP

gmail_search()method,SizeLimits

GoogleMaps,WebScraping

graphicaluserinterfaceautomation(seeGUI(graphicaluserinterface)automation)

greaterthan(>)operator,BooleanValues

greaterthanorequalto(>=)operator,BooleanValues

greedymatching

dot-starfor,TheWildcardCharacter

inregularexpressions,MatchingOneorMorewiththePlus

group()method,CreatingRegexObjects,ReviewofRegularExpressionMatching

groups,regularexpression

matching

greedy,MatchingOneorMorewiththePlus

nongreedy,GreedyandNongreedyMatching

oneormore,OptionalMatchingwiththeQuestionMark

optional,MatchingMultipleGroupswiththePipe

specificreptitions,MatchingOneorMorewiththePlus

zeroormore,OptionalMatchingwiththeQuestionMark

usingparentheses,ReviewofRegularExpressionMatching

usingpipecharacterin,GroupingwithParentheses

GuesstheNumberprogram,ExceptionHandling

GUI(graphicaluserinterface)automation,ReviewofthePyAutoGUIFunctions

(seealsoformfillerproject)

controllingkeyboard,ImageRecognition

hotkeycombinations,PressingandReleasingtheKeyboard

keynames,SendingaStringfromtheKeyboard

pressingandreleasing,KeyNames

sendingstringfromkeyboard,ImageRecognition

controllingmouse,PausesandFail-Safes,Step3:GetandPrinttheMouseCoordinates

clickingmouse,ClickingtheMouse

draggingmouse,ClickingtheMouse

scrollingmouse,DraggingtheMouse

determiningmouseposition,MovingtheMouse

imagerecognition,Project:ExtendingthemouseNowProgram

installingpyautoguimodule,ControllingtheKeyboardandMousewithGUIAutomation

loggingoutofprogram,ControllingtheKeyboardandMousewithGUIAutomation

overview,ControllingtheKeyboardandMousewithGUIAutomation

screenshots,ScrollingtheMouse

stoppingprogram,ControllingtheKeyboardandMousewithGUIAutomation

H%Hdirective,PausingUntilaSpecificDate

hashcharacter(#),MultilineStringswithTripleQuotes

headings,Worddocument,WritingWordDocuments

help

askingonline,HowtoFindHelp

forerrormessages,StartingIDLE

hotkeycombinations,PressingandReleasingtheKeyboard

hotkey()function,PressingandReleasingtheKeyboard,ReviewofthePyAutoGUIFunctions

Hotmail.com,ConnectingtoanSMTPServer,RetrievingandDeletingEmailswithIMAP

HTML(HypertextMarkupLanguage)

browserdevelopertoolsand,ViewingtheSourceHTMLofaWebPage

findingelements,UsingtheDeveloperToolstoFindHTMLElements

learningresources,SavingDownloadedFilestotheHardDrive

overview,SavingDownloadedFilestotheHardDrive

viewingpagesource,AQuickRefresher

I%Idirective,PausingUntilaSpecificDate

idattribute,AQuickRefresher

IDLE(interactivedevelopmentenvironment)

creatingprograms,VariableNames

debuggingin

overview,DisablingLogging

steppingthroughprogram,Over

usingbreakpoints,DebuggingaNumberAddingProgram

expressionsin,EnteringExpressionsintotheInteractiveShell

overview,StartingIDLE

runningscriptsoutsideof,CopyingandPastingStringswiththepyperclipModule

starting,DownloadingandInstallingPython

ifstatements

overview,BlocksofCode

usinginwhileloop,whileLoopStatements

imageDrawmodule,IdeasforSimilarPrograms

imageDrawobjects,IdeasforSimilarPrograms

ImageFontobjects,DrawingExample

Imageobjects,ManipulatingImageswithPillow

images

addinglogoto,Project:AddingaLogo

attributesfor,WorkingwiththeImageDataType

boxtuples,CoordinatesandBoxTuples

colorvaluesin,ComputerImageFundamentals

coordinatesin,ColorsandRGBAValues

copyingandpastingin,CopyingandPastingImagesontoOtherImages

cropping,WorkingwiththeImageDataType

drawingon

exampleprogram,Lines

ellipses,Lines

ImageDrawmodule,IdeasforSimilarPrograms

lines,IdeasforSimilarPrograms

points,IdeasforSimilarPrograms

polygons,Lines

rectangles,Lines

text,DrawingExample

flipping,RotatingandFlippingImages

openingwithPillow,CoordinatesandBoxTuples

pixelmanipulation,ChangingIndividualPixels

recognitionof,Project:ExtendingthemouseNowProgram

resizing,CopyingandPastingImagesontoOtherImages

RGBAvalues,ComputerImageFundamentals

rotating,RotatingandFlippingImages

transparentpixels,CopyingandPastingImagesontoOtherImages

IMAP(InternetMessageAccessProtocol)

defined,DisconnectingfromtheSMTPServer

deletingmessages,GettingtheBodyfromaRawMessage

disconnectingfromserver,GettingtheBodyfromaRawMessage

fetchingmessages,SizeLimits

folders,ConnectingtoanIMAPServer

loggingintoserver,ConnectingtoanIMAPServer

searchingmessages,ConnectingtoanIMAPServer

imapclientmodule,DisconnectingfromtheSMTPServer

IMAPClientobjects,RetrievingandDeletingEmailswithIMAP

immutabledatatypes,List-likeTypes:StringsandTuples

importingmodules

overview,ImportingModules

pyautoguimodule,MovingtheMouse

imprintattribute,RunAttributes

imvariable,ScrollingtheMouse

indentation,ExampleProgram:Magic8BallwithaList

indexes

fordictionaries(seekeys,dictionary)

forlists

changingvaluesusing,GettingaList’sLengthwithlen()

gettingvalueusing,TheListDataType

negative,NegativeIndexes

removingvaluesfromlistusing,RemovingValuesfromListswithdelStatements

forstrings,MultilineStringswithTripleQuotes

IndexError,TheDictionaryDataType

index()method,Methods

infiniteloops,AnAnnoyingwhileLoop,continueStatements,Step1:ImporttheModule

INFOlevel,UsingtheloggingModule

inoperator

usingwithdictionaries,CheckingWhetheraKeyorValueExistsinaDictionary

usingwithlists,TheinandnotinOperators

usingwithstrings,IndexingandSlicingStrings

input()function

overview,Comments,Methods

usingforsensitiveinformation,LoggingintotheSMTPServer

installing

openpyxlmodule,WorkingwithExcelSpreadsheets

pyautoguimodule,ControllingtheKeyboardandMousewithGUIAutomation

Python,AboutThisBook

seleniummodule,Step4:SavetheImageandFindthePreviousComic

third-partymodules,InstallingThird-PartyModules

int,TheInteger,Floating-Point,andStringDataTypes

(seealsointegers)

integerdivision/flooredquotient(//)operator,EnteringExpressionsintotheInteractiveShell

integers

floating-pointequivalence,Thestr(),int(),andfloat()Functions

overview,TheInteger,Floating-Point,andStringDataTypes

interactivedevelopmentenvironment(seeIDLE(interactivedevelopmentenvironment))

interactiveshell(seeIDLE)

InternetExplorer,developertoolsin,ViewingtheSourceHTMLofaWebPage

InternetMessageAccessProtocol(seeIMAP(InternetMessageAccessProtocol))

interpreter,Python,DownloadingandInstallingPython

int()function,Thelen()Function

isabs()function,Theos.pathModule

isalnum()method,Theupper(),lower(),isupper(),andislower()StringMethods

isalpha()method,Theupper(),lower(),isupper(),andislower()StringMethods

isdecimal()method,Theupper(),lower(),isupper(),andislower()StringMethods

isdir()function,FindingFileSizesandFolderContents

is_displayed()method,FindingElementsonthePage

is_enabled()method,FindingElementsonthePage

isfile()function,FindingFileSizesandFolderContents

islower()method,Theupper(),lower(),isupper(),andislower()StringMethods

is_selected()method,FindingElementsonthePage

isspace()method,TheisXStringMethods

istitle()method,TheisXStringMethods

isupper()method,Theupper(),lower(),isupper(),andislower()StringMethods

italicattribute,RunAttributes

items()method,Dictionariesvs.Lists

iter_content()method,SavingDownloadedFilestotheHardDrive

J%jdirective,PausingUntilaSpecificDate

join()method,TheisXStringMethods,FilesandFilePaths,Theos.pathModule,Step2:CreateandStartThreads

JPEGformat,WorkingwiththeImageDataType

JSONfiles

APIsfor,Step3:WriteOuttheCSVFileWithouttheFirstRow

defined,WorkingwithCSVFilesandJSONData

formatoverview,Step3:WriteOuttheCSVFileWithouttheFirstRow

reading,JSONandAPIs

andweatherdataproject,ReadingJSONwiththeloads()Function

writing,ReadingJSONwiththeloads()Function

justifyingtext,Thejoin()andsplit()StringMethods

Kkeyboard

controlling,withPyAutoGUI

hotkeycombinations,PressingandReleasingtheKeyboard

pressingandreleasingkeys,KeyNames

sendingstringfromkeyboard,ImageRecognition

keynames,SendingaStringfromtheKeyboard

KeyboardInterruptexception,Step2:TrackandPrintLapTimes,MovingtheMouse,Step1:ImporttheModule

keyDown()function,KeyNames,PressingandReleasingtheKeyboard,ReviewofthePyAutoGUIFunctions

keys,dictionary,DictionariesandStructuringData

keys()method,Dictionariesvs.Lists

keyUp()function,KeyNames,PressingandReleasingtheKeyboard,ReviewofthePyAutoGUIFunctions

keywordarguments,ReturnValuesandreturnStatements

LLARGERsearchkey,PerformingtheSearch

launchd,LaunchingOtherProgramsfromPython

launchingprograms

andcountdownproject,Project:SimpleCountdownProgram

openingfileswithdefaultapplications,TaskScheduler,launchd,andcron

openingwebsites,TaskScheduler,launchd,andcron

overview,Step2:CreateandStartThreads

passingcommandlineargumentstoprocesses,LaunchingOtherProgramsfromPython

poll()method,LaunchingOtherProgramsfromPython

runningPythonscripts,TaskScheduler,launchd,andcron

scheduling,LaunchingOtherProgramsfromPython

sleep()function,TaskScheduler,launchd,andcron

wait()method,LaunchingOtherProgramsfromPython

len()function,WordDocuments

findingnumberofvaluesinlist,GettingaList’sLengthwithlen()

overview,Theinput()Function

lessthan(<)operator,BooleanValues

lessthanorequalto(<=)operator,BooleanValues

LibreOffice,WorkingwithExcelSpreadsheets,Step4:SavetheResults

linebreaks,Worddocument,AddingHeadings

LineChart()function,Charts

linecontinuationcharacter(\),ExampleProgram:Magic8BallwithaList

line()method,IdeasforSimilarPrograms

linkedstyles,StylingParagraphandRunObjects

Linux

backslashvs.forwardslash,FilesandFilePaths

cron,LaunchingOtherProgramsfromPython

installingPython,DownloadingandInstallingPython

installingthird-partymodules,ThepipTool

launchingprocessesfromPython,LaunchingOtherProgramsfromPython

loggingoutofautomationprogram,ControllingtheKeyboardandMousewithGUIAutomation

openingfileswithdefaultapplications,TaskScheduler,launchd,andcron

piptoolon,InstallingThird-PartyModules

Pythonsupport,WhatIsPython?

runningPythonprogramson,RunningPythonProgramsonOSXandLinux

startingIDLE,StartingIDLE

Unixphilosophy,OpeningFileswithDefaultApplications

listdir()function,HandlingAbsoluteandRelativePaths

list_folders()method,ConnectingtoanIMAPServer

list()function,ReaderObjects,ImageRecognition

lists

append()method,Methods

augmentedassignmentoperators,TheMultipleAssignmentTrick

changingvaluesusingindex,GettingaList’sLengthwithlen()

concatenationof,GettingaList’sLengthwithlen()

copy()function,PassingReferences

deepcopy()function,PassingReferences

dictionariesvs.,TheDictionaryDataType

findingnumberofvaluesusinglen(),GettingaList’sLengthwithlen()

gettingsublistswithslices,NegativeIndexes

gettingvalueusingindex,TheListDataType

index()method,Methods

inoperator,TheinandnotinOperators

insert()method,Methods

list()function,TheTupleDataType

Magic8Ballexampleprogramusing,SortingtheValuesinaListwiththesort()Method

multipleassignmenttrick,TheinandnotinOperators

mutablevs.immutabledatatypes,List-likeTypes:StringsandTuples

negativeindexes,NegativeIndexes

nesting,ATic-Tac-ToeBoard

notinoperator,TheinandnotinOperators

overview,TheListDataType

remove()method,AddingValuestoListswiththeappend()andinsert()Methods

removingvaluesfrom,RemovingValuesfromListswithdelStatements

replicationof,GettingaList’sLengthwithlen()

sort()method,RemovingValuesfromListswithremove()

storingvariablesas,RemovingValuesfromListswithdelStatements

usingwithforloops,UsingforLoopswithLists

ljust()method,Thejoin()andsplit()StringMethods

load_workbook()function,ReadingExcelDocuments

loads()function,JSONandAPIs,Step2:DownloadtheJSONData

localscope,LocalandGlobalScope

locateAllOnScreen()function,ImageRecognition

locateOnScreen()function,Project:ExtendingthemouseNowProgram

locationattribute,FindingElementsonthePage

logging

disabling,LoggingLevels

tofile,DisablingLogging

levelsof,UsingtheloggingModule

print()functionand,UsingtheloggingModule

loggingmodule,UsinganAssertioninaTrafficLightSimulation

loggingout,ofautomationprogram,ControllingtheKeyboardandMousewithGUIAutomation

login()method,ConnectingtoanSMTPServer,ConnectingtoanIMAPServer,Step3:SendCustomizedEmailReminders

logo,addingtoanimage,Project:AddingaLogo

loopingoverfiles,Step1:OpentheLogoImage

openinglogoimage,Project:AddingaLogo

overview,Step3:ResizetheImages

resizingimage,Step2:LoopOverAllFilesandOpenImages

logout()method,GettingtheBodyfromaRawMessage

LogRecordobjects,UsinganAssertioninaTrafficLightSimulation

loops

breakstatements,AnAnnoyingwhileLoop

continuestatements,continueStatements

forloop,continueStatements

range()functionfor,AnEquivalentwhileLoop

readingdatafromCSVfile,ReadingDatafromReaderObjectsinaforLoop

usinglistswith,UsingforLoopswithLists

whileloop,whileLoopStatements

lower()method,Theupper(),lower(),isupper(),andislower()StringMethods

lstrip()method,JustifyingTextwithrjust(),ljust(),andcenter()

M%Mdirective,PausingUntilaSpecificDate

%mdirective,PausingUntilaSpecificDate

MacOSX(seeOSX)

Magic8Ballexampleprogram,SortingtheValuesinaListwiththesort()Method

makedirs()function,Absolutevs.RelativePaths,Step2:LoopOverAllFilesandOpenImages

maps,openwhenlocationiscopied,WebScraping

figuringoutURL,WebScraping

handlingclipboardcontent,Step3:HandletheClipboardContentandLaunchtheBrowser

handlingcommandlineargument,Step1:FigureOuttheURL

launchingbrowser,Step3:HandletheClipboardContentandLaunchtheBrowser

overview,WebScraping

Matchobjects,CreatingRegexObjects

math

operatorsfor,EnteringExpressionsintotheInteractiveShell

programmingand,WhatIsPython?

mergePage()method,OverlayingPages

Messageobjects,SendingTextMessages

methods

chainingcalls,RotatingandFlippingImages

defined,Methods

dictionary

get()method,CheckingWhetheraKeyorValueExistsinaDictionary

items()method,Dictionariesvs.Lists

keys()method,Dictionariesvs.Lists

setdefault()method,Thesetdefault()Method

values()method,Dictionariesvs.Lists

list

append()method,Methods

index()method,Methods

insert()method,Methods

remove()method,AddingValuestoListswiththeappend()andinsert()Methods

sort()method,RemovingValuesfromListswithremove()

string

center()method,Thejoin()andsplit()StringMethods

copy()method,RemovingWhitespacewithstrip(),rstrip(),andlstrip()

endswith()method,TheisXStringMethods

isalnum()method,Theupper(),lower(),isupper(),andislower()StringMethods

isalpha()method,Theupper(),lower(),isupper(),andislower()StringMethods

isdecimal()method,Theupper(),lower(),isupper(),andislower()StringMethods

islower()method,Theupper(),lower(),isupper(),andislower()StringMethods

isspace()method,TheisXStringMethods

istitle()method,TheisXStringMethods

isupper()method,Theupper(),lower(),isupper(),andislower()StringMethods

join()method,TheisXStringMethods

ljust()method,Thejoin()andsplit()StringMethods

lower()method,Theupper(),lower(),isupper(),andislower()StringMethods

lstrip()method,JustifyingTextwithrjust(),ljust(),andcenter()

paste()method,RemovingWhitespacewithstrip(),rstrip(),andlstrip()

rjust()method,Thejoin()andsplit()StringMethods

rstrip()method,JustifyingTextwithrjust(),ljust(),andcenter()

split()method,TheisXStringMethods

startswith()method,TheisXStringMethods

strip()method,JustifyingTextwithrjust(),ljust(),andcenter()

upper()method,Theupper(),lower(),isupper(),andislower()StringMethods

MicrosoftWindows(seeWindowsOS)

middleClick()function,ClickingtheMouse,ReviewofthePyAutoGUIFunctions

modules

importing,ImportingModules

third-party,installing,ThepipTool

modulus/remainder(%)operator,EnteringExpressionsintotheInteractiveShell,TheMultipleAssignmentTrick

MontyPython,WhatIsPython?

mouse

controlling,PausesandFail-Safes,Step3:GetandPrinttheMouseCoordinates

clickingmouse,ClickingtheMouse

draggingmouse,ClickingtheMouse

scrollingmouse,DraggingtheMouse

determiningpositionof,MovingtheMouse

locating,MovingtheMouse

gettingcoordinates,Step1:ImporttheModule

handlingKeyboardInterruptexception,Step1:ImporttheModule

importingpyautoguimodule,Step1:ImporttheModule

infiniteloop,Step1:ImporttheModule

overview,MovingtheMouse

andpixels,identifyingcolorsof,AnalyzingtheScreenshot

mouseDown()function,ClickingtheMouse,ReviewofthePyAutoGUIFunctions

mouse.position()function,Step1:ImporttheModule

mouseUp()function,ReviewofthePyAutoGUIFunctions

move()function,CopyingFilesandFolders

moveRel()function,ControllingMouseMovement,MovingtheMouse,ClickingtheMouse,ReviewofthePyAutoGUIFunctions

moveTo()function,ControllingMouseMovement,ClickingtheMouse,ReviewofthePyAutoGUIFunctions

movingfiles/folders,CopyingFilesandFolders

multiclipboardproject,Step4:WriteContenttotheQuizandAnswerKeyFiles

listingkeywords,Step2:SaveClipboardContentwithaKeyword

loadingkeywordcontent,Step2:SaveClipboardContentwithaKeyword

overview,Step4:WriteContenttotheQuizandAnswerKeyFiles

savingclipboardcontent,Step1:CommentsandShelfSetup

settingupshelffile,Step1:CommentsandShelfSetup

multilinecomments,MultilineStringswithTripleQuotes

multilinestrings,EscapeCharacters

multipleassignmenttrick,TheinandnotinOperators

multiplication(*)operator,EnteringExpressionsintotheInteractiveShell,GettingaList’sLengthwithlen(),TheMultipleAssignmentTrick

multithreading

concurrencyissues,PassingArgumentstotheThread’sTargetFunction

downloadingmultipleimages,,Project:MultithreadedXKCDDownloader

creatingandstartingthreads,Step1:ModifytheProgramtoUseaFunction

usingdownloadXkcd()function,Project:MultithreadedXKCDDownloader

waitingforthreadstoend,Step2:CreateandStartThreads

join()method,Step2:CreateandStartThreads

overview,Multithreading

passingargumentstothreads,Multithreading

start()method,Multithreading,PassingArgumentstotheThread’sTargetFunction

Thread()function,Multithreading

mutabledatatypes,List-likeTypes:StringsandTuples

NNameError,RemovingValuesfromListswithdelStatements

namelist()method,CompressingFileswiththezipfileModule

negativecharacterclasses,CharacterClasses

negativeindexes,NegativeIndexes

nestedlistsanddictionaries,ATic-Tac-ToeBoard

newlinekeywordargument,ReadingDatafromReaderObjectsinaforLoop

Nonevalue,ReturnValuesandreturnStatements

nongreedymatching

dot,star,andquestionmarkfor,TheWildcardCharacter

inregularexpressions,GreedyandNongreedyMatching

notequalto(!=)operator,BooleanValues

notinoperator

usingwithdictionaries,CheckingWhetheraKeyorValueExistsinaDictionary

usingwithlists,TheinandnotinOperators

usingwithstrings,IndexingandSlicingStrings

notoperator,BinaryBooleanOperators

NOTsearchkey,PerformingtheSearch

now()function,ThedatetimeModule,ReviewofPython’sTimeFunctions

OONsearchkey,SelectingaFolder

open()function,TheFileReading/WritingProcess,WebScraping,TaskScheduler,launchd,andcron,ManipulatingImageswithPillow

openingfiles,TheFileReading/WritingProcess

OpenOffice,WorkingwithExcelSpreadsheets,Step4:SavetheResults

openprogram,TaskScheduler,launchd,andcron

openpyxlmodule,installing,WorkingwithExcelSpreadsheets

operators

augmentedassignment,TheMultipleAssignmentTrick

binary,ComparisonOperators

comparison,BooleanValues

defined,EnteringExpressionsintotheInteractiveShell

math,EnteringExpressionsintotheInteractiveShell

usingbinaryandcomparisonoperatorstogether,BinaryBooleanOperators

orderofoperations,EnteringExpressionsintotheInteractiveShell

oroperator,BinaryBooleanOperators

ORsearchkey,PerformingtheSearch

OSX

backslashvs.forwardslash,FilesandFilePaths

installingPython,DownloadingandInstallingPython

installingthird-partymodules,ThepipTool

launchd,LaunchingOtherProgramsfromPython

launchingprocessesfromPython,OpeningFileswithDefaultApplications

loggingoutofautomationprogram,ControllingtheKeyboardandMousewithGUIAutomation

openingfileswithdefaultapplications,TaskScheduler,launchd,andcron

piptoolon,InstallingThird-PartyModules

Pythonsupport,WhatIsPython?

runningPythonprogramson,RunningPythonProgramsonOSXandLinux

startingIDLE,StartingIDLE

Unixphilosophy,OpeningFileswithDefaultApplications

outlineattribute,RunAttributes

Outlook.com,ConnectingtoanSMTPServer,RetrievingandDeletingEmailswithIMAP

P

%pdirective,PausingUntilaSpecificDate

pagebreaks,Worddocument,AddingHeadings

Pageobjects,ExtractingTextfromPDFs

Paragraphobjects,WordDocuments

paragraphs,Worddocument,GettingtheFullTextfroma.docxFile

parameters,function,defStatementswithParameters

parentheses(),MutableandImmutableDataTypes,ReviewofRegularExpressionMatching

parsing,defined,ParsingHTMLwiththeBeautifulSoupModule

passingarguments,Comments

passingreferences,PassingReferences

passwords

application-specific,LoggingintotheSMTPServer

managingproject,CopyingandPastingStringswiththepyperclipModule

command-linearguments,Step1:ProgramDesignandDataStructures

copyingpassword,Step1:ProgramDesignandDataStructures

datastructures,CopyingandPastingStringswiththepyperclipModule

overview,CopyingandPastingStringswiththepyperclipModule

pastebin.com,Summary

paste()method,RemovingWhitespacewithstrip(),rstrip(),andlstrip(),CopyingandPastingImagesontoOtherImages,CopyingandPastingImagesontoOtherImages

paths

absolutevs.relative,TheCurrentWorkingDirectory

backslashvs.forwardslash,FilesandFilePaths

currentworkingdirectory,TheCurrentWorkingDirectory

overview,ReadingandWritingFiles

os.pathmodule

absolutepathsin,Theos.pathModule

filesizes,HandlingAbsoluteandRelativePaths

foldercontents,HandlingAbsoluteandRelativePaths

overview,Theos.pathModule

pathvalidity,FindingFileSizesandFolderContents

relativepathsin,Theos.pathModule

PAUSEvariable,PausesandFail-Safes,Step2:SetUpCoordinates

PdfFileReaderobjects,ExtractingTextfromPDFs

PDFfiles

combiningpagesfrommultiplefiles,EncryptingPDFs

addingpages,EncryptingPDFs

findingPDFfiles,Step1:FindAllPDFFiles

openingPDFs,Step1:FindAllPDFFiles

overview,EncryptingPDFs

savingresults,Step2:OpenEachPDF

creating,DecryptingPDFs

decrypting,ExtractingTextfromPDFs

encrypting,OverlayingPages

extractingtextfrom,PDFDocuments

formatoverview,WorkingwithPDFandwordDocuments

pagesin

copying,CreatingPDFs

overlaying,OverlayingPages

rotating,CopyingPages

PdfFileWriterobjects,DecryptingPDFs

pformat()function

overview,Thesetdefault()Method

savingvariablesintextfilesusing,SavingVariableswiththeshelveModule

phonenumbers,extracting,Combiningre.IGNORECASE,re.DOTALL,andre.VERBOSE

creatingregex,Project:PhoneNumberandEmailAddressExtractor

findingmatchesonclipboard,Step2:CreateaRegexforEmailAddresses

joiningmatchesintoastring,Step3:FindAllMatchesintheClipboardText

overview,Combiningre.IGNORECASE,re.DOTALL,andre.VERBOSE

Pillow

copyingandpastinginimages,CopyingandPastingImagesontoOtherImages

croppingimages,WorkingwiththeImageDataType

drawingonimages

ellipses,Lines

exampleprogram,Lines

ImageDrawmodule,IdeasforSimilarPrograms

lines,IdeasforSimilarPrograms

points,IdeasforSimilarPrograms

polygons,Lines

rectangles,Lines

text,DrawingExample

flippingimages,RotatingandFlippingImages

imageattributes,WorkingwiththeImageDataType

module,ComputerImageFundamentals

openingimages,CoordinatesandBoxTuples

pixelmanipulation,ChangingIndividualPixels

resizingimages,CopyingandPastingImagesontoOtherImages

rotatingimages,RotatingandFlippingImages

transparentpixels,CopyingandPastingImagesontoOtherImages

pipecharacter(|),GroupingwithParentheses,ManagingComplexRegexes

piptool,InstallingThird-PartyModules

pixelMatchesColor()function,AnalyzingtheScreenshot,Step3:StartTypingData

pixels,ComputerImageFundamentals,ChangingIndividualPixels

plaintextfiles,FindingFileSizesandFolderContents

plussign(+),OptionalMatchingwiththeQuestionMark,MatchingNewlineswiththeDotCharacter

PNGformat,WorkingwiththeImageDataType

point()method,IdeasforSimilarPrograms

poll()method,LaunchingOtherProgramsfromPython

polygon()method,Lines

Popen()function,Step2:CreateandStartThreads

openingfileswithdefaultapplications,TaskScheduler,launchd,andcron

passingcommandlineargumentsto,LaunchingOtherProgramsfromPython

position()function,MovingtheMouse,Step3:GetandPrinttheMouseCoordinates

pprint()function,Thesetdefault()Method

precedenceofmathoperators,EnteringExpressionsintotheInteractiveShell

press()function,PressingandReleasingtheKeyboard,ReviewofthePyAutoGUIFunctions,Step4:HandleSelectListsandRadioButtons

print()function,Step3:StartTypingData

loggingand,UsingtheloggingModule

overview,Comments

passingmultipleargumentsto,KeywordArgumentsandprint()

usingvariableswith,Theinput()Function

processes

andcountdownproject,Project:SimpleCountdownProgram

defined,Step2:CreateandStartThreads

openingfileswithdefaultapplications,TaskScheduler,launchd,andcron

openingwebsites,TaskScheduler,launchd,andcron

passingcommandlineargumentsto,LaunchingOtherProgramsfromPython

poll()method,LaunchingOtherProgramsfromPython

Popen()function,Step2:CreateandStartThreads

wait()method,LaunchingOtherProgramsfromPython

profilingcode,ThetimeModule

programming

blocksofcode,MixingBooleanandComparisonOperators

comments,Comments

creativityneededfor,ProgrammingIsaCreativeActivity

deduplicatingcode,Functions

defined,Conventions

exceptionhandling,TheglobalStatement

execution,program,BlocksofCode

functionsas“blackboxes”,TheglobalStatement

globalscope,LocalandGlobalVariableswiththeSameName

indentation,ExampleProgram:Magic8BallwithaList

localscope,LocalandGlobalScope

mathand,WhatIsPython?

Python,WhatIsPython?

terminatingprogramwithsys.exit(),ImportingModules

projects

AddingBulletstoWikiMarkup,Project:AddingBulletstoWikiMarkup

AddingaLogo,Project:AddingaLogo

AutomaticFormFiller,ReviewofthePyAutoGUIFunctions

BackingUpaFolderintoaZIPFile,Step3:FormtheNewFilenameandRenametheFiles

CombiningSelectPagesfromManyPDFs,EncryptingPDFs

DownloadingAllXKCDComics,Step3:OpenWebBrowsersforEachResult

ExtendingthemouseNowProgram,AnalyzingtheScreenshot

FetchingCurrentWeatherData,ReadingJSONwiththeloads()Function

GeneratingRandomQuizFiles,SavingVariableswiththepprint.pformat()Function

“I’mFeelingLucky”GoogleSearch,GettingDatafromanElement’sAttributes

“JustTextMe”Module,Project:“JustTextMe”Module

mapIt.pywiththewebbrowserModule,WebScraping

Multiclipboard,Step4:WriteContenttotheQuizandAnswerKeyFiles

MultithreadedXKCDDownloader,Project:MultithreadedXKCDDownloader

PasswordLocker,CopyingandPastingStringswiththepyperclipModule

PhoneNumberandEmailAddressExtractor,Combiningre.IGNORECASE,re.DOTALL,andre.VERBOSE

ReadingDatafromaSpreadsheet,GettingRowsandColumnsfromtheSheets

RemovingtheHeaderfromCSVFiles,ThedelimiterandlineterminatorKeywordArguments

RenamingFileswithAmerican-StyleDatestoEuropean-StyleDates,CreatingandAddingtoZIPFiles

SendingMemberDuesReminderEmails,DisconnectingfromtheIMAPServer

SimpleCountdownProgram,Project:SimpleCountdownProgram

SuperStopwatch,Thetime.sleep()Function

UpdatingaSpreadsheet,WritingValuestoCells

“WhereIstheMouseRightNow?”,MovingtheMouse

putpixel()method,ChangingIndividualPixels

pyautogui.click()function,Project:AutomaticFormFiller

pyautogui.click()method,ClickingtheMouse

pyautogui.doubleClick()function,ClickingtheMouse

pyautogui.dragTo()function,ClickingtheMouse

pyautogui.FailSafeExceptionexception,PausesandFail-Safes

pyautogui.hotkey()function,PressingandReleasingtheKeyboard

pyautogui.keyDown()function,KeyNames

pyautogui.keyUp()function,KeyNames

pyautogui.middleClick()function,ClickingtheMouse

pyautoguimodule

formfillerproject,ReviewofthePyAutoGUIFunctions

controllingkeyboard,ImageRecognition

hotkeycombinations,PressingandReleasingtheKeyboard

keynames,SendingaStringfromtheKeyboard

pressingandreleasingkeys,KeyNames

sendingstringfromkeyboard,ImageRecognition

controllingmouse,PausesandFail-Safes,Step3:GetandPrinttheMouseCoordinates

clickingmouse,ClickingtheMouse

draggingmouse,ClickingtheMouse

scrollingmouse,DraggingtheMouse

documentationfor,ControllingtheKeyboardandMousewithGUIAutomation

fail-safefeature,PausesandFail-Safes

functions,ReviewofthePyAutoGUIFunctions

imagerecognition,Project:ExtendingthemouseNowProgram

importing,MovingtheMouse

installing,ControllingtheKeyboardandMousewithGUIAutomation

pausingfunctioncalls,PausesandFail-Safes

screenshots,ScrollingtheMouse

pyautogui.mouseDown()function,ClickingtheMouse

pyautogui.moveRel()function,ControllingMouseMovement,MovingtheMouse

pyautogui.moveTo()function,ControllingMouseMovement

pyautogui.PAUSEvariable,PausesandFail-Safes

pyautogui.position()function,Step3:GetandPrinttheMouseCoordinates

pyautogui.press()function,Step4:HandleSelectListsandRadioButtons

pyautogui.rightClick()function,ClickingtheMouse

pyautogui.screenshot()function,ScrollingtheMouse

pyautogui.size()function,ControllingMouseMovement

pyautogui.typewrite()function,ImageRecognition,SendingaStringfromtheKeyboard,Project:AutomaticFormFiller

py.exeprogram,ShebangLine

pyobjcmodule,ThepipTool

PyPDF2module

combiningpagesfrommultiplePDFs,EncryptingPDFs

creatingPDFs,DecryptingPDFs

decryptingPDFs,ExtractingTextfromPDFs

encryptingPDFs,OverlayingPages

extractingtextfromPDFs,PDFDocuments

formatoverview,WorkingwithPDFandwordDocuments

pagesinPDFs

copying,CreatingPDFs

overlaying,OverlayingPages

rotating,CopyingPages

pyperclipmodule,RemovingWhitespacewithstrip(),rstrip(),andlstrip()

Python

datatypes,EnteringExpressionsintotheInteractiveShell

downloading,AboutThisBook

exampleprogram,VariableNames

help,StartingIDLE

installing,AboutThisBook

interactiveshell,StartingIDLE

interpreter,defined,DownloadingandInstallingPython

mathand,WhatIsPython?

overview,WhatIsPython?

programmingoverview,Conventions

startingIDLE,DownloadingandInstallingPython

python-docxmodule,Step4:SavetheResults

pyzmailmodule,DisconnectingfromtheSMTPServer,FetchinganEmailandMarkingItAsRead

PyzMessageobjects,FetchinganEmailandMarkingItAsRead

Qquestionmark(?),MatchingMultipleGroupswiththePipe,MatchingNewlineswiththeDotCharacter

quit()method,SendingSpecialKeys,DisconnectingfromtheSMTPServer,Step3:SendCustomizedEmailReminders

quizgenerator,SavingVariableswiththepprint.pformat()Function

creatingquizfile,Step2:CreatetheQuizFileandShuffletheQuestionOrder

creatingansweroptions,Step3:CreatetheAnswerOptions

overview,SavingVariableswiththepprint.pformat()Function

shufflingquestionorder,Step2:CreatetheQuizFileandShuffletheQuestionOrder

storingquizdataindictionary,Step1:StoretheQuizDatainaDictionary

writingcontenttofiles,Step3:CreatetheAnswerOptions

Rradiobuttons,Step3:StartTypingData

raise_for_status()method,DownloadingaWebPagewiththerequests.get()Function

raisekeyword,Debugging

range()function,AnEquivalentwhileLoop

rawstrings,EscapeCharacters,CreatingRegexObjects

Readerobjects,ReaderObjects

readingfiles,OpeningFileswiththeopen()Function,CompressingFileswiththezipfileModule

readlines()method,OpeningFileswiththeopen()Function

read()method,OpeningFileswiththeopen()Function

rectangle()method,Lines

Reddit,HowtoFindHelp

Referenceobjects,Charts

references

overview,TheTupleDataType

passing,PassingReferences

refresh()method,SendingSpecialKeys

Regexobjects

creating,FindingPatternsofTextWithoutRegularExpressions

matching,CreatingRegexObjects

regularexpressions

beginningofstringmatches,CharacterClasses

casesensitivity,Case-InsensitiveMatching

characterclasses,Thefindall()Method

creatingRegexobjects,FindingPatternsofTextWithoutRegularExpressions

defined,PatternMatchingwithRegularExpressions

endofstringmatches,CharacterClasses

extractingphonenumbersandemailsaddresses,Combiningre.IGNORECASE,re.DOTALL,andre.VERBOSE

findall()method,GreedyandNongreedyMatching

findingtextwithout,PatternMatchingwithRegularExpressions

greedymatching,MatchingOneorMorewiththePlus

grouping

matchingspecificrepetitions,MatchingOneorMorewiththePlus

oneormorematches,OptionalMatchingwiththeQuestionMark

optionalmatching,MatchingMultipleGroupswiththePipe

usingparentheses,ReviewofRegularExpressionMatching

usingpipecharacterin,GroupingwithParentheses

zeroormorematches,OptionalMatchingwiththeQuestionMark

HTMLand,OpeningYourBrowser’sDeveloperTools

matchingwith,CreatingRegexObjects

multipleargumentsforcompile()function,ManagingComplexRegexes

nongreedymatching,GreedyandNongreedyMatching

patternsfor,FindingPatternsofTextWithoutRegularExpressions

spreadingovermultiplelines,ManagingComplexRegexes

substitutingstringsusing,Case-InsensitiveMatching

symbolreference,MatchingNewlineswiththeDotCharacter

wildcardcharacter,TheCaretandDollarSignCharacters

relativepaths,TheCurrentWorkingDirectory

relpath()function,Theos.pathModule,HandlingAbsoluteandRelativePaths

remainder/modulus(%)operator,EnteringExpressionsintotheInteractiveShell,TheMultipleAssignmentTrick

remove()method,AddingValuestoListswiththeappend()andinsert()Methods

remove_sheet()method,CreatingandRemovingSheets

renamingfiles/folders,CopyingFilesandFolders

datestyles,CreatingandAddingtoZIPFiles

creatingregexfordates,CreatingandAddingtoZIPFiles

identifyingdatesinfilenames,Step1:CreateaRegexforAmerican-StyleDates

overview,CreatingandAddingtoZIPFiles

renamingfiles,Step3:FormtheNewFilenameandRenametheFiles

replication

oflists,GettingaList’sLengthwithlen()

string,StringConcatenationandReplication

requestsmodule

downloadingfiles,SavingDownloadedFilestotheHardDrive

downloadingpages,DownloadingFilesfromtheWebwiththerequestsModule

resolutionofcomputerscreen,ControllingMouseMovement

Responseobjects,DownloadingFilesfromtheWebwiththerequestsModule

returnvalues,function,defStatementswithParameters

reversekeyword,RemovingValuesfromListswithremove()

RGBAvalues,ComputerImageFundamentals

RGBcolormodel,ColorsandRGBAValues

rightClick()function,ClickingtheMouse,ReviewofthePyAutoGUIFunctions

rjust()method,Thejoin()andsplit()StringMethods,Step3:GetandPrinttheMouseCoordinates

rmdir()function,MovingandRenamingFilesandFolders

rmtree()function,MovingandRenamingFilesandFolders

rotateClockwise()method,CopyingPages

rotateCounterClockwise()method,CopyingPages

rotatingimages,RotatingandFlippingImages

roundingnumbers,Thetime.sleep()Function

rows,inExcelspreadsheets

settingheightandwidthof,Formulas

slicingWorksheetobjectstogetCellobjectsin,ConvertingBetweenColumnLettersandNumbers

rstrip()method,JustifyingTextwithrjust(),ljust(),andcenter()

rtlattribute,RunAttributes

Runobjects,StylingParagraphandRunObjects,RunAttributes

runningprograms

onLinux,RunningPythonProgramsonOSXandLinux

onOSX,RunningPythonProgramsonOSXandLinux

overview,RunningPrograms

onWindows,ShebangLine

shebangline,RunningPrograms

S\Scharacterclass,Thefindall()Method

\scharacterclass,Thefindall()Method

%Sdirective,PausingUntilaSpecificDate

Safari,developertoolsin,OpeningYourBrowser’sDeveloperTools

save()method,ManipulatingImageswithPillow

scope

global,LocalandGlobalVariableswiththeSameName

local,LocalandGlobalScope

screenshot()function,ScrollingtheMouse,ReviewofthePyAutoGUIFunctions

screenshots

analyzing,AnalyzingtheScreenshot

getting,ScrollingtheMouse

scripts

runningfromPythonprogram,TaskScheduler,launchd,andcron

runningoutsideofIDLE,CopyingandPastingStringswiththepyperclipModule

scroll()function,DraggingtheMouse,ScrollingtheMouse,ReviewofthePyAutoGUIFunctions

scrollingmouse,DraggingtheMouse

searching

email,ConnectingtoanIMAPServer

theWeb,GettingDatafromanElement’sAttributes

findingresults,Step1:GettheCommandLineArgumentsandRequesttheSearchPage

gettingcommandlinearguments,Step1:GettheCommandLineArgumentsandRequesttheSearchPage

openingwebbrowserforresults,Step2:FindAlltheResults

overview,GettingDatafromanElement’sAttributes

requestingsearchpage,Step1:GettheCommandLineArgumentsandRequesttheSearchPage

search()method,CreatingRegexObjects

SEENsearchkey,PerformingtheSearch

seeprogram,TaskScheduler,launchd,andcron

select_folder()method,SelectingaFolder

selectlists,Step3:StartTypingData

select()method,bs4module,CreatingaBeautifulSoupObjectfromHTML

selectors,CSS,CreatingaBeautifulSoupObjectfromHTML,FindingElementsonthePage

seleniummodule

clickingbuttons,SendingSpecialKeys

findingelements,StartingaSelenium-ControlledBrowser

followinglinks,FindingElementsonthePage

installing,Step4:SavetheImageandFindthePreviousComic

sendingspecialkeystrokes,FillingOutandSubmittingForms

submittingforms,FindingElementsonthePage

usingFirefoxwith,Step4:SavetheImageandFindthePreviousComic

send2trashmodule,PermanentlyDeletingFilesandFolders

sendingreminderemails,DisconnectingfromtheIMAPServer

findingunpaidmembers,Step2:FindAllUnpaidMembers

openingExcelfile,DisconnectingfromtheIMAPServer

overview,DisconnectingfromtheIMAPServer

sendingemails,Step2:FindAllUnpaidMembers

send_keys()method,FindingElementsonthePage

sendmail()method,LoggingintotheSMTPServer,Step3:SendCustomizedEmailReminders

sequencenumbers,FetchinganEmailandMarkingItAsRead

sequences,UsingforLoopswithLists

setdefault()method,Thesetdefault()Method

shadowattribute,RunAttributes

shebangline,RunningPrograms

shelvemodule,WritingtoFiles

ShortMessageService(SMS)

sendingmessages,SendingTextMessages

Twilioservice,SendingTextMessageswithTwilio

shutilmodule

deletingfiles/folders,MovingandRenamingFilesandFolders

movingfiles/folders,CopyingFilesandFolders

renamingfiles/folders,CopyingFilesandFolders

SID(stringID),SendingTextMessages

SimpleMailTransferProtocol(seeSMTP(SimpleMailTransferProtocol))

SINCEsearchkey,SelectingaFolder

singlequote(‘),StringLiterals

single-threadedprograms,Multithreading

size()function,ControllingMouseMovement

sleep()function,Thetime.time()Function,PausingUntilaSpecificDate,ReviewofPython’sTimeFunctions,TaskScheduler,launchd,andcron

slices

gettingsublistswith,NegativeIndexes

forstrings,MultilineStringswithTripleQuotes

small_capsattribute,RunAttributes

SMALLERsearchkey,PerformingtheSearch

SMS(ShortMessageService)

sendingmessages,SendingTextMessages

Twilioservice,SendingTextMessageswithTwilio

SMTP(SimpleMailTransferProtocol)

connectingtoserver,ConnectingtoanSMTPServer

defined,SMTP

disconnectingfromserver,DisconnectingfromtheSMTPServer

loggingintoserver,ConnectingtoanSMTPServer

sending“hello”message,ConnectingtoanSMTPServer

sendingmessage,LoggingintotheSMTPServer

TLSencryption,ConnectingtoanSMTPServer

SMTPobjects,ConnectingtoanSMTPServer

sort()method,RemovingValuesfromListswithremove()

soundfiles,playing,Project:SimpleCountdownProgram

sourcecode,defined,Conventions

split()method,TheisXStringMethods,HandlingAbsoluteandRelativePaths,WorkingwithCSVFilesandJSONData

spreadsheets(seeExcelspreadsheets)

squarebrackets[],TheListDataType

StackOverflow,HowtoFindHelp

standardlibrary,ImportingModules

star(*),TheWildcardCharacter,MatchingNewlineswiththeDotCharacter

usingwithwildcardcharacter,TheWildcardCharacter

zeroormorematcheswith,OptionalMatchingwiththeQuestionMark

start()method,Multithreading,PassingArgumentstotheThread’sTargetFunction,Step1:ModifytheProgramtoUseaFunction

startprogram,TaskScheduler,launchd,andcron

startswith()method,TheisXStringMethods

starttls()method,ConnectingtoanSMTPServer,Step3:SendCustomizedEmailReminders

stepargument,AnEquivalentwhileLoop

stopwatchproject,Thetime.sleep()Function

overview,Thetime.sleep()Function

setup,Project:SuperStopwatch

trackinglaptimes,Project:SuperStopwatch

strftime()function,PausingUntilaSpecificDate,ReviewofPython’sTimeFunctions

str()function,Thelen()Function,TheTupleDataType,Step3:GetandPrinttheMouse

Coordinates

strikeattribute,RunAttributes

stringID(SID),SendingTextMessages

strings

center()method,Thejoin()andsplit()StringMethods

concatenation,TheInteger,Floating-Point,andStringDataTypes

convertingdatetimeobjectsto,PausingUntilaSpecificDate

convertingtodatetimeobjects,ConvertingdatetimeObjectsintoStrings

copyingandpasting,RemovingWhitespacewithstrip(),rstrip(),andlstrip()

doublequotesfor,StringLiterals

endswith()method,TheisXStringMethods

escapecharacters,StringLiterals

extractingPDFtextas,PDFDocuments

gettingtracebackas,RaisingExceptions

indexesfor,MultilineStringswithTripleQuotes

inoperator,IndexingandSlicingStrings

isalnum()method,Theupper(),lower(),isupper(),andislower()StringMethods

isalpha()method,Theupper(),lower(),isupper(),andislower()StringMethods

isdecimal()method,Theupper(),lower(),isupper(),andislower()StringMethods

islower()method,Theupper(),lower(),isupper(),andislower()StringMethods

isspace()method,TheisXStringMethods

istitle()method,TheisXStringMethods

isupper()method,Theupper(),lower(),isupper(),andislower()StringMethods

join()method,TheisXStringMethods

literals,StringLiterals

ljust()method,Thejoin()andsplit()StringMethods

lower()method,Theupper(),lower(),isupper(),andislower()StringMethods

lstrip()method,JustifyingTextwithrjust(),ljust(),andcenter()

multiline,EscapeCharacters

mutablevs.immutabledatatypes,List-likeTypes:StringsandTuples

notinoperator,IndexingandSlicingStrings

overview,TheInteger,Floating-Point,andStringDataTypes

raw,EscapeCharacters

replicationof,StringConcatenationandReplication

rjust()method,Thejoin()andsplit()StringMethods

rstrip()method,JustifyingTextwithrjust(),ljust(),andcenter()

slicing,MultilineStringswithTripleQuotes

split()method,TheisXStringMethods

startswith()method,TheisXStringMethods

strip()method,JustifyingTextwithrjust(),ljust(),andcenter()

substitutingusingregularexpressions,Case-InsensitiveMatching

upper()method,Theupper(),lower(),isupper(),andislower()StringMethods

strip()method,JustifyingTextwithrjust(),ljust(),andcenter()

strptime()function,ConvertingdatetimeObjectsintoStrings,ReviewofPython’sTimeFunctions

strs,TheInteger,Floating-Point,andStringDataTypes

(seealsostrings)

Styleobjects,SettingtheFontStyleofCells

SUBJECTsearchkey,SelectingaFolder

sublists,gettingwithslices,NegativeIndexes

sub()method,Case-InsensitiveMatching

submitButtonColorvariable,Step1:FigureOuttheSteps,Step3:StartTypingData

submitButtonvariable,Step1:FigureOuttheSteps

submit()method,FillingOutandSubmittingForms

subprocessmodule,KeepingTime,SchedulingTasks,andLaunchingPrograms,Step2:CreateandStartThreads

subtraction(-)operator,EnteringExpressionsintotheInteractiveShell,TheMultipleAssignmentTrick

subtractivecolormodel,ColorsandRGBAValues

Sudokupuzzles,WhatIsPython?

sys.exit()function,ImportingModules

Ttag_nameattribute,FindingElementsonthePage

Tagobjects,CreatingaBeautifulSoupObjectfromHTML

tags,HTML,SavingDownloadedFilestotheHardDrive

TaskScheduler,LaunchingOtherProgramsfromPython

termination,program,YourFirstProgram,ImportingModules

textattribute,ReadingWordDocuments,RunAttributes

textmessaging

automaticnotifications,Project:“JustTextMe”Module

sendingmessages,SendingTextMessages

Twilioservice,SendingTextMessageswithTwilio

text()method,DrawingExample

TEXTsearchkey,SelectingaFolder

textsize()method,DrawingText

third-partymodules,installing,InstallingThird-PartyModules

Thread()function,Multithreading,Step1:ModifytheProgramtoUseaFunction

threadingmodule,KeepingTime,SchedulingTasks,andLaunchingPrograms,Multithreading

Threadobjects,Multithreading

threads

concurrencyissues,PassingArgumentstotheThread’sTargetFunction

join()method,Step2:CreateandStartThreads

multithreading,Multithreading

imagedownloader,Project:MultithreadedXKCDDownloader

passingargumentsto,Multithreading

processesvs.,Step2:CreateandStartThreads

tic-tac-toeboard,UsingDataStructurestoModelReal-WorldThings

timedeltadatatype,ThedatetimeModule,ReviewofPython’sTimeFunctions

timedeltaobjects,ThedatetimeModule

timemodule

overview,ReviewofPython’sTimeFunctions

sleep()function,Thetime.time()Function,PausingUntilaSpecificDate

stopwatchproject,Thetime.sleep()Function

time()function,ThetimeModule

TLSencryption,ConnectingtoanSMTPServer

top-leveldomains,Step2:CreateaRegexforEmailAddresses

TOsearchkey,PerformingtheSearch

total_seconds()method,ThedatetimeModule,ReviewofPython’sTimeFunctions

traceback,gettingfromerror,RaisingExceptions

transparency,ComputerImageFundamentals,CopyingandPastingImagesontoOtherImages

transpose()method,RotatingandFlippingImages

triplequotes(”’),EscapeCharacters,ManagingComplexRegexes

truetype()function,DrawingText

truthtables,ComparisonOperators

“truthy”values,continueStatements

tupledatatype

overview,MutableandImmutableDataTypes

tuple()function,TheTupleDataType

twiliomodule,SendingTextMessageswithTwilio

TwilioRestClientobjects,SendingTextMessages

Twilioservice

automatictextmessages,Project:“JustTextMe”Module

overview,SendingTextMessageswithTwilio

sendingtextmessages,SendingTextMessages

TypeError,GettingIndividualValuesinaListwithIndexes,List-likeTypes:StringsandTuples

typewrite()function,ImageRecognition,SendingaStringfromtheKeyboard,ReviewofthePyAutoGUIFunctions,Project:AutomaticFormFiller,Step3:StartTypingData,Step4:HandleSelectListsandRadioButtons

UUbuntu,DownloadingandInstallingPython

cron,LaunchingOtherProgramsfromPython

launchingprocessesfromPython,LaunchingOtherProgramsfromPython

openingfileswithdefaultapplications,TaskScheduler,launchd,andcron

Unixphilosophy,OpeningFileswithDefaultApplications

UNANSWEREDsearchkey,PerformingtheSearch

UNDELETEDsearchkey,PerformingtheSearch

underlineattribute,RunAttributes

underscore(_),VariableNames

UNDRAFTsearchkey,PerformingtheSearch

UNFLAGGEDsearchkey,PerformingtheSearch

Unicodeencodings,SavingDownloadedFilestotheHardDrive

Unixepoch,ThetimeModule,ThedatetimeModule,ReviewofPython’sTimeFunctions

Unixphilosophy,OpeningFileswithDefaultApplications

unlink()function,MovingandRenamingFilesandFolders

UNSEENsearchkey,PerformingtheSearch

upper()method,Theupper(),lower(),isupper(),andislower()StringMethods

UTC(CoordinatedUniversalTime),ThetimeModule

VValueError,TheMultipleAssignmentTrick,ConvertingdatetimeObjectsintoStrings

values,defined,EnteringExpressionsintotheInteractiveShell,FindingPatternsofTextWithoutRegularExpressions

values()method,Dictionariesvs.Lists

variables,Methods

(seealsolists)

assignmentstatements,StringConcatenationandReplication

defined,StringConcatenationandReplication

global,LocalandGlobalVariableswiththeSameName

initializing,AssignmentStatements

local,LocalandGlobalScope

naming,VariableNames

Nonevalueand,ReturnValuesandreturnStatements

overwriting,AssignmentStatements

references,TheTupleDataType

savingwithshelvemodule,WritingtoFiles

storingaslist,RemovingValuesfromListswithdelStatements

Verizonmail,ConnectingtoanSMTPServer,RetrievingandDeletingEmailswithIMAP

volumes,defined,FilesandFilePaths

W\Wcharacterclass,Thefindall()Method

\wcharacterclass,Thefindall()Method

%wdirective,PausingUntilaSpecificDate

walk()function,SafeDeleteswiththesend2trashModule,LaunchingOtherProgramsfromPython

WARNINGlevel,UsingtheloggingModule

weatherdata,fetching,ReadingJSONwiththeloads()Function

downloadingJSONdata,Step1:GetLocationfromtheCommandLineArgument

gettinglocation,Step1:GetLocationfromtheCommandLineArgument

loadingJSONdata,Step2:DownloadtheJSONData

overview,ReadingJSONwiththeloads()Function

webbrowsermodule

open()function,TaskScheduler,launchd,andcron

openingbrowserusing,WebScraping

WebDriverobjects,StartingaSelenium-ControlledBrowser

WebElementobjects,StartingaSelenium-ControlledBrowser

webscraping

bs4module

creatingobjectfromHTML,ParsingHTMLwiththeBeautifulSoupModule

findingelementwithselect()method,CreatingaBeautifulSoupObjectfromHTML

gettingattribute,GettingDatafromanElement’sAttributes

overview,ParsingHTMLwiththeBeautifulSoupModule

downloading

files,SavingDownloadedFilestotheHardDrive

images,Step3:OpenWebBrowsersforEachResult

pages,DownloadingFilesfromtheWebwiththerequestsModule

andGooglemapsproject,WebScraping

andGooglesearchproject,GettingDatafromanElement’sAttributes

HTML

browserdevelopertoolsand,ViewingtheSourceHTMLofaWebPage

findingelements,UsingtheDeveloperToolstoFindHTMLElements

learningresources,SavingDownloadedFilestotheHardDrive

overview,SavingDownloadedFilestotheHardDrive

viewingpagesource,AQuickRefresher

overview,WebScraping

requestsmodule,DownloadingFilesfromtheWebwiththerequestsModule

seleniummoduleclickingbuttons,SendingSpecialKeys

findingelements,StartingaSelenium-ControlledBrowser

followinglinks,FindingElementsonthePage

installing,Step4:SavetheImageandFindthePreviousComic

sendingspecialkeystrokes,FillingOutandSubmittingForms

submittingforms,FindingElementsonthePage

usingFirefoxwith,Step4:SavetheImageandFindthePreviousComic

websites,openingfromscript,TaskScheduler,launchd,andcron

whileloops

gettingandprintingmousecoordinatesusing,Step1:ImporttheModule

infinite,Step1:ImporttheModule

overview,whileLoopStatements

whitespace,removing,JustifyingTextwithrjust(),ljust(),andcenter()

wildcardcharacter(.),TheCaretandDollarSignCharacters

WindowsOS

backslashvs.forwardslash,FilesandFilePaths

installingPython,AboutThisBook

installingthird-partymodules,ThepipTool

launchingprocessesfromPython,LaunchingOtherProgramsfromPython

loggingoutofautomationprogram,ControllingtheKeyboardandMousewithGUIAutomation

openingfileswithdefaultapplications,TaskScheduler,launchd,andcron

piptoolon,InstallingThird-PartyModules

Pythonsupport,WhatIsPython?

runningPythonprogramson,ShebangLine

startingIDLE,DownloadingandInstallingPython

TaskScheduler,LaunchingOtherProgramsfromPython

Worddocuments

addingheadings,WritingWordDocuments

creatingdocumentswithnondefaultstyles,StylingParagraphandRunObjects

formatoverview,Step4:SavetheResults

gettingtextfrom,ReadingWordDocuments

line/pagebreaks,AddingHeadings

picturesin,AddingHeadings

python-docxmodule,Step4:SavetheResults

reading,WordDocuments

Runobjectattributes,RunAttributes

stylingparagraphs,GettingtheFullTextfroma.docxFile

writingtofile,RunAttributes

Workbookobjects,ReadingExcelDocuments

workbooks,Excel,WorkingwithExcelSpreadsheets

creatingworksheets,CreatingandRemovingSheets

deletingworksheets,CreatingandRemovingSheets

opening,ReadingExcelDocuments

saving,IdeasforSimilarPrograms

Worksheetobjects,GettingSheetsfromtheWorkbook

write()method,ReadingtheContentsofFiles

Writerobjects,ReadingDatafromReaderObjectsinaforLoop

writerow()method,WriterObjects

XXKCDcomics

downloadingproject,Step3:OpenWebBrowsersforEachResult

designingprogram,Project:DownloadingAllXKCDComics

downloadingwebpage,Step1:DesigntheProgram

overview,Step3:OpenWebBrowsersforEachResult

savingimage,Step4:SavetheImageandFindthePreviousComic

multithreadeddownloadingproject,Project:MultithreadedXKCDDownloader

creatingandstartingthreads,Step1:ModifytheProgramtoUseaFunction

usingdownloadXkcd()function,Project:MultithreadedXKCDDownloader

waitingforthreadstoend,Step2:CreateandStartThreads

Y%Ydirective,PausingUntilaSpecificDate

%ydirective,PausingUntilaSpecificDate

Yahoo!Mail,ConnectingtoanSMTPServer,RetrievingandDeletingEmailswithIMAP

Zzipfilemodule

creatingZIPfiles,ExtractingfromZIPFiles

extractingZIPfiles,ExtractingfromZIPFiles

andfolders,Step3:FormtheNewFilenameandRenametheFiles

overview,WalkingaDirectoryTree

readingZIPfiles,CompressingFileswiththezipfileModule

ZipFileobjects,CompressingFileswiththezipfileModule

ZipInfoobjects,CompressingFileswiththezipfileModule

AutomatetheBoringStuffwithPython:PracticalProgrammingforTotalBeginnersAlbertSweigartCopyright©2015AUTOMATETHEBORINGSTUFFWITHPYTHON.

Allrightsreserved.Nopartofthisworkmaybereproducedortransmittedinanyformorbyanymeans,electronicormechanical,includingphotocopying,recording,orbyanyinformationstorageorretrievalsystem,withoutthepriorwrittenpermissionofthecopyrightownerandthepublisher.

1918171615123456789

ISBN-10:1-59327-599-4

ISBN-13:978-1-59327-599-0

Publisher:WilliamPollockProductionEditor:LaurelChunCoverIllustration:JoshEllingsonInteriorDesign:OctopodStudiosDevelopmentalEditors:JenniferGriffith-Delgado,GregPoulos,andLeslieShenTechnicalReviewer:AriLacenskiCopyeditor:KimWimpsettCompositor:SusanGlinertStevensProofreader:LisaDevotoFarrellIndexer:BIMIndexingandProofreadingServices

Forinformationondistribution,translations,orbulksales,pleasecontactNoStarchPress,Inc.directly:

LibraryofCongressControlNumber:2014953114

NoStarchPressandtheNoStarchPresslogoareregisteredtrademarksofNoStarchPress,Inc.Otherproductandcompanynamesmentionedhereinmaybethetrademarksoftheirrespectiveowners.Ratherthanuseatrademarksymbolwitheveryoccurrenceofatrademarkedname,weareusingthenamesonlyinaneditorialfashionandtothebenefitofthetrademarkowner,withnointentionofinfringementofthetrademark.

Theinformationinthisbookisdistributedonan“AsIs”basis,withoutwarranty.Whileeveryprecautionhasbeentakeninthepreparationofthiswork,neithertheauthornorNoStarchPress,Inc.shallhaveanyliabilitytoanypersonorentitywithrespecttoanylossordamagecausedorallegedtobecauseddirectlyorindirectlybytheinformationcontainedinit.

NoStarchPress

2015-04-16T12:10:03-07:00

AutomatetheBoringStuffwithPython:PracticalProgrammingforTotalBeginnersTableofContents

DedicationAbouttheAuthorAbouttheTechReviewerAcknowledgmentsIntroduction

WhomIsThisBookFor?ConventionsWhatIsProgramming?

WhatIsPython?ProgrammersDon’tNeedtoKnowMuchMathProgrammingIsaCreativeActivity

AboutThisBookDownloadingandInstallingPythonStartingIDLE

TheInteractiveShell

HowtoFindHelpAskingSmartProgrammingQuestionsSummary

I.PythonProgrammingBasics

1.PythonBasics

EnteringExpressionsintotheInteractiveShellTheInteger,Floating-Point,andStringDataTypesStringConcatenationandReplicationStoringValuesinVariables

AssignmentStatementsVariableNames

YourFirstProgramDissectingYourProgram

CommentsTheprint()FunctionTheinput()FunctionPrintingtheUser’sNameThelen()FunctionThestr(),int(),andfloat()Functions

SummaryPracticeQuestions

2.FlowControl

BooleanValuesComparisonOperatorsBooleanOperators

BinaryBooleanOperatorsThenotOperator

MixingBooleanandComparisonOperatorsElementsofFlowControl

ConditionsBlocksofCode

ProgramExecutionFlowControlStatements

ifStatementselseStatementselifStatementswhileLoopStatements

AnAnnoyingwhileLoop

breakStatementscontinueStatementsforLoopsandtherange()Function

AnEquivalentwhileLoopTheStarting,Stopping,andSteppingArgumentstorange()

ImportingModules

fromimportStatements

EndingaProgramEarlywithsys.exit()SummaryPracticeQuestions

3.Functions

defStatementswithParametersReturnValuesandreturnStatementsTheNoneValueKeywordArgumentsandprint()LocalandGlobalScope

LocalVariablesCannotBeUsedintheGlobalScopeLocalScopesCannotUseVariablesinOtherLocalScopesGlobalVariablesCanBeReadfromaLocalScopeLocalandGlobalVariableswiththeSameName

TheglobalStatementExceptionHandlingAShortProgram:GuesstheNumberSummaryPracticeQuestionsPracticeProjects

TheCollatzSequenceInputValidation

4.Lists

TheListDataType

GettingIndividualValuesinaListwithIndexesNegativeIndexesGettingSublistswithSlicesGettingaList’sLengthwithlen()ChangingValuesinaListwithIndexesListConcatenationandListReplicationRemovingValuesfromListswithdelStatements

WorkingwithLists

UsingforLoopswithListsTheinandnotinOperatorsTheMultipleAssignmentTrick

AugmentedAssignmentOperatorsMethods

FindingaValueinaListwiththeindex()MethodAddingValuestoListswiththeappend()andinsert()MethodsRemovingValuesfromListswithremove()SortingtheValuesinaListwiththesort()Method

ExampleProgram:Magic8BallwithaListList-likeTypes:StringsandTuples

MutableandImmutableDataTypesTheTupleDataTypeConvertingTypeswiththelist()andtuple()Functions

References

PassingReferencesThecopyModule’scopy()anddeepcopy()Functions

SummaryPracticeQuestionsPracticeProjects

CommaCodeCharacterPictureGrid

5.DictionariesandStructuringData

TheDictionaryDataType

Dictionariesvs.ListsThekeys(),values(),anditems()MethodsCheckingWhetheraKeyorValueExistsinaDictionaryTheget()MethodThesetdefault()Method

PrettyPrintingUsingDataStructurestoModelReal-WorldThings

ATic-Tac-ToeBoardNestedDictionariesandLists

SummaryPracticeQuestionsPracticeProjects

FantasyGameInventoryListtoDictionaryFunctionforFantasyGameInventory

6.ManipulatingStrings

WorkingwithStrings

StringLiterals

DoubleQuotesEscapeCharactersRawStringsMultilineStringswithTripleQuotesMultilineComments

IndexingandSlicingStringsTheinandnotinOperatorswithStrings

UsefulStringMethods

Theupper(),lower(),isupper(),andislower()StringMethodsTheisXStringMethodsThestartswith()andendswith()StringMethodsThejoin()andsplit()StringMethodsJustifyingTextwithrjust(),ljust(),andcenter()RemovingWhitespacewithstrip(),rstrip(),andlstrip()CopyingandPastingStringswiththepyperclipModule

Project:PasswordLocker

Step1:ProgramDesignandDataStructuresStep2:HandleCommandLineArgumentsStep3:CopytheRightPassword

Project:AddingBulletstoWikiMarkup

Step1:CopyandPastefromtheClipboardStep2:SeparatetheLinesofTextandAddtheStarStep3:JointheModifiedLines

SummaryPracticeQuestionsPracticeProject

TablePrinter

II.AutomatingTasks

7.PatternMatchingwithRegularExpressions

FindingPatternsofTextWithoutRegularExpressions

FindingPatternsofTextwithRegularExpressions

CreatingRegexObjectsMatchingRegexObjectsReviewofRegularExpressionMatching

MorePatternMatchingwithRegularExpressions

GroupingwithParenthesesMatchingMultipleGroupswiththePipeOptionalMatchingwiththeQuestionMarkMatchingZeroorMorewiththeStarMatchingOneorMorewiththePlusMatchingSpecificRepetitionswithCurlyBrackets

GreedyandNongreedyMatchingThefindall()MethodCharacterClassesMakingYourOwnCharacterClassesTheCaretandDollarSignCharactersTheWildcardCharacter

MatchingEverythingwithDot-StarMatchingNewlineswiththeDotCharacter

ReviewofRegexSymbolsCase-InsensitiveMatchingSubstitutingStringswiththesub()MethodManagingComplexRegexesCombiningre.IGNORECASE,re.DOTALL,andre.VERBOSEProject:PhoneNumberandEmailAddressExtractor

Step1:CreateaRegexforPhoneNumbersStep2:CreateaRegexforEmailAddressesStep3:FindAllMatchesintheClipboardTextStep4:JointheMatchesintoaStringfortheClipboardRunningtheProgramIdeasforSimilarPrograms

SummaryPracticeQuestionsPracticeProjects

StrongPasswordDetectionRegexVersionofstrip()

8.ReadingandWritingFiles

FilesandFilePaths

BackslashonWindowsandForwardSlashonOSXandLinuxTheCurrentWorkingDirectoryAbsolutevs.RelativePathsCreatingNewFolderswithos.makedirs()

Theos.pathModule

HandlingAbsoluteandRelativePathsFindingFileSizesandFolderContentsCheckingPathValidity

TheFileReading/WritingProcess

OpeningFileswiththeopen()FunctionReadingtheContentsofFilesWritingtoFiles

SavingVariableswiththeshelveModuleSavingVariableswiththepprint.pformat()FunctionProject:GeneratingRandomQuizFiles

Step1:StoretheQuizDatainaDictionaryStep2:CreatetheQuizFileandShuffletheQuestionOrderStep3:CreatetheAnswerOptionsStep4:WriteContenttotheQuizandAnswerKeyFiles

Project:Multiclipboard

Step1:CommentsandShelfSetupStep2:SaveClipboardContentwithaKeywordStep3:ListKeywordsandLoadaKeyword’sContent

SummaryPracticeQuestionsPracticeProjects

ExtendingtheMulticlipboardMadLibsRegexSearch

9.OrganizingFiles

TheshutilModule

CopyingFilesandFoldersMovingandRenamingFilesandFoldersPermanentlyDeletingFilesandFoldersSafeDeleteswiththesend2trashModule

WalkingaDirectoryTreeCompressingFileswiththezipfileModule

ReadingZIPFilesExtractingfromZIPFilesCreatingandAddingtoZIPFiles

Project:RenamingFileswithAmerican-StyleDatestoEuropean-StyleDates

Step1:CreateaRegexforAmerican-StyleDatesStep2:IdentifytheDatePartsfromtheFilenamesStep3:FormtheNewFilenameandRenametheFilesIdeasforSimilarPrograms

Project:BackingUpaFolderintoaZIPFile

Step1:FigureOuttheZIPFile’sNameStep2:CreatetheNewZIPFileStep3:WalktheDirectoryTreeandAddtotheZIPFileIdeasforSimilarPrograms

SummaryPracticeQuestionsPracticeProjects

SelectiveCopyDeletingUnneededFilesFillingintheGaps

10.Debugging

RaisingExceptionsGettingtheTracebackasaStringAssertions

UsinganAssertioninaTrafficLightSimulationDisablingAssertions

Logging

UsingtheloggingModuleDon’tDebugwithprint()LoggingLevelsDisablingLoggingLoggingtoaFile

IDLE’sDebugger

GoStepOverOutQuitDebuggingaNumberAddingProgramBreakpoints

SummaryPracticeQuestionsPracticeProject

DebuggingCoinToss

11.WebScraping

Project:mapit.pywiththewebbrowserModule

Step1:FigureOuttheURLStep2:HandletheCommandLineArgumentsStep3:HandletheClipboardContentandLaunchtheBrowserIdeasforSimilarPrograms

DownloadingFilesfromtheWebwiththerequestsModule

DownloadingaWebPagewiththerequests.get()FunctionCheckingforErrors

SavingDownloadedFilestotheHardDrive

HTML

ResourcesforLearningHTMLAQuickRefresherViewingtheSourceHTMLofaWebPageOpeningYourBrowser’sDeveloperToolsUsingtheDeveloperToolstoFindHTMLElements

ParsingHTMLwiththeBeautifulSoupModule

CreatingaBeautifulSoupObjectfromHTMLFindinganElementwiththeselect()MethodGettingDatafromanElement’sAttributes

Project:“I’mFeelingLucky”GoogleSearch

Step1:GettheCommandLineArgumentsandRequesttheSearchPageStep2:FindAlltheResultsStep3:OpenWebBrowsersforEachResultIdeasforSimilarPrograms

Project:DownloadingAllXKCDComics

Step1:DesigntheProgramStep2:DownloadtheWebPageStep3:FindandDownloadtheComicImageStep4:SavetheImageandFindthePreviousComicIdeasforSimilarPrograms

ControllingtheBrowserwiththeseleniumModule

StartingaSelenium-ControlledBrowserFindingElementsonthePageClickingthePageFillingOutandSubmittingFormsSendingSpecialKeysClickingBrowserButtonsMoreInformationonSelenium

SummaryPracticeQuestionsPracticeProjects

CommandLineEmailerImageSiteDownloader2048LinkVerification

12.WorkingwithExcelSpreadsheets

ExcelDocumentsInstallingtheopenpyxlModuleReadingExcelDocuments

OpeningExcelDocumentswithOpenPyXLGettingSheetsfromtheWorkbookGettingCellsfromtheSheetsConvertingBetweenColumnLettersandNumbersGettingRowsandColumnsfromtheSheetsWorkbooks,Sheets,Cells

Project:ReadingDatafromaSpreadsheet

Step1:ReadtheSpreadsheetDataStep2:PopulatetheDataStructureStep3:WritetheResultstoaFileIdeasforSimilarPrograms

WritingExcelDocuments

CreatingandSavingExcelDocumentsCreatingandRemovingSheetsWritingValuestoCells

Project:UpdatingaSpreadsheet

Step1:SetUpaDataStructurewiththeUpdateInformationStep2:CheckAllRowsandUpdateIncorrectPricesIdeasforSimilarPrograms

SettingtheFontStyleofCellsFontObjectsFormulasAdjustingRowsandColumns

SettingRowHeightandColumnWidthMergingandUnmergingCellsFreezePanes

ChartsSummaryPracticeQuestions

PracticeProjects

MultiplicationTableMakerBlankRowInserterSpreadsheetCellInverterTextFilestoSpreadsheetSpreadsheettoTextFiles

13.WorkingwithPDFandwordDocuments

PDFDocuments

ExtractingTextfromPDFsDecryptingPDFsCreatingPDFs

CopyingPagesRotatingPagesOverlayingPagesEncryptingPDFs

Project:CombiningSelectPagesfromManyPDFs

Step1:FindAllPDFFilesStep2:OpenEachPDFStep3:AddEachPageStep4:SavetheResultsIdeasforSimilarPrograms

WordDocuments

ReadingWordDocumentsGettingtheFullTextfroma.docxFileStylingParagraphandRunObjectsCreatingWordDocumentswithNondefaultStylesRunAttributesWritingWordDocumentsAddingHeadingsAddingLineandPageBreaksAddingPictures

SummaryPracticeQuestionsPracticeProjects

PDFParanoiaCustomInvitationsasWordDocumentsBrute-ForcePDFPasswordBreaker

14.WorkingwithCSVFilesandJSONData

TheCSVModule

ReaderObjectsReadingDatafromReaderObjectsinaforLoopWriterObjectsThedelimiterandlineterminatorKeywordArguments

Project:RemovingtheHeaderfromCSVFiles

Step1:LoopThroughEachCSVFileStep2:ReadintheCSVFileStep3:WriteOuttheCSVFileWithouttheFirstRowIdeasforSimilarPrograms

JSONandAPIsTheJSONModule

ReadingJSONwiththeloads()FunctionWritingJSONwiththedumps()Function

Project:FetchingCurrentWeatherData

Step1:GetLocationfromtheCommandLineArgumentStep2:DownloadtheJSONDataStep3:LoadJSONDataandPrintWeatherIdeasforSimilarPrograms

SummaryPracticeQuestionsPracticeProject

Excel-to-CSVConverter

15.KeepingTime,SchedulingTasks,andLaunchingPrograms

ThetimeModule

Thetime.time()FunctionThetime.sleep()Function

RoundingNumbersProject:SuperStopwatch

Step1:SetUptheProgramtoTrackTimesStep2:TrackandPrintLapTimesIdeasforSimilarPrograms

ThedatetimeModule

ThetimedeltaDataTypePausingUntilaSpecificDateConvertingdatetimeObjectsintoStringsConvertingStringsintodatetimeObjects

ReviewofPython’sTimeFunctionsMultithreading

PassingArgumentstotheThread’sTargetFunctionConcurrencyIssues

Project:MultithreadedXKCDDownloader

Step1:ModifytheProgramtoUseaFunctionStep2:CreateandStartThreadsStep3:WaitforAllThreadstoEnd

LaunchingOtherProgramsfromPython

PassingCommandLineArgumentstoPopen()TaskScheduler,launchd,andcronOpeningWebsiteswithPythonRunningOtherPythonScriptsOpeningFileswithDefaultApplications

Project:SimpleCountdownProgram

Step1:CountDownStep2:PlaytheSoundFileIdeasforSimilarPrograms

SummaryPracticeQuestionsPracticeProjects

PrettifiedStopwatchScheduledWebComicDownloader

16.SendingEmailandTextMessages

SMTP

SendingEmail

ConnectingtoanSMTPServerSendingtheSMTP“Hello”MessageStartingTLSEncryptionLoggingintotheSMTPServerSendinganEmailDisconnectingfromtheSMTPServer

IMAPRetrievingandDeletingEmailswithIMAP

ConnectingtoanIMAPServerLoggingintotheIMAPServerSearchingforEmail

SelectingaFolderPerformingtheSearchSizeLimits

FetchinganEmailandMarkingItAsReadGettingEmailAddressesfromaRawMessageGettingtheBodyfromaRawMessageDeletingEmailsDisconnectingfromtheIMAPServer

Project:SendingMemberDuesReminderEmails

Step1:OpentheExcelFileStep2:FindAllUnpaidMembersStep3:SendCustomizedEmailReminders

SendingTextMessageswithTwilio

SigningUpforaTwilioAccountSendingTextMessages

Project:“JustTextMe”ModuleSummaryPracticeQuestionsPracticeProjects

RandomChoreAssignmentEmailerUmbrellaReminderAutoUnsubscriberControllingYourComputerThroughEmail

17.ManipulatingImages

ComputerImageFundamentals

ColorsandRGBAValuesCoordinatesandBoxTuples

ManipulatingImageswithPillow

WorkingwiththeImageDataTypeCroppingImagesCopyingandPastingImagesontoOtherImagesResizinganImageRotatingandFlippingImagesChangingIndividualPixels

Project:AddingaLogo

Step1:OpentheLogoImageStep2:LoopOverAllFilesandOpenImagesStep3:ResizetheImagesStep4:AddtheLogoandSavetheChangesIdeasforSimilarPrograms

DrawingonImages

DrawingShapes

PointsLinesRectanglesEllipsesPolygonsDrawingExample

DrawingText

SummaryPracticeQuestionsPracticeProjects

ExtendingandFixingtheChapterProjectProgramsIdentifyingPhotoFoldersontheHardDriveCustomSeatingCards

18.ControllingtheKeyboardandMousewithGUIAutomation

InstallingthepyautoguiModule

StayingonTrack

ShuttingDownEverythingbyLoggingOutPausesandFail-Safes

ControllingMouseMovement

MovingtheMouseGettingtheMousePosition

Project:“WhereIstheMouseRightNow?”

Step1:ImporttheModuleStep2:SetUptheQuitCodeandInfiniteLoopStep3:GetandPrinttheMouseCoordinates

ControllingMouseInteraction

ClickingtheMouseDraggingtheMouseScrollingtheMouse

WorkingwiththeScreen

GettingaScreenshotAnalyzingtheScreenshot

Project:ExtendingthemouseNowProgramImageRecognitionControllingtheKeyboard

SendingaStringfromtheKeyboardKeyNamesPressingandReleasingtheKeyboardHotkeyCombinations

ReviewofthePyAutoGUIFunctionsProject:AutomaticFormFiller

Step1:FigureOuttheStepsStep2:SetUpCoordinatesStep3:StartTypingDataStep4:HandleSelectListsandRadioButtonsStep5:SubmittheFormandWait

SummaryPracticeQuestions

PracticeProjects

LookingBusyInstantMessengerBotGame-PlayingBotTutorial

A.InstallingThird-PartyModules

ThepipToolInstallingThird-PartyModules

B.RunningPrograms

ShebangLineRunningPythonProgramsonWindowsRunningPythonProgramsonOSXandLinux

C.AnswerstothePracticeQuestions

Chapter1Chapter2Chapter3Chapter4Chapter5Chapter6Chapter7Chapter8Chapter9Chapter10Chapter11Chapter12Chapter13Chapter14Chapter15Chapter16Chapter17Chapter18

D.ResourcesIndexCopyright