r object-oriented programmingdl.booktolearn.com/ebooks2/computer/programming/... · mzabalazo z....
TRANSCRIPT
RObject-orientedProgramming
TableofContents
RObject-orientedProgramming
Credits
AbouttheAuthor
AbouttheReviewers
www.PacktPub.com
Supportfiles,eBooks,discountoffers,andmore
Whysubscribe?
FreeaccessforPacktaccountholders
Preface
Whatthisbookcovers
Whatyouneedforthisbook
Whothisbookisfor
Conventions
Readerfeedback
Customersupport
Downloadingtheexamplecode
Errata
Copyrightviolations
Questions
1.DataTypes
Assignment
Theworkspace
Discretedatatypes
Integer
Logical
Character
Factors
Continuousdatatypes
Double
Complex
Specialdatatypes
Notesontheasandisfunctions
Summary
2.OrganizingData
Basicdatastructures
Vectors
Lists
Dataframes
Tables
Matricesandarrays
Censoringdata
Appendingrowsandcolumns
Operationsondatastructures
Theapplycommands
apply
lapplyandsapply
tapply
mapply
Summary
3.SavingDataandPrintingResults
Fileanddirectoryinformation
Enteringdata
Enteringdatafromthecommandline
Readingtablesfromfiles
CSVfiles
Fixed-widthfiles
Printingresultsandsavingdata
Savingaworkspace
Thecatcommand
Theprint,format,andpastecommands
Primitiveinput/output
Networkoptions
Openingasocket
Basicsocketoperations
Summary
4.CalculatingProbabilitiesandRandomNumbers
Overview
Distributionfunctions
Cumulativedistributionfunctions
Inversecumulativedistributionfunctions
Generatingpseudorandomnumbers
Sampling
Summary
5.CharacterandStringOperations
Basicstringoperations
Sixfocusedtasks
Determiningthelengthofastring
Locationofasubstring
Extractingorchangingasubstring
Transformingthecase
Splittingstrings
Creatingformattedstrings
Regularexpressions
Summary
6.ConvertingandDefiningTimeVariables
Introductionandassumptions
Convertingstringstotimedatatypes
Convertingtimedatatypestostrings
Operationsontimedatatypes
Summary
7.BasicProgramming
Conditionalexecution
Loopconstructs
Theforloop
Thewhileloop
Therepeatloop
Breakandnextstatements
Functions
Definingafunction
Argumentstofunctions
Scope
Executingscripts
Summary
8.S3Classes
Definingclassesandmethods
Definingobjectsandinheritance
Encapsulation
Afinalnote
Summary
9.S4Classes
IntroducingtheAntclass
DefininganS4class
DefiningmethodsforanS4class
Definingnewmethods
Polymorphism
Extendingtheexistingmethods
Inheritance
Miscellaneousnotes
Summary
10.CaseStudy–CourseGrades
Overview
TheCourseclass
ThedefinitionoftheCourseclass
Readinggradesfromafile
Theassignmentclasses
TheNumericGradeclass
TheLetterGradeclass
Example–readinggradesfromafile
Definingindexingoperations
Redefiningexistingfunctions
Redefiningarithmeticoperations
Summary
11.CaseStudy–Simulation
Thesimulationclasses
TheMonte-Carloclass
Examples
Summary
A.PackageManagement
Index
RObject-orientedProgramming
RObject-orientedProgrammingCopyright©2014PacktPublishing
Allrightsreserved.Nopartofthisbookmaybereproduced,storedinaretrievalsystem,ortransmittedinanyformorbyanymeans,withoutthepriorwrittenpermissionofthepublisher,exceptinthecaseofbriefquotationsembeddedincriticalarticlesorreviews.
Everyefforthasbeenmadeinthepreparationofthisbooktoensuretheaccuracyoftheinformationpresented.However,theinformationcontainedinthisbookissoldwithoutwarranty,eitherexpressorimplied.Neithertheauthor,norPacktPublishing,anditsdealersanddistributorswillbeheldliableforanydamagescausedorallegedtobecauseddirectlyorindirectlybythisbook.
PacktPublishinghasendeavoredtoprovidetrademarkinformationaboutallofthecompaniesandproductsmentionedinthisbookbytheappropriateuseofcapitals.However,PacktPublishingcannotguaranteetheaccuracyofthisinformation.
Firstpublished:October2014
Productionreference:1201014
PublishedbyPacktPublishingLtd.
LiveryPlace
35LiveryStreet
BirminghamB32PB,UK.
ISBN978-1-78398-668-2
www.packtpub.com
CreditsAuthor
KellyBlack
Reviewers
MzabalazoZ.Ngwenya
PrabhanjanTattar
TengfeiYin
CommissioningEditor
AkramHussain
AcquisitionEditor
RichardBrookes-Bland
ContentDevelopmentEditor
ParitaKhedekar
TechnicalEditor
TanviBhatt
CopyEditors
SimranBhogal
SarangChari
AmeeshaGreen
PaulHindle
KarunaNarayanan
ProjectCoordinator
NehaThakur
Proofreaders
SimranBhogal
MariaGould
AmeeshaGreen
PaulHindle
Indexer
PriyaSane
Graphics
DishaHaria
AbhinashSahu
ProductionCoordinator
AlwinRoy
CoverWork
AlwinRoy
AbouttheAuthorKellyBlackisafacultymemberintheDepartmentofMathematicsatClarksonUniversity.Hisbackgroundisinnumericalanalysiswithafocusontheuseofspectralmethodsandstochasticdifferentialequations.HemakesextensiveuseofRintheanalysisoftheresultsofMonte-Carlosimulations.
InadditiontousingRforhisresearchinterests,KellyalsousestheRenvironmentforhisstatisticsclasses.HehasextensiveexperiencesharinghisexperienceswithRintheclassroom.TheuseofRtoexploredatasetsisanimportantpartofthecurriculum.
Iwouldliketothankmywifeanddaughterfortheirsupportandinspiration.Youaremylife,andwedothistogether.
Ialsowishtothanktothankthepeoplewhotookadirectroleinbringingthisbooktocompletion.Inparticular,thanksgototheContentDevelopmentEditor,ParitaKhedekar,forherworkinassemblingeverything,keepingmeontrack,andensuringtheoverallintegrityofthisbook.AdditionalthanksgototheTechnicalEditor,TanviBhatt,formaintainingtheintegrityofthebookasawholeand,technicalguidance.Thankyou!
Iwouldalsoliketothankthereviewers.Unfortunately,someofyourinsightswerenotabletobeintegratedintothiswork.Ididreadyourreviewsandvaluedthem.Itriedtobalanceyourconcernstothebestofmyability.Thankyouaswell.
AbouttheReviewersMzabalazoZ.Ngwenyahasworkedextensivelyinthefieldofstatisticalconsultingandcurrentlyworksasabiometrician.HehasanMScinMathematicalStatisticsfromtheUniversityofCapeTownandispresentlystudyingforaPhD.Hisresearchinterestsincludestatisticalcomputing,machinelearning,andspatialstatistics.Previously,hewasinvolvedinreviewingPacktPublishing’sLearningRStudioforRStatisticalComputing,RStatisticalApplicationDevelopmentbyExampleBeginner’sGuide,andMachineLearningwithR.
PrabhanjanTattariscurrentlyworkingasaBusinessAnalysisSeniorAdvisoratDellGlobalAnalytics,Dell.Hehas7yearsofexperienceasastatisticalanalyst.Survivalanalysisandstatisticalinferencearehismainareasofresearch/interest,andhehaspublishedseveralresearchpapersinpeer-reviewedjournalsandalsoauthoredtwobooksonR:RStatisticalApplicationDevelopmentbyExample,PacktPublishing,andACourseinStatisticswithR,NarosaPublishing.TheRpackages,gpkandRSADBEarealsomaintainedbyhim.
TengfeiYinearnedhisPhDinMolecular,Cellular,andDevelopmentalBiology(MCDB)withafocusoncomputationalbiologyandbioinformaticsfromIowaStateUniversity,withaminorinStatistics.Hisresearchinterestsincludeinformationvisualization,high-throughputbiologicaldataanalysis,datamining,andappliedstatisticalgenetics.HehasdevelopedandmaintainedseveralsoftwarepackagesinRandBioconductor.
www.PacktPub.com
Supportfiles,eBooks,discountoffers,andmoreYoumightwanttovisitwww.PacktPub.comforsupportfilesanddownloadsrelatedtoyourbook.
DidyouknowthatPacktofferseBookversionsofeverybookpublished,withPDFandePubfilesavailable?YoucanupgradetotheeBookversionatwww.PacktPub.comandasaprintbookcustomer,youareentitledtoadiscountontheeBookcopy.Getintouchwithusat<[email protected]>formoredetails.
Atwww.PacktPub.com,youcanalsoreadacollectionoffreetechnicalarticles,signupforarangeoffreenewslettersandreceiveexclusivediscountsandoffersonPacktbooksandeBooks.
http://PacktLib.PacktPub.com
DoyouneedinstantsolutionstoyourITquestions?PacktLibisPackt’sonlinedigitalbooklibrary.Here,youcanaccess,readandsearchacrossPackt’sentirelibraryofbooks.
Whysubscribe?FullysearchableacrosseverybookpublishedbyPacktCopyandpaste,printandbookmarkcontentOndemandandaccessibleviawebbrowser
FreeaccessforPacktaccountholdersIfyouhaveanaccountwithPacktatwww.PacktPub.com,youcanusethistoaccessPacktLibtodayandviewnineentirelyfreebooks.Simplyuseyourlogincredentialsforimmediateaccess.
PrefaceTheRenvironmentisapowerfulsoftwaresuitethatstartedasamodelfortheSlanguageoriginallydevelopedatBellLaboratories.TheoriginalcodebasewascreatedbyRossIhakaandRobertGentlemanin1993.Itrapidlygrewwiththehelpofothers,andithassincebecomeastandardinstatisticalcomputing.Thesoftwaresuiteitselfhasgrownwellbeyondanimplementationofalanguageandhasbecomean“environment”.Itisextensible,andthewidevarietyofpackagesthatareavailablehelpmakeitapowerfulresourcethatcontinuestogrowinpopularityandpower.
OuraiminthisbookistoprovidearesourceforprogrammingusingtheRlanguage,andweassumethatyouwillbemakinguseoftheRenvironmenttoimplementandtestyourcode.Thebookcanberoughlydividedintofourparts.Inthefirstpart,weprovideadiscussionofthebasicideasandtopicsnecessarytounderstandhowRclassifiesdataandtheoptionsthatcanbeusedtomakecalculationsfromdata.Inthesecondpart,weprovideadiscussionofhowRorganizesdataandtheoptionsavailabletokeeptrackofdata,displaydata,andreadandsavedata.Inthethirdpart,weprovideadiscussiononprogrammingtopicsspecifictotheRlanguageandtheoptionsavailableforobject-orientedprogramming.Inthefourthpart,weprovideseveralextendedexamplesasawaytodemonstratehowallofthetopicscanfittogethertosolveproblems.
WhatthisbookcoversAlistofthechaptersisgivenhere.Thefirstthreechaptersfocusonthebasicrequirementsassociatedwithgettingdataintothesystemandthemostbasictasksassociatedwithcalculationsassociatedwithdata.Thenextthreechaptersfocusonthemiscellaneousissuesthatariseinpracticewhenworkingwithandexaminingdataincludingthemechanicsofdealingwithdifferentdatatypes.Thenextthreechaptersfocusonbasicandadvancedprogrammingtopics.Thefinalthreechaptersprovidemoredetailedexamplestodemonstratehowalloftheideascanbebroughttogethertosolveproblems.
Chapter1,DataTypes,offersabroadoverviewofthedifferentdatatypes.Thisincludesbasicrepresentationssuchasfloat,double,complex,factors,andintegerrepresentations,anditalsoincludesexamplesofhowtoentervectorsthroughtheinteractiveshell.AbriefdiscussionofthemostbasicoperationsandhowtointeractwiththeRshellisalsogiven.
Chapter2,OrganizingData,offersamoredetailedlookatthewaydataisorganizedwithintheRenvironment.Additionaltopicsincludehowtoaccessthedataaswellashowtoperformbasicoperationsonthevariousdatastructures.Theprimarydatastructuresexaminedarelists,arrays,tables,anddataframes.
Chapter3,SavingDataandPrintingResults,offersadetailedlookatthewaystobringdataintotheRenvironmentandbuildsonthetopicsdiscussedinthepreviouschapter.Additionaltopicsrevolvearoundthewaystodisplayresultsaswellasvariouswaystosavedata.
Chapter4,CalculatingProbabilitiesandRandomNumbers,offersadetailedexaminationoftheprobabilityandsamplingfeaturesoftheRlanguage.TheRenvironmentincludesanumberoffeaturestoaidinthewaydatacanbeanalyzed.Anystatisticalanalysisincludesanunderlyingrelianceonprobability,anditisatopicthatcannotbeignored.TheavailabilityofawidevarietyofprobabilityandsamplingoptionsisoneofthestrengthsoftheRlanguage,andweexploresomeoftheoptionsinthischapter.
Chapter5,CharacterandStringOperations,offersadetailedexaminationofthevariousoptionsavailableforexamining,testing,andperformingoperationsonstrings.Thisisanimportanttopicbecauseitisnotuncommonfordatasetstohaveinconsistencies,andaroutinethatreadsdatafromafileshouldincludesomebasicchecks.
Chapter6,ConvertingandDefiningTimeVariables,offersadetailedexaminationofthetimedatastructure.Abasicintroductionisgiveninthefirstchapter,andmoredetailsareprovidedinthischapter.Theprevalenceoftime-relateddatamakesthetopicofthesedatastructurestooimportanttoignore.
Chapter7,BasicProgramming,offersadetailedexaminationofthemostbasicflowcontrolsandprogrammingfeaturesoftheRlanguage.Thechapterprovidesdetailsaboutconditionalexecutionaswellasthevariousloopingconstructs.Additionally,mundanetopicsassociatedwithwritingprograms,execution,andformattingarealsodiscussed.
Chapter8,S3Classes,offersadetailedexaminationofS3classes.Thisisthefirstand
mostcommonapproachtoobject-orientedprogramming.TheuseofS3classescanbeconfusingtopeoplealreadyfamiliarwithobject-orientedprogramming,buttheirflexibilityhasmadethemapopularwaytoapproachobject-orientedprogramminginR.
Chapter9,S4Classes,offersadetailedexaminationofS4classes.Thisisamorerecentapproachtoobject-orientedprogrammingcomparedtoS3classes.Itisamorestructuredapproachandismorefamiliartopeoplewhohaveexperiencewithobject-orientedprogramming.
Chapter10,CaseStudy–CourseGrades,offersanin-depthexampleofagrade-trackingapplication.Thisisthefirstofthreeexamples,anditisthesimplestexample.Itwaschosenasitissomethingthatislikelytobemorefamiliartoawiderrangeofpeople.
Chapter11,CaseStudy–Simulation,offersanin-depthexampleofanapplicationthatisusedtogeneratedatabasedonMonte-Carlosimulations.Theapplicationdemonstrateshowanobject-orientedapproachcanbeusedtocreateanenvironmentusedtoexecutesimulations,organizetheresults,andperformabasicanalysisontheresults.
Chapter12,CaseStudy–Regression,offersanin-depthexampleofanapplicationthatoffersawiderangeofoptionsyoucanusetoperformregression.Regressionisacommontaskandoccursinawidevarietyofcontexts.Theapplicationthatisdevelopeddemonstratesaflexiblewaytohandlebothcontinuousandordinaldataasawaytodemonstratetheuseofaflexibleobject-orientedapproach.Youcandownloadthischapterformhttps://www.packtpub.com/sites/default/files/downloads/6682OS_Case_Study_Regression.pdf
Appendix,PackageManagement,givesabriefoverviewofinstalling,updating,andremovingpackagesisgiven.PackagesarelibrariesthatcanbeaddedtoRthatextenditscapabilities.BeingabletoextendRandmakeuseofotherlibrariesrepresentsoneR’smorepowerfulfeatures.
WhatyouneedforthisbookItisassumedthatyouwillbeworkingintheRenvironment,andtheexamplecodehasbeendevelopedandtestedforRversion3.0.1andlater.TheRenvironmentisatypeoffreesoftwareandismadeavailablethroughtheeffortsandgenerosityoftheRFoundation.Itcanbedownloadedfromhttp://www.r-project.org/.ThematerialinthefirsthalfofthebookassumesthatyouhaveaccesstoRandcanworkfromtheinteractivecommandlinewithintheRenvironment.Thematerialinthesecondhalfofthebookassumesthatyouarefamiliarwithprogrammingandcanwriteandsavecomputercode.Ataminimum,youshouldhaveaccesstoaprogrammingeditorandshouldbefamiliarwithdirectorystructuresandsearchpaths.
WhothisbookisforIfyouarefamiliarwithprogrammingandwishtogainabasicunderstandingoftheRenvironmentandlearnhowtocreateprogrammingapplicationsusingtheRlanguage,thisisthebookforyou.ItisassumedthatyouhavesomeexposuretotheRenvironmentandhaveabasicunderstandingofR.Thisbookdoesnotprovideextensivemotivationsforcertainapproachesandpracticesassumingthatthereaderiscomfortableinthedevelopmentofsoftwareapplications.
ConventionsInthisbook,youwillfindanumberofstylesoftextthatdistinguishbetweendifferentkindsofinformation.Herearesomeexamplesofthesestyles,andanexplanationoftheirmeaning.
Codewordsintext,databasetablenames,foldernames,filenames,fileextensions,pathnames,dummyURLs,userinput,andTwitterhandlesareshownasfollows:“Alistiscreatedusingthelistcommand,andavariablecanbetestedorcoercedusingtheis.listandas.listcommands.”
Ablockofcodeissetasfollows:
>x=rnorm(5,mean=10,sd=3)
>x
[1]11.1727198.78428410.0740355.73517110.800138
>pnorm(abs(x-10),mean=0,sd=3)-pnorm(-abs(x-10),mean=0,sd=3)
[1]0.304133630.314698030.019688490.844860370.21030971
>
Whenwewishtodrawyourattentiontoaparticularpartofacodeblock,therelevantlinesoritemsaresetinbold:
>v<-c(1,3,5,7,-10)
>v
[1]1357-10
>v[4]
[1]7
>v[2]<-v[1]-v[5]
>v
[1]11157-10
Newtermsandimportantwordsareshowninbold.
NoteWarningsorimportantnotesappearinaboxlikethis.
TipTipsandtricksappearlikethis.
ReaderfeedbackFeedbackfromourreadersisalwayswelcome.Letusknowwhatyouthinkaboutthisbook—whatyoulikedormayhavedisliked.Readerfeedbackisimportantforustodeveloptitlesthatyoureallygetthemostoutof.
Tosendusgeneralfeedback,simplysendane-mailto<[email protected]>,andmentionthebooktitleviathesubjectofyourmessage.
Ifthereisatopicthatyouhaveexpertiseinandyouareinterestedineitherwritingorcontributingtoabook,seeourauthorguideonwww.packtpub.com/authors.
CustomersupportNowthatyouaretheproudownerofaPacktbook,wehaveanumberofthingstohelpyoutogetthemostfromyourpurchase.
DownloadingtheexamplecodeYoucandownloadtheexamplecodefilesforallPacktbooksyouhavepurchasedfromyouraccountathttp://www.packtpub.com.Anadditionalsourcefortheexamplesinthisbookcanbefoundathttps://github.com/KellyBlack/R-Object-Oriented-Programming.Ifyoupurchasedthisbookelsewhere,youcanvisithttp://www.packtpub.com/supportandregistertohavethefilese-maileddirectlytoyou.
ErrataAlthoughwehavetakeneverycaretoensuretheaccuracyofourcontent,mistakesdohappen.Ifyoufindamistakeinoneofourbooks—maybeamistakeinthetextorthecode—wewouldbegratefulifyouwouldreportthistous.Bydoingso,youcansaveotherreadersfromfrustrationandhelpusimprovesubsequentversionsofthisbook.Ifyoufindanyerrata,pleasereportthembyvisitinghttp://www.packtpub.com/submit-errata,selectingyourbook,clickingontheerratasubmissionformlink,andenteringthedetailsofyourerrata.Onceyourerrataareverified,yoursubmissionwillbeacceptedandtheerratawillbeuploadedonourwebsite,oraddedtoanylistofexistingerrata,undertheErratasectionofthattitle.Anyexistingerratacanbeviewedbyselectingyourtitlefromhttp://www.packtpub.com/support.
CopyrightviolationsViolationofcopyrightlawsformaterialontheInternetisanongoingproblemacrossallmedia.AtPackt,wetaketheprotectionofourcopyrightandlicensesveryseriously.Ifyoucomeacrossanyillegalcopiesofourworks,inanyform,ontheInternet,pleaseprovideuswiththelocationaddressorwebsitenameimmediatelysothatwecanpursuearemedy.
Pleasecontactusat<[email protected]>withalinktothesuspectedpiratedmaterial.
Weappreciateyourhelpinprotectingourauthors,andourabilitytobringyouvaluablecontent.
QuestionsYoucancontactusat<[email protected]>ifyouarehavingaproblemwithanyaspectofthebook,andwewilldoourbesttoaddressit.
Chapter1.DataTypesInthischapter,weprovideabroadoverviewofthedifferentdatatypesavailableintheRenvironment.Thismaterialisintroductoryinnature,andthischapterensuresthatimportantinformationonimplementingalgorithmsisavailabletoyou.Thereareroughlyfivepartsinthischapter:
WorkingwithvariablesintheRenvironment:ThissectiongivesyouabroadoverviewofinteractingwiththeRshell,creatingvariables,deletingvariables,savingvariables,andloadingvariablesDiscretedatatypes:ThissectiongivesyouanoverviewoftheprincipledatatypesusedtorepresentdiscretedataContinuousdatatypes:ThissectiongivesyouanoverviewoftheprincipledatatypesusedtorepresentcontinuousdataIntroductiontovectors:ThissectiongivesyouanintroductiontovectorsandmanipulatingvectorsinRSpecialdatatypes:Thissectiongivesyoualistofotherdatatypesthatdonotfitintheothercategoriesorhaveothermeanings
AssignmentTheRenvironmentisaninteractiveshell.Commandsareenteredusingthekeyboard,andtheenvironmentshouldfeelfamiliartoanyoneusedtoMATLABorthePythoninteractiveinterpreter.Toassignavaluetoavariable,youcanusuallyusethe=symbolinthesamewayastheseotherinterpreters.ThedifferencewithR,however,isthatthereareotherwaystoassignavariable,andtheirbehaviordependsonthecontext.
Anotherwaytoassignavaluetoavariableistousethe<-symbols(sometimescalledoperators).Atfirstglance,itseemsoddtohavedifferentwaystoassignavalue,butwewillseethatvariablescanbesavedindifferentenvironments.Thesamenamemaybeusedindifferentenvironments,andthenamecanbeambiguous.Wewilladopttheuseofthe<-operatorinthistextbecauseitisthemostcommonoperator,anditisalsotheleastlikelytocauseconfusionindifferentcontexts.
TheRenvironmentmanagesmemoryandvariablenamesdynamically.Tocreateanewvariable,simplyassignavaluetoit,asfollows:
>a<-6
>a
[1]6
Avariablehasascope,andthemeaningofavariablenamecanvarydependingonthecontext.Forexample,ifyourefertoavariablewithinafunction(thinksubroutine)orafterattachingadataset,thentheremaybemultiplevariablesintheworkspacewiththesamename.TheRenvironmentmaintainsasearchpathtodeterminewhichvariabletouse,andwewilldiscussthesedetailsastheyarise.
The<-operatorfortheassignmentwillworkinanycontextwhilethe=operatoronlyworksforcompleteexpressions.Anotheroptionistousethe<<-operator.Theadvantageofthe<<-operatoristhatitinstructstheRenvironmenttosearchparentenvironmentstoseewhetherthevariablealreadyexists.Insomecontexts,withinafunctionforexample,the<-operatorwillcreateanewvariable;however,the<<-operatorwillmakeuseofanexistingvariableoutsideofthefunctionifitisfound.
Anotherwaytoassignvariablesistousethe->and->>operators.Theseoperatorsaresimilartothosegivenpreviously.Theonlydifferenceisthattheyreversethedirectionofassignment,asfollows:
>14.5->a
>1/12.0->>b
>a
[1]14.5
>b
[1]0.08333333
TheworkspaceTheRenvironmentkeepstrackofvariablesaswellasallocatesandmanagesmemoryasitisrequested.Onecommandtolistthecurrentlydefinedvariablesisthelscommand.Avariablecanbedeletedusingthermcommand.Inthefollowingexample,theaandbvariableshavebeenchanged,andtheavariableisdeleted:
>a<-17.5
>b<-99/4
>ls()
[1]"a""b"
>objects()
[1]"a""b"
>rm(a)
>ls()
[1]"b"
Ifyouwishtodeleteallofthevariablesintheworkspace,thelistoptioninthermcommandcanbecombinedwiththelscommand,asfollows:
>ls()
[1]"b"
>rm(list=ls())
>ls()
character(0)
Awidevarietyofotheroptionsareavailable.Forexample,therearedirectoryoptionstoshowandsetthecurrentdirectory,asfollows:
>getwd()
[1]"/home/black"
>setwd("/tmp")
>getwd()
[1]"/tmp"
>dir()
[1]"antActivity.R""betterS3.R"
[3]"chiSquaredArea.R""firstS3.R"
[5]"math100.csv""opsTesting.R"
[7]"probabilityExampleOne.png""s3.R"
[9]"s4Example.R"
Anotherimportanttaskistosaveandloadaworkspace.Thesaveandsave.imagecommandscanbeusedtosavethecurrentworkspace.Thesavecommandallowsyoutosaveaparticularvariable,andthesave.imagecommandallowsyoutosavetheentireworkspace.Theusageofthesecommandsisasfollows:
>save(a,file="a.RData")
>save.image("wholeworkspace.Rdata")
Thesecommandshaveavarietyofoptions.Forexample,theasciioptionisacommonlyusedoptiontoensurethatthedatafileisina(nearly)human-readableform.Thehelpcommandcanbeusedtogetmoredetailsandseemoreoftheoptionsthatareavailable.Inthefollowingexample,thevariableaissavedinafile,a.RData,andthefileissavedina
human-readableformat:
>save(a,file="a.RData",ascii=TRUE)
>save.image("wholeworkspace.RData",ascii=TRUE)
>help(save)
Asanalternativetothehelpcommand,the?operatorcanalsobeusedtogetthehelppageforagivencommand.Anadditionalcommandisthehelp.searchcommandthatisusedtosearchthehelpfilesforagivenstring.The??operatorisalsoavailabletoperformasearchforagivenstring.
Theinformationinafilecanbereadbackintotheworkspaceusingtheloadcommand:
>load("a.RData")
>ls()
[1]"a"
>a
[1]19
Anotherquestionthatariseswithrespecttoavariableishowitisstored.Thetwocommandstodeterminethisaremodeandstorage.mode.Youshouldtrytousethesecommandsforeachofthedatatypesdescribedinthefollowingsubsections.Basically,thesecommandscanmakeiteasiertodeterminewhetheravariableisanumericvalueoranotherbasicdatatype.
Thepreviouscommandsprovideoptionsforsavingthevaluesofthevariableswithinaworkspace.Theydonotsavethecommandsthatyouhaveentered.ThesecommandsarereferredtoasthehistorywithintheRworkspace,andyoucansaveyourhistoryusingthesavehistorycommand.Thehistorycanbedisplayedusingthehistorycommand,andtheloadhistorycommandcanbeusedtoreplaythecommandsinafile.
Thelastcommandgivenhereisthecommandtoquit,q().SomepeopleconsiderthistobethemostimportantcommandbecausewithoutityouwouldneverbeabletoleaveR.Therestofusarenotsurewhyitisnecessary.
DiscretedatatypesOneofthefeaturesoftheRenvironmentistherichcollectionofdatatypesthatareavailable.Here,webrieflylistsomeofthebuilt-indatatypesthatdescribediscretedata.Thefourdatatypesdiscussedaretheinteger,logical,character,andfactordatatypes.Wealsointroducetheideaofavector,whichisthedefaultdatastructureforanyvariable.AlistofthecommandsdiscussedhereisgiveninTable2andTable3.
ItshouldbenotedthatthedefaultdatatypeinR,foranumber,isadoubleprecisionnumber.Stringscanbeinterpretedinavarietyofways,usuallyaseitherastringorafactor.YoushouldbecarefultomakesurethatRisstoringinformationintheformatthatyouwant,anditisimportanttodouble-checkthisimportantaspectofhowdataistracked.
IntegerThefirstdiscretedatatypeexaminedistheintegertype.Valuesare32-bitintegers.Inmostcircumstances,anumbermustbeexplicitlycastasbeinganinteger,asthedefaulttypeinRisadoubleprecisionnumber.Thereareavarietyofcommandsusedtocastintegersaswellasallocatespaceforintegers.Theintegercommandtakesanumberforanargumentandwillreturnavectorofintegerswhoselengthisgivenbytheargument:
>bubba<-integer(12)
>bubba
[1]000000000000
>bubba[1]
[1]0
>bubba[2]
[1]0
>bubba[[4]]
[1]0
>b[4]<-15
>b
[1]0001500000000
Intheprecedingexample,avectoroftwelveintegerswasdefined.Thedefaultvaluesarezero,andtheindividualentriesinthevectorareaccessedusingbraces.Thefirstentryinthevectorhasindex1,sointhisexample,bubba[1]referstotheinitialentryinthevector.Notethattherearetwowaystoaccessanelementinthevector:singleversusdoublebraces.Foravector,thetwomethodsarenearlythesame,butwhenweexploretheuseoflistsasopposedtovectors,themeaningwillchange.Inshort,thedoublebracesreturnobjectsofthesametypeastheelementswithinthevector,andthesinglebracesreturnvaluesofthesametypeasthevariableitself.Forexample,usingsinglebracesonalistwillreturnalist,whiledoublebracesmayreturnavector.
Anumbercanbecastasanintegerusingtheas.integercommand.Avariable’stypecanbecheckedusingthetypeofcommand.ThetypeofcommandindicateshowRstorestheobjectandisdifferentfromtheclasscommand,whichisanattributethatyoucanchangeorquery:
>as.integer(13.2)
[1]13
>thisNumber<-as.integer(8/3)
>typeof(thisNumber)
[1]"integer"
Notethatasequenceofnumberscanbeautomaticallycreatedusingeitherthe:operatorortheseqcommand:
>1:5
[1]12345
>myNum<-as.integer(1:5)
>myNum[1]
[1]1
>myNum[3]
[1]3
>seq(4,11,by=2)
[1]46810
>otherNums<-seq(4,11,by=2)
>otherNums[3]
[1]8
Acommontaskistodeterminewhetherornotavariableisofacertaintype.Forintegers,theis.integercommandisusedtodeterminewhetherornotavariablehasanintegertype:
>a<-1.2
>typeof(a)
[1]"double"
>is.integer(a)
[1]FALSE
>a<-as.integer(1.2)
>typeof(a)
[1]"integer"
>is.integer(a)
[1]TRUE
LogicalLogicaldataconsistsofvariablesthatareeithertrueorfalse.ThewordsTRUEandFALSEareusedtodesignatethetwopossiblevaluesofalogicalvariable.(TheTRUEvaluecanalsobeabbreviatedtoT,andtheFALSEvaluecanbeabbreviatedtoF.)Thebasiccommandsassociatedwithlogicalvariablesaresimilartothecommandsforintegersdiscussedintheprevioussubsection.ThelogicalcommandisusedtoallocateavectorofBooleanvalues.Inthefollowingexample,alogicalvectoroflength10iscreated.ThedefaultvalueisFALSE,andtheBooleannotoperatorisusedtoflipthevaluestoevaluatetoTRUE:
>b<-logical(10)
>b
[1]FALSEFALSEFALSEFALSEFALSEFALSEFALSEFALSEFALSEFALSE
>b[3]
[1]FALSE
>!b
[1]TRUETRUETRUETRUETRUETRUETRUETRUETRUETRUE
>!b[5]
[1]TRUE
>typeof(b)
[1]"logical"
>mode(b)
[1]"logical"
>storage.mode(b)
[1]"logical"
>b[3]<-TRUE
>b
[1]FALSEFALSETRUEFALSEFALSEFALSEFALSEFALSEFALSEFALSE
Tocastavaluetoalogicaltype,youcanusetheas.logicalcommand.NotethatzeroismappedtoavalueofFALSEandothernumbersaremappedtoavalueofTRUE:
>a<--1:1
>a
[1]-101
>as.logical(a)
[1]TRUEFALSETRUE
Todeterminewhetherornotavaluehasalogicaltype,youusetheis.logicalcommand:
>b<-logical(4)
>b
[1]FALSEFALSEFALSEFALSE
>is.logical(b)
[1]TRUE
Thestandardoperatorsforlogicaloperationsareavailable,andalistofsomeofthemorecommonoperationsisgiveninTable1.Notethatthereisadifferencebetweenoperationssuchas&and&&.Asingle&isusedtoperformanandoperationoneachpairwiseelementoftwovectors,whilethedouble&&returnsasinglelogicalresultusingonlythefirstelementsofthevectors:
>l1<-c(TRUE,FALSE)
>l2<-c(TRUE,TRUE)
>l1&l1
[1]TRUEFALSE
>l1&&l1
[1]TRUE
>l1|l2
[1]TRUETRUE
>l1||l2
[1]TRUE
TipYoucandownloadtheexamplecodefilesforallPacktbooksyouhavepurchasedfromyouraccountathttp://www.packtpub.com.Anadditionalsourcefortheexamplesinthisbookcanbefoundathttps://github.com/KellyBlack/R-Object-Oriented-Programming.Ifyoupurchasedthisbookelsewhere,youcanvisithttp://www.packtpub.com/supportandregistertohavethefilese-maileddirectlytoyou.
Thefollowingtableshowsvariouslogicaloperatorsandtheirdescription:
LogicalOperator Description
< Lessthan
> Greaterthat
<= Lessthanorequalto
>= Greaterthanorequalto
== Equalto
!= Notequalto
| Entrywiseor
|| Or
! Not
& Entrywiseand
&& And
xor(a,b) Exclusiveor
Table1–listofoperatorsforlogicalvariables
CharacterOnecommonwaytostoreinformationistosavedataascharactersorstrings.Characterdataisdefinedusingeithersingleordoublequotes:
>a<-"hello"
>a
[1]"hello"
>b<-'there'
>b
[1]"there"
>typeof(a)
[1]"character"
Thecharactercommandcanbeusedtoallocateavectorofcharacter-valuedstrings,asfollows:
>many<-character(3)
>many
[1]""""""
>many[2]<-"thisisthesecond"
>many[3]<-'yo,third!'
>many[1]<-"andthefirst"
>many
[1]"andthefirst""thisisthesecond""yo,third!"
Avaluecanbecastasacharacterusingtheas.charactercommand,asfollows:
>a<-3.0
>a
[1]3
>b<-as.character(a)
>b
[1]"3"
Finally,theis.charactercommandtakesasingleargument,anditreturnsavalueofTRUEiftheargumentisastring:
>a<-as.character(4.5)
>a
[1]"4.5"
>is.character(a)
[1]TRUE
FactorsAnothercommonwaytorecorddataistoprovideadiscretesetoflevels.Forexample,theresultsofanindividualtrialinanexperimentmaybedenotedbyavalueofa,b,orc.OrdinaldataofthiskindisreferredtoasafactorinR.Thecommandsandideasareroughlyparalleltothedatatypesdescribedpreviously.Therearesomesubtledifferenceswithfactors,though.Factorsareusedtodesignatedifferentlevelsandcanbeconsideredorderedorunordered.Therearealargenumberofoptions,anditiswisetoconsultthehelppagesforfactorsusingthe(help(factor))command.Onethingtonote,though,isthatthetypeofcommandforafactorwillreturnaninteger.
Factorscanbedefinedusingthefactorcommand,asfollows:
>lev<-factor(x=c("one","two","three","one"))
>lev
[1]onetwothreeone
Levels:onethreetwo
>levels(lev)
[1]"one""three""two"
>sort(lev)
[1]oneonetwothree
Levels:onetwothree
>lev<-
factor(x=c("one","two","three","one"),levels=c("one","two","three"))
>lev
[1]onetwothreeone
Levels:onetwothree
>levels(lev)
[1]"one""two""three"
>sort(lev)
[1]oneonetwothree
Levels:onetwothree
Thetechniquesusedtocastavariabletoafactorortestwhetheravariableisafactoraresimilartothepreviousexamples.Avariablecanbecastasafactorusingtheas.factorcommand.Also,theis.factorcommandcanbeusedtodeterminewhetherornotavariablehasatypeoffactor.
ContinuousdatatypesThedatatypesforcontinuousdatatypesaregivenhere.Thedoubleandcomplexdatatypesaregiven.AlistofthecommandsdiscussedhereisgiveninTable2andTable3.
DoubleThedefaultnumericdatatypeinRisadoubleprecisionnumber.Thecommandsaresimilartothoseoftheintegerdatatypediscussedpreviously.Thedoublecommandcanbeusedtoallocateavectorofdoubleprecisionnumbers,andthenumberswithinthevectorareaccessedusingbraces:
>d<-double(8)
>d
[1]00000000
>typeof(d)
[1]"double"
>d[3]<-17
>d
[1]001700000
Thetechniquesusedtocastavariabletoadoubleprecisionnumberandtestwhetheravariableisadoubleprecisionnumberaresimilartotheexamplesseenpreviously.Avariablecanbecastasadoubleprecisionnumberusingtheas.doublecommand.Also,todeterminewhetheravariableisadoubleprecisionnumber,theas.doublecommandcanbeused.
ComplexArithmeticforcomplexnumbersissupportedinR,andmostmathfunctionswillreactproperlywhengivenacomplexnumber.Youcanappenditotheendofanumbertoforceittobetheimaginarypartofacomplexnumber,asfollows:
>1i
[1]0+1i
>1i*1i
[1]-1+0i
>z<-3+2i
>z
[1]3+2i
>z*z
[1]5+12i
>Mod(z)
[1]3.605551
>Re(z)
[1]3
>Im(z)
[1]2
>Arg(z)
[1]0.5880026
>Conj(z)
[1]3-2i
Thecomplexcommandcanalsobeusedtodefineavectorofcomplexnumbers.Thereare
anumberofoptionsforthecomplexcommand,soaquickcheckofthehelppage,(help(complex)),isrecommended:
>z<-complex(3)
>z
[1]0+0i0+0i0+0i
>typeof(z)
[1]"complex"
>z<-complex(real=c(1,2),imag=c(3,4))
>z
[1]1+3i2+4i
>Re(z)
[1]12
Thetechniquestocastavariabletoacomplexnumberandtotestwhetherornotavariableisacomplexnumberaresimilartothemethodsseenpreviously.Avariablecanbecastascomplexusingtheas.complexcommand.Also,totestwhetherornotavariableisacomplexnumber,theas.complexcommandcanbeused.
SpecialdatatypesTherearetwoothercommondatatypesthatoccurthatareimportant.Wewilldiscussthesetwodatatypesandprovideanoteaboutobjects.ThetwodatatypesareNAandNULL.Thesearebriefcomments,asthesearerecurringtopicsthatwewillrevisitmanytimes.
Thefirstdatatypeisaconstant,NA.Thisisatypeusedtoindicateamissingvalue.ItisaconstantinR,andavariablecanbetestedusingtheis.nacommand,asfollows:
>n<-c(NA,2,3,NA,5)
>n
[1]NA23NA5
>is.na(n)
[1]TRUEFALSEFALSETRUEFALSE
>n[!is.na(n)]
[1]235
AnotherspecialtypeistheNULLtype.IthasthesamemeaningasthenullkeywordintheClanguage.Itisnotanactualtypebutisusedtodeterminewhetherornotanobjectexists:
>a<-NULL
>typeof(a)
[1]"NULL"
Finally,we’llquicklyexplorethetermobjects.ThevariablesthatwedefinedinalloftheprecedingexamplesaretreatedasobjectswithintheRenvironment.Whenwestartwritingfunctionsandcreatingclasses,itwillbeimportanttorealizethattheyaretreatedlikevariables.ThenamesusedtoassignvariablesarejustashortcutforRtodeterminewhereanobjectislocated.
Forexample,thecomplexcommandisusedtoallocateavectorofcomplexvalues.Thecommandisdefinedtobeasetofinstructions,andthereisanobjectcalledcomplexthatpointstothoseinstructions:
>complex
function(length.out=0L,real=numeric(),imaginary=numeric(),
modulus=1,argument=0)
{
if(missing(modulus)&&missing(argument)){
.Internal(complex(length.out,real,imaginary))
}
else{
n<-max(length.out,length(argument),length(modulus))
rep_len(modulus,n)*exp((0+1i)*rep_len(argument,
n))
}
}
<bytecode:0x2489c80>
<environment:namespace:base>
Thereisadifferencebetweencallingthecomplex()functionandreferringtothesetofinstructionslocatedatcomplex.
NotesontheasandisfunctionsTwocommontasksaretodeterminewhetheravariableisofagiventypeandtocastavariabletodifferenttypes.Thecommandstodeterminewhetheravariableisofagiventypegenerallystartwiththeisprefix,andthecommandstocastavariabletoadifferenttypegenerallystartwiththeasprefix.Thelistofcommandstodeterminewhetheravariableisofagiventypearegiveninthefollowingtable:
Typetocheck Command
Integer is.integer
Logical is.logical
Character is.character
Factor is.factor
Double is.double
Complex is.complex
NA is.NA
List is.list
Table2–commandstodeterminewhetheravariableisofaparticulartype
ThecommandsusedtocastavariabletoadifferenttypearegiveninTable3.Thesecommandstakeasingleargumentandreturnavariableofthegiventype.Forexample,theas.charactercommandcanbeusedtoconvertanumbertoastring.
Thecommandsintheprevioustableareusedtotestwhattypeavariablehas.Thefollowingtableprovidesthecommandsthatareusedtochangeavariableofonetypetoanothertype:
Typetoconvertto Command
Integer as.integer
Logical as.logical
Character as.character
Factor as.factor
Double as.double
Complex as.complex
NA as.NA
List as.list
Table3–commandstocastavariableintoaparticulartype
SummaryInthischapter,weexaminedsomeofthedatatypesavailableintheRenvironment.Theseincludediscretedatatypessuchasintegersandfactors.Italsoincludescontinuousdatatypessuchasrealandcomplexdatatypes.Wealsoexaminedwaystotestavariabletodeterminewhattypeitis.
Inthenextchapter,welookatthedatastructuresthatcanbeusedtokeeptrackofdata.Thisincludesvectorsanddatatypessuchaslistsanddataframesthatcanbeconstructedfromvectors.
Chapter2.OrganizingDataInthischapter,wewillexploretheprimarydatastructuresthatareusedtoorganizedata.Someofthedetailsaboutaccessinginformationwithindatastructureswillbediscussed,andsomeofthewaystoapplydifferentoperationstopartsofthedatawithinadatastructurewillbediscussedtoo.Thereareroughlythreepartstothischapter:
Basicdatastructures:Thissectiongivesyouinformationonusingvectors,lists,dataframes,tables,matrices,andtimeseriesAccessingandmanagingmemory:ThissectiongivesyouanoverviewofthebasicwaystogainaccessandcensorspecificelementsOperationsondatastructures:Thissectiongivesyouanoverviewoftheoperationsandmethodsusedtoapplyoperationswithinthedifferentkindsofdatastructures
BasicdatastructuresThebasicdatastructuresusedtoorganizedatawithintheRenvironmentincludevectors,lists,dataframes,tables,andmatrices.Here,weprovidedetailsforeachofthesedatastructuresanddemonstratehowtocreatethem.Thischapterdoesnotincludeinformationabouthowtoreaddatafromafile,andthefocusisonthedatastructuresthemselves.MoreinformationaboutreadingfromafilecanbefoundinChapter3,SavingDataandPrintingResults.
VectorsThedefaultdatastructureinRisthevector.Forexample,ifyoudefineavariableasasinglenumber,Rwilltreatitasavectoroflengthone:
>a<-5
>a[1]
[1]5
Vectorsrepresentaconvenientandstraightforwardwaytostorealonglistofnumbers.PleaseseeChapter1,DataTypes,toseemoreexamplesofcreatingvectors.Oneusefulandcommonwaytodefineavectoristousetheccommand.Theccommandconcatenatesasetofargumentstoformasinglevector:
>v<-c(1,3,5,7,-10)
>v
[1]1357-10
>v[4]
[1]7
>v[2]<-v[1]-v[5]
>v
[1]11157-10
Twoothermethodstogeneratevectorsmakeuseofthe:notationandtheseqcommand.The:notationisusedtocreatealistofsequentiallynumberedvaluesforgivenstartandendvalues.Theseqcommanddoesthesamething,butitprovidesmoreoptionstodeterminetheincrementbetweenvaluesinthevector:
>1:5
[1]12345
>10:14
[1]1011121314
>a<-3:7
>a
[1]34567
>b<-seq(3,5)
>b
[1]345
>b<-seq(3,10,by=3)
>b
[1]369
ListsAnotherimportanttypeisthelist.Listsareflexibleandareanunstructuredwayoforganizinginformation.Alistcanbethoughtofasacollectionofnamedobjects.Alistiscreatedusingthelistcommand,andavariablecanbetestedorcoercedusingtheis.listandas.listcommands.Acomponentwithinalistisaccessedusingthe$charactertodenotewhichobjectwithinthelisttouse.Asanexample,supposethatwewanttocreatealisttokeeptrackoftheparametersforanexperiment.Thefirstcomponent,calledmeans,willbeavectoroftheassumedmeans.Thesecondcomponentwillbetheconfidencelevel,andthethirdcomponentwillbethevalueofaparameterfortheexperimentcalledmaxRealEigen:
>assumedMeans<-c(1.0,1.5,2.1)
>alpha<-0.05
>eigenValue<-3+2i
>l<-list(means=assumedMeans,alpha=alpha,maxRealEigen=eigenValue)
>l
$means
[1]1.01.52.1
$alpha
[1]0.05
$maxRealEigen
[1]3+2i
>l$means
[1]1.01.52.1
>l$means[2]
[1]1.5
Thenamesandattributescommandscanbeusedtodeterminethecomponentswithinalist.Theattributescommandisamoregenericcommandthatcanbeusedtolistthecomponentsofclassesandawiderrangeofobjects.Notethatthenamescommandcanalsobeusedtorenamethecomponentsofalist.Inthefollowingexample,weusethepreviousexamplebutchangethenamesoftheelements:
>l<-list(means=c(1.0,1.5,2.1),alpha=0.05,maxRealEigen=3+2i)
>names(l)
[1]"means""alpha""maxRealEigen"
>names(l)<-c("assumedMeans","confidenceLevels","maximumRealValue")
>l
$assumedMeans
[1]1.01.52.1
$confidenceLevels
[1]0.05
$maximumRealValue
[1]3+2i
DataframesAdataframeissimilartoalist,andmanyoftheoperationsaresimilar.Theprimarydifferenceisthatallofthecomponentsofadataframemusthavethesamenumberofelements.Thisisoneofthemostcommonwaystostoreinformation,andmanyofthefunctionsavailabletoreaddatafromafilereturnadataframebydefault.Forexample,supposeweaskfivepeopletwoquestions.Thefirstquestionis,“Doyouhaveapetcat?”Thesecondquestionis,“Howmanyroomsinyourhouseneednewcarpet?”:
>d<-data.frame(Q1=as.factor(c("y","n","y","y","n")),
+Q2=c(2,0,1,2,0))
>d
Q1Q2
1y2
2n0
3y1
4y2
5n0
>d$Q1
[1]ynyyn
Levels:ny
>summary(d)
Q1Q2
n:2Min.:0
y:31stQu.:0
Median:1
Mean:1
3rdQu.:2
Max.:2
Thenamesandattributescommandshavethesamebehaviorswithdataframesaslists.Intheprecedingexample,wetakethedataframedefinedinthepreviousexampleandrenamethefieldstosomethingmoredescriptive:
>d<-data.frame(Q1=as.factor(c("y","n","y","y","n")),
+Q2=c(2,0,1,2,0))
>names(d)<-c("HaveCat","NumberRooms")
>d
HaveCatNumberRooms
1y2
2n0
3y1
4y2
5n0
TablesTablescanbeeasilyconstructedandRwillautomaticallygeneratefrequencytablesfromcategoricaldata.Thetablecommandhasanumberofoptions,butwefocusonbasicexampleshere.Moredetailscanbefoundusingthehelp(table)command.Inthenextexample,wetakethedatafromtheprecedingcatquestionsandcreateatablefromtheanswersfromthefirstquestion:
>d<-data.frame(Q1=as.factor(c("y","n","y","y","n")),
+Q2=c(2,0,1,2,0))
>q1Results<-table(d$Q1)
>q1Results
ny
23
>summary(q1Results)
Numberofcasesintable:5
Numberoffactors:1
Ifyouwishtocreateatwowaytable,thensimplyprovidetwovectorstothetablecommandtogetthecrosstabulation.Again,welookatthedatafromthecatquestions.Notethatwehavetoconvertthesecondquestiontoafactor:
>d<-data.frame(Q1=as.factor(c("y","n","y","y","n")),
+Q2=c(2,0,1,2,0))
>results<-table(d$Q1,as.factor(d$Q2))
>results
012
n200
y012
>summary(results)
Numberofcasesintable:5
Numberoffactors:2
Testforindependenceofallfactors:
Chisq=5,df=2,p-value=0.08208
Chi-squaredapproximationmaybeincorrect
Therowsandcolumnsofthetablehavenamesassociatedwiththem,andtherownamesandcolnamescommandscanbeusedtoassignthenames.Thesecommandsaresimilartothenamescommand.Intheprecedingexample,thenamesinthetablearenotdescriptive.Inthefollowingexample,webuildthetableandrenametherowsandcolumns:
>d<-data.frame(Q1=as.factor(c("y","n","y","y","n")),
+Q2=c(2,0,1,2,0))
>results<-table(d$Q1,as.factor(d$Q2))
>rownames(results)<-c("NoCat","HasCat")
>colnames(results)<-c("Noroom","Oneroom","Tworooms")
>results
NoroomOneroomTworooms
NoCat200
HasCat012
Onelastnote;theargumenttothetablecommandrequiresordinaldata.Ifyouhavenumericdata,itcanbequicklytransformedtoencodewhichintervalcontainseach
number.Thecutcommandtakesthenumericdataandavectorofbreakpointsthatindicatethecutoffpointsbetweeneachinterval,asfollows:
>a<-c(-0.8,-0.7,0.9,-1.4,-0.3,1.2)
>b<-cut(a,breaks=c(-1.5,-1,-0.5,0,0.5,1.0,1.5))
>b
[1](-1,-0.5](-1,-0.5](0.5,1](-1.5,-1](-0.5,0](1,1.5]
Levels:(-1.5,-1](-1,-0.5](-0.5,0](0,0.5](0.5,1](1,1.5]
>summary(b)
(-1.5,-1](-1,-0.5](-0.5,0](0,0.5](0.5,1](1,1.5]
121011
>table(b)
b
(-1.5,-1](-1,-0.5](-0.5,0](0,0.5](0.5,1](1,1.5]
121011
MatricesandarraysTablesareaspecialcaseofanarray.Anarrayoramatrixcanbeconstructeddirectlyusingeitherthearrayormatrixcommands.Thearraycommandtakesavectoranddimensions,anditconstructsanarrayusingcolumnmajororder.Ifyouwishtoprovidethedatainrowmajororder,thenthecommandtotransposetheresultist:
>a<-c(1,2,3,4,5,6)
>A<-array(a,c(2,3))
>A
[,1][,2][,3]
[1,]135
[2,]246
>t(A)
[,1][,2]
[1,]12
[2,]34
[3,]56
Youarenotlimitedtotwo-dimensionalarrays.Thedimoptioncanincludeanynumberofdimensions.Inthefollowingexample,athree-dimensionalarrayiscreatedbyusingthreenumbersforthenumberofdimensions:
>A<-array(1:24,c(2,3,4),dimnames=c("row","col","dep"))
>A
,,1
[,1][,2][,3]
[1,]135
[2,]246
,,2
[,1][,2][,3]
[1,]7911
[2,]81012
,,3
[,1][,2][,3]
[1,]131517
[2,]141618
,,4
[,1][,2][,3]
[1,]192123
[2,]202224
>A[2,3,4]
[1]24
Amatrixisatwo-dimensionalarrayandisaspecialcasethatcanbecreatedusingthematrixcommand.Ratherthanusingthedimensions,thematrixcommandrequiresthatyouspecifythenumberofrowsorcolumns.Thecommandhasanadditionaloptionto
specifywhetherornotthedataisinrowmajororcolumnmajororder:
>B<-matrix(1:12,nrow=3)
>B
[,1][,2][,3][,4]
[1,]14710
[2,]25811
[3,]36912
>B<-matrix(1:12,nrow=3,byrow=TRUE)
>B
[,1][,2][,3][,4]
[1,]1234
[2,]5678
[3,]9101112
Bothmatricesandarrayscanbemanipulatedtodetermineorchangetheirdimensions.Thedimcommandcanbeusedtogetorsetthisinformation:
>C<-matrix(1:12,ncol=3)
>C
[,1][,2][,3]
[1,]159
[2,]2610
[3,]3711
[4,]4812
>dim(C)
[1]43
>dim(C)<-c(3,4)
>C
[,1][,2][,3][,4]
[1,]14710
[2,]25811
[3,]36912
CensoringdataUsingalogicalvectorasanindexisusefulforlimitingdatathatisexamined.Forexample,tolimitavectortoexamineonlythepositivevaluesinthedataset,alogicalcomparisoncanbeusedfortheindexintothevector:
>u<-1:6
>v<-c(-1,1,-2,2,-3,3)
>u
[1]123456
>v
[1]-11-22-33
>u[v>0]
[1]246
>u[v<0]=-2*u[v<0]
>u
[1]-22-64-106
AnotherusefulaspectofalogicalindexintoavectoristheuseoftheNAdatatype.Theis.nafunctionandalogicalNOToperator(!)canbeausefultoolwhenavectorcontainsdatathatisnotdefined:
>v<-c(1,2,3,NA,4,NA)
>v
[1]123NA4NA
>v[is.na(v)]
[1]NANA
>v[!is.na(v)]
[1]1234
NotethatmanyfunctionshaveoptionalargumentstospecifyhowRshouldreacttodatathatcontainsavaluewiththeNAtype.Unfortunately,thewaythisisdoneisnotconsistent,andyoushouldusethehelpcommandwithrespecttoanyparticularfunction:
>v<-c(1,2,3,NA,4,NA)
>v
[1]123NA4NA
>mean(v)
[1]NA
>mean(v,na.rm=TRUE)
[1]2.5
Inthislastexample,thena.rmoptioninthemeanfunctionissettoTRUEtospecifythatRshouldignoretheentriesinthevectorthatareNA.
AppendingrowsandcolumnsThecbindandrbindcommandscanbeusedtoappenddatatoexistingobjects.Thesecommandscanbeusedonvectors,matrices,arrays,andtheyareextendedtoalsoactondataframes.Thefollowingexamplesusedataframes,asthatisacommonoperation.Youshouldbecarefulandtrythecommandsonarraystomakesurethattheoperationbehavesinthewayyouexpect.
Thecbindcommandisusedtocombinethecolumnsofthedatagivenasarguments:
>d<-data.frame(one=c(1,2,3),two=as.factor(c("one","two","three")))
>e<-c("ein","zwei","drei")
>newDataFrame<-cbind(d,third=e)
>newDataFrame
onetwothird
11oneein
22twozwei
33threedrei
>newDataFrame$third
[1]einzweidrei
Levels:dreieinzwei
Iftheargumentstothecbindcommandaretwodataframes(ortwoarrays),thenthecommandcombinesallofthecolumnsfromallofthedataframes(arrays):
>d<-data.frame(one=c(1,2,3),two=as.factor(c("one","two","three")))
>e<-data.frame(three=c(4,5,6),four=as.factor(c("vier","funf","sechs")))
>newDataFrame<-cbind(d,e)
>newDataFrame
onetwothreefour
11one4vier
22two5funf
33three6sechs
Therbindcommandconcatenatestherowsoftheobjectspassedtoit.Thecommandusesthenamesofthecolumnstodeterminehowtoappendthedata.Thenumberandnamesofthecolumnsmustbeidentical:
>d<-data.frame(one=c(1,2,3),two=as.factor(c("one","two","three")))
>e<-data.frame(one=c(4,5,6),two=as.factor(c("vier","funf","sechs")))
>newDataFrame<-rbind(d,e)
>newDataFrame
onetwo
11one
22two
33three
44vier
55funf
66sechs
OperationsondatastructuresTheRenvironmenthasarichsetofoptionsavailableforperformingoperationsondatawithinthevariousdatastructures.Theseoperationscanbeperformedinavarietyofwaysandcanberestrictedaccordingtovariouscriteria.Thefocusofthissectionisthepurposeandformatsofthevariousapplycommands.
TheapplycommandsareusedtoinstructRtouseagivencommandonspecificpartsofalist,vector,orarray.Eachdatatypehasdifferentversionsoftheapplycommandsthatareavailable.Beforediscussingthedifferentcommands,itisimportanttodefinethenotionofthemarginsofatableorarray.Themarginsaredefinedalonganydimension,andthedimensionusedmustbespecified.Themargincommandcanbeusedtodeterminethesumoftherow,columns,ortheentirecolumnofanarrayortable:
>A<-matrix(1:12,nrow=3,byrow=TRUE)
>A
[,1][,2][,3][,4]
[1,]1234
[2,]5678
[3,]9101112
>margin.table(A)
[1]78
>margin.table(A,1)
[1]102642
>margin.table(A,2)
[1]15182124
Thelasttwocommandsspecifytheoptionalmarginargument.Themargin.table(A,1)commandspecifiesthatthesumsareinthefirstdimension,thatis,therows.Themargin.table(A,2)commandspecifiesthatthesumsareintheseconddimension,thatis,thecolumns.Theideaofspecifyingwhichdimensiontouseinacommandcanbeimportantwhenusingtheapplycommands.
TheapplycommandsThevariousapplycommandsareusedtooperateonthedifferentdatastructures.Eachone—apply,lapply,sapply,tapply,andmapply—willbebrieflydiscussedinorderinthefollowingsections.
applyTheapplycommandisusedtoapplyagivenfunctionacrossagivenmarginofanarrayortable.Forexample,totakethesumofaroworcolumnfromatwowaytable,usetheapplycommandwithargumentsforthetable,thesumcommand,andwhichdimensiontouse:
>A<-matrix(1:12,nrow=3,byrow=TRUE)
>A
[,1][,2][,3][,4]
[1,]1234
[2,]5678
[3,]9101112
>apply(A,1,sum)
[1]102642
>apply(A,2,sum)
[1]15182124
YoushouldbeabletoverifytheseresultsusingtherowSumsandcolSumscommandsaswellasthemargin.tablecommanddiscussedpreviously.
lapplyandsapplyThelapplycommandisusedtoapplyafunctiontoeachelementinalist.Theresultisalist,whereeachcomponentofthereturnedobjectisthefunctionappliedtotheobjectintheoriginallistwiththesamename:
>theList<-list(one=c(1,2,3),two=c(TRUE,FALSE,TRUE,TRUE))
>sumResult<-lapply(theList,sum)
>sumResult
$one
[1]6
$two
[1]3
>typeof(sumResult)
[1]"list"
>sumResult$one
[1]6
Thesapplycommandissimilartothelapplycommand,anditperformsthesameoperation.Thedifferenceisthattheresultiscoercedtobeavectorifpossible:
>theList<-list(one=c(1,2,3),two=c(TRUE,FALSE,TRUE,TRUE))
>meanResult<-sapply(theList,mean)
>meanResult
onetwo
2.000.75
>typeof(meanResult)
[1]"double"
tapplyThetapplycommandisusedtoapplyafunctiontodifferentpartsofdatawithinanarray.Thefunctiontakesatleastthreearguments.Thefirstisthedatatoapplyanoperation,thesecondisthesetoffactorsthatdefineshowthedataisorganizedwithrespecttothedifferentlevels,andthethirdistheoperationtoperform.Inthefollowingexample,avectorisdefinedthathasthediameteroftrees.Asecondvectorisdefined,whichspecifieswhatkindoftreewasmeasuredforeachobservation.Thegoalistofindthestandarddeviationforeachtypeoftree:
>diameters<-c(28.8,27.3,45.8,34.8,25.3)
>tree<-as.factor(c("pine","pine","oak","pine","oak"))
>tapply(diameters,tree,sd)
oakpine
14.4956893.968627
mapplyThelastcommandtoexamineisthemapplycommand.Themapplycommandtakesafunctiontoapplyandalistofarrays.Thefunctiontakesthefirstelementsofeacharrayandappliesthefunctiontothatlist.Itthentakesthesecondelementsofeacharrayandappliesthefunction.Thisisrepeateduntilitgoesthrougheveryelement.Notethatifoneofthearrayshasfewerelementsthantheothers,themapplycommandwillresetandstartatthebeginningofthatarraytofillinthemissingvalues:
>a<-c(1,2,3)
>b<-c(1,2,3)
>mapply(sum,a,b)
[1]246
>
SummaryInthischapter,weexaminedthebasicdatastructuresavailabletohelporganizedata.Thesedatastructuresincludevectors,lists,dataframes,tables,andarrays.Weexaminedsomeofthewaystomanagethedatastructuresusingtherbindandcbindcommands.Finally,weexaminedsomeofthemethodsavailabletoperformcalculationsonthedatawithinthedatastructureandexaminedthevariousfunctionsavailabletoapplycommandstopartsofthedatawithinthedatastructure.
Inthenextchapter,wewillbuildontheseideasandexaminehowtogetinformationfromadatafileandintothevariousdatastructures.Wewillalsoexaminethemethodsavailabletoproduceformattedoutputtodisplaytheresultsofcalculationsondata.
Chapter3.SavingDataandPrintingResultsThischapterprovidesyouwithabroadoverviewofthewaystogetinformationintoaswellasoutoftheRenvironment.Therearevariouspackagesthatareavailableandrelatedtothisimportantfunction,butwewillfocusonasubsetofthebasic,built-infunctions.Thechapterisdividedintothefollowingfivesections:
Fileanddirectoryinformation:ThissectiongivesyouabriefoverviewofhowfilesanddirectoriesareorganizedintheRenvironmentInput:ThissectiongivesyouanoverviewofthemethodsthatcanbeusedtobringdataintotheRenvironmentOutput:ThissectiongivesyouanoverviewofthemethodsavailabletogetdataoutoftheRenvironmentPrimitiveinput/output:ThissectiongivesyouanoverviewofthemethodsyoucanusetowritedatainbinaryorcharacterformsinpredefinedformatsNetworkoptions:Thissectiongivesyouabriefoverviewofthemethodsassociatedwithcreatingandmanipulatingsockets
FileanddirectoryinformationBeforediscussinghowtosaveorreaddata,wefirstneedtoexamineR’sfacilitiesforgettinginformationaboutfilesanddirectories.Wewillfirstdiscussthecommandsusedtoworkwithdirectoriesandfiles,andthendiscussthecommandsusedtomanipulatethecurrentworkingdirectory.Thebasiccommandstolistdirectoryandfileinformationarethedir,list.dirs,andlist.filescommands.Thebasiccommandstolistandchangethecurrentworkingdirectoryaregetwdandsetwd.
Thedir,list.dirs,andlist.filescommandsareusedtogetinformationaboutdirectoriesandfileswithindirectories.Bydefault,thecommandswillgetinformationaboutthedirectoriesinthecurrentworkingdirectory:
>dir()
[1]"R""bin""csv"
>d<-dir()
>d[1]
[1]"R"
Theprecedingcommandsalsoacceptawidevarietyofoptions.Useofthehelpcommandisrecommendedtoseemoredetails:
>list.files('./csv')
[1]"network.csv""trees.csv"
>f<-list.files('./csv')
>f[2]
[1]"trees.csv"
Thesecommandshaveanoptionalparameterforspecifyingapattern,andthepatternisaregularexpression.Thetopicofregularexpressionsisbeyondthescopeofthisdiscussion,butitoffersaverypowerfuloptionforspecifyingafiltertodeterminethenamesoffiles.Forexample,allofthefilesthatbeginwiththeletterncanbeeasilydetermined:
>f<-list.files('./csv',pattern='^n')
>f
[1]"network.csv"
Anotherimportanttopicistheideaofthecurrentworkingdirectory.WhentheRenvironmentseeksafileordirectorywhosenameisgiveninarelativeform,itstartsfromthecurrentworkingdirectory.Thereareseveralotherwaystospecifythecurrentdirectory,anditispartofthemajorityofgraphicalinterfaces.Unfortunately,itvariesacrossthedifferentinterfaces.
Thecommandstomanipulatethecurrentworkingdirectoryviathecommandlinearethegetwdandsetwdcommands.Thenamesofdirectories(folders)areseparatedusingforwardslashes:
>getwd()
[1]"/tmp/examples"
>d<-getwd()
>d
[1]"/tmp/examples"
EnteringdataHavingdiscussedtheideasassociatedwithdirectoriesandfiles,wecannowdiscusshowtoreaddata.Here,wewillprovideanoverviewofthedifferentwaystogetinformationfromafile.Wewillbeginwithashortoverviewaboutenteringdatafromthecommandlinefollowedbyexamplesforreadingatextfileintheformofatable,fromacsvfile,andfixedwidthfiles.Finally,wewilldiscussmoreprimitivemethodstoreadfromabinaryfile.
ItisimportanttonotethatwerelyonthetopicsdiscussedinChapter1,DataTypesandChapter2,OrganizingData.IassumethatyouarefamiliarwiththevariousdatatypesgiveninChapter1,DataTypes,aswellasthedatastructuresdiscussedinChapter2,OrganizingData.Inthischapter,wewillexploreasmallnumberofwaystoreaddataintoR.TherearealargenumberoflibrariesavailabletoreaddatainawidevarietyofformatssuchasJSON,XML,SAS,Excel,andotherfileformats.TherearealsomoreoptionsavailableinthebaseRdistribution.Toseemoreoptions,typereadandpresstheTABkey(nospaceafterthelettersread)toseeapartiallistofotheroptions.
EnteringdatafromthecommandlineWewillexaminetwowaystoreaddataincludingreadingkeyboardinputfromthecommandlineandreadingdatafromafile.Wefirstexaminesometechniquesusedtoobtaininformationthroughthecommandline.MoredetailscanbefoundinChapter2,OrganizingData,andwewillexploreadditionalwaystoenterdataincludingtheuseofthescananddata.entrycommands.
Inadditiontoconcatenatinginformationwiththeccommand,thereareadditionalcommandstomakeiteasiertodefinedata.Thefirstcommandwewillexamineisthescancommand.Ifyousimplyassignavariableusingthescancommandwithnoarguments,thenyouarepromptedandrequiredtoenterthenumbersfromthecommandline.Ifyouenterablankline(justhittheEnterkey),thenthepreviousvaluesarereturnedasavector:
>x<-scan()
1:28.8
2:27.3
3:45.8
4:34.8
5:23.5
6:
Read5items
>x
[1]28.827.345.834.823.5
Inthisexample,thescancommandisusedtopromptustoenterasetofnumbers.Afterablankentryisgiven,thecommandreturnsavectorwiththepreviousvalues.
Ifyouprovideafilename,thenthescancommandwillreadthevaluesfromthefileasifyouhadtypedthemonthecommandline.Supposethatwehaveafilecalleddiameters.csvwiththefollowingcontents:
28.8
27.3
45.8
34.8
25.3
Youcanreadthecontentsusingthescancommandasfollows:
>x<-scan("diameters.csv")
Read5items
>x
[1]28.827.345.834.825.3
Youcanreadmorecomplexdatafromafileusingthescancommand,butyoumustspecifythestructureofthefile.Thismeansthatyouhavetospecifythedatatypes.Here,assumethatwehaveadatafilecalledtrees.csv:
pine,28.8
pine,27.3
oak,45.8
pine,34.8
oak,25.3
Thefirstcolumnisthecharacterdata,andthesecondcolumnisthenumericdata.Theinformationoneachlineisseparatedbyacomma.Thescancommandassumesthattheinformationisseparatedusingwhitespace(spacesandtabs),sointhiscase,wehavetospecifythatacommaisusedastheseparatorwithinthefile.Inthefollowingexample,thefileisread,andtheformatisgivenusingthewhatargument:
>x<-scan("trees.csv",what=list("character","double"),sep=",")
Read5records
>x
[[1]]
[1]"pine""pine""oak""pine""oak"
[[2]]
[1]"28.8""27.3""45.8""34.8""25.3"
Anothermethodforenteringdataistousethedata.entrycommand.Thecommandwillopenupagraphicalinterfaceifitisavailableonyoursystem.Thedetailscanvarydependingonyouroperatingsystemandthegraphicalinterfacethatyouareusing.
ReadingtablesfromfilesOnecommonmethodusedtoreaddatafromafileistoreaditasatable.Thisassumesthatthefileisnicelyformattedandarrangedinconvenientrowsandcolumns.Acommandtoreaddatainthisformistheread.tablecommand.Therearealargenumberofoptionsforthiscommand,anditishighlyrecommendedthatyouusethehelpcommand,help(read.table),toseemorecompletedetailsaboutthiscommand.
Thefirstexampledemonstrateshowtoreadasimplefile.ItisassumedthatyouhaveafilecalledtrialTable.datinthefollowingformat:
123
356
Thefilehasnoheader,andthevaluesareseparatedbyspaces(whitespace).Inthissimpleformat,thefilecanbereadwiththedefaultoptions:
>trial<-read.table("trialTable.dat")
>trial
V1V2V3
1123
2356
>typeof(trial)
[1]"list"
>names(trial)
[1]"V1""V2""V3"
Theresultisalistofvalues.Nonameswerespecifiedforthenamesofthecolumns,andthedefaultvaluesforthecolumnnameshavebeenused.
CSVfilesTheread.tablecommandoffersageneralwaytoreadthedatafromafilewithaknownstructure.Onecommonfilestructureisafilewherethevaluesareseparatedbycommas,oracsvfile.Thecommandtoreadacsvfileistheread.csvcommand.Thereisanalternateversionofthecommand,read.csv2,whichhasadifferentsetofdefaults.Thedifferenceisthatthedefaultsforread.csv2aredefinedtoallowasimplewayofreadingafileinwhichthedelimiterbetweenthedecimalvaluesisacommaandthevaluesareseparatedbysemicolons,whicharemorecommonlyusedinsomeEuropeancountries.
Theread.csvcommandissimilartotheread.tablecommand.Theprimarydifferenceisthattheresultisreturnedasadataframe,andagreaterrangeofdatatypesforthecolumnsarerecognized.
Intheprecedingexamples,thetrialTable.csvfileisreadintotheworkspace.Thesamefilecanbereadusingtheread.csvcommand.ThetrialTable.csvfiledoesnothaveaheader,andthenumbersareseparatedusingspaces:
>trial<-read.csv("trialTable.csv",header=FALSE,sep='')
>trial
V1V2V3
1123
2356
>typeof(trial)
[1]"list"
Inthenextexample,wehaveadatafileinwhicheachlinehasthesamenumberofcolumns,andthedatafieldsareseparatedbycommas.Thefilewasdownloadedfromhttp://www.bea.gov/.Thefirstsixlinesofthefileareusedtoidentifyinformationaboutthedatainahuman-readableform,butthatinformationshouldbeignoredbytheread.csvfunction.Theotherthingtonoteisthattheseventhlineisaheader;ithasinformationthatdefinesthelabelusedtorefertothecolumns.Thelastthingtonoteisthatthenumbersareseparatedbycommas.Allofthesedetailsmustbespecifiedifwewanttoreadthefileusingtheread.csvcommand.Thissecondfile,inventories.csv,canbereadusingtheread.csvcommand,asfollows:
inventories<-read.csv("inventories.csv",+skip=6,header=TRUE,sep=",")
>typeof(inventories)
[1]"list"
>names(inventories)
[1]"Line""X""X1994.1""X1994.2""X1994.3""X1994.4""X1995.1"
[8]"X1995.2""X1995.3""X1995.4""X1996.1""X1996.2""X1996.3""X1996.4"
Fixed-widthfilesAnothercommonfileformatisafixedwidthfile.Inafixedwidthfile,everylinehasthesameformatandtheinformationwithinagivenlineisstrictlyorganizedbycolumns.Afileinthisformatcanbereadusingtheread.fwfcommand.Tousetheread.fwfcommand,youmustspecifythenameofthefileandthewidthofeachcolumn.YoucaninstructRtoignoreacolumnbyprovidinganegativevalueforthewidthofthecolumn.
Inthisexample,weassumethatafilewiththenametrialFWF.datisinthecurrentworkingdirectory.Thecontentsofthefileareasfollows:
12312345121234
B100ZZ18
C200YY20
D300XX22
Thefirstthreecolumnsareassumedtocontainletters,thenextfivecolumnscontainnumbers,thenexttwocolumnshaveletters,andthelastfourcolumnsarenumbers.Intheexamplefile,thetoprowshouldbeignoredasitisusedtodemonstratehowthefileisorganized.Theskipoptionisusedtoindicatehowmanylinestoignoreatthetopofthefile:
>trial<-read.fwf('trialFWF.dat',c(3,5,2,4),skip=1)
>trial
V1V2V3V4
1B100ZZ18
2C200YY20
3D300XX22
>trial$V1
[1]BCD
Levels:BCD
Notethatwhenawidthisgivenasanegativenumber,thatcolumnisignored:
>trial<-read.fwf('trialFWF.dat',c(3,-5,2,4),skip=1)
>trial
V1V2V3
1BZZ18
2CYY20
3DXX22
PrintingresultsandsavingdataWewillexploretheoptionsavailabletotakeinformationstoredwithintheRenvironmentandexpressthatinformationineitherhuman-ormachine-readableforms.WewillstartwithabriefdiscussiononsavingtheworkspaceinanRenvironment.Next,wewilldiscussvariouscommandsthatcanbeusedtoprintinformationtoeitherthescreenorafile.Finally,wewilldiscusstheprimitivecommandsthatcanbeusedforbasicfileoperations.
SavingaworkspaceTherearetwocommandsusedtosavetheinformationinthecurrentworkspace.Thefirstisthesavecommand,whichallowsyoutosaveparticularvariables.Thesecondisthesave.imagecommand,whichallowsyoutosaveallthevariableswithintheworkspace.
Thesavecommandrequiresalistofvariablestosave,andthenameofafiletosavethevariables.Thereareawidevarietyofoptions,butinthemostbasicformyousimplysavespecificvariablesfromthecurrentworkspace.Here,weusethelscommandtofirstlistthevariablesinthecurrentworkspaceandthenusethesavecommandtosavetwovariables,inventoriesandtrees:
>dir()
[1]"diameters.csv""inventories.csv""network.csv""trees.csv"
[5]"trialFWF.dat""trialTable.csv"
>ls()
[1]"a""d""f""inventories""trees"
[6]"trial""x""y"
>save(inventories,trees,file="theInventories.RData")
>dir()
[1]"diameters.csv""inventories.csv""network.csv"
[4]"theInventories.RData""trees.csv""trialFWF.dat"
[7]"trialTable.csv"
Thesave.imagecommandrequiresonlyoneargument;thenameofthefileusedtosavetheinformation:
>dir()
[1]"diameters.csv""inventories.csv""network.csv"
[4]"theInventories.RData""trees.csv""trialFWF.dat"
[7]"trialTable.csv"
>save.image("wholeShebang.RData")
>dir()
[1]"diameters.csv""inventories.csv""network.csv"
[4]"theInventories.RData""trees.csv""trialFWF.dat"
[7]"trialTable.csv""wholeShebang.RData"
IfyoustartanewRsession,theinformationthathasbeensavedusingasaveorsave.imagecommandcanbereadusingtheloadcommand:
>ls()
character(0)
>dir()
[1]"diameters.csv""inventories.csv""network.csv"
[4]"theInventories.RData""trees.csv""trialFWF.dat"
[7]"trialTable.csv""wholeShebang.RData"
>load("theInventories.RData")
>ls()
[1]"inventories""trees"
ThecatcommandThecatcommandcanbeusedtotakealistofvariables,convertthemtoatextform,andconcatenatetheresults.Ifnofileorconnectorisspecified,theresultisprintedtothescreen;otherwise,theconnectorisusedtodeterminehowtheresultishandled.Notethatthereisanadditionalsetofcommands,thevariouswritecommands,butthosecommandsareconvenientroutinesthatallowashorthandnotationtoaccessthecatcommands.Thesecommandsareprimarilyusedinscripts:
>one<-"A"
>two<-"B"
>cat(one,two,"\n")
AB
Thecatcommandallowsyoutospecifyanumberofoptions.Forexample,youcanspecifytheseparatorbetweenvariables,labelstobeused,orwhetherornottoappendtoagivenfile:
>cat(one,two,"\n",sep="-")
A-B-
Theprint,format,andpastecommandsWeexaminethreewaystodisplayinformationusingtheprint,format,andpastecommands.Thesecanbeusedbyprogramstodisplaytheformattedoutput.Thethreecommandsprovidenumerousoptionstoensurethattheinformationappearsinhuman-readableforms.
Theprintcommandisusedtodisplaythecontentsofasinglevariable,asfollows:
>one<-"A"
>print(one)
[1]"A"
Thepastecommandtakesalistofvariables,convertsthemtocharacters,andconcatenatestheresult.Thisisausefulcommandtodynamicallycreateannotationsforplotsandfigures,asfollows:
>one<-"A"
>two<-"B"
>numbers<-paste(one,two)
>numbers
[1]"AB"
>numbers<-paste(one,two,sep="/")
>numbers
[1]"A/B"
Inthisexample,thenumbersvariableisastring,anditistheresultofconvertingtheargumentstoastringandconcatenatingtheresults.Inthesecondpartoftheexample,theseparatorwaschangedfromthedefault,aspace,toaforwardslash.
TheformatcommandconvertsanRobjecttoastring,anditallowsalargenumberofoptionstospecifytheappearanceoftheobject:
>three<-exp(1)
>nice<-format(three,digits=2)
>nice
[1]"2.7"
>nice<-format(three,digits=12)
>nice
[1]"2.71828182846"
>nice<-format(three,digits=3,width=5,justify="right")
>nice
[1]"2.72"
>nice<-format(three,digits=3,width=8,justify="right",decimal.mark="#")
>nice
[1]"2#72"
Inthisexample,theformatcommandisusedinvariouswaystoconvertanumericvariabletoastring.Thevariousoptionstochangethenumberofdigitsandthetotalnumberofcharactershasbeenchangedtorefinetheresults.
Primitiveinput/outputThereareanumberofprimitivecommandsthatofferfinegraincontrolforreadingandwritinginformationtoandfromafile.Wedonotprovideextensiveexamplesherebecausethesecommandsaremoreusefulwhencombinedwiththeprogrammingcommandsthatareexploredinlaterchapters.
Beforediscussingthesecommands,itisimportanttodiscusstheideaofaconnector.Aconnectorisagenericwaytotreatadatasource.Thiscanbeafile,anHTTPconnection,adatabaseconnection,oranothernetworkconnection.Inthissection,weonlyexploreonetypeofconnector,thatis,thebasictextfileconnector.Moreinformationcanbefoundusingthehelpcommand,help(file).Thefilecommandisusedtocreateaconnectortoafile.TheargumentstothefilecommandaresimilartothefopencommandfoundintheClanguage.
Themostbasicuseofthefilecommandrequiresthatyouprovideanameofafileandthemodethatwillbeusedinmanipulatingthefile.ThemodecantellRwhetherthefilewillbeusedtoreadorwriteaswellaswhetherornotitisabinaryfile.Inthisfirstexample,wewillopenafileandwriteadoubleprecisionnumberandthenacharacterstring.Inthenextexamplethatfollows,wewillopenthefileandreadtheinformationbackintotheworkspace.Towritetheinformation,wewillfirstusethefilecommandtoopenafile,calltwoBinaryValues.dat,andusethebinarymode.WewillthenusethewriteBincommandtowritethetwovalues.Weassumeherethatadoubleprecisionnumberrequiresfourbytes:
>fileConnector=file("twoBinaryValues.dat",open="wb")
>theNumber=as.double(2.72)
>writeBin(theNumber,fileConnector,size=4)
>note<-"hellothere!"
>nchar(note)
[1]12
>writeBin(note,fileConnector,size=nchar(note))
>close(fileConnector)
Inthisexample,afileconnectoriscreatedtowriteinformationinabinaryformat.Twovariablesarethenwrittentothefile,andthefileisclosed.Thesameinformationisreadinthenextexample.ThereadBincommandisusedtoreadtheinformationfromthefile:
>fileConnector=file("twoBinaryValues.dat",open="rb")
>value<-readBin(fileConnector,double(),1,size=4)
>value
[1]2.72
>note<-readBin(fileConnector,character(),12,size=1)
>note
[1]"hellothere!"
>close(fileConnector)
Thereareanumberofcommandsthatcanbeusedtoreadandwritecharacterdata.ThewriteCharandreadCharcommandsareusedtowriteandreadcharacterdatainasimilarwayasthewriteBinandreadBincommands.ThewriteLinesandreadLines
commandscanbeusedtowritewholelinesascharactersatonetime.
NetworkoptionsAnotherwaytoreadinformationisthroughanetworkconnectionusingsockets.Themethodsavailabletomanipulatesocketswillbebrieflyexploredinthissection.WefirstexplorethehighlevelsocketcommandsthatmakeuseofthesocketConnectioncommandtocreateaconnector.Next,someofthemorebasicoptionsarebrieflystated.Thisisanadvancedtopicbeyondthescopeofthisbook,butitisprovidedhereasamatterofcompleteness.
OpeningasocketThesocketConnectioncommandwillcreateanetworkconnectiontoagivenhostusingaportnumber.Thecommandreturnsaconnectorthatcanbetreatedthesameasafileconnector.Inthefollowingexample,aconnectionisopenedtothewaterdata.usgs.govwebsiteusingthestandardHTTPport,80.ItthensendstheHTTPheadernecessarytorequestthedataforthedailyflowratesfortheSouthColtonstationontheRaquetteRiverinnorthernNewYork:
>usgs<-socketConnection(host="waterdata.usgs.gov",80)
>writeLines("GET/ny/nwis/dv?
cb_00060=on&format=rdb&site_no=04267500&referred_module=sw&period=&begin_da
te=2013-05-08&end_date=2014-05-08HTTP/1.1",con=usgs)
>writeLines("Host:waterdata.usgs.gov",con=usgs)
>writeLines("\n\n",con=usgs)
>lines=readLines(usgs)
>lines
[1]"HTTP/1.1200OK"
[2]"Date:Fri,09May201416:28:26GMT"
[3]"Server:Apache"
[4]"AMF-Ver:4.02"
[5]"Connection:close"
[6]"Transfer-Encoding:chunked"
[7]"Content-Type:text/plain"
[8]""
[9]"3bc"
[10]"#----------------------------------WARNING-----------------------
-----------------"
[11]"#ThedatayouhaveobtainedfromthisautomatedU.S.Geological
Surveydatabase"
[12]"#havenotreceivedDirector'sapprovalandassuchareprovisional
andsubjectto"
[13]"#revision.Thedataarereleasedontheconditionthatneitherthe
USGSnorthe"
[14]"#UnitedStatesGovernmentmaybeheldliableforanydamages
resultingfromitsuse."
[15]"#Additionalinfo:http://waterdata.usgs.gov/ny/nwis/?provisional"
[16]"#"
[17]"#File-formatdescription:http://waterdata.usgs.gov/nwis/?
tab_delimited_format_info"
[18]"#Automated-retrievalinfo:
http://help.waterdata.usgs.gov/faq/automated-retrievals"
[19]"#"
[20]"#Contact:[email protected]"
[21]"#retrieved:2014-05-0912:28:35EDT(vaww01)"
[22]"#"
[23]"#Dataforthefollowing1site(s)arecontainedinthisfile"
[24]"#USGS04267500RAQUETTERIVERATSOUTHCOLTONNY"
...[DeletedLines]...
[425]"USGS\t04267500\t2014-05-07\t5810\tP"
[426]"USGS\t04267500\t2014-05-08\t5640\tP"
[427]""
[428]"0"
[429]""
>close(usgs)
Inthisexample,asocketconnectioniscreated.ItisusedtomakeaconnectiontoaURLwithagivenport.ThesocketconnectoristhenusedtosendanHTTPheadertorequestaparticularpagefromthegivenhost.TheresultingpageisreadusingthereadLinescommand.ThereadLinescommandisusedtoreadeverylineasastring,andtheinformationinthevectorwillhavetobeparsedtotransformitintoauseableform.
BasicsocketoperationsAsidefromthehigherleveloptiongivenintheprevioussection,therearealsomoreprimitivecommandsforcreatingandusingasocket.Thecommandsexaminedarethemake.socket,read.socket,write.socket,andtheclose.socketcommands.Socketsarescarceresources,andchecksneedtobeputinplacetoensurethattheyarereleasedwhensomethinggoeswrong.Forthatreason,asocketisusuallycreatedwithinafunctionwithextrachecks.Theexamplehereisbasic,anditisprovidedsimplytodemonstratethecommands.Youshouldseethehelppagesforthesocketcommandsforamorecomprehensiveexample.
Toreplicatetheprecedingexample,thesocketisopened,andtheHTTPrequestissubmitted:
>socketRead<-make.socket("waterdata.usgs.gov",80)
>write.socket(socketRead,"GET/ny/nwis/dv?
cb_00060=on&format=rdb&site_no=04267500&referred_module=sw&period=&begin_da
te=2013-05-08&end_date=2014-05-08HTTP/1.1\n");
>write.socket(socketRead,"Host:waterdata.usgs.gov\n\n\n");
>incoming<-read.socket(socketRead);
>close.socket(socketRead)
[1]FALSE
Theincomingvariablewillnowcontaintheresultoftheoperation.
SummaryInthischapter,wehaveexploredsomeofthefacilitiesavailabletoreadandwriteinformationtoafileaswellaswaystoenterdatafromthecommandline.Weexaminedthewaystochangebetweendirectoriesandgetthedirectoryinformation.
Afterexaminingwaystoreaddata,weexploredsomeofthewaysthattheinformationcanbedisplayed.Wefirstexaminedhowtoprintinformationonthecommandline,andwethenexploredhowtosavedatatoalocalfile.Wealsoexploredhowthecurrentworkspacecouldbesavedandrecalledforuseinalatersession.
Oneimportanttopicexploredwashowtoworkwithfilesthatdonotfollowabasicformat.Weexaminedthecommandsnecessarytoreaddata,inbothtextandbinaryformswherethedatafilefollowsapredefinedstructure.
Thefinaltopicexploredwashowtousenetworkconnectionstoreadinformation.Thisincludedtheuseofasocketconnectortoallowaccesstorelativelywell-structuredinformation.Wealsoexploredmoreprimitiveoptionsthatallowustocreatesocketsandmanipulatetheminamorebasicform.
Inthepreviouschapters,weexaminedthebasicwaystostoredata.Inthechapterthatfollows,Chapter4,CalculatingProbabilitiesandRandomNumbers,wegetourfirstglimpseofthefunctionsavailabletohelpusunderstandhowtointerpretdata.Wewillexploresomeoftheoptionsavailabletoworkwithvariouspredefinedprobabilitydistributions.
Chapter4.CalculatingProbabilitiesandRandomNumbersInthischapter,weprovideabroadoverviewofthefunctionsrelatedtoprobabilitydistributions.Thisincludesfunctionsassociatedwithprobabilitydistributions,RandomNumberGeneration,andissuesassociatedwithsampling.Thechapterisdividedintofiveparts:
Distributionfunctions:ThissectiongivesyouabriefoverviewoftheideasandconceptsbehindrandomvariablesandapproximatingtheheightofaprobabilitymassfunctionofaprobabilitydensityfunctionforagivendistributionThecumulativedistributionfunction:ThissectiongivesyouanoverviewofhowtoapproximatethecumulativedistributionforagivendistributionTheinversecumulativedistributionfunction:ThissectiongivesyouanoverviewofhowtoapproximatetheinversecumulativedistributionfunctionforagivendistributionRandomNumberGeneration(RNG):ThissectiongivesyouanoverviewofhowRgeneratespseudorandomnumberswithexamplesofhowtogeneraterandomnumbersforagivendistributionSampling:Thissectiongivesyouanoverviewofsamplingdatafromagivenvector
OverviewThebaseRenvironmenthasoptionsforapproximatingmanyofthepropertiesassociatedwithprobabilitydistributions.Thisisnotacompletediscussion,andamorecompletelistcanbefoundwithintheRenvironmentusingthehelp(Distributions)command.Inthischapter,wewilldiscusshowRcanbeusedtoapproximatedistributionfunctions,howtoapproximatecumulativedistributionfunctions,howtoapproximateinversecumulativedistributionfunctions,RandomNumberGeneration(RNG),andsampling.
Thecommandsusedforthefirstsetoftopicshaveacommonformat,andeachcommandhastheformofaprefixandasuffix.Thesuffixspecifiesthedistributionbyname.Forexample,thenormsuffixreferstothenormaldistribution.AlistofthedistributionsavailableinthebaseRinstallationisgiveninTable1.Theprefixisoneofthefollowing:
d:Thisdeterminesthevalueofthedistributionfunction,forexample,dnormistheheightofthenormal’sprobabilitydistributionfunctionp:Thisdeterminesthecumulativedistribution,forexample,pnormisthecumulativedistributionfunctionforthenormaldistributionq:Thisdeterminestheinversecumulativedistribution,forexample,qnormistheinversecumulativedistributionfunctionforthenormaldistributionr:Thisgeneratesrandomnumbersaccordingtothedistribution,forexample,rnormcalculatesrandomnumbersthatfollowanormaldistribution
Asanexample,todeterminetheprobabilitythataPoissondistributionwillreturnagivenvalue,thecommandtouseisdpois,andthecommandtogettheprobabilitythataPoissondistributionislessthanorequaltoaparticularvalueisppois.
Inthischapter,weassumethatyouarefamiliarwiththeideaofarandomvariable.Inshort,arandomvariableisafunction,andthefunctionassignsanumberforeachoutcomeinthesamplespaceassociatedwithanexperiment.Acontinuousrandomvariableisafunctionthatcanincludeacontinuousrangeofvaluesinthevaluesitcanreturn.Adiscreterandomvariablecanonlyreturnnumbersfromadiscretesetofvalues.ThefollowingtableshowsthedistributionsavailableinR:
Discrete Continuous
Name Suffix Name Suffix
Beta beta X2 chisq
Binomial binom Exponential exp
Cauchy cauchy F f
Geometric geom Gamma gamma
Hypergeometric hyper LogNormal lnorm
Multinomial mutlinom Normal norm
NegativeBinomial nbinom Studentt t
Poisson pois Uniform unif
Weibull weibull
Table1–alistofdistributionsandthesuffixusedinRtorefertothem
DistributionfunctionsWewillfirstdiscussthewaytocalculatethevalueofadistributionfunctioninR.Wewillthendiscussdiscretedistributionsandthencontinuousdistributions.Thedistributionfunctionisusedtodeterminetheprobabilitiesthataparticulareventwilloccur.Inthecaseofadiscretedistribution,thefunctioniscalledaprobabilitymassfunction,andforacontinuousdistributionitiscalledaprobabilitydistributionfunction.Foradiscretedistribution,theprobabilitiesarecalculatedusingasumwheref(i)istheprobabilitymassfunction:
Eachdistributionhasitsownparametersassociatedwithit,andjudicialuseofthehelpcommandishighlyrecommended.Forexample,togetmoreinformationaboutthePoissondistribution,thehelp(dpois)commandcanbeused.InthecaseofthePoissondistribution,therearetwoparametersrequiredforthedpoiscommand.Thefirstisxandthesecondislambda.ThefunctionreturnstheprobabilitiesthataPoissonrandomvariablewiththelambdaparameterreturnsthevaluesgivenbyx.
Forexample,toplottheprobabilitiesforaPoissondistributionwithparameter10,thefollowingcommandsareusedtogeneratethevaluesthatcanbereturned(calledx),andabarplotisusedtodisplaythem:
>x<-0:20
>probabilities<-dpois(x,10.0)
>barplot(probabilities,names.arg=x,xlab="x",ylab="p",+main="Poission
Dist,l=10.0")
Forthecontinuousdistribution,therandomvariablecantakeonarangeofvalues,andinsteadofadding,wefindtheareaunderacurve,f(s),calledtheprobabilitydensityfunction:
Inthenextexample,weusetheX2distribution.TheideabehindtheX2distributionisthatyousamplenindependentrandomvariablesthatfollowastandardnormaldistribution.Youthensquarethevaluesthatweresampledandaddthemup.TheresultisdefinedtobeanX2distribution.Notethatthereisoneparameter,n,anditisgenerallyspecifiedbygivingthenumberofdegreesoffreedom,whichisdefinedtoben-1.Thenumberofdegreesoffreedomisoftenreferredtoasdf.
Forthisexample,thecommandtoobtainanapproximationfortheprobabilitydensityfunctionisdchisq,andittakestwoarguments,xanddf.Here,weplottheprobability
densityfunctionfortwoX2distributionswithtwodifferentparameters.Firstarangeofvalues,x,isdefined,andthentheheightoftheprobabilitydensityfunctionforanX2
distributionwithdf=30isplotted.Next,anotherX2distributionwithdf=35isplottedonthesameplotusingthepointscommand:
>x<-seq(0.0,100.0,by=0.1)
>prob<-dchisq(x,df=30)
>plot(x,prob,main="ChiSquaredDist.",xlab='x',ylab='p',col=2,type="l")
>probTwo<-dchisq(x,df=35)
>points(x,probTwo,col=3,type="l")
CumulativedistributionfunctionsAnotherimportanttoolusedtoapproximateprobabilitiesisthecumulativedistributionfunction.Inmanysituations,wearerequiredtodeterminetheprobabilitythatarandomvariablewillreturnavaluewithinsomegivenrangeofvalues.Thecumulativedistributionallowsustouseashortcuttocalculatetheresultingprobability.Asintheprevioussection,welookatdiscreteandcontinuousdistributionsseparately.Weexaminethedefinitionandlookatexamplesforbothcases.
Foradiscretedistribution,thecumulativedistributionfunctionisdefinedtobethefollowingequation:
Fromthisdefinition,theprobabilitythatarandomvariableisbetweentwonumberscanbedeterminedbythefollowingequation:
Pleasenotethedetailsintheinequality.Inthecaseofdiscretedistributions,itmattersiflessthanorequalisusedasopposedtolessthan.
ForaPoissondistribution,thecommandtodeterminethecumulativedistributionfunctionistheppoiscommand,andithasthesameargumentsasthedpoiscommanddiscussedearlier.Thefollowingexamplemirrorstheexampleintheprevioussection,andthecumulativedistributionfunctionisplottedforaPoissondistributionwithparameter10.0:
>x<-0:20
>cdf<-ppois(x,10.0)
>barplot(cdf,names.arg=x)
Thecumulativedistributionfunctionforacontinuousrandomvariableisdefinedinthesamewayasthatofadiscreterandomvariable.Theonlydifferenceisthatinsteadofasumweuseanintegral,asfollows:
Thisdefinitiongivesthesameresultasbefore,andtheprobabilitythatarandomvariableisbetweentwonumberscanbedeterminedbyusingthefollowingequation:
ThefollowingexamplecomparesthecumulativedistributionfunctionsfortwoX2distributionswherethefirsthas30degreesoffreedomandthesecondhas35degreesoffreedom:
>x<-seq(0.0,100.0,by=0.1)
>cdf<-pchisq(x,df=30)
>plot(x,cdf,main="ChiSquaredDist.",xlab='x',ylab='p',col=2,
+type="l")
>cdfTwo<-pchisq(x,df=35)
>points(x,cdfTwo,col=3,type="l")
InversecumulativedistributionfunctionsThecumulativedistributionfunctionisoftenusedtocalculateprobabilities,butinothercircumstancesthegoalistofindarangeofvaluesgiventheprobability.Inthiscase,theinversecumulativedistributionfunctioncanbeusedtodetermineavalueoftherandomvariablethatcorrespondstoagivenprobability.Theideaisthatgiventheprobability,youwanttosolveforthevalueofaintheexpression:
Forexample,supposethatwehaveaPoissonrandomvariablewithparameter10.0andwishtofindthevalueofaforwhichtheprobabilityoftherandomvariableislessthanais0.5.Usingtheqpoiscommand,wecandeterminethevalue:
>a<-qpois(0.5,10.0)
>a
[1]10
ThisresultindicatesthattheprobabilitythataPoissonrandomvariablewithparameter10.0islessthan10.0is0.5,orputanotherwaythemedianis10.0.Sincethisisalsothemeanoftherandomvariable,wecanseethatthereisnoskewassociatedwiththedistribution.
WenowdothesamefortheX2distributionwith30degreesoffreedom.Theqchisqcommandcanbeusedtodeterminethemedianforthisdistribution:
>a<-qchisq(0.5,df=30)
>a
[1]29.33603
Inthiscase,weseethatthemedianislessthanthemean,whichis30,sotheX2distributionwith30degreesoffreedomisskewedtotheleftsincethemedianistotheleftofthemean.
GeneratingpseudorandomnumbersAcommontaskinsimulationsistogeneraterandomnumbersthatfollowagivendistribution.Weexplorethisimportanttopic,butitisimportanttomakeafewnotesaboutRandomNumberGeneration.First,andforemost,despitethenomenclature,thenumbersarenotrandombecausetheyaregeneratedusingadeterministicalgorithm.Secondly,whendebuggingandcomparingcodeusedtosimulateastochasticprocess,itisimportanttobeabletogeneratenumbersinarepeatablewaytoensurethattheresultsareconsistent.
Beforediscussinggeneratingrandomnumbers,weprovidesomeminimalbackgroundinformationabouthowRgeneratesrandomnumbers.Thisisacomplextopic,andyoucanfindmoredetailsusingthehelp(RNG)command.Onethingtonoteisthatthe.Random.seedvariablehasthevalueofthecurrentseed,butitisnotdefineduntilyoudosoexplicitlyoracommandiscalledthatrequiresthatarandomnumberbegenerated.Thevariablecanbesetdirectly,butitisbettertochangeitusingtheset.seedcommand.Also,thealgorithmthatisusedtogeneraterandomnumberscanbesetorobtainedusingtheRNGkindcommand.Notethatthesavecommandcanbeusedasaconvenientwaytosavetheseedifrepeatabilityofyourresultsisimportantasyoumakechangestoestablishedcode.
RandomNumberGenerationcanbeadauntingsubject,butweprimarilyfocusonhowtogeneraterandomnumbersaccordingtoagivendistribution.Asbefore,wefirstexamineadiscretedistribution;thePoissondistributionwithparameter10.0.WethenexamineanX2distributionwith30degreesoffreedom.Ineachcase,wegenerateonehundredrandomnumbersandcreateahistogramoftheresults.
Firstweexaminethediscretedistribution.Therpoiscommandcanbeusedtogeneratethenumber.Ittakestwoparameters,thenumberofpointstoapproximateandtheparameterassociatedwiththedistribution:
>numbers<-rpois(100,10.0)
>hist(numbers,main="100SamplesofaPoissonDist.",xlab="x")
Likewise,aX2distributioncanalsobesampled,andtheresultsareplottedusingahistogram:
>numbers<-rchisq(100,df=30)
>hist(numbers,main="100SamplesfromaChi-SquaredDist.",xlab="x")
SamplingThefinaltopicthatwewilldiscussissampling.Thiscanalsobeacomplicatedsubject,anditisoftenusedinbootstrappingandawidevarietyofothertechniques.Becauseofitsprevalence,weprovideitasaseparatesection.
Thesolefocusofthissectionisonthesamplecommand.Itmayseemoddtograntsuchattentiontoasinglecommand,butsamplingisacomplextopicwithmoreopinionsassociatedwithitthantherearestatisticians.Thesamplecommandrequiresatleastoneargument,avectororanumber,anditreturnsasetofvalueschosenatrandom.Theoptionsforthecommandallowyoutospecifyhowmanysamplestotake,whetherornottousereplacement,andaprobabilitydistributionifyoudonotwishtouseauniformmassfunction.
Thesamplefunction’sbehaviordependsonwhetherornotyougiveitavectororanumber.Ifyoupassanumbertoit(thatis,avectoroflength1),itwillsamplefromthesetofwhole,positivenumberslessthanorequaltothatnumber:
>sample(3)
[1]321
>sample(5)
[1]23145
>sample(5.6)
[1]51432
Ifyoupassitavectorwhoselengthisgreaterthan1,itwillsamplefromtheelementsinthevector:
>x<-c(1,3,5,7)
>sample(x)
[1]7153
Ifyoudonotspecifythenumberofsamplestotake,itwillusethenumberofobjectspassedtoit.Thesizeparameterallowsyoutospecifyadifferentnumberofsamples:
>x<-c(1,3,5,7)
>sample(x,size=2)
[1]17
>sample(x,size=3)
[1]135
>sample(x,size=8)
Errorinsample.int(length(x),size,replace,prob):
cannottakeasamplelargerthanthepopulationwhen'replace=FALSE'
Intheprecedingexample,thenumberofsamplesislargerthanthenumberofelementsavailable.Toavoidanerror,youhavetospecifysamplingwithreplacement:
>x<-c(1,3,5,7)
>sample(x,size=8,replace=TRUE)
[1]35553751
Inthepreviousexamples,thesampleswerefoundusingthedefaultalgorithm.Thedefaultistouseauniformprobabilitymassfunctionmeaningeveryelementhasthesame
likelihoodofbeingchosen.Youcanchangethisbehaviorbyspecifyingavectorofprobabilitiesthathasthelikelihoodofchoosingeachparticularelementofthevector:
>x<-c(1,3,5,7)
>sample(x,size=8,replace=TRUE,p=c(0.05,.10,.15,.70))
[1]73377775
SummaryInthischapter,weexaminedabroadoverviewofsomeoftheprobabilityfunctionsinthebaseRinstallation.Theseincludefunctionstoapproximatethedistributionfunction,thecumulativedistributionfunction,andtheinversecumulativedistribution.Wealsoexaminedhowtogeneratepseudo-randomnumbersforthevariousdistributions.Thefinaltopicexploredwastheuseofthesamplecommand,whichisusedforsamplingfromagivendatasetstoredasavector.
Inthenextchapter,westepbackfromthemoremathematicalideasexploredinthischapterandlookattheprogrammingfacilitiesthatcanbeusedtomanipulatestrings.Thisisanimportanttopicasitisnotuncommonfordatasetstoincludestringvariables,anditisoftennecessarytoextractoraddinformationtothevariableswithinadataset.
Chapter5.CharacterandStringOperationsThischapterwillprovideyouwithabroadoverviewoftheoperationsavailableforthemanipulationofcharacterandstringobjects.Thisisarelativelyconcisechapter,andthefocusisonbasicoperations.Thereareroughlytwopartsinthischapter:
Basicstringoperations:ThissectiongivesyouabroadoverviewofthemostbasicstringoperationsRegularexpressions:Thissectiongivesyouabriefintroductionofthreecommandsthatmakeuseofregularexpressions
BasicstringoperationsInsomesituations,dataiskeptintheformofcharactersorstrings,whereasinsomeothersituationsthestringsmustbeparsedorinvestigatedaspartofastatisticalanalysis.Becauseoftheprevalenceofdataintheformofstrings,theRlanguagehasarichsetofoptionsavailableformanipulatingstrings.Inthischapter,weinvestigatesomeoftheoptionsavailable,andourinvestigationisdividedintotwoparts.Theyareasfollows:
Inthefirstpart,weexamineanumberofbasicstringoperationswhosefunctionisfocusedonparticularoperationsInthesecondpart,weexaminethefunctionsthatarebasedonregularexpressionsthatofferapowerfulsetofwiderangingoperations
Theuseofregularexpressionsrepresentapowerfulsetoftoolsforstringmanipulation,butthecostisgreatercomplexity.Inthischapter,weonlyfocusonthemostbasicusesofthesefunctionsastheirmostcommonusestendtobepartsofmorecomplexcodethatmayuseacomplicatedcombinationofthecommands.
Asawaytomaketheconnectionbetweenthecommands,weassumeacommondatasetthroughoutthischapter.Inparticular,weassumethatwehaveasetofURLs:
>urls<-c("https://duckduckgo.com?q=Johann+Carl+Friedrich+Gauss",
"https://search.yahoo.com/search?p=Jean+Baptiste+Joseph+Fourier",
"http://www.bing.com/search?q=Isaac+Newton",
"http://www.google.com/search?q=Brahmagupta")
WewillexaminevariouswaystopullinformationfromeachURL.SuchataskmaybenecessaryeithertopullinformationfromawebsiteortoperformananalysisontheURLsthemselves.
SixfocusedtasksWefirstexaminesixspecifictasks.Theseoperationsaretodeterminethelengthofastring,locationofasubstring,extractorreplaceasubstring,changethecaseofastring,splitastringintoseparateparts,andexpressacombinationofobjectsasasinglestring.Pleasenotethattheexampleswillmakeextensiveuseofthedefinitionoftheurlvectorthatisdefinedintheprevioussection.
DeterminingthelengthofastringThencharcommandisusedtodeterminethelengthofastring.Thisoptioncanbeusedaspartofasimplestatisticforasetofstringsorcanbepartofaprogrammingtoolwhenitisnecessarytoiterateoverastring.Youcanspecifyhowthecountiscalculatedwithoptionstocountthenumberofcharacters,bytes,orwidthoftheresultingstring.Inmostcases,thesevalueswillbethesamebutcandifferdependingontheUnicodesettingsforyourenvironment:
>nchar(urls)
[1]52624142
Thencharcommandtriestocoerceitsargumenttoastring,whichmeansthatitinterpretsthevaluesofobjectswhosetypeisNAasthestringNA.Anothersideeffectisthatitcanreturnanerrorwhenusedonfactors:
>>nchar(c("one",NA,1234))
[1]324
>nchar(as.factor(c("a","b","a","a","b","c")))
Errorinnchar(as.factor(c("a","b","a","a","b","c"))):
'nchar()'requiresacharactervector
IfyouarerunningRwithinaUnicodeenvironmentsuchasUTF-8,thiscommandmayreturnanerror.Ifthisisthecase,trytheallowNA=TRUEoptiontoseewhethertheresultsareappropriate.
Onelastnote;thereisanadditionalcommand,nzchar,whichteststodeterminewhetherthecharacterwidthofthestringhaszerolength.Thisfunctionreturnsalogicalvalue,anditistrueifthestringdoesnothaveazerolength.Asanexample,supposeyouhavealistoffiletypeswithanemptystringbeinganunrecognizedfiletype.Youmaywanttousenzchartocreateamasktoskiporignoreanemptystring:
>fileTypes<-c("txt","","html","txt")
>nzchar(fileTypes)
[1]TRUEFALSETRUETRUE
>fileTypes[nzchar(fileTypes)]
[1]"txt""html""txt"
LocationofasubstringInsomecircumstances,itisnecessarytodeterminethelocationwithinastringfortheoccurrenceofasubstring.Forexample,itmightbenecessarytosearchasetoffilenamestodeterminewhethertheycontainamatchforapredeterminedparameter.Thecommandtoreturnthelocationofasubstringistheregexpcommand.Thiscommandhasmanyoptions,anditisexploredinmoredetailinthefollowingsection.Here,weuseitinitsmostbasicformtodeterminethelocationofasubstring.
Usingthedefinitionofthevector,urls,asanexample,wemaywishtofindthelocationofthecolonsintheURLs.Thecolonsarethedelimiterbetweentheprotocolandthehostname,andwecannotassumeitwillalwaysbeinthesameposition:
>colons<-regexpr(":",urls)
>colons
[1]6655
attr(,"match.length")
[1]1111
attr(,"useBytes")
[1]TRUE
>colons[2]
[1]6
>colons[3]
[1]5
Notethattheassumedpositionofthefirstcharacterisoneandnotzero.
ExtractingorchangingasubstringTherearetwocommandsavailabletochangeasubstringwithinastring.Thetwocommandsaresubstrandsubstring,andtheirargumentsareidentical.ThesubstringcommandiscompatiblewithS,butwefocusontheRcommandsubstr.Thecommandhastwoforms.Oneformcanbeusedtoextractasubstring,andtheotherformcanbeusedtochangethevalueofasubstring.
First,weexaminetheoptiontoextractasubstring.Thecommandtakesastring,thelocationofthestartofthesubstring,andthelocationoftheendofthesubstring.Makinguseofthepreviousexample,wenowusetheurlsvectordefinedatthestartofthechapteranddeterminetheprotocolforeachURL:
>protocols<-substr(urls,1,colons-1)
>protocols
[1]"https""https""http""http"
Asubstringcanbereplacedbycombiningthesubstrcommandandtheassignmentoperator.Unfortunately,thelengthofthestringinsertedmustbethesamelengthasthestringbeingreplaced,whichcanreturnnonintuitiveresults.Here,wereplaceeachoftheprotocolsintheurlsvectorwithamailtoprotocol:
>colons<-regexpr(":",urls)
>mailto<-urls
>substr(mailto,1,colons-1)<-c("mailto","mailto","mailto")
>mailto
[1]"mailt://duckduckgo.com?q=Johann+Carl+Friedrich+Gauss"
[2]"mailt://search.yahoo.com/search?p=Jean+Baptiste+Joseph+Fourier"
[3]"mail://www.bing.com/search?q=Isaac+Newton"
[4]"mail://www.google.com/search?q=Brahmagupta"
Notethatthestring“http”hasfewercharactersthanthestring“mailto,”andthecommandhastruncatedthenewstringthatissubstitutedintotheoriginalstring.
TransformingthecaseInsomeanalyses,thecaseofalettermaynotbeconsideredimportant.Inthesesituations,itmaybenecessarytoconvertthecaseofastring.Thetolowerandtouppercommandscanbeusedtoensurethatastringisintheexpectedform:
>tolower(urls[1])
[1]"https://duckduckgo.com?q=johann+carl+friedrich+gauss"
>toupper(urls[2])
[1]"HTTPS://SEARCH.YAHOO.COM/SEARCH?P=JEAN+BAPTISTE+JOSEPH+FOURIER"
Anadditionalcommand,chartr,enablesmorefine-grainedcontrolofcharacterreplacement.Theideaisthattherearesomecharactersintheoriginalstringthatwewanttoreplace,andweknowwhatthereplacementcharactersare.Forexample,goingbacktotheurlsvectordefinedpreviously,wemaywanttoreplacetheoccurrencesof=witha#charactertodelimittheendoftheURLandthestartoftheargumentlist.Supposewealsowanttoreplacethe+characterswithspaces.Wecandosobyusingthechartrcommand,wherethefirstargumentisastringwhosecharacterelementswillbechanged,thesecondargumentisastringthatcontainsthereplacementcharacters,andthethirdargumentisthestringtochange:
>>chartr("=+","#",urls)
[1]"https://duckduckgo.com?q#JohannCarlFriedrichGauss"
[2]"https://search.yahoo.com/search?p#JeanBaptisteJosephFourier"
[3]"http://www.bing.com/search?q#IsaacNewton"
[4]"http://www.google.com/search?q#Brahmagupta"
SplittingstringsAstringcanbedividedintomultiplepartsusingthestrsplitcommand.Thisisusefulifyouhavedatainapredefinedformatandwishtodividethestringsintocomponentpiecesforaseparateanalysis.Usingtheurlsvectordefinedearlier,wemaywishtodivideeachURLintoitsprotocolandtherestoftheinformation:
>splitURL<-strsplit(urls,":")
>splitURL
[[1]]
[1]"https"
[2]"//duckduckgo.com?q=Johann+Carl+Friedrich+Gauss"
[[2]]
[1]"https"
[2]"//search.yahoo.com/search?p=Jean+Baptiste+Joseph+Fourier"
[[3]]
[1]"http"
[2]"//www.bing.com/search?q=Isaac+Newton"
[[4]]
[1]"http"
[2]"//www.google.com/search?q=Brahmagupta"
>splitURL[[1]]
[1]"https"
[2]"//duckduckgo.com?q=Johann+Carl+Friedrich+Gauss"
>splitURL[[1]][2]
[1]"//duckduckgo.com?q=Johann+Carl+Friedrich+Gauss"
Notethatitreturnsalist,andeachentryinthelistisavectorofstringsthathavebeensplit.
CreatingformattedstringsThesprintffunctionallowsyoutotakeacombinationofobjectsandexpressthemasaformattedstring.Thepaste,format,andprintcommandshavebeenexaminedinChapter3,SavingDataandPrintingResults,andthosefunctionscanbeusedtoaccomplishsimilarresults.TheprimarydifferenceisthatthesprintffunctionactsliketheClanguage’ssprintffunction.Forexample,supposewehavealoopthatperformsananalysisonthetextfoundateachoftheURLsfoundintheurlsvectordefinedearlier.Aspartoftheanalysis,wemayhavetheresultofacalculationstoredinavariableandwishtousethatnumberinthetitleforagraph.Inthefollowingexample,wesimplysetthevalue,though,tokeeptheexamplemorestreamlined:
>n<-1
>calculation<-123.0
>theTitle<-sprintf("URL:%s,Count=%d",urls[n],calculation)
>theTitle
[1]"URL:https://duckduckgo.com?q=Johann+Carl+Friedrich+Gauss,Count=123"
The%scharactersrefertoastringintheargumentlist,and%dreferstoanintegernextintheargumentlist.ThesedefinitionsfollowthesamespecificationastheClanguagedefinitionofsprintf.
RegularexpressionsThecommandsexaminedintheprevioussectionlackflexibility,buttheyarestraightforwardintheirimplementation.Theuseofregularexpressions,ontheotherhand,offersamoreelegantapproachinmanycircumstances,buttheycanbemorecomplex.Webrieflyexamineafewcommandsthatallowstringmanipulationsviaregularexpressions,andweassumethatyouarefamiliarwithregularexpressions.TogetmoreinformationaboutregularexpressionsinRyoucanusethehelp(regularexpression)command.
Inthissection,wewillfocusonthegregexprandgsubcommands.Thereareanumberofothercommandsthatarelistedwhenyouenterthehelp(gregexpr)command.Also,thecommandshaveanumberofoptions,butwewillonlyexaminetheirmostbasicforms.Asaquicknote,thepatternsubmittedtothegrepcommanddiscussedearliercanalsobearegularexpression.
Thegregexprcommandisageneralcommandthatreturnsthenumberofresultswithrespecttotheregularexpression.Inparticular,itwillreturnthelocationofamatch,whetherornotamatchwasfound,andthenumberofcharactersinmatch.Thefirstargumenttothefunctionisapatternandthenthestringstomatch.Thefunctionreturnsalistthathasinformationaboutmatch.Inthefollowingexample,weexaminetheresultsofsearchingforthe=delimiterinthefirstentryinoururlsvectordefinedearlier:
>loc<-gregexpr("=",urls[[1]])
>loc
[[1]]
[1]25
attr(,"match.length")
[1]1
attr(,"useBytes")
[1]TRUE
>loc[[1]][1]
[1]25
Notethatthefunctionreturnsalist.Sincetheuseofasinglebracereturnsanotherlist,weusethedoublebracestoensurethatwereturntheelementwithinthelistasavector.
Thesubcommandcanbeusedtoreplacealloccurrencesofapatternwithinastring.Thisfunctiontakesthreearguments,thepattern,thereplacementstring,andthestringsthatareusedtoperformtheoperation:
>sub("\\?.*$","",urls)
[1]"https://duckduckgo.com""https://search.yahoo.com/search"
[3]"http://www.bing.com/search""http://www.google.com/search"
Notethattwobackslashesmustbeusedinthepreviousexample.Thefirstbackslashisusedtoindicatethatthenextcharacterisasymboltobeinterpreted,andifasecondbackslashisused,thenitmeanstointerpretthepairasasinglebackslash.
SummaryInthischapter,wehaveexploredavarietyofwaystomanipulatestringvariables.Twobroadcategorieshavebeenexplored.Thefirstsetoffunctionsprovideanumberoffunctionswithanarrowrangeoffunctionality.Thesefunctionsarerelativelystraightforwardbutmustoftenbeusedtogethertoaccomplishcomplextasks.Thesecondsetoffunctionsmakeuseofregularexpressions,andtheycanbeusedtoaccomplishacomplexsetoftasksusingasmallnumberofsteps.
Inthenextchapter,wemoveontothetopicofworkingwithvariablesassociatedwithtime.ThisincludestranslatingstringsintoR’sbuilt-intimetypes,anditalsoincludesthekindofoperationsthatcanbeperformedontimevariables.
Chapter6.ConvertingandDefiningTimeVariablesThischapterprovidesabroadoverviewofthewaystoconvertandstoretimevariables.Thisisamundanetopic,butitiscommontohavedatathatcontainsdateandtimeinformationinawidevarietyofforms.Thereareroughlythreepartsinthischapter:
Convertingstringstotimedatatypes:ThissectiongivesyouanintroductiontothemethodsavailabletotakeatimestampintextformatandconvertitintooneofR’sinternaltimeformatsConvertingtimedatatypestostrings:ThissectiongivesyouanintroductiontothemethodsavailabletotakeatimedatatypeandconvertittoastringsothatitcanbesavedtoafileinastandardformatOperationsontimedatatypes:Thissectiongivesyouanoverviewofthemethodsandtechniquesusedtoperformbasicarithmeticontimedatatypes
IntroductionandassumptionsDateandtimeformationsareoftensavedaspartoftheinformationwithinadataset.Convertingtheinformationintoadateortimevariableisoneofthelessexcitingchorestoperform,anditissomethingthatrequiresagreatdealofcare.Inthischapter,wewilldiscussthewaysoftransformingstringsanddatatypesanddemonstratehowtoperformbasicarithmeticoperations.Itisimportanttonotethatworkingwithtimeanddatedataoccursinanumberofdifferentcontexts,andthereareanumberofdifferentlibraries,suchaschron,lubridate,anddate,tohelpyouworkwithtimeanddatavariables.
Ourfocushere,though,isonR’sbuilt-infunctionsusedtoworkwithtimeanddatedata.Itisimportanttonotethatthecommandsexploredherecanbesensitivetosmallvariationswithinadatafile,andyoushouldalwaysdoublecheckyourworkespeciallywithrespecttotimedata.Itcanbeatedioustask,butitisimportanttomakesurethatthedataiscorrect.Ifyoucommonlyworkwithtimedata,youshouldreadthedetailsfoundusingthehelp(DateTimeClasses)command.
Anothercomplicationisthattimezonescanchange,andthegeneralpracticesassociatedwithtimedatacanchange.Forexample,newtimezonescanbechanged,created,orremoved,oraregioncanchangeitstimezone.Youshouldalwaysensurethatyourpracticesassociatedwithtimedataareconsistentwiththepracticeusedtogeneratethedatathatyouhave.
ConvertingstringstotimedatatypesThefirsttasktoexamineistotakeastringandconvertittoeachoftheinternaltimeformats.ThestrptimecommandwilltakeastringandconvertittoaPOSIXlttimevariable.IfyouwishtoconvertastringtoaPOSIXctdatatype,youcancasttheresultofstrptimeusingtheas.POSIXctcommand.WefirstfocusonconvertingastringtoaPOSIXltdatatypeandprovideanexampleattheendofthissectiontobeconvertedtoaPOSIXctdatatype.
Toconvertastringtoatimedatatype,theformatforthestringmustbespecified,andtheformattingoptionsmustconformtotheISOC99/POSIXstandard.Thestringincludesasequenceofliteralcharactersandapartiallistofconversionsubstrings,whichisgiveninTable1.Forexample,the%Y-%m-%d%H:%M:%Sstringindicatesthatthedateshouldlooklike2014-05-1409:54:10whenreferringtoMay14,2014at9:54inthemorning.Astringisassumedtoincludeanynumberofpredefinedstrings.Anythingelseisconsideredtobealiteralcharacterthatmustappearexactlyasitappearsintheformatstring.
Oncetheformatstringhasbeenspecified,thestrptimecommandcanbeusedtoconvertastringintothePOSIXltdatatype.Theargumentsforthecommandarethestringstoconvert,followedbytheformatstring:
>theTime<-c("08:30:001867-07-01","18:15:001864-10-27")
>converted<-strptime(theTime,"%H:%M:%S%Y-%m-%d")
>converted
[1]"1867-07-0108:30:00""1864-10-2718:15:00"
>typeof(converted[1])
[1]"list"
>converted[1]-converted[2]
Timedifferenceof976.5938days
Thecommandalsoacceptsanoptionaltimezoneoption,asfollows:
>theTime<-c("08:30:001867-07-01","18:15:001864-10-27")
>converted<-strptime(theTime,"%H:%M:%S%Y-%m-%d",+tz="Canada/Eastern")
>converted
[1]"1867-07-0108:30:00EST""1864-10-2718:15:00EST"
>converted[1]-converted[2]
Timedifferenceof976.5938days
TheresultsreturnedbythestrptimecommandareofthePOSIXltdatatype.Theycanbeconvertedusingtheas.POSIXctcommand:
>theTime<-c("08:30:001867-07-01","18:15:001864-10-27")
>converted<-strptime(theTime,"%H:%M:%S%Y-%m-%d",tz="Canada/Eastern")
>converted
[1]"1867-07-0108:30:00EST""1864-10-2718:15:00EST"
>typeof(converted[1])
[1]"list"
>otherTime<-as.POSIXct(converted)
>otherTime
[1]"1867-07-0108:30:00EST""1864-10-2718:15:00EST"
>typeof(otherTime[1])
[1]"double"
>cat(otherTime[1],"\n")
-3234681000
Notethatwhenthedatevariablesareprinted,theyareconvertedtoahuman-readableformat.Thisisconvenient,butitcanhidetheunderlyingdatatype.Again,youshouldbecarefulaboutvariablesthathaveatimedatatypesinceitiseasytolosetrackofhowtheRenvironmentisactuallytreatingthevalues.
Anotherimportantconcernthatyoushouldbewaryofisthatthestrptimecommandwillnotgiveavisibleindicationwhenanerroroccurs.ItwillreturnNAinplaceofeacherror,asfollows:
>aTime<-c("2014-05-0508:00:00","2014/05/0508:00:00")
>internal<-strptime(aTime,"%Y/%m/%d%H:%M:%S")
>internal
[1]NA"2014-05-0508:00:00"
>is.na(internal)
[1]TRUEFALSE
Anotherwaytosavethedateandtimeinformationinafileistospecifythedateandtimeinonecolumn.Iftheinformationfromthefileisstoredasadataframe,thenanewcolumncanbeaddedthatcontainstheinformationsavedinaninternalformat:
>fileInfo<-data.frame(time=c("2014-01-0100:00:00",
+"2013-12-3123:59:50","2013-12-3123:55:12"),
+happiness=c(1.0,0.9,0.8))
>fileInfo
timehappiness
12014-01-0100:00:001.0
22013-12-3123:59:500.9
32013-12-3123:55:120.8
>fileInfo$internalTime<-strptime(fileInfo$time,"%Y-%m-%d%H:%M:%S")
>fileInfo
timehappinessinternalTime
12014-01-0100:00:001.02014-01-0100:00:00
22013-12-3123:59:500.92013-12-3123:59:50
32013-12-3123:55:120.82013-12-3123:55:12
>summary(fileInfo)
timehappinessinternalTime
2013-12-3123:55:12:1Min.:0.80Min.:2013-12-3123:55:12
2013-12-3123:59:50:11stQu.:0.851stQu.:2013-12-3123:57:31
2014-01-0100:00:00:1Median:0.90Median:2013-12-3123:59:50
Mean:0.90Mean:2013-12-3123:58:20
3rdQu.:0.953rdQu.:2013-12-3123:59:55
Max.:1.00Max.:2014-01-0100:00:00
Thevariouscommandsusedtoconvertastringtoatimeordatevariablehavealargevarietyofoptions.TheseoptionscanbefoundwithRusingthehelp(trptime)command.Alistoftheoptionscanalsobefoundinthefollowingtable:
Formatstring Meaning Format
string Meaning
%a Abbreviationforthenameofthedayoftheweek %p Indicatorfor“A.M.”or“P.M.”
%A Fullnameofthedayoftheweek %SSecondsasanumber(00-61);leapsecondsareallowed
%b Abbreviationforthenameofthemonth %U Weakoftheyearasanumber(00-53)
%B Fullnameofthemonth %wWeekdayasanumber(0=Sunday,1=Monday,…,6=Saturday)
%cThedateandtimeintheformat%a%b%e%H:%M:%S%Y
%x Dateintheform“%y/%m/%d”
%d Dayofthemonthasanumber(01through31) %X Timeintheformat“%H:%M:%S”
%HHoursasanumber(00-23);notethat24isallowedasanexceptionwhenusedas24:00:00
%yYearastwodigits(“00-99”),and00-68refertothe2000swhile69-99refertothe1900s
%I Hoursin12hourformatasanumber(01-12) %Y Yearasfourdigits(>=1582)
%j Dayoftheyearasanumber(001-366) %zOffsetfromUTC(-0500isfivehoursbehindUTC)
%m Monthasanumber(01-12) %ZTimezoneasastring(onlyavailableforconvertingtimetoastring)
%M Minuteasanumber(00-59)
Table1–charactersusedtodefinetheformatsofdates
Moreoptionsareavailableandcanbeviewedusingthehelp(strftime)commandwithintheRenvironment.
ConvertingtimedatatypestostringsAdateandtimevariablecanbeconvertedtoastringusingthestrftimecommand.Itsformatissimilartothestrptimecommand,exceptithasoneadditionaloptiontodeterminewhetherornottoincludetimezoneinformationintheresultingstring.Thecommandisaconveniencefunction,anditcallseithertheformat.POSIXltorformat.POSIXctcommanddependingonthedatatypeofthetimevariable.Wefocusonthestrftimecommandbecauseitisfamiliartopeoplefromawidervarietyofprogrammingexperiences.
Thestrftimecommandrequiresatimevariableandaformat.Itreturnsastringinthegivenformat(someformatoptionsaregiveninTable1):
>theTime<-c("08:30:001867-07-01","18:15:001864-10-27")
>converted<-strptime(theTime,"%H:%M:%S%Y-%m-%d",
+tz="Canada/Eastern")
>converted
[1]"1867-07-0108:30:00EST""1864-10-2718:15:00EST"
>typeof(converted)
[1]"list"
>backAgain<-strftime(converted,"%j-%B")
>backAgain
[1]"182-July""301-October"
>typeof(backAgain[1])
[1]"character"
OperationsontimedatatypesAvarietyofarithmeticoperationsareavailableforthetimedatatypes,andyoushouldbeespeciallywarywhenperforminganyoperations.Theunitsthatarereturnedcanvarydependingonthecontext.Itisextremelyeasytolosetrackoftheunitsandmakeaspuriouscomparison.Inthissection,I’llintroducesomeofthebasicoperationsandthendiscussthedifftimecommand.Itispossibletoperformanyoperationwithoutthedifftimecommand,butthecommandhasanimportantadvantage:itallowsyoutoexplicitlydefinetheunits.
Whenyouperformsimplearithmeticontimedatatypes,itactsinthewayyoumightexpect:
>earlier<-strptime("2014-01-0100:00:00","%Y-%m-%d%H:%M:%S")
>later<-strptime("2014-01-0200:00:00","%Y-%m-%d%H:%M:%S")
>later-earlier
Timedifferenceof1days
>timeDiff<-later-earlier
>timeDiff
Timedifferenceof1days
>as.double(timeDiff)
[1]1
>earlier+timeDiff
[1]"2014-01-02EST"
Notethatinthepreviousexample,theunitsusedaregivenindays.Onesmallchange,though,resultsinadifferentkindofresult:
>earlier<-strptime("2014-01-0100:00:00","%Y-%m-%d%H:%M:%S")
>later<-strptime("2014-01-0112:00:00","%Y-%m-%d%H:%M:%S")
>later-earlier
Timedifferenceof12hours
>timeDiff<-later-earlier
>timeDiff
Timedifferenceof12hours
>as.double(timeDiff)
[1]12
TheRenvironmentwillkeeptrackofthedifferenceanditsunits.Ifyoulookatthedifftimevariableinthepreviousexample,itcontainsinformationaboutwhatitmeans:
>attributes(timeDiff)
$units
[1]"hours"
$class
[1]"difftime"
>attr(timeDiff,"units")
[1]"hours"
Youcanspecifytheunitsbycastingtheresultasanumericvalueandprovidingtheunitstouse:
>earlier<-strptime("2014-01-0100:00:00","%Y-%m-%d%H:%M:%S")
>later<-strptime("2014-01-0112:00:00","%Y-%m-%d%H:%M:%S")
>timeDiff<-later-earlier
>timeDiff
Timedifferenceof12hours
>as.numeric(timeDiff,units="weeks")
[1]0.07142857
>as.numeric(timeDiff,units="secs")
[1]43200
Becauseofpotentialambiguities,itisusuallyadvantageoustousethedifftimecommand.Thedifftimecommandoffersawiderangeofoptions,andyoushouldcarefullyreaditshelppage,help(difftime),toseetheoptionsanddetails.Initsmostbasicuse,youcanfindthedifferencebetweentwotimes,andyoucanspecifywhatunitstouse:
>earlier<-strptime("2014-01-0100:00:00","%Y-%m-%d%H:%M:%S")
>later<-strptime("2014-01-0112:00:00","%Y-%m-%d%H:%M:%S")
>timeDiff<-difftime(later,earlier,units="sec")
>timeDiff
Timedifferenceof43200secs
>timeDiff<-difftime(later,earlier,units="day")
>timeDiff
Timedifferenceof0.5days
Theunitsthatareavailableareauto,secs,mins,hours,days,orweeks.
Onefinalnote;thedifftimedatatypeoffersaconvenientwaytoperformsometimearithmeticoptions.Theas.difftimecommandcanbeusedtospecifyatimeinterval,andyoucanspecifytheunits,soitismorelikelythattheresultsareconsistentwithyourexpectations:
>later<-strptime("2014-01-0112:00:00","%Y-%m-%d%H:%M:%S")
>oneHour=as.difftime(1,units="hours")
>later+oneHour
[1]"2014-01-0113:00:00EST"
SummaryAbroadoverviewoftheoptionsavailabletoconvertbetweenstringsandthetwotimedatatypeswasgiveninthischapter.ThestrptimecommandisusedtoconvertastringintoaPOSIXltvariable.Thestrftimecommandisusedtoconvertatimedatatypeintoastring.Finally,thebasicoperatorsusedtoperformarithmeticoperationsbetweentwotimevariableswasdiscussed,withanemphasisonusingthedifftimecommand.Wealsoexploredtheuseofthedifftimedatatype.
Chapter7.BasicProgrammingInthepreviouschapters,weexploredthebasicaspectsofhowRstoresinformationandthedifferentwaystoorganizeinformation.Wewillnowexplorethewaythatoperationscanbedefinedandexecuted,andwriteaprograminR.TheabilitytocreatealgorithmsthatcombinefunctionstocompletecomplicatedtasksisoneofR’sbestfeatures.Wecontinuetheexplorationofprogramminginthenexttwochaptersandfocusonobject-orientedapproaches.Thischapterisdividedintofourparts:
Conditionalexecution:Inthissection,wewillintroduceif-then-elseblocksanddiscusslogicaloperatorsLoopconstructs:Inthissection,wewillexplorethreedifferentwaystoimplementloopsFunctions:Inthissection,wewilldiscusshowtodefinefunctionsinRandexploresomeoftheimportantconsiderationsassociatedwithfunctionsScriptexecution:Inthissection,wewilldiscusshowtoexecuteasetofcommandsthathavebeensavedinafile
ConditionalexecutionThefirstcontrolconstructexaminedistheifstatement.Anifstatementwilldeterminewhetherornotacodeblockshouldbeexecuted,andithasanoptionalelsestatement.Anadditionalifstatementcanbechainedtoanelsestatementtocreateacascadeofoptions.
Initsmostbasicform,anifstatementhasthefollowingform:
if(condition)
codeblock
Theconditioncanbealogicalstatementconstructedfromtheoperatorsgiveninthenexttable,oritcanbeanumericresult.Iftheconditionisnumeric,thenzeroisconsideredtobeFALSE,otherwisethestatementisTRUE.Thecodeblockcaneitherbeasinglestatementorasetofstatementsenclosedinbraces:
>#checkif1islessthan2
>if(1<2)
+cat("oneissmallerthantwo.\n")
oneissmallerthantwo.
>if(2>1){
+cat("Buttwoisbiggeryet.\n")
+}
Buttwoisbiggeryet.
Notethatacommentisincludedinthepreviouscode.The#characterisusedtodenoteacomment.Anythingafterthe#characterisignored.
Theifstatementcanbeextendedusinganelsestatement.TheelsestatementallowsyoutospecifyacodeblocktoexecuteiftheconditiondoesnotevaluatetoTRUE:
>if(1>2){
+cat("Oneisthemostbiggestnumber\n")#Saysomethingwrong
+}else{
+cat("Oneistheloneliestnumber\n")#Saysomethinglesswrong
+}
Oneistheloneliestnumber
ThefirstthingtonoteabouttheseexamplesisthataKernighanandRitchie(K&R)indentationstyleisadopted.WeadoptedtheK&Rstylebecausetheelsestatementmustbeonthesamelineastheclosingbrace.Thesecondthingtonoteisthatanotherifstatementcanbeappendedtotheelsestatementsothatastandardif-then-elseblockcanbeconstructed:
>if(0){
+cat("Yes,thatisFALSE.\n")
+}elseif(1){
+cat("YesthatisTRUE\n")
+}else{
+cat("Whatever")
+}
YesthatisTRUE
OnepotentialissueisthatRdoesnothaveascalartypeandassumesmostdatatypesare
arrangedinvectors.Thiscanleadtopotentialproblemswithalogicalstatement.ThefirstelementinthevectorisusedtodecidewhetherawholestatementisTRUEorFALSE.Fortunately,Rwillgiveawarningiftheconditionevaluatestoavectoroflengthgreaterthanone:
>x<-c(1,2)
>if(x<2){
+cat("Ohyesitis\n")
+}
Ohyesitis
Warningmessage:
Inif(x<2){:
theconditionhaslength>1andonlythefirstelementwillbeused
Thisissomethingtokeepinmindwhendecidingwhichlogicaloperatortouseinanifstatement.Forexample,the|operatorwillperformalogicalORonallelementsofthevectors,butthe||operatorwillonlyperformalogicalORonthefirstelementsofthevectors:
>x<-c(FALSE,FALSE,TRUE)
>y<-c(FALSE,TRUE,TRUE)
>x|y
[1]FALSETRUETRUE
>x||y
[1]FALSE
Avarietyoflogicaloperatorsarerecognized.Someprovidecomparisonsbetweenallentriesinavectorandothersareforcomparisonsonlyforthefirstelementsinthevectors.Alistoftheoperatorsisgiveninthefollowingtable:
Operator Description Operator Description
< Lessthan(vector) | Or(vector)
> Greaterthan(vector) || Or(firstentryonly)
<= Lessthanorequal(vector) ! Not(vector)
>= Greaterthanorequal(vector) & And(vector)
== Equalto(vector) && And(firstentryonly)
!= Notequalto(vector) xor(a,b) Exclusiveor(vector)
Table1–Thelogicaloperatorsincludingcomparisonoperators
LoopconstructsAnotherimportantprogrammingtaskistocreateasetofinstructionsthatcanberepeatedinastructuredway.TherearethreeloopconstructsinR:for,while,andrepeatloops.Wewillexploreeachoftheseloopconstructsanddiscussthebreakandnextcommands,whichcancontrolhowtheinstructionswithinaloop’scodeblockareexecuted.Moredetailsareavailableusingthehelp(Control)command.
TheforloopAforlooptakesavector,anditrepeatsablockofcodeforeachvalueinthevector.Thesyntaxisasfollows:
for(varinvector)
codeblock
Theforlooprepeatsasetofinstructionsforeveryvaluewithinavectorintheappropriateorder.Itcanbememoryintensiveifyouneedtorepeattheloopalotoftimesandthevectorisnotalreadystoredintheworkspace.Anexampleofasimpleforloopisgivenhere:
>for(lupeinseq(1,2,by=0.33))
+{
+cat("Thevalueoflupeis",lupe,"\n")
+}
Thevalueoflupeis1
Thevalueoflupeis1.33
Thevalueoflupeis1.66
Thevalueoflupeis1.99
ThewhileloopAwhileloopwillexecuteacodeblockaslongasagivenexpressionistrue.Thesyntaxisasfollows:
while(condition)
codeblock
Theyareoftenusedwhenthenumberofiterationsisnotknowninadvanceandtheloopisrepeateduntilsomecriteriaismet.Also,thewhileloophassomeadvantagesovertheforloop.Itcanbemoreefficient,especiallywithrespecttomemory,anditismoreflexible(RMemory,PleaserefertoRMemory:HadleyWickham,memory,2014,http://adv-r.had.co.nz/memory.htmlformoreinformationonthistopic).Forexample,insteadofconstructingalargevectortoiterateoveritselements,asingleindexcanbeused.Onthedownside,itcanbehardertoread,anditcanrequirealittlemorecarewhenwritingthecode.Anexampleisgivenhere:
>lupe<-1.0;
>while(lupe<=2.0)
+{
+cat("Thevalueoflupeis",lupe,"\n")
+lupe<-lupe+0.33
+}
Thevalueoflupeis1
Thevalueoflupeis1.33
Thevalueoflupeis1.66
Thevalueoflupeis1.99
TherepeatloopArepeatloopisusedtodenoteablockofcodethatwillberepeatedlyexecuteduntilanexplicitbreakoutoftheblockisexecuted.Theprimaryadvantageoftherepeatloopisthatthestartofthecodeblockwillalwaysbeexecuted.Onedisadvantageisthatitcanbedifficulttoread.Thesyntaxisrelativelysimple:
repeat
codeblock
YoumustuseabreakcommandtotellRtoexitthecodeblock.Thebreakcommandisdescribedinthenextsubsectioninmoredetail.Anexampleofarepeatloopisgiveninthefollowingexample,anditisabriefexampleofsimulationofarandomwalkinthecomplexplane:
positions<-complex(0)#Initializethehistoryofpositions
currentPos<-0.0+0.0i#Startattheorigin
NUMBERSTEPS<-50#Numberofstepstotake
angleFacing<-0.0#directionitisfacing
stdDev<-1.0#stddev.ofthechangeintheangle
step<-as.integer(0)
repeat
{
##Updatethecurrenttimestep
step<-step+as.integer(1)
##Checktoseeifitistimetostop
if(step>MAXNUMBER)
break
##Addnewpeopletothelineandupdatethelength
angle<-angle+rnorm(1,0.0,stdDev)
currentPos<-currentPos+exp(angle*1.0i)
positions<-c(positions,currentPos)
}
plot(Re(positions),Im(positions),type="l")
BreakandnextstatementsThebreakandnextstatementsareusedtoinfluencewhichpartofthecodeinthecurrentloopwillbeexecuted.Thebreakstatementwillmovetotheveryendofthecurrentblockanditwillstoptheexecutionoftheloop.Thenextstatementwillactasiftheendofthecodeblockwasreachedandstartoveratthebeginningofthecodeblocktobeginthenextiteration.
Asademonstration,webuildonthesimulationoftherandomwalkintheprevious
example.Wealterthemodelbyplacingarestrictionontheposition.Ifastepmovestotheleft-handpartoftheplane,itisignored:
positions<-complex(0)#Initializethehistoryofpositions
currentPos<-0.0+0.0i#Startattheorigin
NUMBERSTEPS<-50#Numberofstepstotake
angleFacing<-0.0#directionitisfacing
stdDev<-1.0#stddev.ofthechangeintheangle
step<-as.integer(0)
repeat
{
##Addnewpeopletothelineandupdatethelength
newAngle<-angle+rnorm(1,0.0,stdDev)
proposedStep<-currentPos+exp(newAngle*1.0i)
if(Re(proposedStep)<0.0)
next#Ignorethisstep.Itmovestoneg.realparts
##updatetheposition
angle<-newAngle
currentPos<-proposedStep
positions<-c(positions,currentPos)
##Updatethecurrenttimestep
step<-step+as.integer(1)
##Checktoseeifitistimetostop
if(step>MAXNUMBER)
break
}
plot(Re(positions),Im(positions),type="l")
FunctionsAnothercommonprogrammingtaskistodefineafunctionorsubroutinethatcanbeexecutedwithasinglecall.Definingandusingfunctionscanbecomplicatedbecauseofthetechnicaldetailsassociatedwithworkingwithvariablesthatexistindifferentcontextsbutmayhavethesamename.AnotherproblemthatarisesisthateverythinginRisanobject.Uptothispoint,wehavequietlyignoredthisissue,butitisatechnicalissuethatwemustnowconsider.Inthissection,wewillfirstdemonstratehowtodefineafunction.Wewillthendiscussthedetailsabouthowargumentsarepassedtoafunction.Finally,wediscussthetechnicaldetailsofhowRdetermineswhatavariablenamemeans.
Beforewegetintothosedetails,wewillprovideanoteabouthowRkeepstrackoffunctions.Whenwedefineavariable,Rtreatsthatvariableasanobjectthatcanbeaccessedusingthenameweassigntothevariable.Likewise,whenyoudefineanewfunction,itisassignedavariablename,andthevariableisanobject.
DefiningafunctionAspreviouslymentioned,whenafunctionisdefined,itisassignedtoanobjectandtreatedlikeanyothervariable.Theformatforafunctiondefinitionisasfollows:
function(arg1,arg2,…)
codeblock
Thiswillcreateanobject,andyoumustassignavariablenametotheobject.Ifyouprintoutthevalueofthevariable,itwillprintoutthedefinitionofthefunction.Inthefollowingexample,supposeweneedafunctionusedtosimulatearandomwalkinthecomplexplane.Thefunctiontakesthecurrentpositionandaddsaunitstepinarandomdirection:
>updatePosition<-function(currentPos)
+{
+newDirection<-exp(1i*runif(1,0.0,2.0*pi))
+currentPos+newDirection
+}
>
>updatePosition(0.0)
[1]0.9919473-0.1266517i
>updatePosition
function(currentPos)
{
newDirection<-exp(1i*runif(1,0.0,2.0*pi))
currentPos+newDirection
}
Oneoddityassociatedwithfunctionsisthatthevalueitreturnsisthelastexpressionevaluatedwithinthecodeblock.
Therearetimeswhenyouwantafunctiontoperformoperationsthatimpactmorethanonevariable.Insuchcases,youmayneedtoreturnacombinationofresults.Fordifficultresultsthatcannotbeexpressedasavector,youcanreturntheresultasalist.Inthe
followingexample,weextendthepreviousexampleandwishtoreturnthenewpositionaswellastheupdateddirectionofmovement:
>updatePosition<-function(currentPos,angle,stdDev)
+{
+angle<-angle+rnorm(1,0,stdDev)
+list(newPos=currentPos+exp(angle*1.0i),
+newAngle=angle)
+}
>
>pos<-updatePosition(2.0,0.0,1.0)
>pos
$newPos
[1]2.986467+0.163962i
$newAngle
[1]0.164706
>pos$newPos
[1]2.986467+0.163962i
Itispossibletoexplicitlyspecifythevaluereturnedbyafunctionusingthereturncommand.Thereturncommandtakesatmostoneargument.Thecommandwillexitthefunctionandreturnthevaluegivenintheargumentifitexists.Inthefollowingexample,webuildonourexampleofarandomwalkinthecomplexplane.Here,weassumethattheleft-handsideoftheplaneisnotreachable,andiftherealpartofastepisnegative,thenthestepmovesintheoppositedirection:
>updatePosition<-function(currentPos,angle,stdDev)
+{
+angle<-angle+rnorm(1,0.0,stdDev)
+newStep<-exp(angle*1.0i)
+if(Re(currentPos+newStep)<0.0)
+{
+#Thiswouldbeamoveinthelefthandpartofthe
#plane.
+#Moveintheoppositedirection.
+return(list(newPos=currentPos-newStep,
+newAngle=angle+pi))
+}
+#Allisgood.Acceptthismove.
+return(list(newPos=currentPos+newStep,
+angle=angle))
+}
>
>pos<-updatePosition(-0.1+2i,0.0,1.0)
>pos$newPos
[1]0.459425+2.828881i
ArgumentstofunctionsWehavediscussedhowtodefineanewfunctionandbrieflydiscussedhowtopassargumentstoafunction.Wenowfocusonsomedetailsaboutpassingargumentstoafunction.First,wenotethattheargumentsthatarepassedtoafunctionarepassedasvaluesandnotreferences.Anychangesyoumaketoanargumentdonotimpactthe
variableoutsidethefunction.Inthefollowingexample,wegobacktoourfunctiontoupdatethepositionforarandomwalkinthecomplexplane.Wepasstheangletothefunctionthatischangedwithinthefunctionbutnotoutsideofthefunction:
>updatePosition<-function(currentPos,angle,stdDev)
+{
+angle<-angle+rnorm(1,0.0,stdDev)
+currentPos+exp(1i*angle)
+}
>
>angle<-0.0
>updatePosition(1+2i,angle,1.0)
[1]0.250178+2.661639i
>angle
[1]0
Anotherimportantpointisthatyoucanprovidedefaultvaluesforsomearguments.Ifadefaultvalueisgivenforavariable,thenitisnotrequiredwhilecallingthefunction:
>updatePosition<-function(currentPos,angle=0.0)
+{
+print(noquote(paste("Angleis",angle)))
+angle<-angle+runif(1,0.0,2.0*pi)
+currentPos+exp(1i*angle)
+}
>
>updatePosition(1+2i)
[1]Angleis0
[1]0.091507+2.417901i
>updatePosition(1+2i,pi)
[1]Angleis3.14159265358979
[1]0.029198+2.239884i
Therearecircumstancesinwhichyoumightwishtocheckwhetheraparticularargumenthasbeenspecifiedwhenthefunctioniscalled.Thiscanbedoneusingthemissingcommand.Inthenextexample,wetestwhetherornottheangleisprovided:
>updatePosition<-function(currentPos,angle=0.0)
+{
+if(missing(angle))
+{
+warning("Usingthedefaultdrift:",angle)
+}
+angle<-angle+runif(1,0.0,2.0*pi)
+currentPos+exp(1i*angle)
+}
>
>updatePosition(1+2i)
[1]0.552725+1.105604i
Warningmessage:
InupdatePosition(1+(0+2i)):Usingthedefaultdrift:0
>updatePosition(1+2i,pi)
[1]0.054151+1.675394i
Notethatthewarningcommandwasusedtoprintoutamessage.Whenexecutinga
function,itmaybebeneficialtoprintawarningortostoptheexecutionofthefunctionduetoanerrorcondition.Thestopandwarningcommandscanbeusedforthesesituations.Thewarningcommandprintsoutawarningandcontinuesexecutionasnormal.Thestopcommandwillprintoutamessageandexitthefunction:
>updatePosition<-function(currentPos,angle=0.0)
+{
+if(abs(angle)>2.0*pi)
+{
+stop("Iarbitrarilydonotlikeanglesthatbig")
+}
+angle<-angle+runif(1,0.0,2.0*pi)
+currentPos+exp(1i*angle)
+}
>
>pos1<-updatePosition(1+2i)
>pos2<-updatePosition(1+2i,3.0*pi)
ErrorinupdatePosition(1+(0+2i),3*pi):
Iarbitrarilydonotlikeanglesthatbig
>pos2
Error:object'pos2'notfound
Notethatinthelastline,thestopcommandwascalledandthefunctiondidnotreturnavalue.Theresultisthatthevariablepos2doesnotexist.Becarefulthough,asifthevariablepos2hadbeenpreviouslydefined,itwouldretainitspreviousvalue.
Intheprecedingexamples,thereisanassumptionabouttheorderoftheargumentswhencallingafunction.Inthepreviousexamples,theargumentsarematchedaccordingtotheordertheyappearinthefunctioncall.Youcancircumventthisconventionbyspecifyingthenamewhenyoucallthefunction.ThecaveatisthatRdoesnotrequirethatthenameshouldmatchexactly,anditwilltrytomatchthenamesusingthefirstcharactersinthename.Ifthematchisambiguous,youwillgetanerrormessage:
>matching<-function(argOne,argTwo)
+{
+return(paste("Igotthis:",argOne,'',argTwo))
+}
>matching(argTwo="second",argOne="First")
[1]"Igotthis:Firstsecond"
>matching(argT="2nd",argO="1st")
[1]"Igotthis:1st2nd"
>matching(argT="two",arg="one")
Errorinmatching(argT="two",arg="one"):
argument2matchesmultipleformalarguments
Thelastissuetodiscussishowtolimitthepotentialvaluesthatanargumentmayhave.Thedefaultvaluesforanargumentcanbegivenasavectorofvalues.Ifnoargumentisgiven,itdefaultstothefirstentryinthevector.Ifyouwishtolimitthevaluestobeoneofthevaluesinthevector,youcanusethematch.argfunctiontotestthevalue:
>updatePosition<-function(currentPos,angle=0.0,
+dist=c("uniform","normal"))
+{
+dist<-match.arg(dist)
+print(dist)
+#Updatepositioncodewouldgobelow
+}
>
>updatePosition(0.0,0.0)
[1]"uniform"
>updatePosition(0.0,0.0,"uniform")
[1]"uniform"
>updatePosition(0.0,0.0,"neither")
Errorinmatch.arg(dist):'arg'shouldbeoneof"uniform","normal"
ScopeAnimportantquestionwhendealingwithafunctionishowtodecidewhatasymbolmeans.Thisideaisreferredtoasscope,andthelanguageusedtodescribetheideasassociatedwithscopecanbeconfusing.Unfortunately,itissomethingthatneedstobeconsidered,andwetrytodiscusssomeoftheideashere.ItisalsoimportanttonotethatthedetailsdiscussedhererepresentanareainwhichRisnotconsistentwiththeS_PLUSlanguage,sobecarefulaboutgeneralizingtheseideas.Forfurtherinformation,youcanenterthedemo(scoping)commandintheRenvironmentandabriefdemonstrationofthenotionofscopeisgiven.Ifyouenterthehelp(environment)command,youcanalsofindmoredetails.
ThebasicideaisthatRmaintainsahierarchyofenvironments.Eachenvironmenthasalistofsymbolsthatareassociatedwiththatenvironment.Youcancreateanewenvironmentthatisembeddedwithinanotherenvironment.Thenew.envcommandisusedtocreateanenvironment.Thisenvironmentholdsitsownvariables,andvariablescanbecreatedusingtheassigncommand.Thevaluescanbeobtainedusingthegetcommand:
>envOne<-new.env()
>typeof(envOne)
[1]"environment"
>ls()
[1]"envOne"
>ls(envOne)
character(0)
>
>
>assign("bubba",12,envir=envOne)
>ls()
[1]"envOne"
>ls(envOne)
[1]"bubba"
>envOne$bubba
[1]12
>get("bubba",envOne)
[1]12
>bubba
Error:object'bubba'notfound
Notethatintheprecedingexample,weusedtheoptionalenvironmentargumenttothelscommand,whichspecifieswhichenvironmenttouse.TheenvironmentisusedtoguideR
inhowtointerpretthemeaningofasymbol.TheRenvironmentmaintainsapaththatitusestosearchinaparticularordertofindasymbol.Onewaytomanipulatethepathistousetheattachanddetachcommands.Theattachanddetachcommandshavenumerousoptions,butwefocusonhowtouseitwithenvironments.Wealsoprovideawarningthatusingthesecommandscanleadtoconfusionaboutthemeaningofasymbol,andyoushouldexercisecautionwhenusingthesecommands:
>ls()
character(0)
>one<-2
>ls()
[1]"one"
>envTwo<-new.env()
>assign("two",3,envir=envTwo)
>two
Error:object'two'notfound
>attach(envTwo)
>ls()
[1]"envTwo""one"
>two
[1]3
>detach(envTwo)
>two
Error:object'two'notfound
Thereasonweexplorethistopichereisthatwhenyoudefineafunction,anewenvironmentiscreatedthatexistswithinthefunction.Whenyouuseasymbolwithinafunction,itcanbeambiguousastowhatitmeans.Ifthatsymbolhasbeenpreviouslydefinedwithinthefunction,thenitistreatedasalocalvariable.Ifthatsymbolexistsintheparentenvironment,thenitispossibletogetaccesstoitorchangeitsvalue.InChapter1,DataTypes,itwasbrieflynotedthatthe<-operatorisusedtoassignavariableinthelocalcontext.The<<-operatorisusedtotellRtofirstsearchtheparentenvironment:
>one<-2
>changeOne<-function(a)
+{
+one<-a
+return(one)
+}
>changeOne(3)
[1]3
>one
[1]2
>realyChangeOne<-function(a)
+{
+one<<-a
+return(one)
+}
>realyChangeOne(3)
[1]3
>one
[1]3
Again,<<-tellsRtousetheparentofthecurrentenvironment.Thatmeansthatifyoucreateafunctionwithinafunction,theuseofthe<<-operatorwithintheinnermostfunctionwilllookforavariableintheoriginal(outermost)function.Thisideaisexaminedinthefollowingexample:
>market<-function(rutabagas)
+{
+money<-0
+return(list(
+numberRutabagas=function()
+{
+return(rutabagas)
+},
+revenue=function()
+{
+return(money)
+},
+harvestRutabagas=function(amount)
+{
+rutabagas<<-rutabagas+amount
+},
+sellRutabagas=function(amount)
+{
+if(rutabagas>=amount)
+{
+rutabagas<<-rutabagas-amount
+money<<-money+amount*0.5
+}
+else
+{
+warning("Wedonothavethatmanyrutabagas")
+}
+return(rutabagas)
+}))
+}
>farmerJoe<-market(20)
>farmerJoe$numberRutabagas()
[1]20
>farmerJoe$sellRutabagas(6)
[1]14
>farmerJoe$numberRutabagas()
[1]14
Warningmessage:
InfarmerJoe$sellRutabagas(15):Wedonothavethatmanyrutabagas
>farmerJoe$harvestRutabagas(10)
>farmerJoe$numberRutabagas()
[1]24
Notethattheprecedingexamplegivesusourfirsttasteofobject-orientedprogramming.WewillexplorethisinmoredetailinChapter8,S3Classes,andwillbuildontheidea.
ExecutingscriptsThefinaltopicishowtoexecuteasetofcommandsthathavebeensavedinafileusingthecommandlineinaninteractiveRsession.Allofourexampleshavebeencontrived,andthereasonforthisistotrytofocusonaspecificidea.TherealpowerofRthoughistheabilitytoputtogetherasetofcommandsandhavethemexecutedinorder.Thiscanbeaccomplishedusingthesourcecommand.
Weneedtohaveafiletoexecute.Weassumethatyouhavethefilegivenhere.Youcancreatethisfileusinganyeditorcapableofsavingsimpletextfiles,andweassumethatthenameofthefileissimpleExecute.R:
#FilesimpleExecute.R
#Thisisasimpleexampleusedtodemonstratethesourcecommand.
#Thisscriptwillpromptthepersonrunningittoenteranumber,
#anditwillfindthesquarerootofthenumber.
#Itteststheoriginal
#numbertomakesureitispositiveandprintsoutanappropriate
#warningmessageifitisnegative.
x<-as.double(readline("Whatisthevalueofx?"))#Readinanumber
cat("Igotthenumber",format(x,digits=6),".\n")
if(x<0)
{
#Thenumberisnegative.Whataretheythinking?
print("Whywouldyougivemeanegativenumber?")
x<-abs(x)
}
#Findthesquarerootandassignitthevariable"y."
y<-sqrt(x)
Youcanexecutethefileusingthesourcecommand.OneimportantthingisthattheRenvironmentmustfinditonyourlocalmachine.Youcaneitherspecifythesearchpathoryoucanspecifythecurrentworkingdirectory.TheeasiestwaytodothisdependsonhowyouarerunningR,theinterfaceyouareusing,andyouroperatingsystem.Weassumethatyoucanspecifythecurrentworkingdirectory(folder),andthefilegivenearlierisinthatdirectory.Onceyouspecifythecurrentworkingdirectory,youcanexecutethecommandsusingthesourcecommand,asfollows:
>source('simpleExecute.R')
Whatisthevalueofx?2.3
Igotthenumber2.3.
>source('simpleExecute.R')
Whatisthevalueofx?-2.3
Igotthenumber-2.3.
[1]"Whywouldyougivemeanegativenumber,jerk?"
>source('simpleExecute.R',echo=TRUE)
>x<-as.double(readline("Whatisthevalueofx?"))
Whatisthevalueofx?-2.3
>cat("Igotthenumber",format(x,digits=6),".\n")
Igotthenumber-2.3.
>if(x<0)
+{
+print("Whywouldyougivemeanegativenumber,jerk?")
+x<-abs(x)
+}
[1]"Whywouldyougivemeanegativenumber,jerk?"
>y<-sqrt(x)
Thecommandhasanumberofoptions.Oneoptionnotexploredhereistheverboseoption.Thisisahelpfuloptionfordebugging,andyoushouldtrytoaddtheverbose=TRUEoption.Anexampleisomittedbecausereferringtoitasverboseisanunderstatement.
SummaryThischapterintroducedbasicideastospecifyoptionalexecutionofcertaincommandsandthethreebasicloopconstructs.WehadtotakeasidetriptodiscusstheideaofscopeandexplorehowRfindsandinterpretsthemeaningofavariablename.Youcancombinetheseideastocreateandimplementalgorithmsandexecutecommandsinafile.
Thischapteralsoincludesourfirsttasteofobject-orientedprogramminginthesenseofanS3class.WebuildonthisideainthenextchapterwheretheS3classisformallydefined.Indoingso,weexplorehowexistingfunctionscanbeextendedtoaccommodateargumentsthatincludeaclassthatwehaveconstructed.
Chapter8.S3ClassesThisisthesecondchapterinourintroductiontoprogramming.Intheprecedingchapter,weexploredthebasiccontrolstructuresthathelpustodefinethecodethatisexecuted,andwehadourfirsttasteofobjects.Wewillnowbuildontheideaofobject-orientedprogramming,concentratingonS3classes.Therearetwoapproaches,S3andS4classes.Itiscommonforsomepeopletouseonlyoneexclusively.
Thischapterisdividedintothreeparts:
Definingclassesandmethods:ThissectionwillgiveageneralideaofhowmethodsaredefinedwhosefunctiondependsontheclassnameoftheprimaryargumentObjectsandinheritance:Inthissection,wewilldiscussthewayinwhichobjectsofagivenclasscanbedefined;wewillalsointroducetheideaofinheritanceinthecontextofS3classesEncapsulation:Inthissection,wewilldiscusstheimportanceofencapsulationwithrespecttoaclassandhowitishandledwithinthecontextofanRclass
Atfirstglance,S3objectsdonotappeartobehavelikeobjectsasdefinedinotherlanguages.ThedefinitionisanoddimplementationcomparedtoJavaorC++.Ontheplusside,S3objectsarerelativelysimpleandcanofferapowerfulwaytodealwithawidevarietyofcircumstances.
Wehaveseenavarietyofdatastructuresaswellasfunctions,andinthischapter,wewillseehowtheclassattributecanbeusedtodictatehowafunctionrespondswhenalistispassedtoafunction.Theideaisthattheclassattributeforanobjectisavectorofnames,andthevectorrepresentsanorderedsetofnamestosearchwhendecidingwhatactionafunctionshouldtake.Wewillbuildonandextendoneexamplethroughoutthischapter.Theideaisthatwewishtocreateasetofclassesthatcanbeusedtosimulatearandomvariable,whichfollowsageometricdistribution.Therewillbetwoclasses.Thefirstclassisforafaircoin,inwhichweflipthecoinuntilheadsistossed.Thesecondclassisforafair,six-sideddie,inwhichwerolluntila1isrolled.
DefiningclassesandmethodsTheclasscommandissimilartootherattributecommands,anditcanbeusedtoeithersetorgetinformationaboutanobject’sclass.Anobject’sclassisavector,andeachiteminthevectoristhenameofaclass.Thefirstelementintheclassvectoristheobject’sbaseclass,anditinheritsfromtheotherclassesasyoureadfromlefttoright.
Wefirstfocusonthesituationwhereanobjecthasasingleclassandwillexamineinheritanceinthesectionthatfollows.Theexampleexaminedthroughoutthischapterisusedtosimulateoneexperimentthatfollowsageometricdistribution.Theideaisthatyourepeatsomeexperimentandstopwhenthefirstsuccessoccurs.First,weexaminetwoclasses,andweconstructafunctionthatwilltakeanactiondependingontheclassname.Thefirstclassisusedtorepresentafair,six-sideddie.Thediewillberolled,givinganintegerbetween1and6inclusive,andtheexperimentstopswhena1isreturned.Thesecondclassrepresentsafaircoin.ThecoinwillbeflippedreturningeitheranHoraT,andtheexperimentstopswhenHisreturned.
Thetwoclassdefinitionsareillustratedinthefollowingfigure.Eachclasskeepstrackofthetrials,andtheresultsarekeptinavector.Thetwomethodsincludeamethodtoresetthehistory,butmorewillbeaddedwhenweexamineinheritance.Inthisexample,wearenotcreatingmethodsinthetraditionalsensebutarecreatingfunctionsthattakeappropriateactionbasedontheclassnameoftheargumentpassedtothem.Havealookatthefollowingdiagram:
Themethodsassociatedwiththedieandcoinclasses
First,wedefinethetwoclasses.Eachclassiscomposedofalist,andtheclassnamesaresettoDieandCoinrespectively.(Thenamesarestringsthatwemakeup.)Eachclassconsistsofalistwithasinglenumericvectorthatinitiallyhasalengthofzero.Ineachofthefollowingcases,thelistiscreatedmanually,andaclassnameisdefined.Wecouldhaveusedavector,butweusedalistsothattheexamplesareconsistentwiththewayweextendtheclasseslater:
>oneDie<-list(trials=character(0))
>class(oneDie)<-"Die"
>oneCoin<-list(trials=character(0))
>class(oneCoin)<-"Coin"
First,wedefinetwosetsoffunctions.Thefirstsetoffunctionsresetsthehistory,andthesecondsetperformsasingleBernoullitrial.Wefirstfocusonaroutinetoresetandinitializethehistory,anddefineafunctioncalledreset.Theresetfunctionmakesuseofthreedifferentfunctions.ThefirstusestheUseMethodcommand,whichwilltellRtosearchfortheappropriatefunctiontocall.Thedecisionisbasedontheclassnameoftheobjectpassedtoitasthefirstargument.TheUseMethodcommandlooksforotherfunctionswhosenameshavetheformresetTrial.class_name,wheretheclass_namesuffixmustexactlymatchthenameoftheclass.Theexceptionisthedefaultsuffixthatisexecutedifnootherfunctionisfound:
reset<-function(theObject)
{
UseMethod("reset",theObject)
print("ResettheTrials")
}
reset.default<-function(theObject)
{
print("Uhoh,notsurewhattodohere!\n")
return(theObject)
}
reset.Die<-function(theObject)
{
theObject$trials<-character(0)
print("Resetthedie\n")
return(theObject)
}
reset.Coin<-function(theObject)
{
theObject$trials<-character(0)
print("Resetthecoin\n")
return(theObject)
}
Notethatthefunctionsreturntheobjectpassedtothem.RecallthatRpassesargumentsasvalues.Anychangesyoumaketothevariablearelocaltothefunction,sothenewvaluemustbereturned.WecannowcalltheresetTrialfunction,anditwilldecidewhichfunctiontocall,giventheargumentpassedtoit.Havealookatthefollowingcode:
>oneDie$trials=c("3","4","1")
>oneDie$trials
[1]"3""4""1"
>oneDie<-reset(oneDie)
Resetthedie
>oneDie
$trials
character(0)
attr(,"class")
[1]"Die"
>oneCoin$trials=c("H","H","T")
>oneCoin<-reset(oneCoin)
Resetthecoin
>oneDie$trials
character(0)
>#Lookatanexamplethatwillfailandusethedefaultfunction.
>v<-c(1,2,3)
>v<-reset(v)
[1]"Uhoh,notsurewhattodohere!\n"
>v
[1]123
NotethattheprintcommandaftertheUseMethodcommandinthefunctionresetTrialisnotexecuted.Whenthereturnfunctioniscalled,anycommandsthatfollowtheUseMethodcommandarenotexecuted.
DefiningobjectsandinheritanceTheexamplesgivenintheprevioussectionshouldinvokeatwingeofshameforthosefamiliarwithobject-orientedprinciples,andyoushouldbeassuredthatIfeltappropriatelyembarrassedtosharethem.Itwasdone,though,tokeeptheintroductiontoS3classesassimpleaspossible.Oneissueisthatthetwoclassesarecloselyrelated,andthefunctionsincludeagreatdealofrepeatedcode.Wewillnowexaminehowinheritancecanbeusedtoavoidthisproblem.
Inthissection,wedefineabaseclass,GeometricTrial,andthenredefinetheroutinessothattheDieandCoinclassescanbederivedfromthebaseclass.Indoingso,wecandemonstratehowinheritanceisimplementedinthecontextofanS3class.Additionally,werespecttheideaofencapsulation,whichistheprinciplethatanobjectofagivenclassshouldupdateitsownelementsusingmethodsfromwithintheclass.Weexplorethisissueingreaterdetailinthesectionthatfollows.
Wewillnowrethinkthewholeclassstructure.Thedieandthecoinarecloselyrelated,andtheonlydifferenceistheresultreturnedfromasingletrial.Wereimaginetheclassestotakeadvantageofthecommonalitiesbetweenthecoinandthedie.Thenewclassstructureisshowninthefollowingdiagram:
Inadditiontothechangeintheclasses,wealsochangethewayinwhichtheclassesaredefined.Inthiscase,wedefinefunctionsthatwillactasconstructorsforeachclass.Eachconstructorwillusetheclasscommandtoappendthenameoftheclasstotheobject’sclassattribute.Aspreviouslymentioned,theclassattributeforanobjectisavector.When
youcalltheUseMethodcommand,Rwillsearchforafunctionwhoseclassmatchesthefirstelementinthevector.Ifitdoesnotfindthatfunction,itlooksforafunctionthatmatchesthesecondelement,anditproceedsuntilitreachesthelastelementinthevector.Ifitdoesnotfindanything,itcallsthedefaultfunction.Withthisinmind,wenowexaminenewdefinitionsoftheclasses.Ratherthanmanuallycreatingtheclass,wedefinefunctionsthatwillcreatealistrepresentingtheclass,appendaclassnametotheclassattribute,andthenreturnthelist.Therearethreeclasses,andwewilldefineonefunctionforeachclass.ThefirstfunctionisusedtodefineaconstructorforanobjectoftheGeometricTrialclass:
GeometricTrial<-function()
{
#Createthebasicdatastructure-alistthatkeepstrackof
#asetoftrials.
#Createthebasicmethodsaspartofalisttobereturned.
me=list(
#Definethehistorytokeeptrackofthetrials.
history=character(0)
)
#Definemyclassidentifierandreturnthelist.
class(me)<-append(class(me),"GeometricTrial")
return(me)
}
Priortoreturningthelist,theappendfunctionisusedtoaddthenewclassnametotheendofthecurrentclassattribute.ThisideaisusedinclassesthatarederivedfromtheGeometricTrialclassesaswell.TheconstructorfortheDieandCoinclassescannowbedefined,andbothconstructorsexplicitlycalltheconstructorfortheparentclass,performanyactionsassociatedwiththecurrentclass,andthenappendthecurrentclassnametotheclassattribute:
Die<-function()
{
#Definetheobjectbyfirstcallingtheconstructorforthebaseclass
me<-GeometricTrial()
#Addtheclassnametotheendofthelistofclassnames
class(me)<-append(class(me),"Die")
return(me)
}
Coin<-function()
{
#Definetheobjectbycallingtheconstructorforthebaseclass
me<-GeometricTrial()
#Addtheclassnametotheendofthelistofclassnames
class(me)<-append(class(me),"Coin")
return(me)
}
TheGeometricTrialclassincludesfourmethods.Theresetmethodbehavesexactlylike
theresetmethoddiscussedintheprevioussection.ThegetHistorymethodisanaccessorforadataelementandisdiscussedinthefollowingsection.Wewillnowdiscussthesimulationmethod,andadiscussiononthesingleTrialmethodwillfollow.
Thesimulationmethodisusedtosimulateasingleexperiment.Thehistoryisfirstcleared,andthesingleTrialmethodisrepeatedlycalleduntilasuccessfulresultisreturned.Wefirstdefinethebasesimulationfunction,thedefaultsimulationfunction,andthenthesimulationfunctionusedbytheGeometricTrialclass,asfollows:
simulation<-function(theObject)
{
UseMethod("simulation",theObject)
}
simulation.default<-function(theObject)
{
warning("Defaultsimulationmethodcalledonunrecognizedobject.")
return(theObject)
}
##Defineamethodtorunasimulationofageometrictrial.
simulation.GeometricTrial=function(theObject)
{
theObject<-reset(theObject)#Resetthehistory
#beforethetrial.
repeat
{
##performasingletrialandaddittothehistory
thisTrial<-singleTrial(theObject)
theObject<-appendEvent(theObject,thisTrial$result)
if(thisTrial$success)
{
break#Thetrialresultedinasuccess.Time
#tostop!
}
}#Thetrialwasnotasuccess.Keepgoing.
return(theObject)
}
Theefforttodefineadefaultfunctionmaynotappeartobeaworthwhileendeavor.However,thispracticeisgenerallyemployedtoensurethatthesystemcanresponsiblyreactifthemethodsyoudefinearecalledbymistake.
ThefinalstepistodefinethesingleTrialmethods.Thismethodisexecutedbythechildclasses,DieandCoin.Again,thebaseanddefaultmethodsarecreated.Inthiscase,though,therearealsomethodsforeachofthethreeclasses.ThebasefunctioncallstheUseMethodfunction,whichscrollsthroughtheclassattributeforthefirstfunctiontocall.WeuseamethodfortheGeometricTrialclasstodemonstratetheorderofthecallsaswellastheNextMethodfunction.TheNextMethodfunctioncontinuesthesearchintheclassattributeandwillcallthenextfunctionbasedontheclassnamesthatfollowthecurrentclass:
singleTrial.default=function(theObject)
{
##Justgenerateadefaultsuccess
warning("UnrecognizedobjectfoundforthesingleTrialmethod")
return(list(result="1",success=TRUE))
}
singleTrial.GeometricTrial=function(theObject)
{
NextMethod("singleTrial",theObject)
}
singleTrial.Coin=function(theObject)
{
##Performasinglecoinflip
value<-as.character(
cut(as.integer(1+trunc(runif(1,0,2))),c(0,1,2),labels=c("H","T")))
return(list(result=value,success=(value=="H")))
}
singleTrial.Die=function(theObject)
{
##Performasingledieroll
value<-as.integer(1+trunc(runif(1,0,6)))
return(list(result=value,success=(value==1)))
}
WiththesemethodsdefinedandthegetHistorymethoddefinedinthefollowingsection,theclasswillbecomplete.ObjectsoftheCoinandDieclasscanbecreated,andsimulationscanbeexecuted,asfollows:
>coin<-Coin()
>coin<-simulation(coin)
>getHistory(coin)
[1]H
Levels:H
>coin<-simulation(coin)
>getHistory(coin)
[1]TTH
Levels:HT
>
>die<-Die()
>die<-simulation(die)
>getHistory(die)
[1]1
Levels:1
>die<-simulation(die)
>getHistory(die)
[1]655621
Levels:1256
EncapsulationThefinalmethodforthegetHistoryclasswillnowbedefined.Itisdefinedinaseparatesectiontostressanimportantpoint.AnS3objectisgenerallyabasicdatastructure,suchasavectororalistthathasanadditionalclassattributedefined.Thefunctionsthataredefinedfortheclassreacttotheclassattributeinapredictableway.
Onesideeffectisthateveryelementofanobjectfromagivenclassispublicdata.Theelementscontainedwithinanobjectcanalwaysbeaccessed.TheresultisthatwhenprogramminginR,wemusttakeextrastepstomaintaindisciplinewithrespecttoaccessingthedataelementsmaintainedbyanobject.Codethatdirectlyaccessesdataelementswithinanobjectmayworkwhenfirstwritten,butanychangetotheclassconstructorrisksbreakingcodeintheothermethodsdefinedforaclass.
Withrespecttoourpreviousexample,wehaveanaccessor,thegetHistorymethod.Ifwehaveanobject,calledoneDie,fromtheDieclass,wecaneasilygetthehistoryusingoneDie$history.Ifwelaterdecidetochangethedatastructureusedtostorethehistory,thenanycodedirectlyaccessingthisvariableislikelytofail.
Instead,wewriteanaccessormethod,getHistory,whichisdesignedtoreturnavectorthathasthehistoryintheformofavectoroffactors.Itisimportanttomaintaindisciplineandonlyusethismethodtogetacopyofthehistory.Havealookatthefollowingcode:
getHistory<-function(theObject)
{
UseMethod("getHistory",theObject)
}
getHistory.default<-function(theObject)
{
return(factor())#Justreturnanemptyvectoroffactors
}
getHistory.GeometricTrial<-function(theObject)
{
return(as.factor(theObject$history))
}
AfinalnoteThereisonefinalnotetoshareaboutS3classes.IfyouhaveusedR,youmostlikelyhaveusedthem.Manyfunctionsaredefinedtoreactaccordingtotheclassnameoftheirfirstargument.Havealookatthefollowingdiagram:
Acommonexampleofthisistheplotcommand.Ifyoutypetheplotcommandwithoutarguments,youcanseeitsdefinition,asfollows:
>plot
function(x,y,...)
UseMethod("plot")
<bytecode:0x32fdd50>
<environment:namespace:graphics>
>
Theplotcommandwillreactdifferentlydependingonwhatkindofobjectyoupassedtoit.Ifyouwishtoseewhatclassestheplotcommandcanhandle,youcanusethemethodscommandtolistthem:
>methods(plot)
[1]plot.HoltWinters*plot.TukeyHSD*plot.acf*
[4]plot.data.frame*plot.decomposed.ts*plot.default
[7]plot.dendrogram*plot.density*plot.ecdf
[10]plot.factor*plot.formula*plot.function
[13]plot.hclust*plot.histogram*plot.isoreg*
[16]plot.lm*plot.medpolish*plot.mlm*
[19]plot.ppr*plot.prcomp*plot.princomp*
[22]plot.profile.nls*plot.spec*plot.stepfun
[25]plot.stl*plot.table*plot.ts
[28]plot.tskernel*
Non-visiblefunctionsareasterisked
>
OneofthegreatestadvantagesoftheS3classdefinitionisthatitissimpletobuildonwhatisalreadyavailable.Intheexamplefromtheprevioussection,IwouldliketohavetheplotcommandreactappropriatelyaccordingtowhetherornotIpassitaclassofthetypeDieorCoin.AssumingthatIhavethepreviousclasses,DieandCoin,defined,Imerelyhavetodefinetwonewplotfunctions,asfollows:
>plot.Die<-function(theDie,theTitle)
+{
+plot(theDie$getHistory(),
+xlab="ValueAfterADieRoll",ylab="Frequency",
+main=theTitle)
+}
>
>plot.Coin<-function(theCoin,theTitle)
+{
+plot(theCoin$getHistory(),
+xlab="ValueAfterCoinFlip",ylab="Frequency",
+main=theTitle)
+}
>plot(aCoin,"ThisHereTrial")
>plot(aDie,"AMoreBetterTrial")
Itiscommontousethisideatoextendanumberofcommands.Somecommonexamplesincludetheprintandtheformatfunctions.
SummaryWehaveexploredhowtocreateS3classes,andwedidsointhecontextoftwoexamples.Thefirstexamplefocusedonhowtodefinefunctionsthatwillreactbasedontheclassnameofthefirstargumentgiventothefunction.Thefirstexampledidnotmakefulluseofbasicobject-orientedprinciples,asitisanattempttosimplyintroducetheideaofS3classes.Thesecondexampleextendedthefirstexampletoprovideasimpleexampleofhowinheritanceisimplemented.ItdemonstratedhowinheritanceisimplementedinthecontextofanS3class.ItalsoprovidedademonstrationofhowencapsulationisimplementedundertheframeworkofanS3class.
Onedownsidetotheapproachisthatthereislittletypechecking.Itispossibletomakechangestoanobjectthatcanmakeitinconsistentwiththeoriginaldefinition.Whenachangeismadetoanobject,nochecksareimplementedtoensurethatanobjecthasthepropertiesthatareexpectedofit.
OnewaytoavoidthisissueistomakeuseofS4classes.TheapproachassociatedwithS4classesisexaminedinthenextchapter.AnotheradvantageisthattheS4approachwilllookmorefamiliartothosealreadyfamiliarwithobject-orientedapproachestoprogramming.
Chapter9.S4ClassesThischapteristhethirdpartinourintroductiontoprogramming.WeexaminedS3classesinthepreviouschapter.WewillnowexamineS4classes.TheapproachassociatedwithS3classesismoreflexible,andtheapproachassociatedwithS4classesisamoreformalandstructureddefinition.
Thischapterisroughlydividedintofourparts:
Classdefinition:Thissectiongivesyouanoverviewofhowaclassisdefinedandhowthedata(slots)associatedwiththeclassarespecifiedClassmethods:ThissectiongivesyouanoverviewofhowmethodsthatareassociatedwithaclassaredefinedInheritance:ThissectiongivesyouanoverviewofhowchildclassesthatbuildonthedefinitionofaparentclasscanbedefinedMiscellaneouscommands:Thissectionexplainsfourcommandsthatcanbeusedtoexploreagivenobjectorclass
IntroducingtheAntclassWedefinedtheideaofcontrolflowstructuresinChapter7,BasicProgramming,andintroducedtheideaofanS3classinChapter8,S3Classes.WewillnowintroducetheideaofS4classes,whichisamoreformalwaytoimplementclassesinR.OneoftheoddquirksofS4classesisthatyoufirstdefinetheclassalongwithitsdata,andthenyoudefinethemethodsseparately.
Asaresultofthisseparationinthewayaclassisdefined,wewillfirstdiscussthegeneralideaofhowtodefineaclassanditsdata.Wewillthendiscusshowtoaddamethodtoanexistingclass.Next,wewilldiscusshowinheritanceisimplemented.Finally,wewillprovideafewnotesaboutotheroptionsthatdonotfitnicelyinthecategoriesmentionedearlier.
Inthepreviouschapter,wetookanexampleandthenmodifiedit.TheapproachassociatedwithanS4classislessflexibleandrequiresabitmoreforethoughtintermsofhowaclassisdefined.Wewilltakeadifferentapproachinthischapterandcreateacompleteclassfromthebeginning.Inthiscase,wewillbuildonanideaproposedbyColeandCheshire.Theauthorsproposedacellularautomatasimulationtomimichowantsmovewithinacolony.
Aspartofasimulation,wewillassumethatweneedanAntclass.Wewilldepartfromthepaperandassumethattheantsarenothomogeneous.Wewillthenassumethattherearemale(drones)andfemaleants,andthefemalescanbeeitherworkersorsoldiers.Wewillneedanantbaseclass,whichisdiscussedinthefirsttwosectionsofthischapterasameanstodemonstratehowtocreateanS4class.Inthethirdsection,wewilldefineahierarchyofclassesbasedontheoriginalAntclass.Thishierarchyincludesmaleandfemaleclasses.Theworkerclasswilltheninheritfromthefemaleclass,andthesoldierclasswillinheritfromtheworkerclass.
DefininganS4classWewilldefinethebaseAntclasscalledAnt.Theclassisrepresentedinthefollowingfigure.Theclassisusedtorepresentthefundamentalaspectsthatweneedtotrackforanant,andwefocusoncreatingtheclassanddata.Themethodsareconstructedinaseparatestepandareexaminedinthenextsection.
AclassiscreatedusingthesetClasscommand.Whencreatingtheclass,wespecifythedatainacharactervectorusingtheslotsargument.Theslotsargumentisavectorofcharacterobjectsandrepresentsthenamesofthedataelements.Theseelementsareoftenreferredtoastheslotswithintheclass.
Someoftheargumentsthatwewilldiscusshereareoptional,butitisagoodpracticetousethem.Inparticular,wewillspecifyasetofdefaultvalues(theprototype)andafunctiontocheckwhetherthedataisconsistent(avalidityfunction).Also,itisagoodpracticetokeepallofthestepsnecessarytocreateaclasswithinthesamefile.Tothatend,weassumethatyouwillnotbeenteringthecommandsfromthecommandline.Theyareallfoundwithinasinglefile,sotheformattingoftheexampleswillreflectthelackoftheRworkspacemarkers.
ThefirststepistodefinetheclassusingthesetClasscommand.Thiscommanddefinesa
newclassbyname,anditalsoreturnsageneratorthatcanbeusedtoconstructanobjectforthenewclass.Thefirstargumentisthenameoftheclassfollowedbythedatatobeincludedintheclass.Wewillalsoincludethedefaultinitialvaluesandthedefinitionofthefunctionusedtoensurethatthedataisconsistent.ThevalidityfunctioncanbesetseparatelyusingthesetValiditycommand.ThedatatypesfortheslotsarecharactervaluesthatmatchthenamesoftheRdatatypeswhichwillbereturnedbytheclasscommand:
#DefinethebaseAntclass.
Ant<-setClass(
#Setthenameoftheclass
"Ant",
#Namethedatatypes(slots)thattheclasswilltrack
slots=c(
Length="numeric",#thelength(size)ofthisant.
Position="numeric",#thepositionofthisant.
#(a3vector!)
pA="numeric",#Probabilitythatanantwill
#transitionfromactiveto
#inactive.
pI="numeric",#Probabilitythatanantwill
#transitionfrominactiveto
#active.
ActivityLevel="numeric"#Theant'scurrentactivity
#level.
),
#Setthedefaultvaluesfortheslots.(optional)
prototype=list(
Length=4.0,
Position=c(0.0,0.0,0.0),
pA=0.05,
pI=0.1,
ActivityLevel=0.5
),
#Makeafunctionthatcantesttoseeifthedataisconsistent.
#(optional)
validity=function(object)
{
#Checktoseeiftheactivitylevelandlengthis
#non-negative.
#Seethediscussiononthe@notationinthetextbelow.
if(object@ActivityLevel<0.0){
return("Error:Theactivitylevelisnegative")
}elseif(object@Length<0.0){
return("Error:Thelengthisnegative")
}
return(TRUE)
}
)
Withthisdefinition,therearetwowaystocreateanAntobject:oneisusingthenewcommandandtheotherisusingtheAntgenerator,whichiscreatedafterthesuccessfulexecutionofthesetClasscommand.Notethatinthefollowingexamples,thedefaultvaluescanbeoverriddenwhenanewobjectiscreated:
>ant1<-new("Ant")
>ant1
Anobjectofclass"Ant"
Slot"Length":
[1]4
Slot"Position":
[1]000
Slot"pA":
[1]0.05
Slot"pI":
[1]0.1
Slot"ActivityLevel":
[1]0.5
Wecanspecifythedefaultvalueswhencreatinganewobject.
>ant2<-new("Ant",Length=4.5)
>ant2
Anobjectofclass"Ant"
Slot"Length":
[1]4.5
Slot"Position":
[1]000
Slot"pA":
[1]0.05
Slot"pI":
[1]0.1
Slot"ActivityLevel":
[1]0.5
TheobjectcanalsobecreatedusingthegeneratorthatisdefinedwhencreatingtheclassusingthesetClasscommand.
>ant3<-Ant(Length=5.0,Position=c(3.0,2.0,1.0))
>ant3
Anobjectofclass"Ant"
Slot"Length":
[1]5
Slot"Position":
[1]321
Slot"pA":
[1]0.05
Slot"pI":
[1]0.1
Slot"ActivityLevel":
[1]0.5
>class(ant3)
[1]"Ant"
attr(,"package")
[1]".GlobalEnv"
>getClass(ant3)
Anobjectofclass"Ant"
Slot"Length":
[1]5
Slot"Position":
[1]321
Slot"pA":
[1]0.05
Slot"pI":
[1]0.1
Slot"ActivityLevel":
[1]0.5
Whentheobjectiscreatedandavalidityfunctionisdefined,thevalidityfunctionwilldeterminewhetherthegiveninitialvaluesareconsistent:
>ant4<-Ant(Length=-1.0,Position=c(3.0,2.0,1.0))
ErrorinvalidObject(.Object):
invalidclass"Ant"object:Error:Thelengthisnegative
>ant4
Error:object'ant4'notfound
Inthelaststeps,theattemptedcreationofant4,anerrormessageisdisplayed.Thenewvariable,ant4,wasnotcreated.Ifyouwishtotestwhethertheobjectwascreated,youmustbecarefultoensurethatthevariablenameuseddoesnotexistpriortotheattemptedcreationofthenewobject.Also,thevalidityfunctionisonlyexecutedwhenarequesttocreateanewobjectismade.Ifyouchangethevaluesofthedatalater,thevalidityfunctionisnotcalled.
Beforewemoveontodiscussmethods,weneedtofigureouthowtogetaccesstothedatawithinanobject.Thesyntaxisdifferentfromotherdatastructures,andweuse@toindicatethatwewanttoaccessanelementfromwithintheobject.Thiscanbeusedtogetacopyofthevalueortosetthevalueofanelement:
>adomAnt<-Ant(Length=5.0,Position=c(-1.0,2.0,1.0))
>adomAnt@Length
[1]5
>adomAnt@Position
[1]-121
>adomAnt@ActivityLevel=-5.0
>adomAnt@ActivityLevel
[1]-5
Notethatintheprecedingexample,wesetavaluefortheactivitylevelthatisnotallowedaccordingtothevalidityfunction.Sinceitwassetaftertheobjectwascreated,nocheckisperformed.ThevalidityfunctionisonlyexecutedduringthecreationoftheobjectorifthevalidObjectfunctioniscalled.
Onefinalnote:itisgenerallyabadformtoworkdirectlywithanelementwithinanobject,andabetterpracticeistocreatemethodsthatobtainorchangeanindividualelementwithinanobject.Itisabestpracticetobecarefulabouttheencapsulationofanobject’sslots.TheRenvironmentdoesnotrecognizetheideaofprivateversuspublicdata,andtheonusisontheprogrammertomaintaindisciplinewithrespecttothisimportantprinciple.
DefiningmethodsforanS4classWhenanewclassisdefined,thedataelementsaredefined,butthemethodsassociatedwiththeclassaredefinedonaseparatestage.MethodsareimplementedinamannersimilartotheoneusedforS3classes.Afunctionisdefined,andthewaythefunctionreactsdependsonitsarguments.Ifamethodisusedtochangeoneofthedatacomponentsofanobject,thenitmustreturnacopyoftheobject,justaswesawwithS3classes.
Thecreationofnewmethodsisdiscussedintwosteps.Wewillfirstdiscusshowtodefineamethodforaclasswherethemethoddoesnotyetexist.Next,wewilldiscusssomepredefinedmethodsthatareavailableandhowtoextendthemtoaccommodateanewclass.
DefiningnewmethodsThefirststeptocreateanewmethodistoreservethename.Somefunctionsareincludedbydefault,suchastheinitialize,printorshowcommands,andwewilllaterseehowtoextendthem.Toreserveanewname,youmustfirstusethesetGenericcommand.Attheveryleast,youneedtogivethiscommandthenameofthefunctionasacharacterstring.Asintheprevioussection,wewillusemoreoptionsasanattempttopracticesafeprogramming.
Themethodstobecreatedareshowninprecedingfigure.Thereareanumberofmethods,butwewillonlydefinefourhere.Allofthemethodsareaccessors;theyareusedtoeithergetorsetvaluesofthedatacomponents.Wewillonlydefinethemethodsassociatedwiththelengthslotinthistext,andyoucanseetherestofthecodeintheexamplesavailableonthewebsite.Theothermethodscloselyfollowthecodeusedforthelengthslot.Therearetwomethodstosettheactivitylevel,andthosecodesareexaminedseparatelytoprovideanexampleofhowamethodcanbeoverloaded.
First,wewilldefinethemethodstogetandsetthelength.Wewillfirstcreatethemethodtogetthelength,asitisalittlemorestraightforward.ThefirststepistotellRthatanewfunctionwillbedefined,andthenameisreservedusingthesetGenericcommand.ThemethodthatiscalledwhenanAntobjectispassedtothecommandisdefinedusingthesetMethodcommand:
setGeneric(name="GetLength",
def=function(antie)
{
standardGeneric("GetLength")
}
)
setMethod(f="GetLength",
signature="Ant",
definition=function(antie)
{
return(antie@Length)
}
)
NowthattheGetLengthfunctionisdefined,itcanbeusedtogetthelengthcomponentforanAntobject:
>ant2<-new("Ant",Length=4.5)
>GetLength(ant2)
[1]4.5
Themethodtosetthelengthissimilar,butthereisonedifference.Themethodmustreturnacopyoftheobjectpassedtoit,anditrequiresanadditionalargument:
setGeneric(name="SetLength",
def=function(antie,newLength)
{
standardGeneric("SetLength")
}
)
setMethod(f="SetLength",
signature="Ant",
definition=function(antie,newLength)
{
if(newLength>0.0){
antie@Length=newLength
}else{
warning("Error-invalidlengthpassed");
}
return(antie)
}
)
Whensettingthelength,thenewobjectmustbesetusingtheobjectthatispassedbackfromthefunction:
>ant2<-new("Ant",Length=4.5)
>ant2@Length
[1]4.5
>ant2<-SetLength(ant2,6.25)
>ant2@Length
[1]6.25
PolymorphismThedefinitionofS4classesallowsmethodstobeoverloaded.Thatis,multiplefunctionsthathavethesamenamecanbedefined,andthefunctionthatisexecutedisdeterminedbythearguments’types.WewillnowexaminethisideainthecontextofdefiningthemethodsusedtosettheactivitylevelintheAntclass.
Twoormorefunctionscanhavethesamename,butthetypesoftheargumentspassedtothemdiffer.Therearetwomethodstosettheactivitylevel.Onetakesafloatingpointnumberandsetstheactivitylevelbasedtothevaluepassedtoit.TheothertakesalogicalvalueandsetstheactivityleveltozeroiftheargumentisFALSE;otherwise,itsetsittoadefaultvalue.
TheideaistousethesignatureoptioninthesetMethodcommand.Itissettoavectorofclassnames,andtheorderoftheclassnamesisusedtodeterminewhichfunctionshouldbecalledforagivensetofarguments.Animportantthingtonote,though,isthattheprototypedefinedinthesetGenericcommanddefinesthenamesofthearguments,andtheargumentnamesinbothmethodsmustbeexactlythesameandinthesameorder:
setGeneric(name="SetActivityLevel",
def=function(antie,activity)
{
standardGeneric("SetActivityLevel")
}
)
setMethod(f="SetActivityLevel",
signature=c("Ant","logical"),
definition=function(antie,activity)
{
if(activity){
antie@ActivityLevel=0.1
}else{
antie@ActivityLevel=0.0
}
return(antie)
}
)
setMethod(f="SetActivityLevel",
signature=c("Ant","numeric"),
definition=function(antie,activity)
{
if(activity>=0.0){
antie@ActivityLevel=activity
}else{
warning("Theactivitylevelcannotbenegative")
}
return(antie)
}
)
Oncethetwomethodsaredefined,Rwillusetheclassnamesoftheargumentsto
determinewhichfunctiontocallinagivencontext:
>ant2<-SetActivityLevel(ant2,0.1)
>ant2@ActivityLevel
[1]0.1
>ant2<-SetActivityLevel(ant2,FALSE)
>ant2@ActivityLevel
[1]0
Therearetwoadditionaldatatypesrecognizedbythesignatureoption:ANYandmissing.Thesecanbeusedtomatchanydatatypeoramissingvalue.Alsonotethatwehaveleftouttheuseofellipses(…)fortheargumentsintheprecedingexamples.The…argumentmustbethelastargumentandisusedtoindicatethatanyremainingparametersarepassedastheyappearintheoriginalcalltothefunction.Ellipsescanmaketheuseoftheoverloadedfunctionsinamoreflexiblewaythanindicated.Moreinformationcanbefoundusingthehelp(dotsMethods)command.
ExtendingtheexistingmethodsThereareanumberofgenericfunctionsdefinedinabasicRsession,andwewillexaminehowtoextendanexistingfunction.Forexample,theshowcommandisagenericfunctionwhosebehaviordependsontheclassnameoftheobjectpassedtoit.Sincethefunctionnameisalreadyreserved,thesetGenericcommandisnotusedtoreservethefunctionname.
Theshowcommandisastandardexample.Thecommandtakesanobjectandconvertsittoacharactervaluetobedisplayed.Thecommanddefineshowothercommandsprintoutandexpressanobject.Intheprecedingexample,anewclasscalledcoordinateisdefined;thiskeepstrackoftwovalues,xandy,foracoordinate,andwewilladdonemethodtosetthevaluesofthecoordinate:
#Definethebasecoordinatesclass.
Coordinate<-setClass(
#Setthenameoftheclass
"Coordinate",
#Namethedatatypes(slots)thattheclasswilltrack
slots=c(
x="numeric",#thexposition
y="numeric"#theyposition
),
#Setthedefaultvaluesfortheslots.(optional)
prototype=list(
x=0.0,
y=0.0
),
#Makeafunctionthatcantesttoseeifthedataisconsistent.
#(optional)
#Thisisnotcalledifyouhaveaninitializefunctiondefined!
validity=function(object)
{
#Checktoseeifthecoordinateisoutsideofacircleof
#radius100
print("Checkingthevalidityofthepoint")
if(object@x*object@x+object@y*object@y>100.0*100.0){
return(paste("Error:Thepointistoofar",
"awayfromtheorigin."))
}
return(TRUE)
}
)
#Addamethodtosetthevalueofacoordinate
setGeneric(name="SetPoint",
def=function(coord,x,y)
{
standardGeneric("SetPoint")
}
)
setMethod(f="SetPoint",
signature="Coordinate",
def=function(coord,x,y)
{
print("Settingthepoint")
coord@x=x
coord@y=y
return(coord)
}
)
Wewillnowextendtheshowmethodsothatitcanproperlyreacttoacoordinateobject.Asitisreserved,wedonothavetousethesetGenericcommandbutcansimplydefineit:
setMethod(f="show",
signature="Coordinate",
def=function(object)
{
cat("ThecoordinateisX:",object@x,"Y:",object@y,"\n")
}
)
Asnotedpreviously,thesignatureoptionmustmatchtheoriginaldefinitionofafunctionthatyouwishtoextend.YoucanusethegetMethod('show')commandtoexaminethesignatureforthefunction.Withthenewmethodinplace,theshowcommandisusedtoconvertacoordinateobjecttoastringwhenitisprinted:
>point<-Coordinate(x=1,y=5)
[1]"Checkingthevalidityofthepoint"
>print(point)
ThecoordinateisX:1Y:5
>point
ThecoordinateisX:1Y:5
Anotherimportpredefinedmethodistheinitializecommand.Iftheinitializecommandiscreatedforaclass,thenitiscalledwhenanewobjectiscreated.Thatis,youcandefineaninitializefunctiontoactasaconstructor.Ifaninitializefunctionisdefinedforaclass,thevalidatorisnotcalled.YouhavetomanuallycallthevalidatorusingthevalidObjectcommand.Alsonotethattheprototypefortheinitializecommandrequiresthenameofthefirstargumenttobeanobject,andthedefaultvaluesaregivenfortheremainingargumentsincaseanewobjectiscreatedwithoutspecifyinganyvaluesfortheslots:
setMethod(f="initialize",
signature="Coordinate",
def=function(.Object,x=0.0,y=0.0)
{
print("Checkingthepoint")
.Object=SetPoint(.Object,x,y)
validObject(.Object)#youmustexplicitlycallthe
#inspector
return(.Object)
}
)
Now,whenyoucreateanewobject,thenewinitializefunctioniscalledimmediately:
>point<-Coordinate(x=2,y=3)
[1]"Checkingthepoint"
[1]"Settingthepoint"
[1]"Checkingthevalidityofthepoint"
>point
ThecoordinateisX:2Y:3
Usingtheinitializeandvalidityfunctionstogethercanresultinsurprisingcodepaths.Thisisespeciallytruewheninheritingfromoneclassandcallingtheinitializefunctionofaparentclassfromthechildclass.Itisimportanttotestcodestoensurethatthecodeisexecutingintheorderthatyouexpect.Personally,Itrytouseeithervalidatororconstructor,butnotboth.
InheritanceTheAntclassdiscussedinthefirstsectionofthischapterprovidedanexampleofhowtodefineaclassandthendefinethemethodsassociatedwiththeclass.Wewillnowextendtheclassbycreatingnewclassesthatinheritfromthebaseclass.TheoriginalAntclassisshownintheprecedingfigure,andnow,wewillproposefourclassesthatinheritfromthebaseclass.TwonewclassesthatinheritfromAntaretheMaleandFemaleclasses.TheWorkerclassinheritsfromtheFemaleclass,whiletheSoldierclassinheritsfromtheWorkerclass.Therelationshipsareshowninthefollowingfigure.Thecodeforallofthenewclassesisincludedinourexamplecodesavailableatourwebsite,butwewillonlyfocusontwoofthenewclassesinthetexttokeepourdiscussionmorefocused.
RelationshipsbetweentheclassesthatinheritfromthebaseAntclass
Whenanewclassiscreated,itcaninheritfromanexistingclassbysettingthecontainsparameter.Thiscanbesettoavectorofclassesformultipleinheritance.However,wewillfocusonsingleinheritanceheretoavoiddiscussingthecomplicationsassociatedwithdetermininghowRfindsamethodwhentherearecollisions.AssumingthattheAntbaseclassgiveninthefirstsectionhasalreadybeendefinedinthecurrentsession,thechildclassescanbedefined.Thedetailsforthetwoclasses,FemaleandWorker,arediscussedhere.
First,theFemaleAntclassisdefined.Itaddsanewslot,Food,andinheritsfromtheAntclass.BeforedefiningtheFemaleAntclass,weaddacaveatabouttheAntclass.ThebaseAntclassshouldhavebeenavirtualclass.WewouldnotordinarilycreateanobjectoftheAntclass.Wedidnotmakeitavirtualclassinordertosimplifyourintroduction.Wearewisernowandwishtodemonstratehowtodefineavirtualclass.TheFemaleAntclasswillbeavirtualclasstodemonstratetheidea.WewillmakeitavirtualclassbyincludingtheVIRTUALcharacterstringinthecontainsparameter,anditwillnotbepossibletocreateanobjectoftheFemaleAntclass:
#Definethefemaleantclass.
FemaleAnt<-setClass(
#Setthenameoftheclass
"FemaleAnt",
#Namethedatatypes(slots)thattheclasswilltrack
slots=c(
Food="numeric"#Thenumberoffoodunitscarried
),
#Setthedefaultvaluesfortheslots.(optional)
prototype=list(
Food=0
),
#Makeafunctionthatcantesttoseeifthedataisconsistent.
#(optional)
#Thisisnotcalledifyouhaveaninitializefunctiondefined!
validity=function(object)
{
print("Validity:FemaleAnt")
#Checktoseeifthenumberofoffspringisnon-negative.
if(object@Food<0){
return("Error:Thenumberoffoodunitsisnegative")
}
return(TRUE)
},
#ThisclassinheritsfromtheAntclass
contains=c("Ant","VIRTUAL")
)
Now,wewilldefineaWorkerAntclassthatinheritsfromtheFemaleAntclass:
#Definetheworkerantclass.
WorkerAnt<-setClass(
#Setthenameoftheclass
"WorkerAnt",
#Namethedatatypes(slots)thattheclasswilltrack
slots=c(
Foraging="logical",#Whetherornottheantisactively
#lookingforfood
Alarm="logical"#Whetherornottheantisactively
#announcinganalarm.
),
#Setthedefaultvaluesfortheslots.(optional)
prototype=list(
Foraging=FALSE,
Alarm=FALSE
),
#Makeafunctionthatcantesttoseeifthedataisconsistent.
#(optional)
#Thisisnotcalledifyouhaveaninitializefunctiondefined!
validity=function(object)
{
print("Validity:WorkerAnt")
return(TRUE)
},
#ThisclassinheritsfromtheFemaleAntclass
contains="FemaleAnt"
)
Whenanewworkeriscreated,itinheritsfromtheFemaleAntclass:
>worker<-WorkerAnt(Position=c(-1,3,5),Length=2.5)
>worker
Anobjectofclass"WorkerAnt"
Slot"Foraging":
[1]FALSE
Slot"Alarm":
[1]FALSE
Slot"Food":
[1]0
Slot"Length":
[1]2.5
Slot"Position":
[1]-135
Slot"pA":
[1]0.05
Slot"pI":
[1]0.1
Slot"ActivityLevel":
[1]0.5
>worker<-SetLength(worker,3.5)
>GetLength(worker)
[1]3.5
Wehavenotdefinedtherelevantmethodsintheprecedingexamples.Thecodeisavailableinoursetofexamples,andwewillnotdiscussmostofittokeepthisdiscussionmorefocused.Wewillexaminetheinitializemethod,though.ThereasontodosoistoexplorethecallNextMethodcommand.ThecallNextMethodcommandisusedtorequestthatRsearchesforandexecutesamethodofthesamenamethatisamemberofaparentclass.
Wechosetheinitializemethodbecauseacommontaskistobuildachainofconstructorsthatinitializethedataassociatedfortheclassassociatedwitheachconstructor.WehavenotyetcreatedanyoftheinitializemethodsandstartwiththebaseAntclass:
setMethod(f="initialize",
signature="Ant",
def=function(.Object,Length=4,Position=c(0.0,0.0,0.0))
{
print("Antinitialize")
.Object=SetLength(.Object,Length)
.Object=SetPosition(.Object,Position)
#validObject(.Object)#youmustexplicitlycallthe
#inspector
return(.Object)
}
)
Theconstructortakesthreearguments:theobjectitself(.Object),thelength,andthepositionoftheant,anddefaultvaluesaregivenincasenoneareprovidedwhenanewobjectiscreated.ThevalidObjectcommandiscommentedout.Youshouldtryuncommentingthelineandcreatenewobjectstoseewhetherthevalidatorcaninturncalltheinitializemethod.Anotherimportantfeatureisthattheinitializemethodreturnsacopyoftheobject.
TheinitializecommandiscreatedfortheFemaleAntclass,andtheargumentstotheinitializecommandshouldberespectedwhentherequesttocallNextMethodforthenextfunctionismade:
setMethod(f="initialize",
signature="FemaleAnt",
def=function(.Object,Length=4,Position=c(0.0,0.0,0.0))
{
print("FemaleAntinitialize")
.Object<-callNextMethod(.Object,Length,Position)
#validObject(.Object)#youmustexplicitlycallthe
inspector
return(.Object)
}
)
ThecallNextMethodcommandisusedtocalltheinitializemethodassociatedwiththeAntclass.TheargumentsarearrangedtomatchthedefinitionoftheAntclass,anditreturnsanewcopyofthecurrentobject.
Finally,theinitializefunctionfortheWorkerAntclassiscreated.ItalsomakesuseofcallNextMethodtoensurethatthemethodofthesamenameassociatedwiththeparentclassisalsocalled:
setMethod(f="initialize",
signature="WorkerAnt",
def=function(.Object,Length=4,Position=c(0.0,0.0,0.0))
{
print("WorkerAntinitialize")
.Object<-callNextMethod(.Object,Length,Position)
#validObject(.Object)#youmustexplicitlycallthe
#inspector
return(.Object)
}
)
Now,whenanewobjectoftheWorkerAntclassiscreated,theinitializemethodassociatedwiththeWorkerAntclassiscalled,andeachassociatedmethodforeachparentclassiscalledinturn:
>worker<-WorkerAnt(Position=c(-1,3,5),Length=2.5)
[1]"WorkerAntinitialize"
[1]"FemaleAntinitialize"
[1]"Antinitialize"
MiscellaneousnotesIntheprevioussections,wediscussedhowtocreateanewclassaswellashowtodefineahierarchyofclasses.Wewillnowdiscussfourcommandsthatarehelpfulwhenworkingwithclasses:theslotNames,getSlots,getClass,andslotcommands.Eachcommandisbrieflydiscussedinturn,anditisassumedthattheAnt,FemaleAnt,andWorkerAntclassesthataregivenintheprevioussectionaredefinedinthecurrentworkspace.
Thefirstcommand,theslotnamescommand,isusedtolistthedatacomponentsofanobjectofsomeclass.Itreturnsthenamesofeachcomponentasavectorofcharacters:
>worker<-WorkerAnt(Position=c(1,2,3),Length=5.6)
>slotNames(worker)
[1]"Foraging""Alarm""Food""Length"
[5]"Position""pA""pI""ActivityLevel"
ThegetSlotscommandissimilartotheslotNamescommand.Thedifferenceisthattheargumentisacharactervariablewhichisthenameoftheclassyouwanttoinvestigate:
>getSlots("WorkerAnt")
ForagingAlarmFoodLengthPosition
"logical""logical""numeric""numeric""numeric"
pApIActivityLevel
"numeric""numeric""numeric"
ThegetClasscommandhastwoforms.Iftheargumentisanobject,thecommandwillprintoutthedetailsfortheobject.Iftheargumentisacharacterstring,thenitwillprintoutthedetailsfortheclasswhosenameisthesameastheargument:
>worker<-WorkerAnt(Position=c(1,2,3),Length=5.6)
>getClass(worker)
Anobjectofclass"WorkerAnt"
Slot"Foraging":
[1]FALSE
Slot"Alarm":
[1]FALSE
Slot"Food":
[1]0
Slot"Length":
[1]5.6
Slot"Position":
[1]123
Slot"pA":
[1]0.05
Slot"pI":
[1]0.1
Slot"ActivityLevel":
[1]0.5
>getClass("WorkerAnt")
Class"WorkerAnt"[in".GlobalEnv"]
Slots:
Name:ForagingAlarmFoodLength
Position
Class:logicallogicalnumericnumeric
numeric
Name:pApIActivityLevel
Class:numericnumericnumeric
Extends:
Class"FemaleAnt",directly
Class"Ant",byclass"FemaleAnt",distance2
KnownSubclasses:"SoldierAnt"
Finally,wewillexaminetheslotcommand.Theslotcommandisusedtoretrievethevalueofaslotforagivenobjectbasedonthenameoftheslot:
>worker<-WorkerAnt(Position=c(1,2,3),Length=5.6)
>slot(worker,"Position")
[1]123
SummaryWeintroducedtheideaofanS4classandprovidedseveralexamples.TheS4classisconstructedinatleasttwostages.Thefirststageistodefinethenameoftheclassandtheassociateddatacomponents.Themethodsassociatedwiththeclassarethendefinedinaseparatestep.
Inadditiontodefiningaclassanditsmethod,theideaofinheritancewasexplored.Apartialexamplewasgiveninthischapter;itbuiltonabaseclassdefinedinthefirstsectionofthechapter.Additionally,themethodtocall-associatedmethodsinparentclasseswasalsoexplored,andtheexamplemadeuseoftheconstructor(orinitializemethod)todemonstratehowtobuildachainofconstructors.
Finally,fourusefulcommandswereexplained.Thefourcommandsoffereddifferentwaystogetinformationaboutaclassoraboutanobjectofagivenclass.
Formoreinformation,youcanrefertoMobileCellularAutomataModelsofAntBehavior:MovementActivityofLeptothoraxallardycei,BlaineJ.ColeandDavidCheshire,TheAmericanNaturalist.
Chapter10.CaseStudy–CourseGradesThischapterisourfirstcasestudy.Webringtogethertheideasfromthepreviouschaptersandprovideanextendedexample.Somenewideasareintroduced,andtheyshouldmakemoresenseinthecontextofafullexample.
Thischapterisroughlydividedintofourparts:
TheCourseclass:ThissectiongivesyouanoverviewoftheS4classthatwillcontainalistofgradesforacourse.Thegradesarekeptinalistwitheachgradedtaskasaseparateobject.Theassignmentclasses:ThissectiongivesyouanoverviewoftheS4classesusedtokeepthegradesforaspecificgradedtask.Thisclasshastwoderivedclasses.Onederivedclassistokeeptrackofgradesthathaveanumericscore,andtheotheristokeeptrackofgradesthatconsistofletters.Extendingexistingfunctions:Thissectionincludesabriefdiscussionofhowexistingfunctionscanbeextendedtoreactinanappropriatewaytoanobjectthatisoneoftheassignmentclasses.Wefocusonthesummary,plot,andshowcommands.Extendingoperations:Thissectionincludesadiscussiononhowtoextendarithmeticandaccessfunctionsbybuildingonexistingmethods.
OverviewAllthepreviouschaptersfocusonspecifictopics,andherewebringtogetheranumberofdifferenttopicstoexamineanextendedexample.AlloftheclassesexaminedhereareS4classes.TheseclassesareusedtoreadinaCSVfilethatcontainsthegradesforstudentsinaclass.Thereisoneclassthatdefineshowtoreadagradefileandhowtointerpretacolumn.Anothersetofclassesisdefinedandtheclassesareusedtotracktheinformationforasingleassignment.Anobjectofthisclasswillincludeallthescoresfortheassignmentsofallstudents.
WefirstdiscussthenewCourseclass,whichisusedtokeeptrackofalltheassignments.Next,wediscusstheassignmentclassesthatareusedtokeeptrackofthegradesforasingleassignment.Oncetheclassesaredefined,thedetailsonextendingthesummary,plot,andshowcommandsaregiventodemonstratehowcommoncommandscanbeextendedtoreactwhenpassedanewlydefinedobject.Finally,somebasicarithmeticandaccessoroperatorsareredefinedsothattheRsystemwillreactinanappropriatemannerwhenusingfamiliaroperations,suchasadditionormultiplication.
TheCourseclassincludesobjectswhosetypeisoneoftheassignmentclasses,anditseemsmorenaturaltodefinetheassignmentclassesfirst.Theactionsoftheassignmentclassesaremoreintricateandincludemoredetails,whiletheCourseclassisrelativelystraightforward.Tokeeptheintroductiontotheclassesmoregentle,we’llfirstdiscusstheCourseclass.
TheCourseclassTheCourseclassisusedtokeeptrackofallofthegradesforacourse.Thisclasswillreadintheinformationfromafile,decidewhethergradesarenumericgradesorlettergrades,andcreateanappropriateassignmentobjecttoholdthegrades.Thedetailsfortheclassareshowninthefollowingfigure.We’llfirstdiscussthedataandthenthemethodsassociatedwiththeclass.We’llthengivesomedetailsonhowtodefinetheclass.Mostofthecodeassociatedwiththeaccessorsisomittedforthesakeofbrevity,butthefullcodeisavailableonthewebsiteassociatedwiththisbook.
First,therearethreedatastructures(slots)associatedwiththeclass.ThefirstisthenameoftheCSVfilethatcontainsallofthegradesfortheclass.Thesecondisavectorofprefixesusedtodeterminewhatkindofgradedtasksareinthefile.Thelastisalistthatcontainsalloftheassignments.
Therearesevenmethods.Allbutoneareusedtosetandgetthevaluesforthethreeslots.Theaccessorroutinestogetandsetthefilenamearegivenhere,andtheothersarenotincludedinthetexthere.Thelastroutineisusedtoreadthegrades,anditisacomplexmethodthatreadsthedatafromthefileandthendetermineswhethertheassignmenthasnumericalorlettergrades.Itthendefinesanappropriateassignmentobject.
ThedefinitionoftheCourseclassTheCourseclassdefinitionisgivenandthentheaccessorsfortheslotsarestated.ItisanS4class.Inthiscase,wedonotprovideanycheckstoensurethattheinformationintheslotsisconsistenttoreducethecomplexityoftheexample:
###############################################
#CreatetheCourseclasstokeeptrackofallgrades
Course<-setClass(
#Setthenamefortheclass
"Course",
#Definetheslots
slots=c(
GradesFile="character",
GradeTypes="character",
Grades="list"
),
#Setthedefaultvaluesfortheslots.(optional)
prototype=list(
GradesFile="",
GradeTypes=c("test","hw","quiz","project"),
Grades=list()
),
#Makeafunctionthatcantesttoseeifthedataisconsistent.
#Thisisnotcalledifyouhaveaninitializefunctiondefined!
validity=function(object)
{
return(TRUE)
}
)
Eachoftheslotsincludesasetofaccessorfunctionstoassistinretrievingorsettinginformationtrackedbytheclass.Weonlyprovidethedetailsforonesetofaccessors.Themethodstosetandgetthefilenamearegiven.NeitherofthesemethodsexistinthedefaultRenvironmentsotheymustbecreatedfirst,asfollows:
#DefinethemethodsusedtoretrieveorsetthevalueswithinaCourse
object.
setGeneric(name="GetFileName",
def=function(course)
{
standardGeneric("GetFileName")
}
)
setMethod(f="GetFileName",
signature="Course",
definition=function(course)
{
return(course@GradesFile)
}
)
setGeneric(name="SetFileName",
def=function(course,fileName)
{
standardGeneric("SetFileName")
}
)
setMethod(f="SetFileName",
signature="Course",
definition=function(course,fileName)
{
course@GradesFile=fileName
return(course)
}
)
ReadinggradesfromafileThelastmethodexaminedisthemethodtoreadthedatafromagivenfile.WewillexaminetheGradeslistwithintheCourseclassbeforeprovidingalisting.Thisslotisalist,andeachelementwithinthelistisanassignmentobject.(Theassignmentclassesarediscussedinthenextsectionofthischapter.)ThenameoftheobjectwithinthelististhesameasthenamefoundinthefirstrowoftheCSVfile.
TheReadGradesmethodfirstreadsthecsvfile.Itthengoesthroughthecolumnsthatwerereadfromthefile,andthenamesofthecolumnsareassumedtobeinthefirstrowofthefile.Ifthefirstlettersinthenameofacolumnmatchoneofthestringsthatareinthevector(foundintheGradeTypesslot),thenitisassumedthatthecolumnrepresentsagradeditem.Therearetwokindsofassignments:numericalgradesorlettergrades.Ifacolumnfromthedatafileisdeterminedtobeagradeditem,thenitstypeischecked.Ifthetypeisanumerictype,thenitisassumedthatthecoursetypeisforanumericgrade(anobjectfromtheNumericGradeclass);otherwise,itisassumedtobealettergrade(anobjectfromtheLetterGradeclass):
setGeneric(name="ReadGrades",
def=function(course)
{
standardGeneric("ReadGrades")
}
)
setMethod(f="ReadGrades",
signature="Course",
definition=function(course)
{
grades<-read.csv(GetFileName(course))
convertedGrades<-list()
courseTypes<-GetGradeTypes(course)
for(gradeIteminnames(grades))
{
#Gothrougheachcolumnfromthefile.
for(typeincourseTypes)
{
#gothrougheachcoursetypeanddetermineif
#thiscolumnisaquiz/test/hw/?
if(length(grep(type,gradeItem))>0)
{
#Theprefixforthenamematchesoneof
#thepredefinedtypes.
if((class(grades[[gradeItem]])=="numeric")||
(class(grades[[gradeItem]])=="integer")){
thisItem<-NumericGrade()
#print(paste("Thisitem,",gradeItem,
#",isanumericgradeitem.",
#class(thisItem)))
}else{
thisItem<-LetterGrade()
#print(paste("Thisitem,",gradeItem,
#",isalettergrade.",
#class(thisItem)))
}
#Convertthevaluesintotheir
#respectivegrades.
thisItem<-SetValue(thisItem,
grades[[gradeItem]])
#print(paste("class:",class(thisItem)))
convertedGrades[gradeItem]<-thisItem
}
}
}
return(SetGrades(course,convertedGrades))
}
)
TheCourseclassisusedtoorganizethegradesforawholeclass.Thescoresforanindividualassignmentarekeptinoneoftheassignmentclasses,whichisexaminedinthefollowingsection.
TheassignmentclassesTherearetwoassignmentclasses,andtheyarebothderivedfromtheassignmentclass.ThefirstclassistheNumericGradeclass,whichkeepstrackofanumericgrade.ThesecondclassistheLetterGradeclass,whichkeepstrackofalettergrade.Thedetailsoftheclassesareshowninthenextfigure.ThedefinitionoftheAssignmentclassisgiveninthefollowingfigure,andthedetailsabouttheaccessorsareomittedforthesakeofbrevity.Thecompletecodeisavailableonthewebsiteforthisbook.
ThedetailsoftheNumericGradeandLetterGradeclassesaregiveninseparatesubsectionsinthischapter.Oncetheclassesaregiven,examplesareprovidedtodemonstratehowtousetheCourseclasstoreadthegradesfromafileandcreatethenecessaryassignmentobjects.
TheassignmentclassisthebaseclassfortheNumericGradeandLetterGradeclasses.Theclassonlyhastwoslots:thenameandanumber.Thenameisusedtodisplaytheinformationrelatedtotheassignment,andthenumbercanfurtheridentifythesetofgrades.Forexample,thegradesfortest3fromaclassmighthaveaname,Test,andnumbersettothevalueof3.
Thedefinitionfortheclassisgivenhere:
###############################################
#Createthebaseassignmentclass
#
#Thisisusedtorepresentthegradesforoneassignment.
Assignment<-setClass(
#Setthenamefortheclass
"Assignment",
#Definetheslots
slots=c(
Name="character",
Number="numeric"
),
#Setthedefaultvaluesfortheslots.(optional)
prototype=list(
Name="Test",
Number=as.integer(1)
),
#Makeafunctionthatcantesttoseeifthedataisconsistent.
#Thisisnotcalledifyouhaveaninitializefunctiondefined!
validity=function(object)
{
if(object@Number<0){
object@Number<-0
warning(paste("Anegativenumberfortheassignment",
"numberispassed.Itissettozero.")
}
return(TRUE)
}
)
TheprimarypurposeoftheAssignmentclassistoactasthebaseclassforothertypesofassignments.Thetwoclasses,NumericGradeandLetterGrade,thatinheritfromtheassignmentclassarediscussedinthefollowingsections.
TheNumericGradeclassThefirstclassweexaminethatinheritsfromtheAssignmentclassistheNumericGradeclass.Thisclassisusedtoretainthegradesforanassignmentwhosegradesarenumbers.Theclasshasonlyoneslot,anumericvectorofgrades.Thedefinitionisgivenhere:
##############################################
#Createtheclasstokeeptrackofthegradesthatarenumericin
#nature.
#
#Thisisusedtorepresentthenumericgradesforoneassignment.
NumericGrade<-setClass(
#Setthenamefortheclasswithanumericgradeassociatedwith
#it.
"NumericGrade",
#Definetheslots
slots=c(
Value="numeric"
),
#Setthedefaultvaluesfortheslots.(optional)
prototype=list(
Value=0.0
),
#Makeafunctionthatcantesttoseeifthedataisconsistent.
#(optional)
#Thisisnotcalledifyouhaveaninitializefunctiondefined!
validity=function(object)
{
if(object@Value<0){
object@Value<-0
warning(paste("Anegativenumberfortheassignmentvalue",
"ispassed.Itissettozero."))
}
return(TRUE)
},
#ThisclassinheritsfromtheAssignmentclass
contains="Assignment"
)
Themethodsfortheclassincludetheroutinestosetandgetthevaluesofthegradesaswellasamethodtoprintoutareport.Themethodtogetthegradesisgivenhereasanexample:
#CreatethemethodstoretrieveandsetthevaluesoftheNumericGrade
class.
setGeneric(name="GetValue",
def=function(assignment)
{
standardGeneric("GetValue")
}
)
setMethod(f="GetValue",
signature="NumericGrade",
definition=function(assignment)
{
return(assignment@Value)
}
)
Thereisoneothermethodtobedefined,anditisusedtoprintoutareportbasedonthescores.Themethodforthegradereportprintsoutafive-pointsummaryofthegrades,thefrequencyofthegradesdividedintotenpercentintervals(90-100percent,80-90percent,70-80percent,andsoon),astem-leafplotofthegrades,andasortedlistofthegrades.Thereportmethodincludesacalltothesummarycommandtogetthefive-pointsummaryforthegrades.Thesummaryfunctionisoverridden,andthedetailsaregiveninalatersection,Redefiningexistingfunctions.
Thedefinitionofthereportmethodisgivenhere:
setGeneric(name="GradeReport",
def=function(assignment,maxGrade=NA,div=10)
{
standardGeneric("GradeReport")
}
)
setMethod(f="GradeReport",
signature="Assignment",
definition=function(assignment,maxGrade=NA,div=10)
{
print(noquote(paste("Gradereportfor",
GetName(assignment))))
print(noquote(''))
print(summary(assignment))#Printoutafive
#pointsummaryforthedata
values<-GetValue(assignment)#Gettherawscores.
if(is.na(maxGrade)){
#ThemaxGradewasnotset.Assumethemaxscore
#fromthedataisthemaximumpossible.
maxGrade<-max(values)
warning(paste("Themaxgadeisnotset,anditis",
"assumedtobe",maxGrade))
}
skip<-maxGrade*div/100;#Setthe
widthoftheintervals.
#Movethemaxgradeuptomakesurethattheleftsided
#cutwillhaveanintervaltocontainthetopscores.
while(maxGrade<=max(values))
{
maxGrade=maxGrade+skip
}
#Determinethenumberofintervals.
numLower<-ceiling((maxGrade-min(values))/skip)
#Determineallofthecutoffpoints
bins=c(seq(maxGrade-numLower*skip,
max(c(values,maxGrade)),by=skip))
#Convertthedataintofactors
levs<-cut(values,breaks=bins,right=FALSE)
#Determinethefrequenciesforthedifferentlevels.
gradeFreqs<-table(levs)
print(noquote(''))
print(noquote("StemLeafplotofgrades:"))
print(stem(values))
print(noquote(''))
print(noquote("GradeFrequencies:"))
print(gradeFreqs)
print(noquote(''))
print(noquote("SortedGrades:"))
print(sort(values))
}
)
TheLetterGradeclassThesecondclassthatinheritsfromtheAssignmentclassistheLetterGradeclass.Thisclassisusedtokeeptrackofgradesthatareassignedaslettergrades.ItissimilartotheNumericGradeclass,exceptithastwoslots.Thefirstslotisacharactervectorwiththegrades.Thesecondslotisalistthatcontainsthepossiblelettergradesasthename,andthevalueassociatedwitheachlettergradeisitsnumericvalueusedforcalculations.
ThedefinitionoftheLetterGradeclassisgivenhere:
#############################################################
#Createtheclasstokeeptrackofthegradesthatarelettersin
#nature.
#
#Thisisusedtorepresentthelettergradesforoneassignment.
LetterGrade<-setClass(
#Setthenamefortheclasswithanumericgradeassociated
#withit.
"LetterGrade",
#Definetheslots
slots=c(
Value="character",
Scale="list"
),
#Setthedefaultvaluesfortheslots.(optional)
prototype=list(
Value="F",
Scale=list(
'A+'=98,'A'=95,'A-'=92,
'B+'=88,'B'=85,'B-'=83,
'C+'=78,'C'=75,'C-'=73,
'D+'=68,'D'=65,'D-'=63,
'F+'=58,'F'=55,'F-'=53,
"NA"=0)
),
#Makeafunctionthatcantesttoseeifthedatais#consistent.
#(optional)
#Thisisnotcalledifyouhaveaninitializefunction#defined!
validity=function(object)
{
pos<-grep(paste(object@Value,"$",sep=""),
names(object@Scale))
if(length(pos)!=1){
object@Value<-'F-'
warning("Thegradeisnotrecognized.")
}
return(TRUE)
},
#ThisclassinheritsfromtheAssignmentclass
contains="Assignment"
)
Theclasshastheusualaccessormethodstogetandsetthevaluesoftheslots.Wegivethemethodtosetthevalueofthegradesbecauseitisacomplexexercise.Themethodmustconverteachgradetoacharacterbecausetheymaybepassedasafactor.Themethodmustalsogothroughandmakesurethateachgradeistheonethatisalreadydefined.ItdoesthisusingthegrepcommandandthenamesofthelistintheScalesslot,asfollows:
setMethod(f="SetValue",
signature="LetterGrade",
definition=function(assignment,value)
{
#Loopthrougheachiteminthevectorofvalues.Also,
#convertthevaluetoacharactervector.Weneedthe
#scalelistinsidetheloopsograbacopynowfor
#lateruse.
lupe<-1
value<-as.character(value)
theScale<-GetScale(assignment)
theNames<-names(theScale)
while(lupe<=length(value))
{
#Determinewhetherthisitemcanbefoundinthe
#listofscaleitems.Expressitasaregular
#expressionandmakesureitisanexactmatch
#byusingfirstandlastplacemarkers.
thePattern<-paste("^",sub("\\+","\\\\+",
value[lupe]),"$",sep="")
pos<-grep(thePattern,theNames)
if(value[lupe]==""){
#Anemptystringwaspassed.
value[lupe]<-"NA"
}elseif(length(pos)!=1){
#Thisitemwasnotfound.PrintawarningandmakeitanNA
warning(paste("Thegrade\"",value[lupe],
"\"isnotrecognized.ItissettoNA.",
sep=""))
value[lupe]<-"NA"
}
lupe<-lupe+1
}
assignment@Value<-value
return(assignment)
}
)
Thefinalmethodexaminedhereisthemethodtoprintoutagradereport.ItisasimplermethodcomparedtothesamemethodintheNumericGradeclass.Inthiscase,theonlyresultstoprintarethefrequencyofoccurrencesforeachpossiblelettergrade:
setMethod(f="GradeReport",
signature="LetterGrade",
definition=function(assignment,maxGrade=NA,div=10)
{
print(noquote(''))
print(noquote(paste("Gradereportfor",
GetName(assignment))))
print(noquote(''))
print(summary(assignment))#Printoutafive
#pointsummaryforthedata
}
)
Example–readinggradesfromafileAshortexampleisgiventodemonstratehowtoreadaclassfile.WeassumethataCSVfile,calledshortList.csv,isinthecurrentworkingdirectory.Thefirsttasktoaccomplishistoexecutethefilethatcontainstheclassdefinitions,grades.R.Oncetheclassisdefined,thefilenameisset,andthefileisread.
First,inthisexample,thefilewiththeshortList.csvgradesisassumedtobethefollowing:
section,test1,project1,final
3,75,B+,89
2,68,B,94
3,98,B+,110
2,76,A-,93
1,96,A+,112
1,81,B+,91
2,19,A,70
2,52,B+,70
2,88,A,71
ThedefaultvaluefortheGradeTypesslotintheCourseclassisthefollowing:
GradeTypes=c("test","hw","quiz","project")
AnycolumninthefilewhosenamestartswithoneofthestringsintheGradeTypesvectorisassumedtobearecognizedgrade.Thetest1,test2,hw1,hw2,hw3,quiz1,quiz2,andproject1columnsarerecognizedasbeingagradeditem.Thecolumnwhosenameisfinalisnotinthedefaultvectorandisnotrecognized,sotheGradeTypesslotshouldbereplaced:
>source('grades.R')
>dir(pattern="csv$")
[1]"math100.csv""shortList.csv"
>course<-Course()
>course<-SetGradeTypes(course,c("test","hw","quiz","project",
"final"))
>course<-SetFileName(course,"shortList.csv")
>course<-ReadGrades(course)
>course
Anobjectofclass"Course"
Slot"GradesFile":
[1]"shortList.csv"
Slot"GradeTypes":
[1]"test""hw""quiz""project""final"
Slot"Grades":
$test1
Anobjectofclass"NumericGrade"
Slot"Value":
[1]756898769681195288
Slot"Name":
[1]"Test"
Slot"Number":
[1]1
$project1
Anobjectofclass"LetterGrade"
Slot"Value":
[1]"B+""B""B+""A-""A+""B+""A""B+""A"
Slot"Scale":
$'A+'
[1]98
$A
[1]95
$'A-'
[1]92
$'B+'
[1]88
$B
[1]85
$'B-'
[1]83
$'C+'
[1]78
$C
[1]75
$'C-'
[1]73
$'D+'
[1]68
$D
[1]65
$'D-'
[1]63
$'F+'
[1]58
$F
[1]55
$'F-'
[1]53
$'NA'
[1]0
Slot"Name":
[1]"Test"
Slot"Number":
[1]1
$final
Anobjectofclass"NumericGrade"
Slot"Value":
[1]89941109311291707071
Slot"Name":
[1]"Test"
Slot"Number":
[1]1
DefiningindexingoperationsAnobjectthatisaCourseclasscanhavemanyassignmentobjectsinitsGradesslot.Wedidnotdefineaspecialmethodtogetaparticularassignment,andnomethodisdefinedtosaveanassignment.Weexaminehowtodothisisinthissection,andthediscussionrevolvesaroundredefiningthe[operation.
Firstweredefinethe[operationusedtogetanassignment.Theideaisthatwewanttogetacopyofanassignmentbyenclosingthenameoftheassignmentasdefinedintheoriginalfilewithinsquarebraces.Todothis,wecanredefinetheoperation.Inthiscase,wewanttobeabletopassanykindofobjectwithinthebraces,whichwillallowustoalsouseintegerstoindexbylocationinthelist:
setMethod("[",
signature(x="Course",i="ANY"),
definition=function(x,i=1)
{
#print(paste("Getgradeitem",i))
return(x@Grades[[i]])
}
)
Wewouldliketoalsobeabletogetagradewithinanassignment.Inthiscase,wewillpasstwoargumentswithinthebraces.Thefirsttaskistogettheassignmentusingthepreviousdefinitionandthengetthegradewithintheassignmentwhoseindexmatchesthesecondargument:
setMethod("[",
signature(x="Course",i="ANY",j="numeric"),
definition=function(x,i=1,j=1)
{
#print(paste("coursevalue",i,j))
allGrades<-GetValue(x[i])
return(allGrades[[j]])
}
)
Weassumethattheoperationsaredefinedinafilecalledops.R.Oncethesemethodsarereadandexecuted,the[operationisusedinthefollowingexampletogetacopyoftest1ortoexaminethethirdgradeintest1:
>source('grades.R')
>source('ops.R')
>source('overriding.R')
>course<-Course()
>course<-SetFileName(course,"shortList.csv")
>course<-ReadGrades(course)
>course['test1']
[1]Assignment:Test
[1](9)Grades:
[1]756898769681195288
>course['test1',3]
[1]98
>
Thefinaltopictoexamineinthissectionistheuseofthe[operationtosetthevalueofanentry.Theapproachissimilartothemethodsusedtogetinformationgivenearlier.TheonlydifferenceisthatinsteadofusingthesetMethodfunction,weusethesetReplacemethod,andthelastargumenttothefunctionisthevaluetosetthecorrespondingentryintheappropriateobject:
setReplaceMethod("[",
signature("NumericGrade"),
definition=function(x,i,value)
{
#print(paste("gradevalue",i,value))
x@Value[i]=value
return(x)
}
)
setReplaceMethod("[",
signature("Course"),
definition=function(x,i,j,value)
{
#print(paste("coursegrade",i,j,value))
grades<-x@Grades[[i]]
grades[j]<-value
x@Grades[i]<-grades
return(x)
}
)
Withthesedefinitionsinplace,avaluewithinthecourseforaspecificgradecanbeeasilyset,asfollows:
>source('grades.R')
>source('ops.R')
>source('overriding.R')
>
>course<-Course()
>course<-SetFileName(course,"shortList.csv")
>course<-ReadGrades(course)
>print(course['test1'])
[1]Assignment:Test
[1](9)Grades:
[1]756898769681195288
>print(course['test1',3])
[1]98
>
>course['test1',3]<-99.1
>print(course['test1',3])
[1]99.1
>
RedefiningexistingfunctionsAsmentionedpreviously,thesummary,show,andplotcommandsareextendedtoreactinanappropriatewaywhenpassedaNumericGradeorLetterGradeobject.ExtendingthesefunctionsisdonebysimplydefininganewmethodforthesefunctionsusingthesetMethodcommand.Eachofthesecommandsalreadyexist,soitisnotnecessarytoreservethenamesusingthesetGenericcommand.Thatis,wesimplydefinethemethodtoassociatewiththecommandwhenpassedanobjectthatisamemberoftheNumericGradeorLetterGradeclass.
Weextendthesummarycommandinthefirstexample.Inthiscase,thefunctionshouldhaveadifferentbehavioriftheobjectpassedtoitisNumericGradeversusLetterGrade.ForanobjectthatisLetterGrade,thesummarycommandretrievesthegradesandprintsoutafrequencytableforthegrades:
setMethod(f="summary",
signature="LetterGrade",
definition=function(object,...)
{
#Getthelettergradesasfactorsandreturnthe
#frequencytable.
values<-GetLetterGrade(object)
return(summary(as.factor(values)))
}
)
Inthisexample,theonlyotherkindofassignmentthatcanbecreatedisNumericGrade,butinthefuture,theclassmaybeextended.InsteadofcreatingasummaryfunctionsolelyfortheNumericGradeclass,wecreateasummaryfunctionforitsbaseclass,Assignment.Inthisway,anargumenttothesummarycommandthatisderivedfromtheAssignmentclasswillobtaintheobject’sgrades,andthegradesareassumedtobeanumericvector.Thesummarycommandcanthenbeinvokedonthevector:
setMethod(f="summary",
signature="Assignment",
definition=function(object,...)
{
#Getthegradevaluesandreturnthefivepoint
#summary.
values<-GetValue(object)
return(summary(values))
}
)
Inthefollowingexample,thenecessaryfilesareread,andtheinformationfromtheshortList.csvfileisread.Copiesoftwodifferentassignments,test1andproject1,arefound,andasummaryforeachobjectisprinted:
>source('grades.R')
>source('ops.R')
>source('overriding.R')
Creatingagenericfunctionfor'plot'frompackage'graphics'inthe
globalenvironment
Creatingagenericfunctionfor'summary'frompackage'base'intheglobal
environment
>
>course<-Course()
>course<-SetFileName(course,"shortList.csv")
>course<-ReadGrades(course)
>x<-course['test1']
>summary(x)
Min.1stQu.MedianMean3rdQu.Max.
19.0068.0076.0072.5688.0098.00
>p<-course['project1']
>summary(p)
AA+A-BB+
21114
>
Thefilesavailableintherepositoryforthisbookcontaindefinitionsfortheshow,summary,andplotcommands.Weexamineonemorehere.TheplotcommandforaNumericGradeobjectisabitmorecomplicatedthantheotherexamples.Theplotcommandobtainsthegradesfromtheobjectpassedtoit,anditprintsoutahistogram,addsaboxplotatthetopofthehistogram,andtherugcommandisusedtodisplaythedatavaluesonthesameplot.
Theargumentstothecommandincludethemaximumgradefortheassignment.Thisargumentisrequiredbecauseinsomecircumstances,extracreditisavailableandsomestudentsmayachieveahigherscorethanisallocatedfortherestoftheclass,andinothersituations,nostudentmayachieveaperfectscore.Ifthisargumentisnotprovided,thenthemaximumscoreisused.
Asecondargumentisincludedthatindicateswhatpercentagestouse.Thedefaultis10,whichindicatesthatthebreaksinthehistogramaremadeatthe10percentmarksforthescores.Forexample,ifthedefaultof10isused,thenthescoresinthe90-100rangearecountedtogether.
Thefinalargumentistheellipsessymbol(…),whichindicatesthatotherargumentscanbepassedtothefunction.Thesamesymbolisusedintheplotcommandwithinthemethod.Theideaisthatalloftheextraparametersarepassedtotheplotcommand.Thisallowsustosetawidearrayofplotparametersthroughthemethodwithouthavingtocatchanyspecialcases:
setMethod(f="plot",
signature="Assignment",
definition=function(x,maxGrade=NA,div=10,...)
{
#print("Plottinganassignment")
values<-GetValue(x)#Gettherawscores
if(is.na(maxGrade)){
#ThemaxGradewasnotset.Assumethemaxscorefrom
#thedataisthemaximumpossible.
maxGrade<-max(values)
warning(paste("Themaxgadeisnotset,",
"anditisassumedtobe",
maxGrade))
}
skip<-maxGrade*div/100;#Setthewidthofthe
#intervals.
#Determinethenumberofintervals.
numLower<-ceiling((maxGrade-min(values))/skip)
#Determinethecutoffvaluesbetweenthebinsinthe
#histogram.
bins=c(seq(maxGrade-numLower*skip,
max(c(values,maxGrade)),by=skip))
if(max(bins)<max(values)){
#Thebinsdonotincludethemaximumvalue.Adjust
#theboundontheuppermostbin.
bins[length(bins)]<-max(values);
}
#Convertthedataintofactors
levs<-cut(values,breaks=bins,right=FALSE)
#Determinethefrequenciesforthedifferentlevels.
gradeFreqs<-table(levs)
#Getthemaxfrequency
top<-max(gradeFreqs)
#Plotthehistogram.
hist(values,breaks=bins,
freq=TRUE,
ylim=c(0,top+1),axes=FALSE,
col=grey((seq(length(bins)-1,1,by=-1)/
(length(bins)-1))),
right=FALSE,...)
#Addaboxplotacrossthetop
boxplot(values,horizontal=TRUE,at=top+0.5,add=TRUE,
axes=FALSE)
#Plottherawdataasastripchartacrossthebottom
rug(values)
#Turnononlytheleftandloweraxes.
axis(side=1,at=bins)
axis(side=2,at=seq(0,top+1,by=1))
}
)
Inthefollowingexample,alargerdatafileisread.Theresultsfromtest2areplotted,andthemethodautomaticallygeneratestheplotasdescribedearlier:
>source('grades.R')
>source('ops.R')
>source('overriding.R')
>
>course<-Course()
>course<-SetFileName(course,"math100.csv")
>course<-ReadGrades(course)
>
>x<-course['test2']
>plot(x,maxGrade=100,main='StudentScoresFromTest2')
Warningmessage:
Inplot.histogram(r,freq=freq1,col=col,border=border,angle=
angle,:
theAREASintheplotarewrong—ratheruse'freq=FALSE'
>
Notethatawarningmessageisprinted.Themaximumgradeisspecifiedas100,buttherearesomestudentsintheclasswhoachieveda102becauseofextracredit.Thebreaksinthehistogramincludethestudentswhoachievedahighergradethanthemaximumgradeinthetop10percent,andthatsetofvalueshasadifferentwidththantheothers.Theresultingplotisshowninthefollowingscreenshot:
Thehistogramcreatedbytheplotcommandgiventheinformationinthemath-100.csvgradefile
RedefiningarithmeticoperationsThelasttopicthatwe’regoingtoexamineishowtodefinebasicarithmeticoperationsonassignments.Inparticular,we’llexaminehowtoperformarithmeticoperationsontwoNumericGradeobjectsandhowtoaddasetofvaluestoaNumericGradeobject.ThiscanbedoneintheRenvironmentbyextendingoneofthesetsofgenericgroupsthatisusedtocollectsimilaroperations.ThedifferentgroupsincludetheArith,Compare,Ops,Logic,Math,Math2,Summary,andComplexgroups.TheOpsgroupincludestheArith,Compare,andLogicoperations,andweextendthisgrouptoincludeobjectsfromtheNumericGradesclass.
WeusethesetMethodfunctionfortheOpsgrouptodefineoperationsinvolvingNumericGradeobjects.TheresultingfunctionobtainsthegradesandperformstherequisiteoperationusingthecallGenericcommand.ThequirkofthesystemisthatitmustreturnaNumericGradeobject,andthelaststepistosetthevalueofthegradesforthefirstNumericGradeobjectsenttothefunction:
setMethod("Ops",signature(e1="NumericGrade",e2="NumericGrade"),
function(e1,e2){
theSum<-callGeneric(GetValue(e1),GetValue(e2))
return(SetValue(e1,theSum))
}
)
setMethod("Ops",signature(e1="NumericGrade",e2="numeric"),
function(e1,e2){
theSum<-callGeneric(GetValue(e1),e2)
return(SetValue(e1,theSum))
}
)
setMethod("Ops",signature(e1="numeric",e2="NumericGrade"),
function(e1,e2){
theSum<-callGeneric(e1,GetValue(e2))
return(SetValue(e1,theSum))
}
)
Inthefollowingexample,thegradesfromthemath100.csvfileareread.Acopyofthescorestakenfromthetest1andtest2assignmentsareobtained,andthesimpleaverageisfoundbyaddingthemanddividingbytwo:
>source('grades.R')
>source('ops.R')
>source('overriding.R')
>
>course<-Course()
>course<-SetFileName(course,"math100.csv")
>course<-ReadGrades(course)
>x<-course['test1']
>y<-course['test2']
>z<-(x+y)/2
>print(z)
[1]Assignment:Test
[1](327)Grades:
[1]80.076.095.583.098.087.045.566.594.076.563.5
83.5
[13]66.052.585.535.582.590.589.551.071.053.580.0
75.5
[25]39.061.50.072.584.070.097.576.052.073.068.5
55.5
[37]44.086.064.059.061.558.043.595.553.560.077.5
56.5
[49]93.552.597.555.098.068.063.046.574.526.578.5
69.0
[61]84.592.578.578.084.511.051.096.073.095.094.0
82.5
[73]93.00.086.044.575.051.557.5100.048.064.557.0
65.0
[85]75.576.562.019.052.052.579.071.563.024.091.0
99.5
[97]0.073.062.00.066.588.577.591.548.063.553.5
87.5
[109]80.50.055.585.040.579.00.092.560.591.541.5
51.0
[121]67.076.076.560.585.564.587.531.087.00.098.5
60.5
[133]91.086.548.054.554.034.056.50.061.584.544.5
64.0
[145]64.059.575.560.561.583.066.052.572.066.572.5
81.0
[157]98.064.578.577.019.080.533.053.544.074.074.0
58.5
[169]45.083.060.078.070.079.060.096.568.592.063.5
93.5
[181]62.564.082.590.057.571.054.029.545.079.527.5
94.5
[193]74.571.00.00.031.077.049.048.552.0100.575.5
90.5
[205]79.062.092.557.583.00.045.572.576.585.034.5
90.5
[217]100.574.582.00.063.079.547.567.580.558.082.0
60.0
[229]68.061.562.054.568.569.087.544.065.553.047.0
66.0
[241]64.093.052.051.581.067.092.585.036.076.578.0
49.5
[253]42.586.076.043.074.090.544.540.545.050.548.0
80.0
[265]87.088.059.559.548.062.094.080.071.064.065.0
43.5
[277]61.00.057.033.091.0100.066.091.598.070.060.0
91.5
[289]91.081.542.080.067.064.564.555.588.573.581.5
99.5
[301]59.565.077.558.589.5100.544.585.50.070.590.0
85.5
[313]81.064.551.076.554.588.058.556.023.036.088.5
71.5
[325]56.066.516.5
>
SummaryAnextendedexamplemakinguseofS4classeswasexaminedinthischapter,andasetofclassesaredefinedtoreadandtrackthegradesforaclass.TheCourseclasskeepstrackofanumberofassignments,anditreadsthecontentsofaCSVfileandautomaticallydetermineswhichcolumnsaregradesandwhetherornottheyarenumericalgradesorlettergrades.
Thegradesforaparticularassignmentarekeptinoneoftwoclasses.TheNumericGradeclasskeepstrackofnumericalgrades,andtheLetterGradeclasskeepstrackoflettergrades.BothclassesarederivedfromtheAssignmentbaseclass.
Anumberofexamplesweregiven,andhighlightsfromthecodeweregiven.Thefullsetofcodecanbefoundonthewebsiteassociatedwiththisbook,andwerecommendthatyoucloselyexaminethecode.Thethreefilesthatincludethedefinitionsforthisclassarethegrades.R,ops.R,andoverriding.Rfiles.
Inthenextchapter,anothersetofclassesaredeveloped.Theclassesinthatchapterprovideanexampleofasetofclassesthatcanbeusedtogeneratetheresultsfromastochasticprocessandmanagetheresultsfromalargenumberofsimulations.Theclassescanbeusedtogenerateresultsfromeitheradiscreteorcontinuousprocess,andthedistributionoftheresultscanbeexplored.
Chapter11.CaseStudy–SimulationWewillnowexamineasetofS3classesdesignedtoimplementaMonte-Carlosimulation.Theexamplescriptsincludeaclasstogenerateasinglesimulationaswellasaclasstorunandcollecttheresultsfrommultiplesimulations.Thesimulationclassincludestwoderivedclasses.Oneisusedforsimulationsofadiscretestochasticprocess,andtheotherisforsimulationsofastochasticdifferentialequation.
Thissectionisroughlydividedintothefollowingparts:
ThesimulationclassesTheMonte-CarloclassExamples
Wewillfirstexamineasetofsimulationclassesthataredesignedtogenerateasinglesimulationofastochasticsystem.Theclassesmakeuseofabaseclasstomanagetheparameters.Twoclassesarederivedfromthebaseclass.Thefirstderivedclassisusedtosimulateadiscretestochasticsystem.Thesecondderivedclassisusedtoapproximateastochasticdifferentialequation.
Afterexaminingthesimulationclasses,amasterclassisusedtomanagetheMonte-Carlosimulations.TheMonte-Carloclassacceptsasimulationclassandcollectstheresultsfrommultiplesimulations.Ourfocusisonthebasicstructureoftheclasses,sowedonotdiscussstatisticalmethodsforthedata.
Webrieflyexamineanexampleusingtheclassesinthelastsectionofthischapter.Theexamplefocusesonhowtocreateanobjectfromthediscretestochasticsimulationclassandhowtogenerateresults.
ThesimulationclassesThesimulationclassesconsistofthreeparts.TheSimulationclassisthebaseclass,andtwoclasses,DiscreteSimulationandContinuousSimulation,arederivedfromthebaseclass.Weassumethatthedefinitionsfortheseclassesarekeptinaseparatefile,simulationS3.R.
Thebaseclass,Simulation,isusedtomanagetheparametersandresultsforasinglesimulation.Thedataincludesthefinaltimeusedinthesimulationandaccessormethodsaredefinedforthedata:
########################Createthebasesimulationclass
##
##Thisisusedtorepresentasinglesimulation
Simulation<-function()
{
##Createthelistusedtorepresentan
##objectforthisclass
me=list(
simulationResults=matrix(0)
)
##Setthenamefortheclass
class(me)<-append(class(me),"Simulation")
return(me)
}#Setthedatavaluesthataretheresultofasimulation.
setSimulation<-function(theSimulation)
{
UseMethod("setSimulation",theSimulation)
}
setSimulation<-function(theSimulation,simulationResults)
{
##SetthevalueofthevariabletheSimulation
theSimulation$simulationResults<-simulationResults
return(theSimulation)
}
]
##methodtoreturnthedatafromthecurrentsetofresults.
getFinalValues<-function(theSimulation)
{
UseMethod("getFinalValues",theSimulation)
}
getFinalValues<-function(theSimulation)
{
##Getthevalueofthedatapairatthelasttimestep
size<-dim(theSimulation$simulationResults)
return(c(theSimulation$simulationResults[size[1],1],
theSimulation$simulationResults[size[1],2]))
}
Notethattheclassincludesanadditionmethod,getFinalValues,whichisusedto
retrievethelastvaluesapproximatedinthesimulation.ThismethodisrequiredbytheMonte-Carloclassdefinedinthenextsection.Theresultsareusedtodeterminetherequisitestatistics.
ThenexttwoclassesaretheDiscreteSimulationandContinuousSimulationclasses.TheDiscreteSimulationclassisusedtogenerateanapproximationtothediscretestochasticsystemgivenbythefollowingequation:
Intheprecedingequation,a,b,N1,g,d,andN2areconstants,andW1andW2arenormallydistributedrandomvariableswithameanofzeroandastandarddeviationofone.TheContinuousSimulationclassisusedtogenerateanapproximationusingtheMilsteinschemetothestochasticdifferentialequation:
Intheprecedingequation,a,b,N1,g,d,andN2areconstants,andW1andW2areindependentWienerprocesses.
Thecompletescriptsforalloftheclassescanbefoundinthecodethataccompaniesthistext.Intheinterestofbrevity,weonlylookatonederivedclass:theDiscreteSimulationclass.Thedefinitionoftheclassisgivenhere:
#############################################################
##Createasimulationforadiscretesimulation.
##
##Thisisusedtorepresenttheresultsfromadiscretesimulation.
DiscreteSimulation<-function()
{
##Definethebaseclassandgettheenvironment
me<-Simulation()
me$N<-0
##Setthenamefortheclasswithanumericgradeassociatedwith
##it.
class(me)<-append(class(me),"DiscreteSimulation")
return(me)
}
ThefinalmethodspecifictothisclassisthesingleSimulationmethodthatisusedtoconductonesimulationandsavetheresults.Themethodsnecessarytodefinethediscretesimulationclassareasfollows:
#themethodstodotheactualsimulations.
singleSimulation<-function(simulation,N,T,x0,y0,alpha,beta,
gamma,delta,noiseOne,noiseTwo)
{
UseMethod("singleSimulation",simulation)
}
singleSimulation.DiscreteSimulation<-function(
simulation,N,T,x0,y0,alpha,beta,gamma,delta,noiseOne,noiseTwo)
{
##Makeanapproximationforonerunofthediscretemodel
##withthegivenparameters.Storetheapproximationin
##thesimulationslotwhendone.
##initializethenecessaryvariables.
x<-matrix(data=double(N*2),nrow=N,ncol=2)
x[1,1]<-x0
x[1,2]<-y0
lupe<-2
##GothroughandmakeNiterationsofthestochasticmodel.
while(lupe<=N)
{
dW<-rnorm(2,mean=0,sd=1)#Generatetworandomnumbers
#withanormaldist.
##Takeonestepofthediscretemodel
x[lupe,1]<-alpha*x[lupe-1,1]+beta*x[lupe-1,1]*x[lupe-1,2]+
noiseOne*dW[1]
x[lupe,2]<-gamma*x[lupe-1,1]+delta*x[lupe-1,2]+
noiseTwo*dW[2]
lupe<-lupe+1
}
##Savethesimulationandreturntheresult.
simulation<-setSimulation(simulation,x)
return(simulation)
}
TheMonte-CarloclassThefinalclassexaminedistheMonteCarloclass.TheMonteCarloclassisusedtokeeptrackoftheresultsfrommultiplesimulations.Here,weprovideapartiallistofthecodefortheclassandthenprovidethemethodsusedtogeneratemultiplesimulations.
First,thecoderequiredtodefinetheclassisgivenhere:
############################################################
#CreatetheMonteCarloclass
#
#Thisclassisusedtomakemanysimulations
MonteCarlo<-function()
{
#Definetheslots
me=list(
##Firstdefinetheparametersforthestochasticmodel
N=0,
T=0,
x0=0,
y0=0,
alpha=0,
beta=0,
gamma=0,
delta=0,
noiseOne=0,
noiseTwo=0,
##Definethedatatotrackandthenumberoftrials
xData=0,
yData=0
)
##Setthenamefortheclass
class(me)<-append(class(me),"MonteCarlo")
return(me)
}
#Definethemethodusedtoinitializethedatapriortoarun.
prepare<-function(monteCarlo,number)
{
UseMethod("prepare",monteCarlo)
}
prepare.MonteCarlo<-function(monteCarlo,number)
{
##Setthenumberoftrialsandinitializethevaluesto
##zeroes.
monteCarlo$xData<-double(number)
monteCarlo$yData<-double(number)
return(monteCarlo)
}
Theclassrequiresanadditionalmethod.Thesimulationsmethodisusedtocreatemultiplesimulationsandrecordtheresults:
simulations<-function(monteCarlo,number,simulation)
{
UseMethod("simulations",monteCarlo)
}
simulations.MonteCarlo<-function(monteCarlo,number,simulation)
{
##Setthenumberoftrialsandinitializethevalues
monteCarlo<-prepare(monteCarlo,number)
params<-getParams(monteCarlo)#gettheparameters
##Performthesimulations
lupe<-0
while(lupe<number)
{
lupe<-lupe+1#incrementthecount
##Performasinglesimulation.
simulation<-singleSimulation(
simulation,
params[1],params[2],params[3],params[4],params[5],
params[6],params[7],params[8],params[9],params[10])
##Getthelastvaluesofthesimulationandrecordthem.
values<-getFinalValues(simulation)
monteCarlo<-setValue(monteCarlo,values[1],values[2],lupe)
}
return(monteCarlo)
}
ExamplesWewillnowprovideabriefexampletodemonstratehowtousetheSimulationandMonteCarloclassesdescribedintheprevioussections.Wefocusonthediscretesimulationclass,butadditionalexamplescanbefoundinthecodethataccompaniesthistext.Inthisexample,weassumethatthedefinitionsforthesimulationandMonteCarloclassesarecontainedintwofiles,simulationS3.RandmonteCarloS3.R.
TheMonte-CarlosimulationscanbecreatedbyfirstcreatinganobjectfromtheMontyCarloclassandsettingthevaluesoftheparametersasfollows:
>source('simulationS3.R')
>source('monteCarloS3.R')
>monty<-MonteCarlo()
>monty$setParams(100,1,
1.0,2.0,
1.2,-0.3,0.65,0.2,
0.03,0.04)
NowthatanobjectfromtheMonteCarloclassisdefined,anobjectfromtheDiscreteSimulationclassiscreated,andthesimulationobjectisusedtogeneratetheresultsfrom500simulations:
>a<-DiscreteSimulation()
>monty<-simulations(monty,500,a)
Atthispoint,themontyobjecthastheresultsfrom500simulations.TheresultscanbefoundusingthegetValuesmethod,asfollows:
>summary(results[,1])
Min.1stQu.MedianMean3rdQu.Max.
0.64800.77920.82310.82130.85951.0110
>summary(results[,2])
Min.1stQu.MedianMean3rdQu.Max.
0.49780.62320.65900.66320.70640.8918
Alternatively,methodscanbedefinedthatextendexistingmethods.Forexample,thehistfunctioncanbeextendedtoaccommodateanobjectfromtheMonteCarloclass:
#themethodstoplottheresults
hist.MonteCarlo<-function(x,main="",...)
{
par(mfrow=c(2,1))
values<-getValues(x)
isValid<-(!is.na(values[,1]))&&(!is.infinite(values[,1]))
hist(values[isValid,1],xlab="x",main=main,...)
hist(values[isValid,2],xlab="y",main="",...)
}
Withthisdefinition,ahistogramcanbeeasilycreated:
hist(monty,main="ResultsfromaDiscreteSimulation")
SummaryAnexampleofanS3classwasdefinedthatcanbeusedtokeeptrackofasetofsimulations.Theclassesincludeaseparateclassusedtocreateasinglesimulation.Thesinglesimulationcanbeanapproximationofeitheradiscreteorcontinuousstochasticprocess.Anotherclassisdevelopedthatcankeeptrackoftheresultsfromalargenumberofsimulations.
Inthenextchapter,welookatanotherextendedexample.ThefocusinthenextchapterisoncreatingasetofS3classestoprovideageneralwaytohandleregressiontasksforavarietyofdatatypes.Youcandownloadthischapterfromhttps://www.packtpub.com/sites/default/files/downloads/6682OS_Case_Study_Regression.pdf
AppendixA.PackageManagementAbriefoverviewofworkingwithpackagesisprovidedhere.Thisisgivenasareferencetothebasiccommandsusedtomanagepackages.ItisnotexhaustiveandservesonlyasabriefreferencetomanagepackagesassociatedwithyourinstallationofR.
Theappendixhasfourparts:
AnoverviewonhowtoaccessapackageAnoverviewonhowtoinstallapackageAnoverviewonhowtoremoveapackageAnoverviewonhowtoupdatepackages
OneofR’sgreateststrengthsistheabilitytousespecializedpackages,andawiderangeofpackagesareavailable.Someofthepackagesareincludedintheregularinstallation,andsomepackagesmustbeinstalledandmaintainedseparately.Weprovideabriefoverviewofhowtoinstall,remove,andupgradetheinstalledpackagesonyoursystem.
Wefirstdiscusstwocommandsthatarecommonlyusedwhenworkingwithpackages.Thefirstistheinstalled.packagescommand.Thiscommandwilllistallofthepackagesthatarepartofyourinstallation.Theothercommandisthelibrarycommand.ThelibrarycommandisusedtotellRtomakeuseofthecommandsavailableinagivenlibrary.Inthefollowingexample,thefirstcommanddisplaysinformationaboutthesplinespackage,andthesecondcommandmustbeenteredbeforeusingthesplinepackage:
>library(help="splines")
>library(splines)
Ifapackageisnotpartofyourinstallation,youneedtoinstallit.Thecommandtoinstallapackageis,oddlyenough,theinstall.packagecommand.Inthefollowingexample,thecarpackage,whichisusedforadditionalregressionoptions,isinstalled.Thefulldetailsarenotprovided.YoumustreplytoaseriesofquestionsposedandthenRwillautomaticallyfetchandinstallthepackageforyou:
>install.packages("car")
Apackagecanalsobeeasilyremovedusingtheremove.packagescommand.Inthefollowingexample,weremovethecarpackage:
>remove.packages("car")
Thelasttopicdiscussedishowtoupdateyourpackages.Toupdateallofyourpackages,simplyusetheupdate.packagescommand.Inthefollowingexample,thecommandisenteredandthefulldetailsarenotprovided.Aftersubmittingthecommand,youaregivenalistofpackagesthatcanbeupdatedandyoumustdecidewhichpackageswillbeupdatedonyoursystem:
>update.packages()
IndexA
Antclassabout/IntroducingtheAntclass,DefininganS4class,Inheritance
Antobjectcreating/DefininganS4classnewcommand/DefininganS4classAntgenerator/DefininganS4class
appendfunction/Definingobjectsandinheritanceapplycommand/applyapplycommands
about/Operationsondatastructures,Theapplycommandsapplycommand/applylapplycommand/lapplyandsapplysapplycommand/lapplyandsapplytapplycommand/tapplymapplycommand/mapply
argumentspassing,tofunctions/Argumentstofunctions
arithmeticoperationsredefining/Redefiningarithmeticoperations
arraycommand/Matricesandarraysarrays,datastructure/Matricesandarraysas.charactercommand/Character,Notesontheasandisfunctionsas.difftimecommand/Operationsontimedatatypesas.doublecommand/Doubleas.factorcommand/Factorsas.integercommand/Integeras.listcommand/Listsas.POSIXctcommand/Convertingstringstotimedatatypesasciioption/Theworkspaceasfunction/Notesontheasandisfunctionsassigncommand/Scopeassignment,variables
about/Assignmentassignmentclass
about/Theassignmentclassesname/Theassignmentclassesnumber/Theassignmentclasses
assignmentclassesabout/TheassignmentclassesNumericGradeclass/Theassignmentclasses,TheNumericGradeclass
LetterGradeclass/Theassignmentclasses,TheLetterGradeclassgrades,readingfromfile/Example–readinggradesfromafile
attachcommand/Scopeattributescommand/Lists
BbaseRenvironment
about/Overviewbasicdatastructures
about/Basicdatastructuresvectors/Vectorslists/Listsdataframes/Dataframestables/Tablesmatrices/Matricesandarraysarrays/Matricesandarraysdata,censoring/Censoringdatarows,appending/Appendingrowsandcolumnscolumns,appending/Appendingrowsandcolumns
basicstringoperationsabout/Basicstringoperations
breakcommand/Therepeatloopbreakstatement/Breakandnextstatements
C#character/ConditionalexecutioncallGenericcommand/RedefiningarithmeticoperationscallNextMethodcommand/Inheritancecatcommand/Thecatcommandcbindcommand/Appendingrowsandcolumnsccommand/Vectorscharacter,discretedatatype/Charactercharactercommand/Characterclasscommand/Integer,Definingclassesandmethods,Definingobjectsandinheritance,DefininganS4classclasses
defining/Definingclassesandmethodscolnamescommand/TablescolSumscommand/applycolumns
appending/Appendingrowsandcolumnscommandline
data,enteringfrom/Enteringdatafromthecommandline,Readingtablesfromfiles
commandsdetermining,variabletype/Notesontheasandisfunctionsused,forcastingvariableintoparticulartype/Notesontheasandisfunctions
complex,continuousdatatype/Complexcomplexcommand/Complex,Specialdatatypesconditionalexecution
about/Conditionalexecutioncontinuousdatatypes
about/Continuousdatatypesdouble/Doublecomplex/Complex
ContinuousSimulationclass/ThesimulationclassesCourseclass
about/TheCourseclassdefining/ThedefinitionoftheCourseclassGrades,readingfromfile/Readinggradesfromafile
CSVfiles/CSVfilescumulativedistributionfunctions
about/Cumulativedistributionfunctionscutcommand/Tables
D%dcharacter/Creatingformattedstringsdata
censoring/Censoringdataentering/Enteringdatasaving/Printingresultsandsavingdata
data,enteringfromcommandline/Enteringdatafromthecommandlinetables,readingfromfiles/ReadingtablesfromfilesCSVfiles/CSVfilesfixedwidthfile/Fixed-widthfiles
data.entrycommand/Enteringdatafromthecommandlinedataframes,datastructure/Dataframesdateformation/Introductionandassumptionsdetachcommand/Scopedifftimecommand/Operationsontimedatatypesdimcommand/Matricesandarraysdimoption/Matricesandarraysdircommand/Fileanddirectoryinformationdirectoryinformation
about/Fileanddirectoryinformationdiscretedatatypes
about/Discretedatatypesinteger/Integerlogical/Logicalcharacter/Characterfactors/Factors
DiscreteSimulationclass/Thesimulationclassesdistributionfunctions
about/Distributionfunctionsdouble,continuousdatatype/Doubledoublecommand/Doubledpoiscommand/Cumulativedistributionfunctions
Eelsestatement/Conditionalexecutionencapsulation
about/Encapsulationexamples
used,fordemonstratingSimulationclassusage/Examplesused,fordemonstratingMonteCarloclassusage/Examples
existingfunctionsredefining/Redefiningexistingfunctions
existingmethodsextending/Extendingtheexistingmethods
Ffactorcommand/Factorsfactors,discretedatatype/FactorsFALSEvalue/Logicalfile
Grades,readingfrom/Readinggradesfromafilegrades,readingfrom/Example–readinggradesfromafile
filecommand/Primitiveinput/outputfileinformation
about/Fileanddirectoryinformationfixedwidthfile/Fixed-widthfilesforloop
versuswhileloop/Thewhileloopforloops/Theforloopformatcommand/Theprint,format,andpastecommandsformattedstrings
creating/Creatingformattedstringsfunctions
about/Functionsdefining/Definingafunctionarguments,passingto/Argumentstofunctionsscope/Scope
GGeometricTrialclass
resetmethod/DefiningobjectsandinheritancegetHistorymethod/Definingobjectsandinheritancesimulationmethod/DefiningobjectsandinheritancesingleTrialmethod/Definingobjectsandinheritance
getClasscommandabout/Miscellaneousnotes
getcommand/ScopegetFinalValuesmethod/ThesimulationclassesgetHistorymethod/DefiningobjectsandinheritanceGetLengthfunction/DefiningnewmethodsgetMethod(*show*)command/ExtendingtheexistingmethodsgetSlotscommand
about/Miscellaneousnotesgetwdcommand/Fileanddirectoryinformationgradefile
reading/OverviewGrades
reading,fromfile/Readinggradesfromafilegrades
reading,fromfile/Example–readinggradesfromafilegregexprcommand/Regularexpressionsgrepcommand/TheLetterGradeclass
H(help(factor))command/Factorshelp(Control)command/Loopconstructshelp(DateTimeClasses)command/Introductionandassumptionshelp(Distributions)command/Overviewhelp(dpois)command/Distributionfunctionshelp(regularexpression)command/Regularexpressionshelp(RNG)command/Generatingpseudorandomnumbershelp.searchcommand/Theworkspacehelpcommand/Theworkspace,Primitiveinput/output
Iifstatement/Conditionalexecutionindexingoperations
defining/Definingindexingoperationsinheritance
defining/Definingobjectsandinheritanceabout/Inheritance
initializecommand/Extendingtheexistingmethodsinitializefunction/Extendingtheexistingmethodsinteger,discretedatatype/Integerintegercommand/Integerinversecumulativedistributionfunctions
about/Inversecumulativedistributionfunctionsis.charactercommand/Characteris.factorcommand/Factorsis.integercommand/Integeris.listcommand/Listsis.nacommand/Specialdatatypesisfunction/Notesontheasandisfunctions
KKernighanandRitchie(K&R)/Conditionalexecution
Llambdaparameter/Distributionfunctionslapplycommand/lapplyandsapplylengthofstring
determining/DeterminingthelengthofastringLetterGradeclass/Theassignmentclasses,TheLetterGradeclasslist.dirscommand/Fileanddirectoryinformationlist.filescommand/Fileanddirectoryinformationlistcommand/Listslists,datastructure/Listsloadcommand/Savingaworkspaceloadhistorycommand/Theworkspacelogical,discretedatatype/Logicallogicalcommand/Logicallogicaloperators
</Logical>/Logical<=/Logical&&/Logical>=/Logical!/Logical==/Logical!=/Logical||/Logical|/Logical&/Logicalxor(a,b)/Logical
loopconstructsabout/Loopconstructsforloops/Theforloopwhileloops/Thewhilelooprepeatloops/Therepeatloopbreakstatement/Breakandnextstatementsnextstatement/Breakandnextstatements
lscommand/Savingaworkspace,Scope
Mmapplycommand/mapplymargin.table(A,1)command/Operationsondatastructuresmargin.table(A,2)command/Operationsondatastructuresmargincommand
about/Operationsondatastructuresmatch.argfunction/Argumentstofunctionsmatrices,datastructure/Matricesandarraysmatrixcommand/Matricesandarraysmethods
defining/Definingclassesandmethodsdefining,forS4class/DefiningmethodsforanS4class
methods,S4classnewmethods,defining/Definingnewmethodspolymorphism/Polymorphismexistingmethods,extending/Extendingtheexistingmethods
miscellaneousnotes,S4Classesabout/Miscellaneousnotes
missingcommand/Argumentstofunctionsmodecommand/TheworkspaceMonte-Carloclass
about/TheMonte-Carloclass
N$notation/Vectorsnamescommand/Listsncharcommand/Determiningthelengthofastringnetworkoptions
about/Networkoptionssocket,opening/Openingasocketsocketoperations/Basicsocketoperations
new.envcommand/Scopenewmethods
defining/DefiningnewmethodsNextMethodfunction/Definingobjectsandinheritancenextstatement/BreakandnextstatementsNumericGradeclass/Theassignmentclasses,TheNumericGradeclass
O&&operation/Logical&operation/Logical<-operator
about/Assignment/Scope<<-operator/Scope=operator
about/Assignment?operator/Theworkspaceobjects
defining/Definingobjectsandinheritanceoperations,ondatastructures
about/Operationsondatastructuresapplycommands/Theapplycommands
operators</Conditionalexecution>/Conditionalexecution<=/Conditionalexecution>=/Conditionalexecution==/Conditionalexecution!=/Conditionalexecution
options,help(trptime)command%a/Convertingstringstotimedatatypes%p/Convertingstringstotimedatatypes%A/Convertingstringstotimedatatypes%S/Convertingstringstotimedatatypes%b/Convertingstringstotimedatatypes%U/Convertingstringstotimedatatypes%B/Convertingstringstotimedatatypes%w/Convertingstringstotimedatatypes%c/Convertingstringstotimedatatypes%x/Convertingstringstotimedatatypes%d/Convertingstringstotimedatatypes%X/Convertingstringstotimedatatypes%H/Convertingstringstotimedatatypes%y/Convertingstringstotimedatatypes%I/Convertingstringstotimedatatypes%Y/Convertingstringstotimedatatypes%j/Convertingstringstotimedatatypes%z/Convertingstringstotimedatatypes%m/Convertingstringstotimedatatypes%Z/Convertingstringstotimedatatypes
%M/Convertingstringstotimedatatypesoptions,Renvironment
workspace,saving/Savingaworkspacecatcommand/Thecatcommandprintcommand/Theprint,format,andpastecommandsformatcommand/Theprint,format,andpastecommandspastecommand/Theprint,format,andpastecommands
Ppastecommand/Theprint,format,andpastecommandsplotcommand
about/Afinalnote/Redefiningexistingfunctionspointscommand/Distributionfunctionspolymorphism/Polymorphismppoiscommand/Cumulativedistributionfunctionsprefix
d/Overviewp/Overviewq/Overviewr/Overview
primitiveinput/outputabout/Primitiveinput/output
printcommand/Theprint,format,andpastecommandspseudorandomnumbers
generating/Generatingpseudorandomnumbers
Qq()command/Theworkspaceqchisqcommand/Inversecumulativedistributionfunctionsqpoiscommand/Inversecumulativedistributionfunctions
RRandomNumberGeneration(RNG)
about/Overviewrbindcommand/Appendingrowsandcolumnsread.csvcommand/CSVfilesread.fwfcommand/Fixed-widthfilesread.tablecommand/ReadingtablesfromfilesreadBincommand/Primitiveinput/outputreadCharcommand/Primitiveinput/outputReadGradesmethod/ReadinggradesfromafilereadLinescommand/Primitiveinput/output,Openingasocketregexpcommand/Locationofasubstringregularexpressions
about/Regularexpressionsrepeatloop
advantages/Therepeatloopdisadvantages/Therepeatloop
repeatloops/Therepeatloopreportmethod/TheNumericGradeclassresetfunction/Definingclassesandmethodsresetmethod/DefiningobjectsandinheritanceresetTrialfunction/Definingclassesandmethodsresults
printing/Printingresultsandsavingdatareturncommand/Definingafunctionrmcommand/TheworkspaceRNGkindcommand/Generatingpseudorandomnumbersrownamescommand/Tablesrows
appending/AppendingrowsandcolumnsrowSumscommand/applyrpoiscommand/Generatingpseudorandomnumbers
S%scharacter/Creatingformattedstrings<<-symbol
advantages/AssignmentS3classdefinition
advantages/AfinalnoteS3classes
about/AfinalnoteS3object
about/EncapsulationS4class
defining/DefininganS4classmethods,definingfor/DefiningmethodsforanS4class
samplecommand/Samplingsampling
about/Samplingsapplycommand/lapplyandsapplysave.imagecommand/Theworkspace,Savingaworkspacesavecommand/Theworkspace,Savingaworkspace,Generatingpseudorandomnumberssavehistorycommand/Theworkspacescancommand/Enteringdatafromthecommandlinescope/Scopescripts
executing/Executingscriptsseqcommand/Vectorsset.seedcommand/GeneratingpseudorandomnumberssetClasscommand/DefininganS4classsetGenericcommand/Definingnewmethods,PolymorphismsetMethodcommand/RedefiningexistingfunctionssetValiditycommand/DefininganS4classsetwdcommand/Fileanddirectoryinformationshowcommand/Extendingtheexistingmethodssignatureoption
ANY/Polymorphismmissing/Polymorphism
Simulationclass/Thesimulationclassessimulationclasses
about/ThesimulationclassesSimulationclass/ThesimulationclassesDiscreteSimulationclass/ThesimulationclassesContinuousSimulationclass/Thesimulationclasses
simulationmethod/Definingobjectsandinheritance
simulationsmethod/TheMonte-CarloclasssingleSimulationmethod/ThesimulationclassessingleTrialmethod/Definingobjectsandinheritanceskipoption/Fixed-widthfilesslotcommand
about/Miscellaneousnotesslotnamescommand
about/Miscellaneousnotesslotsargument/DefininganS4classsocketConnectioncommand/Networkoptionssocketoperations/Basicsocketoperationsspecialdatatypes
about/Specialdatatypesasfunction/Notesontheasandisfunctionsisfunction/Notesontheasandisfunctions
sprintffunction/Creatingformattedstringsstopcommand/Argumentstofunctionsstorage.modecommand/Theworkspacestrftimecommand/Convertingtimedatatypestostringsstrings
splitting/Splittingstringsconverting,totimedatatypes/Convertingstringstotimedatatypestimedatatypes,convertingto/Convertingtimedatatypestostrings
strptimecommand/Convertingstringstotimedatatypesstrsplitcommand/Splittingstringssubcommand/Regularexpressionssubstring
location/Locationofasubstringextracting/Extractingorchangingasubstringchanging/Extractingorchangingasubstring
substringcommand/ExtractingorchangingasubstringS_PLUSlanguage/Scope
Ttables,datastructure/Tablestapplycommand/tapplytimedatatypes
strings,convertingto/Convertingstringstotimedatatypesconverting,tostrings/Convertingtimedatatypestostringsoperationson/Operationsontimedatatypes
timeformation/Introductionandassumptionstransformcase
about/TransformingthecaseTRUEvalue/Logicaltypeofcommand/Integer
UUseMethodcommand/Definingclassesandmethods,Definingobjectsandinheritance
Vvectors,datastructure/Vectors
Wwarningcommand/Argumentstofunctionswhileloop
versusforloop/Thewhileloopwhileloops
about/Thewhileloopworkspace
about/Theworkspacesaving/Savingaworkspace
writeBincommand/Primitiveinput/outputwriteCharcommand/Primitiveinput/outputwriteLinescommand/Primitiveinput/output