getting started with r - coli.uni-saarland.degambi/lectures/gettingstartedwithr.pdf · getting...
TRANSCRIPT
Getting started with RChiara Gambi
Tuesday 24 September 2013
Step 1 - Download
• Download the latest version of R and R studio from the websites listed on the course page onto your laptop.
• Installation should proceed smoothly, just follow the on-screen instructions
• Download R first
• Then download R studio
Tuesday 24 September 2013
Step 2 - R studio basicsConsole & Script
• Open up R studio.
• This will automatically start an R session in the window on the bottom left corner: this window is the Console.
• You can type commands directly into the Console, which is good to try things out when you're not sure if they will work.
Tuesday 24 September 2013
Step 2 - R studio basicsConsole & Script
• Once you know what your doing, you should use an R script.
• This is displayed in the top left corner within R studio. If one is not already open, go to File --> New --> R script.
Tuesday 24 September 2013
Step 2 - R studio basicsConsole & Script
IMPORTANT NOTE
• Always type the commands you want to save for later on in the Script.
• You can save the output of R calculations, which is the content of the Workspace (go to Session --> Save Workspace As), at any time but you cannot save the list of commands you used to get that output unless you type them in a Script.
• Also, it's good to get into the habit of commenting on the commands as you go along, so that you can remember what each does. Anything that is preceded by # in the script will be treated by R as a comment and will not run.
Tuesday 24 September 2013
Step 2 - R studio basicsConsole & Script
• When the commands are typed into the Console just press ENTER to run.
• When the commands are typed into the Script, position your cursor on the command line (or highlight a group of lines) and hit RUN in the top right corner of the Script window.
• Alternatively, CMD + ENTER for Mac users or CTRL + ENTER for Windows users
• If a command is incomplete, R signals that it is waiting for more input by showing a + in the Console. To start anew when this happens, just hit ESC.
Tuesday 24 September 2013
Step 2 - R studio basicsConsole & Script
• If you realize a command you have used before is useful, but you forgot to type it in the script...
• you can still find it (provided you did not quit the session) in the History tab on the top right corner.
Tuesday 24 September 2013
Step 3 - R studio basicsWorking Directory & Projects
• When working with R, it is crucial to know in which directory you want R to look for input and save output.
• Typically, you want your script and your data to be in the same directory.
• If you stick to this rule, when you open a saved script in R studio, all you have to do is to go to Session --> Set working directory --> To source file location
• which will tell R to look for files within the same directory of the current active script.
Tuesday 24 September 2013
Step 3 - R studio basicsWorking Directory & Projects
• If you want to avoid this step, you can create a project in R studio.
• Anytime you open a project, R will be directed towards the right folder.
Tuesday 24 September 2013
Step 3 - R studio basicsWorking Directory & Projects
• In any case, you can:
• check what the current directory is:
• use the command: getwd()
• set the current directory to any path within your machine:
• use the command: setwd("~/path") on Mac; setwd("C:/path") on Windows
(By default, R works in the home directory: ~/users/YourUsername on Mac).
Tuesday 24 September 2013
Step 3 - R studio basicsWorkspace
• The output of all R calculations is temporarily stored in the workspace.
• You can check the contents of the workspace in the top right corner.
• Before quitting a session, R studio will prompt you to save the contents of the workspace. When you resume the session, you can load the contents of a previously saved workspace
Tuesday 24 September 2013
Step 3 - R studio basicsWorkspace
• IMPORTANT NOTE
• R is very much like a calculator. It will not store anything (not even temporarily) unless you tell it to! Whenever you wish to store the output of a calculation for use in following calculations, you need to assign the calculated expression to an object (a variable). The assignment operator is <-
variable <- assigned expression
e.g., x <- 3*10+4/2x[1] 32
Tuesday 24 September 2013
Step 3 - R studio basicsInstalling & loading packages
• The basic R installation allows you to do quite a lot, but the real power of R comes from the extensive range of freely available packages.
• Whatever you need to do, R has a package for it!
• Baayen's "Analyzing linguistic data with R" comes with a package, languageR, which you should install before the course starts. Here is how to do it in R studio.
Go to Tools --> Install packages
If prompted to create a package repository, just follow the instructions.
Then, type the package name under Packages, and hit Install.
IMPORTANT NOTE: your computer needs to be connected to the Internet for this to work!!!
Tuesday 24 September 2013
Step 3 - R studio basicsInstalling & loading packages
• To load the packages in the current session, type the following in the Console
library(PackageName)
e.g., library(languageR)
Tuesday 24 September 2013
Step 4 - R basicsReading in the data
• The next two sections are based on Baayen (2008). Analyzing linguistic data with R. Chapter 1 (up to and including section 1.3)
• You should have installed and loaded languageR before you continue!
• Please, go through these brief exercises and read up to p. 10 of Baayen's book before the course starts.
Tuesday 24 September 2013
Step 4 - R basicsReading in the data
• First, run the following line:
• write.table(verbs, file = "/users/yourusername/yourfolder/dativeS.csv") #for Mac
• write.table(verbs, file = "C:/yourfolder/dativeS.csv") # for Windows
• This will create a file (dativeS.csv) on the specified location on your computer.
• If you open the file using any text editor, Excel, or Numbers, you will see that it contains 5 columns and several rows.
• The first row does not contain data, but gives the column names: RealizationOfRec, Verb, AnymacyOfRec, AnimacyOfTheme, LengthOfTheme.
Tuesday 24 September 2013
Step 4 - R basicsReading in the data
• You will normally have your data in .csv or .txt format saved in the same directory as your R script. Make sure this directory is set as the current working directory (see Step 3).
• Then, read the data as follows:
• verbs<-read.table("dativeS.csv", header=T) # NOTE: this only works if yourfolder (as specified in the write.table command) is the current working directory!
• This creates an R object called verbs (you can see it in your Workspace now) that is of a special kind, namely a data frame.
• By specifying header=T you tell R that the first row in the file dativeS.csv is to be treated differently from all other rows, as it gives the column names.
Tuesday 24 September 2013
Step 4 - R basicsReading in the data
• In R, you can also create and work with other types of objects: common ones include vectors, matrices, and lists. You can check what type of object you are dealing with using:
• class(objectname)
• Now, if you type names(verbs), R will return a list of names of the columns in the data frame.
Tuesday 24 September 2013
Step 4 - R basicsReading in the data
• Also, in R studio, if you type verbs$ and then press the Tab key, you will be presented with the list of column names.
• Because verbs is a data frame you can refer to the columns within it with this syntax:
• dataframename$columnname
• e.g., verbs$Verb
Tuesday 24 September 2013
Step 4 - R basicsReading in the data
• Alternatively, you can refer to any combination of columns and rows in a data frame using subscripting, with the following syntax:
• dataframename[rowID,columnID]
• You can refer to a row or column using the row/column number submitted as an integer:
Examplesverbs[1,1] # returns the element at the intersection of the first row and first columnverbs[1,] # returns all the elements in the first row (from all columns)verbs[,1] # returns all the elements from the first column (from all rows)
Tuesday 24 September 2013
Step 4 - R basicsReading in the data
• You can refer to a subset of rows/columns in (at least) three ways:
a. 1:4 #all rows/columns from (and including) 1 to (and including) 4b. c(2,6) #row/column 2 and 6
NOTE: c is the concatenate function, useful to create vectors. Vectors can be numeric or string:
c(4,7:10,6) is the vector: (4,7,8,9,10,6)c("This","is","a","string","vector") is the vector: ("This","is","a","string","vector")
NOTE: for columns, you can also use a string vector of column namese.g., verbs[1:5,c("Verb","LengthOfTheme")]
c. Use logical subscripting. This is very powerful, and will be described in the next section
Tuesday 24 September 2013
Step 4 - R basicsLogical subscripting
This allows you to select only those rows of a data frame that meet certain conditions.
For example, you can select only the rows for which Length of Theme is greater than 3.4
verbs[verbs$LengthOfTheme>3.4,]
Or, you can select only the rows that refer to verbs with animate themes
verbs[verbs$AnimacyOfTheme == "animate",]
Tuesday 24 September 2013
Step 4 - R basicsLogical subscripting
Some logical operators:
- identity, equal to: == (DO NOT USE: =)- not equal to: !=- AND: &- OR: | (i.e., the vertical bar)
Tuesday 24 September 2013