using r 4/10/2012 geoff black matthew goglia. what is r? r is a free program for statistical...
TRANSCRIPT
What is R?
R is a free program for statistical analysis and graphical display of data.
R uses code; these saved scripts can be easily used again to perform calculations on new data
It is particularly strong at performing matrix calculations
Gives us the ability to quickly pull various data in a usable format
Before Beginning R
1. Download Rhttp://cran.cnr.berkeley.edu/
2. DemonstrationBasic ArithmeticMatricesTreasury Bill data
Before Beginning R
1. Create an Excel Spreadsheet Input all ETF symbols that you want data for into one
columnSave as a .CSV file, for example “symbols.csv”
2. Make sure your spreadsheet and R are in the same directory
Create a new folder on your desktop and save your spreadsheet and a copy of R in it
Set R to start in the same directory as your spreadsheet by editing “properties” for PC or “preferences” for Mac
Install and Load Yahoo Finance Package
Open R and type in the following:
>if (!require(fImport)) install.packages ('fImport’) >library("fImport")
You are now able to import market data from Yahoo.
(“>” is the command line in R; it is not part of the code.)
Pull Symbols from Excel Spreadsheet
Type in the following:
>symbols <- scan("symbols.csv",what=character(),sep = ",")
“<-” gives a definition.
Now R knows which ETFs to pull data for. Make sure to use the name of the spreadsheet that you created if you did not name it “symbols.csv”.
Load Yahoo Data for ETFs
Type in the following:
>stockdata <- yahooSeries(symbols,nDaysBack = 365*3)
This code pulls the last 3 years of data for each ETF in your Excel spreadsheet. You can change the number of years by changing the last digit in the code.
Remove Unnecessary Data
Type in the following:
>c <- 1:(ncol(stockdata)/6)*6
>stockadj <- stockdata[,c]
We only need the 6th column of data that Yahoo gives us. The first code, “c” defines the columns we need and ignores the ones we don’t.
The second line defines “stockadj” as the data to be extracted from the Yahoo data we received
Calculate Daily Returns
Type in the following:
>returns <- stockadj/lag(stockadj,k=1)-1
This code defines returns as: (price/previous day’s price)-1
Write to File
Type in the following:
>write.table(returns, "returns.csv", sep=",", col.names=NA)
This code will produce an Excel spreadsheet in your working directory under the name “returns.csv”. You now have 3 years’ of daily returns for each ETF in your original spreadsheet.
Final Code
if (!require(fImport)) install.packages('fImport')library("fImport")symbols <- scan("symbols.csv",what=character(),sep = ",")stockdata <- yahooSeries(symbols,nDaysBack = 365*3)c <- 1:(ncol(stockdata)/6)*6stockadj <- stockdata[,c]
returns <- stockadj/lag(stockadj,k=1)-1write.table(returns, "returns.csv", sep=",", col.names=NA)
Now that you know how to write the code, you can just copy and paste the above into R.
You are now ready to use R to solve for GMVP.
A Review
1 Make a Covariance Matrix of returns above the risk-free rate
2 Make an Inverse Matrix
3 Make a Vector of Ones
4 Multiply the Inverse Matrix by the Vector of One
This is the basic equation we need to solve:
Source: cran.r-project.org/web/packages/quadprog/quadprog.pdf
Install the Quadratic Programming Package
>if (!require(quadprog)) install.packages('quadprog')>library("quadprog")
Definitions
>n <- dim(returns)[1]>p <- dim(returns)[2]>ub <- rep(.1,p)>zeros <- numeric(p) >ones <- zeros +1 >dim(ub) <- c(p,1)>dim(ones) <- c(p,1)>dim(zeros) <- c(p,1)>dim(ub) <- c(p,1)
Number of rows, number of columns, upper bounds, vectors
Construct the A-Matrix and B-Vector
>Atmp <- rbind(t(ones),diag(p),-diag(p))>Amat <- t(Atmp)>bvec <- rbind(1,zeros,-ub)
Constraints: Ax >= bSum of stock weights = 1No shortsWeights must be below the upper bound
Build the Covariance Matrix
>sigma<- cov(returns, use="pairwise")
Pairwise means there must be two values. If there is only one value the covariance is not calculated.
Solve for GMVP and Write to Table
>gmvp=solve.QP(sigma,zeros,Amat,bvec,meq=1) >gmvp$solution <- round(gmvp$solution,2)>soln<-matrix(cbind(symbols,gmvp$solution),nrow=p,ncol=2)>write.table(soln, "soln.csv”, sep=",",col.names=F,row.names=F)
Meq = 1 means the first row of constraints is an equality
Final Code
if (!require(quadprog)) install.packages('quadprog')library("quadprog")n <- dim(returns)[1] p <- dim(returns)[2] ub <- rep(.1,p)zeros <- numeric(p) ones <- zeros +1 dim(ub) <- c(p,1)dim(ones) <- c(p,1)dim(zeros) <- c(p,1)dim(ub) <- c(p,1)Atmp <- rbind(t(ones),diag(p),-diag(p))Amat <- t(Atmp)bvec <- rbind(1,zeros,-ub)sigma<- cov(returns, use="pairwise”)gmvp=solve.QP(sigma,zeros,Amat,bvec,meq=1) #meq=1 means first row of contraints is an equality gmvp$solution <- round(gmvp$solution,2)soln<-matrix(cbind(symbols,gmvp$solution),nrow=p,ncol=2)write.table(soln, "soln.csv", sep=",",col.names=F,row.names=F)
Now that you know how to write the code, you can just copy and paste the above into R to solve for GMVP.