go back to the homepage
go back to the economics page
go back to the teaching page
R is an open-source programming language that is popular among statisticians for its computing and graphical prowess.  With that said, it has a high start-up cost.  There aren't many good "how-to" guides or manuals out there.  And, the open-source aspect makes the packages you can install almost infinite, there is no guarantee that they will work or give you the correct results.  If you have a question, searching on Google doesn't help much.  Instead, I rely on nabble.

To get rolling, download the software from a CRAN mirror and install the program.  You might also want to install a friendly work console called RStudio.  

When you open R up, if you have not read a tutorial or used a similar program like S or Matlab before, you might be lost.  Below are some pointers that will help get you up and running.  Eventually, I hope to write up a series of lectures that gives an all-around crash course in the basics of data manipulation for both R and Stata.  I'm also working on writing up parts of text books in both languages so that students could pick the preferred software and work at getting better.  Personally, I love the open-source idea of R but I've become really comfortable with the ease of Stata that it's hard to let go...but when I do go back to Stata, I find myself longing for the object-oriented arrays of R.  Obviously, there are tradeoffs to everything.

Getting started with R

The basic idea is that you've got to load libraries (aka packages).  Some come pre-installed (see drop down menus) and others are floating about on the Internet.  If you have a pre-installed or downloaded a library, you can run it by typing

> library(tools)     # calls up the "tools" package
> library(help="tools")  # lets you figure out what's in the tools package
> ?makeLazyLoading   # gives you the help file for the command "makeLazyLoading" that is in the "tools" package so you can determine its syntax

Notice that the typical way to denote lines of R code is with a ">" as opposed to the "." in Stata.  The comment out tag is the # sign instead of the % or * that are used in other programs like LaTeX and Stata. The other important tidbit you should know is that the slashes go the opposite way from the Windows operating system.

> setwd("C:/my files/fall classes/homework")

The working directory is important for when you're saving and calling up files.  For R, the script files are saved with the extension .R and you may have to specify the extension when saving.  To get started, there are a few ways to get data into R (assuming you aren't inputting/creating vectors or matrices like most manuals start you out doing).  The first step is to get data so go onto a financial website, like Yahoo!Finance, and download a history of stock prices.  Here are some alternatives:

> data=matrix(scan(file='ibm19502007.txt'), nrow=5, byrow=FALSE)
> data=read.table("ibm19502007.txt")
> data=read.table("ibm19502007.txt",header=T)   # works if 1st row is var names
> ibm=t(data[3,])      # this defines ibm as the transpose of the 3rd row
> ibm<- t(data[3,])    # same as above and more common for working with object-oriented arrays instead of using the equal sign

Once you've got the data in R, you can get working.  Commands need a set of parentheses and some sorts of options inside of them.  Even if the help files don't specify an option you've seen elsewhere, you still might be able to use it...hence the claim that the "help" files aren't great, but you'd actually get the support if you paid for the product like with S.  Try these things out:

> summary(ibm)
> plot(ibm) 
> jpeg(filename = "ibmreturns.jpg")
> colnames(data) <- c("date", "ibm")
> # How do you combine multiple graphs/plots into one?
> par(mfrow=c(2,1)) 
> ibm.first = ibm[1:15]
> ibm.second = ibm[16:30]
> plot(ibm.first)
> plot(ibm.second)
> title(main="Different time periods of IBM stocks")
> dev.off()

You might find it helpful to write your own functions.  Try this for skewness:

> skewness <-  function(x) {
>        m3 <- mean((x-mean(x))^3)
>        skew <- m3/(sd(x)^3) 
>        skew
>       }  # I like to indent or put in space to know my loops/functions balance
> skewness(ibm)

Hint: If you're going to use R for financial stuff, you might want to install "FinTS" and "Rmetrics."  To display data on the screen, use the print or cat function.  You might find the paste function useful and it helps to know you can set the number of significant digits by digits=2 or use line breaks with "\n".  Another cool thing to know is that the "up" and "down" arrow on your keyboard will let you scroll back and forth through previous commands.  When you've saved your work (and script file) and are ready to exit, just type q().

Finally, I have devoted one of my Stata pages to making maps.  Although a little harder to do in R, sharp(er) images can be produced using packages like maptools, rgdal, and PBSmapping (others include RColorBrewer and ggplot2).