Homework

Download this File as Rmd

Lecture 4

More on Functions

Let’s add default arguments and error checking to our discussion of user-defined functions.

Default Arguments for User-Defined Functions

To make a default argument that is used when an argument is not specified, in the function definition we put an equals sign inside of function() and indicate the default.

square5 = function(x=5) {   # including =5 makes the default
     return(x^2)
} # the function is now defined with x=5 as a default

# let's test the function in both situations: with an indicated argument and without an indicated argument
square5(9) # argument x=9 is given, so default is ignored
## [1] 81
square5() # no argument is given, so the default x=5 is used
## [1] 25

Alter the squared5 function to make the default output be 64, and choose an appropriate name for the function instead of squared5.

Error Checking for User-Defined Functions

When someone applies a user-defined function, they may not know which inputs are valid inputs. So the creator of a function can code an extra step into the function that checks if an input is a valid input, and notifies the user when the user calls the function on an invalid input. This coding principle is called error checking for a function. It is useful for debugging when the function is used deep inside a program.

square = function(x) {
     if (!is.numeric(x)) {                   # these three
          stop("The input must be numeric.", # lines code     
               call.=FALSE)                  # error checking
     } else {
          return(x^2)
     }
}

#square("this input is not numeric") # uncomment this and try it!
square(10)
## [1] 100

Scoping Rules

The environment inside a function is different from the global environment. In particular, a variable defined inside a function body is unknown to the global environment.

square = function(x) {
     y=2  # here y is defined as 2 in the function body 
     return(x^2)
}

# y # uncomment this to see y is not found in the global environment

If we define a variable inside a function that is already defined in the global environment, there is no conflict.

y=5   # this defines y = 5 in the global environment
square = function(x) {
     y=2 # here y is defined as 2 in the function body
     return(x^2)
}

y # y remains 5 in the global environment even though in the function body environment y is 2 
## [1] 5

Inside a function body environment, we can use variables defined in the global environment. When a variable is called inside a function body, R first looks for the variable in the function body, if it is found there it is used, if it is not found there, R looks secondly in the global environment. Define y=5 in the global environment, and define a function f(x) that returns the value of x+y. Test your function to compute f(15).

Getting and Setting your Working Directory

In order to access files from RStudio and the console, you need to either be in the same directory as the file, or provide a path to the file.

To see your current working directory, type getwd() in the console (with empty parentheses).

The simplest way to change your working directory is to use RStudio. To change your working directory using RStudio, in the taskbar select Session, then Set Working Directory, then Choose Directory.

Exercise. Determine your working directory from the console. Change it to the desktop using RStudio, and confirm it worked by checking from the console. Then change the working directory to your Stats 535 folder using RStudio. Confirm it worked by checking from the console.

You can set the working directory with RStudio using the console (or in a script) using the command setwd("path_to_directory"). Notice the quotations are needed. Windows users can copy the directory from the taskbar inside a folder in FileExplorer, however the backslashes go in the wrong way! So, before pasting a path into setwd(" "), Windows uses must either change change the backslashes \ to /, or change the backslashes \ to double backslashes \\.

setwd("C:\Users\Me\Desktop") # Pasted from File Explorer, will not work! Backslashes wrong!
# So manually change backslashes to forward slashes as follows
setwd("C:/Users/Me/Desktop") # will work

# Or manually change backslashes to double backslashes as follows
setwd("C:\\Users\\Me\\Desktop") # will also work

Exercise. Determine your working directory from the console. Use setwd(" ")in the console to change the working directory to the desktop using RStudio, and confirm it worked by checking from the console. Then use setwd(" ")in the console to change the working directory to your Stats 535 folder. Confirm it worked by checking from the console.

Warning: When you set the working directory, it does not change where a knitted Rmd file looks! An Rmd file will first look in the folder in which it is located!

Reading Data Files

Preparing your Excel File for Read-in

In everyday work, often your data will be in an excel file. To get it into R, you should clean it up inside of Excel, and then save it to a csv file. Read this on how to clean up your excel file before saving to csv for read in. After you clean up the file, save it to a csv file. The abbreviation csv stands for comma-separated values. After you save it to a csv file, you can use the following method to read it to a data frame in R.

Reading a csv File from Your Local Computer to a Data Frame in R

Put the csv file in the same folder as your Rmd file. Or, if you’re working interactively in the console, set your working directory to be the folder which contains the csv file. Then use the following command to read the csv file to a data frame in the global environment.

dataframename = read.csv("filename.csv")

For illustration purposes, I saved the this excerpt from Chetty et al from Professor Ben Hansen’s website to my local folder.

Download the excerpt, save it in the same folder as this Rmd file, and look at the excerpt using a text editor such as NotePad or WordPad, etc. Notice that commas separate the entries in each line, but there is no comma at the end of a line. Notice the first line consists of column names, i.e. the first line is a header.

I read it in as follows.

chetty_etal_excerpt0 = read.csv('mobility0.csv') # notice quotation marks!

To make sure it is in now in a data frame in the global environment, I can look at it by typing View(chetty_etal_excerpt0).

Exercise: The foregoing was excerpt 0. Repeat the procedure for excerpt 1. In other words, download and save this file to the same folder as this Rmd file, and read it in to a data frame called chetty_etal_excerpt1, and look at it by typing View(chetty_etal_excerpt1).

If a csv file does not have a header, use the following.

dataframename = read.csv("filename.csv", header=FALSE)

Reading a csv File from the Web

If the csv file is on the web, you’re better off to read it directly from the web, instead of downloading it and then reading.

The following read in directly from the web avoids all the unecessary file-downloading above.

chetty_etal_excerpt0 = read.csv('http://dept.stat.lsa.umich.edu/~bbh/s485/data/mobility0.csv')

As before, we use the option header=FALSE if there is no header.

How to Save a Graph to a jpeg or png File for Drag and Drop into a Report or Presentation, and How to Print a Graph to a pdf File for Another Purpose

For your job (and this class!), you will want to paste R graphs into a report or presentation. Often reports and presentations use

All of these files allow you to drag and drop jpeg files and png files and resize them as needed. The don’t allow you to drag and drop pdf files, but you can use pdf as well as jpeg and png in LaTeX.

Here is the syntax to output a scatterplot to a jpeg file for pasting into a report or presentation.

jpeg("fuel_vs_weight.jpg")    # opens jpeg file, notice jpeg vs jpg!
plot(mtcars$wt, mtcars$mpg)
dev.off()  # closes jpeg file and saves it in same directory as this Rmd file! Even when you run it to the console with CTRL+Shift+Enter it is saved in the same directory as this Rmd file!

In the foregoing example, you can change jpeg and jpg to png in order to print to a png file, or change both to pdf in order to print to a pdf file.

Exercise: Alter the foregoing example to save the scatter plot to a png file, and paste it into a Word file or google doc file.

Probability Review

Sample spaces, probability measures, random variables, and probability mass functions.

See notes on Canvas.