Thursday, July 10, 2014

Introduction to R: Other Textual Formats in R

Last time we discussed on how to save a .csv file. In this blog we will discuss on other textual formats that is present in R, these are basically functions that can be used for other formats aside from .csv or .txt.

There are two functions that can be used to read or write other textual formats in R, these is the dump() function and the dput() function. These two functions are important for reading and writing other textual formats because it makes these formats editable and recoverable. The downside with dump() and dput() functions is that they are not space efficient.

The dput() Function

This function is another way to pass data into R by desparsing R objects. The dput() function can be read by dget(). The mechanism of the dput() function is that it will take an R object and will create an R code that will essentially reconstruct the object in R.

Example:
We will create a small data frame with two columns. The first column will be named "d" and the second column is named "f". The value of a will be set to 10, will the value of b is set to "d".





















In the first part of our example, we have the expression z<-data.frame(d=10, f="d"). If the data frame is in dput() function, it will reconstruct the R code creating a list with two elements, the new construct has the class at the end, as you can see in the example: class="data.frame". Another essential thing is that for us to retrieve this expression, we must dput it into a file, as in the example dput(z, file="z.R"). The file can now be retrieved using the dget() function, in the example it is x<-dget("z.R"). Hence, dput() function essentially writes an R code which can be used to reconstruct an R object.

Dumping Objects in R

If we have multiple objects that we want to desparse in R, we can use the dump(c(), file=" ") function. Dumping is quite similar to dgetting, the main difference is that dumping is used in multiple objects while dget is used for single objects. R passes the dump as a character vector which contains the name of the object.

Example:
We create two objects, x and y. We assign a string vector "owl" to x and a data frame (a=10, b="a") to y.