Thursday, June 19, 2014

Introduction to R: Sub-Setting Part II

Previously we have discussed about sub-setting an object or an element in R. Now we will discuss on how to sub-set objects and elements from a list. The principle are the same the operators [], [[]] and $ can be used.

Let's proceed to the exercises. Create a list with two elements, name the first element with your first name and assign a value to it with the sequence 1 to 5. Name the second element with your favourite fruit and assign your favourite number as its value.
a. Extract the first element in list form.
b. Extract the first element in sequence form.
c. Extract the second element using $ and [[]] operators.
d. Extract the second element in list form by using the name of the element of interest.





















The operator [] returns an output with the same class as the original, since x is a list then the expression x[1] will give an output that is also a list with a sequence 1 to 5. The [[ ]] operator only gives an output which is a sequence of the numbers 1 to 5.

You can also use the name of the element inside the [] operator, as in the ["durian"] example, here we use the name "durian" instead of the number index 2.

The [] operator can also be used to extract multiple elements from a list using the [c()] function. Let's proceed to the exercises:
Create a list with three elements, name the first element with your first name and assign a value into it with the sequence 1 to 5. Name the second element with your favourite fruit and assign your favourite number as its value, name the third element with your favourite animal and assign a value which is equivalent to the number of its legs.
a. Extract the elements 1 and 3 from the list that you have created.





















Here we have made the expression x[c(1,3)], 1 being the number index of the first element which corresponds to the name gerard and 3 being the number index of the third element in the list which corresponds to owl.

It should be noted however that the [[ ]] and the $ operators have different function when used to retrieve an element from a list. The [[ ]] operator can only be used with a computed index while the $ operator can only be used with a literal name. Let's take the above example:





















Here we have made a new vector x<-"gerard", the purpose of this is to create a vector to assign the values of the element gerard=1:5. The x<- expression resulted in a computation that the string "name" is equals to the string "gerard".  Notice that the list x only constitute the elements "gerard", "durian" and "owl" but there is no element "name", hence the x[[gerard]] becomes similar with the x[[names]].

However, if we use the x$gerard, the operation $gerard will literally look for the string "gerard" in the list, thus x$gerard is not equivalent to the x$name, if you have observed that when we type the x$name it gives out a null because the string "name" does not exist in the original list: x<-list(gerard=1:5, durian=7, owl=2).

SUB-SETTING ELEMENTS FROM A LIST NESTED INTO ANOTHER LIST

The [[c(number index, number index )]] operator can be used to extract an element from a list that is nested within another list, to illustrate this we go to our next exercise.

Say, we want to extract the number 5 from the expression:

x<-list(gerard=list(1,3,5), durian=list(2,4,6), owl=list(7,8,9)).

As you can observe the number 5 is the third element of a list named "gerard", furthermore, the list that is named "gerard" is the first element of another list which is "x". To extract the number 5 we need to use the number index method we previously discussed. The number index of the number 5 in the list "gerard" is 3, while the number index of the list "gerard" form the original list "x" is 1. 

Using operator [[ c()]], the expression becomes x[[c(1,3)]]. 1 for the number index for "gerard" and 3 for the number index of "5".





















PARTIAL MATCHING IN R

If you have a list with an incredibly long name and you want to save time in typing, you may opt to use the operators: [[ ]] and $. 

For example we have a list containing the elements pneumonoultramicroscopicsilicovolcanosis having the sequence 1 to 5, the second element is another long name monosodiumglutamate with the sequence 6 to 9 Since it would take us time to type the word  pneumonoultramicroscopicsilicovolcanosis over and over again every time we want to know its value we can use the expression $p.

The $p expression will search for a name in the list containing a word with p as its first letter.





















Care should be taken as R program is quite syntax sensitive, as you can observe X$p and x$P will give a NULL value. On the other hand,  the [[ ]] operator has a different approach, as you can see if you type [[ "p"]] gives out a NULL this is because the [[ ]] operator will search for an exact match in the list, since the list x do not have an element named "p", the result is a NULL, to resolve this we can use the [[ , exact=FALSE]] operator, like in our example x[[ "p", exact=FALSE]].

No comments:

Post a Comment