Wednesday, June 11, 2014

Introduction to R

Now that we have integrated our Google Analytics to our Facebook page, lets start to learn R. The R Language is probably the most powerful and the most versatile free software for Data Analytics. It contains comprehensive packages supported by millions of programmers working in the academe and research around the world. Commercial analytics program like "Oracle R" have appreciated the value of this free software and have adopted the R Language and its environment to support statistician, data analysts, and data scientists to perform advance analytics. Oracle R used the R language to generate sophisticated graphics in their programs. R Language has also been integrated in other commercial analytics software like Adobe AnalyticsGoogle AnalyticsSQL. Even IBM, the maker of SPSS, have used R to extend its functionality. SAP and TIBCO-SPOTFIRE have join in the band wagon in integrating R Language. The newest member of the commercial software who integrated R into their system is Tableau 8.1.


Because R Language is free and is supported by millions of programmers in the academe and research, the functionality of the R Language is probably more powerful and flexible than SAS, there has been a R Language vs. SAS debate ongoing in the Data Analytics community for years. For me though, coming from an avid SPSS user who moved to STATA then to SAS then to R, I find R more appealing because of its readily available packages and more flexible environment.

Now let's go to the more serious discussions. The basic 'atomic classes of objects' in R are composed of:
- Characters (a,b,c,d,e,f,g)
- Complex (1+i, 1-2i)
- Logical (True/False, Yes/No, If Yes/If Not)
- Integers (1L, 2L, 3L....)
- Numeric (Real Numbers:1,2,3,4,5.....)

Vector is the most basic object in R, it contains a single or multiple 'atomic classes of objects' under the same class. Example x<-c(1,2,3,4,5) or x<-1 is a numeric vector, y<-c(a,b,c,d,e,f) or y<-a is a character vector, z<-c("true") is a logical vector, w<-c(1L, 2L,3L) or x<-w is an integer vector. on the other hand an empty vector can be made by the vector function:

vector()

The list is a special kind of vector because it contains multiple 'atomic classes of objects' of different classes, say for example: x<-c(a,1,2,1i,2L,"True") is a list.

Numbers (numerics) is the most important atomic class in R. to express a number as an integer simply add the suffix L, say for example 1 is a number but 1L is an integer in the R Language.

inf, or infinity, is also taken as a number say if 1/0=inf and 1/inf=0. NaN on the other hand represents an undefined value, for example 0/0=NaN. A NaN value also means that there is a missing value in your vector.

So, this ends our introduction to R, our next topic will be on basics of R.

No comments:

Post a Comment