# Basic Tutorial in R

R is a very simple language to understand and to implement. We will learn R in this tutorial in an easy manner.

**What is R?**

R is a programming language developed by Ross Ihaka and Robert Gentleman. It is commonly used in data analytics and scientific research. They decided to call their creation, simply, R.

It is one of the most popular languages used by data analysts to collect, clean and transform the data to make decisions and predict the required probabilities.

**Basics of R**

We will learn the following topics in this tutorial:-

- Help() function
- Print() function
- Comments
- Variables
- Vectors
- Lists
- Matrices
- Arrays
- Factors
- Data Frames

- Deleting Variables (using rm() function)
- Operations
- Arithmetic Operations
- Relational Operations
- Logical Operations
- Assignment Operations
- Miscellaneous Operations

- Decision Making
- Loops
- Packages
- Importing Dataset
- Graphs
- Pie chart
- Bar plot
- Histogram
- Scatter plot
- Box plot

- Functions
- Predefined Functions
- Numeric Functions
- Statistical Function

- Predefined Functions

This is Rstudio work space. There is a console window where you can run the commands directly. There is plots window to view different types of graphs. There is a help section to search about R topics and many others.

Now let us learn about all the

**Help()** **function- **This is an useful function and should be understood first. This provides you help about any topic in R you want to search. It’s command will be:

**help(topic you want to search).**

**Print() function- **You can directly use the print() function to print the required content.

**Comments- **You can put comments in your code using the # symbol. To use multi line comments put the comment in Single quotes or double quotes.

**Variables- **In programming languages we need variables to store the information. These variables are reserved memory spaces.

We have different types of variables as:-

- Vectors
- Lists
- Matrices
- Arrays
- Factors
- Data Frames

We even have different data types to store different kinds of values:-

- Logical
- Numeric
- String
- Complex

**Vectors- **We use the c() function to assign values to vectors

We can also access a particular element in these vectors

**Lists- **A list can have different types of elements such as vectors, strings and many others.

**Matrices- **It is a 2-D rectangular data set. It can be created using a vector input with a matrix function.

You can manipulate rows and columns value using “nrow” and “ncol” respectively.

**Arrays- **We earlier studied matrices, these matrices are confined to specific dimensions. Arrays could be of any size.

If we want to create 2 arrays of 3 rows and 2 columns then this will be the format:-

**Factors- **They are data objects which divide the elements as levels.

**Data Frames- **A data frame is 2-Dimensional array like structure in which each column is a variable name and every row is its corresponding value.

**Deleting Variables- **Variables can be removed or deleted using the **remove()** function or **rm()** function.

**Operations- **When we have numbers or list of numbers (vectors), we sometimes need to perform some basic mathematical operations on it. R provides us different types of operations.

- Arithmetic Operators
- Relational Operators
- Logical Operators
- Assignment Operators
- Miscellaneous Operators

**Arithmetic Operators**

Let us look at some of the arithmetic operators

Operator |
Description |

+ | Addition |

– | Subtraction |

* | Multiplication |

/ | Division |

%% | Remainder |

%/% | Quotient |

^ | Power |

**Relational Operator**

Relational operators help us to compare vectors. They return TRUE or FALSE as output.

Operators |
Description |

> | checks if each element of first vector
is greater than corresponding element of second vector |

< | checks if each element of first vector
is less than corresponding element of second vector |

== | checks if each element of first vector
is equal to corresponding element of second vector |

<= | checks if each element of first vector
is less than or equal to corresponding element of second vector. |

>= | checks if each element of first vector
is greater than or equal to corresponding element of second vector |

!= | checks if each element of first vector
is not equal to corresponding element of second vector |

**Logical Operators**** **

Operator |
Description |

& | It is known as Element-wise logical AND operator. It combines each element of first vector and corresponding element in second vector and gives an output TRUE if both the elements are TRUE. |

| | It is known as Element-wise logical OR operator. It combines each element of first vector and corresponding element in second vector and gives an output TRUE if anyone of them is TRUE. |

! | It is known as Logical NOT Operator. Takes each element of the vector and gives the opposite logical value. |

&& | It is known as Logical AND Operator. It takes the first element of each vector and gives output TRUE if both are TRUE. |

|| | It is known as Logical OR Operator. It takes the first element of each vector and gives output TRUE if anyone of them is TRUE. |

**Assignment Operator**

These operators are used to assign values to vectors.

Operator |
Description |

<-
= <<- |
Called left
Assignment |

->
->> |
Called Right
Assignment |

**Left Assignment**

**Right Assignment**

**Miscellaneous Operator **

Operator |
Description |

: | Colon operator: Creates series of
numbers in sequence |

%in% | Checks if an element belongs to a
vector |

%*% | This operator multiplies a matrix
with its transpose |

**Decision Making**

Decision making means if a certain condition is true following task has to be done and if it’s not true some other task has to be done.

There are various decision making statements possible in R:

** If statement:-**

**If else statement:-**

**Switch:-**

** **Switch (Expression, list)

**Loop- **There might be situations when we need to execute a statement several number of times, so feasibly we will not write that statement again and again. This would make our task difficult. So at such places we use loops.

We have different types of loops as:

**For loop:-**

**While loop:- **

While (cond) expr

**Packages **

Packages are collection of data, R functions and some compiled code in. We can access them through a directory called **library. **They are by default installed during the installation of R.

You can see all the libraries already installed by the command: **library()** .We will get the following output:-

You can even a install a new package as per your requirement by following command:-

**Install.packages(“ Package Name”)**

**Importing Dataset**

We have files stored in our systems and sometimes we need to use them. They can be any format like csv, xml, excel etc. To use a file it should be in the current working directory (Current working directory means the directory or the folder you currently working in).

With commands: – **getwd()** you can get the directory in which Rstudio is currently working in and with **setwd() **you can set your own directory.

**CSV file- **CSV stands for comma separated value. In a csv file the values are stored in comma separated format.

This is how csv file looks like with all the values separated with commas.

**Reading a CSV File- **To read a CSV file in R we use read.csv() function

Note: – In case you want to import an excel file use command **read.xlsx(path of file).**

**Plots –**Sometimes we need to analyze our data better so we do it through graphs. R provides us different types of graphs:-

**Pie chart**

We will create a pie chart using the **pie() **function. It takes only the positive values as input in vector form.

**Bar plot**

These are most commonly used graphs. They show the relationship between the numerical value and the categorical value.

**Histograms**

It is a type of graph whose area shown is proportional to the frequency of any variable and width is equal to class interval.

**Scatter plot**

It is a type of graph where 2 variables are plotted along 2 axes ,the resulting pattern specifies the correlation present between the variables.

**Box plot**

It is another type of representation where the data is represented with a rectangle to denote the quartiles.

**Xlab-** It denotes the label on the x-axis.

**Ylab- **It denotes the label on the y-axis.

**Main-** It gives the heading to our plot.

**Functions**

Functions are group of statements which perform certain tasks as defined by the programmer. R has a set of some predefined functions to make our work easier such as log(x), exp(x), mean(x), median(x) and many others. You can also define your own function in the given format:-

**Function_name <- function (argument list……)**

**{**

**Function body**

**}**

Let us understand with an example:-

To create a script in python, follow the steps: – **File** -> **New File** -> **R script **or use **ctrl+shift+N**

This will give you a new script:

Suppose we are making an addition function in a R Script:-

**Default Arguments**

Default arguments means giving the value to arguments when we define the function. Let us look at an example:

**Predefined Functions**

R has a set of some predefined functions which can be categorized in following ways:-

**Numeric Functions:-**

Some of the numeric functions which R provides us are: – abs(x), sqrt(x), ceiling(x), floor(x), trunc(x), round(x), signif(x), cos(x), sin(x), tan(x), exp(x).

**Statistical Functions: – **We have some statistical probability functions as: mean(x), median(x), sd(x), min(x), max(x) and many others.

**Mean** function gives the mean value of the argument, median provides the media.

**sd** stands for standard deviation. It calculates the standard deviation of the argument.

**Min** and **Max** provides the minimum and maximum value of the argument.