visitors: 9416 - online: 1 - today: 4

R has several data types, including:

- Vectors
- Matrices
- Dataframes
- Lists

A vector is a simple data structure, where data is stored in one column.
The simplest way to define a numeric vector is with the `c()`

statement:

X <- c(1,2,3,5,6) # numeric vector X

## [1] 1 2 3 5 6

R has a *colon notation* to create series of numbers:

X <- c(1:6) # numeric vector X

## [1] 1 2 3 4 5 6

`seq()`

An explicit function is `seq(from=X, to=Y, by=Z)`

:

X <- seq(from = 1, to = 3, by = 0.25) # numeric vector X

## [1] 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00

`seq()`

can be used with the parameter `length`

:

X <- seq(from = 1, to = 3, length = 9) # numeric vector X

## [1] 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00

`rep()`

You can create vectors containing repetitions with the function `rep()`

:

X <- rep(1:2, times = 3) # numeric vector X

## [1] 1 2 1 2 1 2

The function `rep()`

may have also the argument `each`

:

X <- rep(1:3, each = 3) # numeric vector X

## [1] 1 1 1 2 2 2 3 3 3

Both arguments `times`

and `each`

can be used together:

X <- rep(1:3, each = 2 , times = 2) # numeric vector X

## [1] 1 1 2 2 3 3 1 1 2 2 3 3

A vector may contain text:

X <- c("A","B","C") # character vector X

## [1] "A" "B" "C"

A vector may contain logical values:

X <- c(TRUE,TRUE,TRUE,FALSE,TRUE,FALSE) #logical vector X

## [1] TRUE TRUE TRUE FALSE TRUE FALSE

A vector contain data of the same type. In the following examples, R interpret all the data as characters:

X <- c(1, "A", TRUE) #R will interpret all these values as text X

## [1] "1" "A" "TRUE"

You may have access to a *n-element* of a vector by its index `X[n]`

:

X <- c(1:6) X[1]

## [1] 1

You may select multiple elements of a vector by specifying multiple indices, like `X[x, y, z]`

:

X <- c(1:6) X[c(2, 4)]

## [1] 2 4

Elements can be also excluded by a negative index, like `X[-n]`

:

X <- c(1:6) X[-c(1:3)]

## [1] 4 5 6

The indices can be used also to substitute the value of an elements:

X[2] <- 20 X

## [1] 1 20 3 4 5 6

Logical indices can be used to search specific elements of a vector:

X <- c(1, 3, 7, 4, 9, 2) # define the vector X X[X > 4] # select only those elements with values higher than...

## [1] 7 9

Matrices are a collection of vectors, all of the same type. The elements are arranged in a two-dimensional rectangular layout.

`cbind()`

A simple way to define matrices is with
the `cbind()`

function, which bind a series of vectors column-wise:

x <- c(1,2,3,4,5) # numeric vector y <- c(1,2,3,4,5) # numeric vector z <- c(1,2,3,4,5) # numeric vector M <- cbind(x,y,z) M

## x y z ## [1,] 1 1 1 ## [2,] 2 2 2 ## [3,] 3 3 3 ## [4,] 4 4 4 ## [5,] 5 5 5

Similar to `cbind()`

function, there is also the `rbind()`

function,
which binds vectors by one row at a time:

M <- rbind(x, y, z) M

## [,1] [,2] [,3] [,4] [,5] ## x 1 2 3 4 5 ## y 1 2 3 4 5 ## z 1 2 3 4 5

`matrix()`

Matrices can be also defined by the `matrix()`

function:

A = matrix( c(1:6), # the data elements to be filled in nrow=2, # the number of rows, ncol=3, # the number of columns, and byrow = TRUE) # filling the matrix rowwise (one row at a time). A

## [,1] [,2] [,3] ## [1,] 1 2 3 ## [2,] 4 5 6

An element of a `n x m`

matrix can be selected with its
index `M[n, m]`

, similarly to vectors. Consider the matrix:

M <- matrix(round(runif(12, 5, 10), 0), # generate random integers nrow = 3, # and fill into a matrix of 3 rows ncol = 4) # and 4 columns. M

## [,1] [,2] [,3] [,4] ## [1,] 9 9 6 10 ## [2,] 6 7 9 6 ## [3,] 7 9 6 8

To get the value of the first column:

M[, 1]

## [1] 9 6 7

To get the value of the second row:

M[2, ]

## [1] 6 7 9 6

To get the values of the first and third column:

M[, c(1, 3)]

## [,1] [,2] ## [1,] 9 6 ## [2,] 6 9 ## [3,] 7 6

To get the value of a specific element:

M[2, 3]

## [1] 9

To get the elements that satisfy a condition:

M[M > 7] # this returns a vector

## [1] 9 9 9 9 10 8

To get rid of a row or column, you can use a negative index:

M[, -2]

## [,1] [,2] [,3] ## [1,] 9 6 10 ## [2,] 6 9 6 ## [3,] 7 6 8

`dim()`

Given a matrix:

M <- matrix(rep(c(1:4), times = 3), 3, 4) M

## [,1] [,2] [,3] [,4] ## [1,] 1 4 3 2 ## [2,] 2 1 4 3 ## [3,] 3 2 1 4

The dimension of a matrix is given by the function `dim()`

:

dim(M) # return the number of rows and columns

## [1] 3 4

The function `dim()`

can also be used to change dimensions:

dim(M) <- c(4, 3) M

## [,1] [,2] [,3] ## [1,] 1 1 1 ## [2,] 2 2 2 ## [3,] 3 3 3 ## [4,] 4 4 4

The same synthax is used to transform a matrix to a vector:

dim(M) <- c(12, 1) M

## [,1] ## [1,] 1 ## [2,] 2 ## [3,] 3 ## [4,] 4 ## [5,] 1 ## [6,] 2 ## [7,] 3 ## [8,] 4 ## [9,] 1 ## [10,] 2 ## [11,] 3 ## [12,] 4

`dimnames()`

:Sometime, it is more simple to refer to names instead of numerical indices. For this you
can define the names of rows and columns by the fnuction `dimnames()`

:

M <- matrix(rep(c(1:3), times = 4), 3, 4) dimnames(M) = list( c("row1", "row2", "row3"), # row names c("col1", "col2", "col3", "col4")) # column names M

## col1 col2 col3 col4 ## row1 1 1 1 1 ## row2 2 2 2 2 ## row3 3 3 3 3

A `list`

is the most flexible container of objects in R. Its elements can be
unrelated, of any type and size.

mylist <- list(name=c("A", "B", "C"), numb=c(1,2,3,4), matr=cbind(c(2,1),c(1,2)), vect=c(5,3,4,5,6,2)) mylist

## $name ## [1] "A" "B" "C" ## ## $numb ## [1] 1 2 3 4 ## ## $matr ## [,1] [,2] ## [1,] 2 1 ## [2,] 1 2 ## ## $vect ## [1] 5 3 4 5 6 2

An object contained in the list can be accessed with the `dollar`

notation:

mylist$name

## [1] "A" "B" "C"

Dataframes are between a list and a matrix. It is like a list since the columns can contain different types of objects (i.e. texts, numbers, factors). It is like a matrix since the output is a table. To create a dataframe from scratch:

origin <- c("ITA", "AUT", "FRA") protein <- c(2, 3, 2) sugar <- c(8, 12, 10) mydata <- data.frame(origin,protein,sugar) mydata

## origin protein sugar ## 1 ITA 2 8 ## 2 AUT 3 12 ## 3 FRA 2 10

To edit manually the data, use the command "edit(mydata)"

With the dataframe "mydata", the variable "origin" should be used as categorical factor:

mydata[,1] <- factor(mydata[,1])

This statement stores this vector as (1, 2, 1, 1) and associates it with 1=Type1 and 2=Type2 internally.

str(mydata) # structure of an object

## 'data.frame': 3 obs. of 3 variables: ## $ origin : Factor w/ 3 levels "AUT","FRA","ITA": 3 1 2 ## $ protein: num 2 3 2 ## $ sugar : num 8 12 10

class(mydata) # class or type of an object

## [1] "data.frame"

names(mydata) # names

## [1] "origin" "protein" "sugar"