# Data Analysis visitors: 21546 - online: 1 - today: 13

Data Analsis with R

# 1 Introduction

This course is intended for students and experimenters that would like, for the first time, to get in touch with advanced statistical softwares. Accordingly, experienced R user or programmers will find this course of limited interest. Instead, the course aims to motivate and introduce students of scientific disciplines to the improvement of their statistical skills, of the quality of their graphical output, of the readability of the their reports.

The course is structured in the following Units:

2. Data-frames
3. Data management
4. User defined functions
5. Looping
6. Data consistency
7. Summarize data
8. Plot data
9. Simple comparisons
10. Analysis of variance
11. Contrast analysis
12. Linear regression models
13. Non linear regression models
14. Model validation
15. Principal component analysis
16. Multivariate regression models
17. Linear discriminant analysis
18. Non parametric modelling
19. Quality control

This course use mostly R base packages (like base, stats and graphics). The use of eXternal packages will be limited as much as possible.

The approach used is in the form of a tutorial. Each Unit starts with a clear problem to solve. The sections of each unit will provide guidelines to achieve the solution.

# 2 Requirements

## 2.1 What is R?

R is a free language and environment for data manipulation, statistical analysis and graphical display.

## 2.2 Who needs R?

If you have to analyze only a small dataset, your model fitting is limited to linear regression functions, yout statistic is mostly related to mean and standard deviation , then, R does not pay back the time needed to learn it.

Instead, R becomes an interesting option if:

• You need to re-apply the same procedure or the same plot a number of times

Finally, R is your choice if:

• You need to apply a satistical model over a number of variables;
• Your dataset is composed by thousands of columns (variables) and thousands of rows (samples);
• Advanced statistics (i.e. multivariate analysis) are needed.

## 2.3 Which kind of people use R?

• R programmers. They have previous knowledge on programming languages, such as C, C++ or Java. Time of execution and CPU usage are of primary concerns.
• R advanced users. Develop user-defined functions to automate works.
• R users. Write scripts. Summarize data. Draw plots.
• R executors. Copy and paste codes writen by other to draw graphs or summary tables.

## 2.4 How to install R?

Installation of R for Linux, OS or Windows is available here:

## 2.5 How to install RStudio?

Rstudio is an R editor. You can download the RStudio Desktop Open Source at the following website:

## 2.6 Finding help

There are many way to get help. The most simple is to type help().

Official manuals are at the following website:

Support for working with RStudio is available here: