Data Analysis
visitors: 39400 - online: 1 - today: 18

Data Analsis with R

1 Introduction

This course is intended for students and experimenters that would like, for the first time, to get in touch with advanced statistical softwares. Accordingly, experienced R user or programmers will find this course of limited interest. Instead, the course aims to motivate and introduce students of scientific disciplines to the improvement of their statistical skills, of the quality of their graphical output, of the readability of the their reports.

The course is structured in the following Units:

  1. Read and save data
  2. Data-frames
  3. Data management
  4. User defined functions
  5. Looping
  6. Data consistency
  7. Summarize data
  8. Plot data
  9. Simple comparisons
  10. Analysis of variance
  11. Contrast analysis
  12. Linear regression models
  13. Non linear regression models
  14. Model validation
  15. Principal component analysis
  16. Multivariate regression models
  17. Linear discriminant analysis
  18. Non parametric modelling
  19. Quality control

This course use mostly R base packages (like base, stats and graphics). The use of eXternal packages will be limited as much as possible.

The approach used is in the form of a tutorial. Each Unit starts with a clear problem to solve. The sections of each unit will provide guidelines to achieve the solution.

2 Requirements

2.1 What is R?

R is a free language and environment for data manipulation, statistical analysis and graphical display.

2.2 Who needs R?

If you have to analyze only a small dataset, your model fitting is limited to linear regression functions, yout statistic is mostly related to mean and standard deviation , then, R does not pay back the time needed to learn it.

Instead, R becomes an interesting option if:

  • You need to re-apply the same procedure or the same plot a number of times

Finally, R is your choice if:

  • You need to apply a satistical model over a number of variables;
  • Your dataset is composed by thousands of columns (variables) and thousands of rows (samples);
  • Advanced statistics (i.e. multivariate analysis) are needed.

2.3 Which kind of people use R?

  • R programmers. They have previous knowledge on programming languages, such as C, C++ or Java. Time of execution and CPU usage are of primary concerns.
  • R advanced users. Develop user-defined functions to automate works.
  • R users. Write scripts. Summarize data. Draw plots.
  • R executors. Copy and paste codes writen by other to draw graphs or summary tables.

2.4 How to install R?

Installation of R for Linux, OS or Windows is available here:

2.5 How to install RStudio?

Rstudio is an R editor. You can download the RStudio Desktop Open Source at the following website:

2.6 Finding help

There are many way to get help. The most simple is to type help().

Official manuals are at the following website:

Support for working with RStudio is available here:


1 2 3 4 5