Training Courses

Introduction

This is our educational offer that ranges from courses for R beginners to specialization courses. The specialist offer covers both particular analysis areas and advanced programming techniques to use technology to achieve the desired results.

Our trainers have experience both in corporate and academic training, and they are active users of the techniques subjects of these courses for business solutions and research.

Data Science Courses (2 day courses)

1 . R for Data Science

This course aims to introduce the attendee to the modern R for data analysis. RStudio and the Tidyverse will be the core of tools to perform data manipulation, visualization, modeling and to perform exportation of the result in beautiful and simple reports.

2 . Analysis Communication: Data Visualization & Automatized Reports

Communication is the last and most delicate step following the analysis. It communicates sophisticated results to people unrelated to technology and without profound statistical skills. Therefore it is important to know how to convey the message in a simple way with the help of expressive and well thought-out graphics. Finally, we will look at the R technologies to make the reports reproducible on updated data at the cost of a click.

This course will teach you how to use ggplot2 as well as other interactive JS visualization tools. The second part will show you how to build a customized and automatized PDF or HTML reports or slides with R Markdown.

3 . Web Dashboard with R

R and Shiny are an effective way to make a pilot program, which is a feasibility study on small-scale: short-term and inexpensive experiments that help an organization discover whether this project could be useful for their business. In this course we will learn how to build a pilot application with Shiny: an R framework suitable to express your data workflow and make it accessible as a nice dashboard to anyone with a web browser.

4 . R and Big Data

When the database grows, it is important to find the solution that best suits your needs. In this course we start with the use of databases to analyze a much larger data than the one that’s accessible with the simple R and we will move on increasing the data size until we understand how the distributed architectures work and how to use Apache Hadoop and Apache Spark with R and Sparklyr.

Programming Specialization (1 day courses)

1 . Professional Programming

Writing code means creating a software functionality. However, writing good code has stronger requirements in terms of reliability, robustness, reusability, extensibility. This means that the exact resolution of the problem must be ensured, in a great variety of cases, in a legible and extensible way. This course aims to describe the “best practices” of the following working methods: debugging, tracing, error handling, logging, asserting, documentation, unit tests.

2 . Functional programming in R

This class introduces the meaning of functional programming. It starts from the mathematical concepts that are the basis of this paradigm and then it continues with showing the ready to use packages to implement it. The class will explain the differences between pure functional programming and mutable programming with closures, the map-reduce paradigm, and the common ground between OOP (Object Oriented Programming) and FP (Functional Programming).

3 . Parallel computing

This class aims to give a basis of what the main constraints of parallel programming are, the description and the differences between the main paradigms of parallelization (shared memory, distributed computation, map-reduce paradigm, future, …). We will do exercises using some R libraries and we will measure the time gain. (R packages: “future”, “parallel”, “foreach” and others)

4 . Tidy evaluation and its advantages with dplyr

Tidy evaluation is widely used in R and especially in the Tidyverse. Those who are familiar with Dplyr already have an understanding of how it works. However, very often you find yourself using specific functions for your project logic together with the tidyverse. Therefore for consistency and to exploit the potential of this framework, you want to write new functions that use the same non-standard evaluation syntax. In this course we want to present the theoretical and practical bases of this approach.

5 . Optimize R code

In the most expensive projects in terms of computational performance, it is important to know how to optimize the performance of the algorithm created. Therefore it is important to have a tool to understand what the bottlenecks are and it is important to know how to orientate yourself with the possible solutions. In a nutshell: can I solve this quickly by optimizing the R code or do I need to rewrite the affected part in a compiled language? In this course we will see how to do profiling (analysis of computational run time) directly from RStudio, what are the main rules to be observed to write efficient R code, we will see the basics of the interaction between C++ and R and we will try to understand how much performance is gained with this tool and at what price.

IT training (2 day course)

1 . DevOps R

This course is not for Data Scientists, but for Operations. This course will teach you how to provide a Server with any R service: RStudio Server, Shiny Server (free or Pro version), Shiny Proxy (the Open Source Shiny Server based on Docker) and some custom service using R.

We will install this software on a Linux System and see how to best use the features of the operating system to provide services to final users with the correct security criteria. Finally, we will make everything automated and reproducible through Ansible and establish a CI/CD pipeline that will automatically install the latest version of the software only if automatic tests have been passed.

Trainers

Andrea Melloncelli

Andrea is an expert R and Shiny developer consultant. He has a solid experience in Python and Scala programming and development, along with extensive skills in Unix system management, IT automation tools, cloud technologies and big-data platforms, such as Hadoop & Spark. He has taught R basics and advanced topics during live and remote courses in universities, masters and companies. Andrea graduated in Physics from the Università Degli Studi Di Milano.

Linkedin

Mariachiara Fortuna

CV

Linkedin

Contact form