Introduction to Sparklyr

Open to
Government analysts
Training category
Analytical
Type of training
Online
Length
2 days
Organiser
Data Science Campus Faculty
Provider
Data Science Campus Faculty
Location
Online

This course will give you an understanding of Sparklyr, which is the R interface to the distributed processing tool “Spark”. Sparklyr will help you to handle huge data sets effortlessly. It will also help you process, query and manipulate data which is beyond the reach of traditional programming languages.

The course will:

  • cover distributed processing
  • give a strong introduction to the main data structure of Sparklyr
  • teach you how to investigate data, combine it, query it and run complex transformations upon it

This is a practical course. You will write a lot of code throughout the course and there will be plenty of opportunities to practice what you are learning.

Learning outcomes

On this course you will:

  • gain confidence using Sparklyr
  • gain an understanding of distributed programming
  • learn how to import and export data
  • learn how to investigate data sets
  • learn how to manipulate data sets
  • learn how to draw conclusions from data
  • learn how to perform basic visualisation
  • gain the knowledge to handle large data sets with efficient code

How to book

Please use your Learning Hub account to enrol on this course.

If you do not have a Learning Hub account, please contact Data.Science.Campus.Faculty@ons.gov.uk.

Contact

If you have any questions about this course, please contact us at Data.Science.Campus.Faculty@ons.gov.uk.

Related courses

Introduction to Pyspark