Introduction to Sparklyr
- Open to
- Government analysts
- Training category
- Analytical
- Type of training
- Online
- Length
- 2 days
- Organiser
- Data Science Campus Faculty
- Provider
- Data Science Campus Faculty
- Location
- Online
Description
This course will give you an understanding of Sparklyr, the R interface to the distributed processing tool Spark. With it, you will be able to handle huge data sets effortlessly, and process, query, and manipulate data which is beyond the reach of traditional programming languages.
Over two days, the course will cover the why and how of distributed processing, give a strong introduction to the key data structure of Sparklyr, and teach you how to investigate data, combine it, query it and run complex transformations upon it. With a hands-on approach, you will be writing a lot of code throughout the material, getting to immediately try out what you have just learnt.
By the end of the course, you will be able to ingest, investigate, and manipulate vast data sets to come to meaningful conclusions. You will also be able to perform simple visualisations on your data, and have the knowledge needed to handle data in an efficient, effective manner.
Learning outcomes
By the end of the course you will:
- be confident using Sparklyr
- have an understanding of distributed programming
- be able to import and export data
- be able to investigate data sets
- be able to manipulate data sets
- be able to draw conclusions from data
- be able to perform basic visualisation
- have the knowledge to handle large data sets with efficient code
How to book
Please use your Learning Hub account to enrol on this course.
If you do not have a Learning Hub account, please contact Data.Science.Campus.Faculty@ons.gov.uk.
Contact
If you have any questions about this course, please contact us at Data.Science.Campus.Faculty@ons.gov.uk.