Ignite your interest in Apache Spark with an introduction to the core concepts that make this general processor an essential tool set for working with Big Data. Get hands-on experience with Spark in our lab exercises, hosted in the cloud.
Solid understanding and experience, with core tools, in any field promotes excellence and innovation. Apache Spark, as a general engine for large scale data processing, is such a tool within the big data realm. This learning path addresses the fundamentals of this program's design and its application in the everyday.
About this learning path
Ever waited over night to run a report and to come back to your computer in the morning to find it still running. When the heat is on and you have a deadline, something is not working. With larger and larger data sets you need to be fluent in the right tools to be able to make your commitments. This learning path is your opportunity to learn from industry leaders about Spark. This path provides hands on opportunities and projects to build your confidence within this tool set.
Building on your foundational knowledge of Spark, take this opportunity to move your skills to the next level. With a focus on Spark Resilient Distributed Data Set operations this course exposes you to concepts that are critical to your success in this field.
Spark provides a machine learning library known as MLlib. Spark MLlib provides various machine learning algorithms such as classification, regression, clustering, and collaborative filtering. It also provides tools such as featurization, pipelines, persistence, and utilities for handling linear algebra operations, statistics and data handling.
Spark provides a graph-parallel computation library in GraphX. Graph-parallel is a paradigm that allows representation of your data as vertices and edges. Spark GraphX provides a set of fundamental operators in addition to a growing collection of algorithms and builders to simplify graph analytics tasks.
A single firework being launched in the night sky is only so effective, but when clustered with together, magic happens and the sky comes to life. In this course, learn how to apply a similar type of magic when working with Apache Spark for analyzing big data in r. Don't blink or you might miss it!