Cognitive Class

Exploring Spark’s GraphX

Apache Spark provides a graph-parallel computation library in GraphX. Graph-parallel is a paradigm that allows representation of your data as vertices and edges. Spark GraphX provides a set of fundamental operators in addition to a growing collection of algorithms and builders to simplify graph analytics tasks.

Start the Free Course

About This Spark GraphX Course

Spark provides a graph-parallel computation library in GraphX. Graph-parallel is a paradigm that allows representation of your data as vertices and edges. Spark GraphX provides a set of fundamental operators in addition to a growing collection of algorithms and builders to simplify graph analytics tasks.

In this course, you will learn about Spark GraphX components and the background of graph-parallel operations. You will see how Spark implements this with RDDs and how it compares vs Data Parallel operations. You will get to explore how to visualize your data using various graph operators.

Course Syllabus

Module 1 - Introduction to Graph-Parallel

  1. Learned about GraphX components, construction, and background
  2. See how Data Parallel, Graph-Parallel, and RDDs tie in with GraphX

Module 2 - Visualizing Spark GraphX and Exploring Graph Operators

  1. Learn about how GraphX handles visualizations, create views and look alternative options
  2. Take a look at few introductory Graph Operators and PageRank
Module 3 - Modifying Spark GraphX
  1. Understand how GraphX deals with modifications and RDDs
  2. Take a look at a Property Operators, Structural Operators and how to utilize them

Module 4 - Neighborhood Aggregation and Caching

  1. Learn about Neighborhood Aggregation - aggregateMessages
  2. Learn how to cache with GraphX and take a look at other graph

General Information

      • This course is free.
      • It is self-paced.
      • It can be taken at any time.
      • It can be audited as many times as you wish.

Recommended skills prior to taking this course

      • None

Requirements

  • None

Course Staff

Kevin Wong, Spark GraphX Course Instructor

Kevin Wong

Kevin Wong is a Technical Curriculum Developer. He enjoys developing courses that focuses on the education in the Big Data field. Kevin updates courses to be compatible with the newest software releases, recreates courses on the new cloud environment, and develops new courses such as Introduction to Machine Learning. In addition to contributing to the content on Cognitive Class, he has worked with various components that deal with Big Data, including Hadoop, Pig, Hive, Phoenix, HBase, MapReduce & YARN, Sqoop and Oozie. Kevin is working on obtaining a degree in Computer Engineering from the University of Alberta.

Earn your IBM Data Science Professional Certificate on Coursera.Learn more ...