Cognitive Class

Controlling Hadoop Jobs Using Oozie

Are you an Oozie? Or just appreciate their skills?  Come and learn the ways that this elephant keeper, or rather workflow controller, keeps Big Data projects at bay.

Start the Free Course

About this Course

Learn how Oozie controls Hadoop jobs.

  • See the components required to code a workflow as well as optional components such as case statements, forks, and joins.
  • Learn how to use the Oozie coordinator to schedule a workflow. You will quickly notice that workflows are coded using XML which tends to get verbose.
  • Learn about a graphical workflow editor tool designed to simplify the work in generating a workflow.

Course Syllabus

  • Module 1 - Introduction to Oozie Workflows
    1. Explain the use for Oozie workflows
    2. Describe a workflow
    3. List some of the workflow elements
  • Module 2 - Oozie Coordinator
    1. Explain the use for the Oozie coordinator
    2. List some of the coordinator elements
    3. Describe how to submit a workflow job and a coordinator job
  • Module 3 - BigInsights Workflow Editor
    1. Explain how to publish an application
    2. Describe how to define a reoccurring schedule for an application
    3. Explain how to link multiple applications to form a new application

General Information

  • This course is free.
  • It is self-paced.
  • It can be taken at any time.
  • It can be audited as many times as you wish.
  • Labs can be performed on the Cloud, or using a 64-bit system. If using a 64-bit system, you can install the required software (Linux-only), or use the supplied VMWare image. More details are provided in the course.

Recommended skills prior to taking this course


Course Staff

Glenn Mules instructor, Introduction to Big Data

Glen R.J. Mules

Glen R.J. Mules is a Senior Instructor and Principal Consultant with IBM Information Management World-Wide Education and works from New Rochelle, NY. Glen joined IBM in 2001 as a result of IBM's acquisition of Informix Software. He has worked at IBM, and previously at Informix Software, as an instructor, a course developer, and in the enablement of instructors worldwide. He teaches courses in BigData (BigInsights & Streams), Optim, Guardium, and DB2, & Informix databases. He has a BSc in Mathematics from the University of Adelaide, South Australia; an MSc in Computer Science from the University of Birmingham, England; and has just completed a PhD in Education (Educational Technology) at Walden University. His early work life was as a high school teacher in Australia. In the 1970s he designed, programmed, and managed banking systems in Manhattan and Boston. In the 1980s he was a VP in Electronic Payments for Bank of America in San Francisco and New York. In the early 1990s he was an EVP in Marketing for a software development company and chaired the ANSI X12C Standards Committee on Data Security for Electronic Data Interchange (EDI).

Warren Pettit

Warren Pettit

Warren Pettit has been with IBM for over 30 years. For the last 16 years, he has worked in Information Management education where he has been both an instructor and a course developer in the Data Warehouse and Big Data curriculums. For the nine years prior to his joining IBM, he was an application programmer and was responsible for developing a training program for newly hired programmers.