Big Data University

MapReduce and YARN

Login to enroll
  • Course Number
  • Classes Start
    Any time, Self-paced
  • Estimated Effort
    4 hours
  • Audience
  • Course Level
  • Language
  • Learning Path
  • Badge Earned
  • Tell Your Friends


Learn why Apache Hadoop is one of the most popular tools for big data processing.

  • Learn why it is reliable, scalable, and cost-effective.
  • Learn about its revolutionary features, including Yet Another Resource Negotiator (YARN), HDFS Federation, and high availability.
  • Learn how the MapReduce framework job execution is controlled.
  • Get insights into the design and implementation of YARN.


  • Module 1 - About MapReduce
    1. The MapReduce model v1
  • Module 2 - Limitations
    1. Limitations of Hadoop 1 and MapReduce 1
  • Module 3 - Classes and Access
    1. Review of the Java code required to handle the Mapper class, the Reducer class, and the program driver needed to access MapReduce
  • Module 4 - About YARN
    1. The YARN model
  • Module 5 - Comparisons
    1. Comparison of YARN / Hadoop 2 / MR2 vs Hadoop 1 / MR1


  • This course is free.
  • It is self-paced.
  • It can be taken at any time.
  • It can be audited as many times as you wish.


  • Basic understanding of Apache Hadoop and Big Data.
  • Basic Linux Operating System knowledge.
  • Basic understanding of the Scala, Python, or Java programming languages.


  • None


Glenn Mules instructor, Introduction to Big Data

Glen R.J. Mules

Glen R.J. Mules is a Senior Instructor and Principal Consultant with IBM Information Management World-Wide Education and works from New Rochelle, NY. Glen joined IBM in 2001 as a result of IBM's acquisition of Informix Software. He has worked at IBM, and previously at Informix Software, as an instructor, a course developer, and in the enablement of instructors worldwide. He teaches courses in BigData (BigInsights & Streams), Optim, Guardium, and DB2, & Informix databases. He has a BSc in Mathematics from the University of Adelaide, South Australia; an MSc in Computer Science from the University of Birmingham, England; and has just completed a PhD in Education (Educational Technology) at Walden University. His early work life was as a high school teacher in Australia. In the 1970s he designed, programmed, and managed banking systems in Manhattan and Boston. In the 1980s he was a VP in Electronic Payments for Bank of America in San Francisco and New York. In the early 1990s he was an EVP in Marketing for a software development company and chaired the ANSI X12C Standards Committee on Data Security for Electronic Data Interchange (EDI).


Joe Byers, Instructor

Joe Byers

Joe Byers is a Senior Technical Curriculum Developer with IBM World-Wide Education and develops training on various media formats for Business Intelligence, Predictive Analytics, and Information Management. Joe came to IBM in 2007. Prior to that he was a Technical Manager for Oracle Corporation, where he spent 10 years engineering and architecting data warehouses. Joe performed his undergraduate and graduate studies in Computer Science at Indiana University, after which, he became a database programmer for Sears Craftsman Tools in Blue Ash, Ohio. In the 1990s, prior to joining Oracle, Joe was the Data Administrator at The Andrew Jergens Company in Cincinnati, Ohio, responsible for the entire company's data. While at Andrew Jergens, Joe became an Oracle Master and earned his Oracle Certified DBA (OCP). Joe regularly leads BI sessions at IBM's annual Insight conference in Las Vegas. Joe's other professional certifications include five different certifications in IBM Cognos products, two certifications in IBM TM1, as well as certifications and badges in IBM DB2, SPARK, Hadoop, and IBM Big Data.



Kevin Wong

Kevin Wong is a Technical Curriculum Developer. He enjoys developing courses that focuses on the education in the Big Data field. Kevin updates courses to be compatible with the newest software releases, recreates courses on the new cloud environment, and develops new courses such as Introduction to Machine Learning. In addition to contributing to the transition of BDU, he has worked with various components that deal with Big Data, including Hadoop, Pig, Hive, Phoenix, HBase, MapReduce & YARN, Sqoop and Oozie. Kevin is from the University of Alberta, where he has completed his third year of Computer Engineering Co-op and is currently a IBM Co-op Student.


Leons Petrazickis photo

Leons Petrazickis

Leons Petrazickis is the Ombud for Hadoop content on IBM Big Data U as well as the Platform Architect for Big Data U Labs. As a senior software developer at IBM, he uses Ruby, Python, and Javascript to develop microservices and web applications, as well as manage containerized infrastructure.