Data Science Tools

Learn and try out the most popular data science tools like Jupyter Notebooks, RStudio IDE, Apache Zeppelin, OpenRefine, and more. Tools are available to use directly on the cloud at no charge.

Start the Free Course

Note: This is an updated version of the old course. If you were enrolled in the older version (before September 18, 2017), you can continue your progress here: link.

About This Course

Get started with some of the most popular tools for collaborative data science, including RStudio IDE, Jupyter Notebooks, Apache Zeppelin notebooks, Seahorse by and OpenRefine. Use the tools directly on Cognitive Class Labs, a free virtual lab environment that brings powerful open data science tools together so you can analyze, visualize, explore, clean data, run models and create apps.

Course Syllabus

  • Module 1 -Introducing Cognitive Class Labs
    • What is Cognitive Class Labs?
    • Data Scientist Workbench Account features
    • Creating a Data Scientist Workbench account
    • Managing data within My Data
    • LAB: Getting Started with Cognitive Class Labs
  • Module 2 -Introducing Jupyter Notebooks
    • What are Jupyter notebooks?
    • Getting started with Jupyter
    • Data and Notebooks in Jupyter
    • Sharing your Jupyter Notebooks and data
    • Apache Spark in Jupyter Notebooks
    • LAB: Getting Started with Jupyter Notebooks
  • Module 3 - Introducing Zeppelin Notebooks
    • What are Zeppelin Notebooks?
    • Zeppelin for Scala
    • Getting started with Zeppelin
    • Managing your Interpreters in Zeppelin
    • Apache Spark in Zeppelin Notebooks
    • LAB: Getting Started with Apache Zeppelin Notebooks
  • Module 4 - Introducing RStudio IDE
    • What is RStudio IDE?
    • Uploading files, Installing Packages and loading libraries in RStudio IDE
    • Getting started with RStudio IDE
    • RStudio Environment and History
    • Apache Spark in RStudio IDE
    • LAB: Getting Started with RStudio IDE
  • Module 5 - Introducing Seahorse
    • What is Seahorse?
    • A Glimpse of Seahorse's Features
    • Getting started with Seahorse on Cognitive Class Labs
    • Creating and uploading Seahorse Workflows on Cognitive Class Labs
    • Exporting and Cloning the Seahorse Examples on Cognitive Class Labs
    • LAB: Getting Started with Seahorse
  • Module 6 - Introducing OpenRefine
    • Preparing data with OpenRefine
    • LAB: Getting Started with OpenRefine

General Information

  • This course is free.
  • It is self-paced.
  • It can be taken at any time.
  • It can be audited as many times as you wish.

Recommended skills prior to taking this course

  • None


  • None

Course Staff

Polong Lin, Data Science Bootcamp instructor

Polong Lin

Polong Lin is a Data Scientist at IBM in Canada. Under the Emerging Technologies division, Polong is responsible for educating the next generation of data scientists through BDU. Polong is a regular speaker in conferences and meetups, and holds a M.Sc. in Cognitive Psychology.

Dr. Saeed Aghabozorgi, Data Science Bootcamp instructor

Saeed Aghabozorgi

Saeed Aghabozorgi, PhD is a Data Scientist in IBM with a track record of developing enterprise level applications that substantially increases clients’ ability to turn data into actionable knowledge. He is a researcher in data mining field and expert in developing advanced analytic methods like machine learning and statistical modelling on large datasets.