Data Science Methodology

Login to enroll
  • Course Number
  • Classes Start
    Any time, Self-paced
  • Estimated Effort
    3 hours
  • Audience
  • Course Level
  • Language
  • Learning Path
  • Badge Earned
  • Tell Your Friends


Despite the recent increase in computing power and access to data over the last couple of decades, our ability to use the data within the decision making process is either lost or not maximized at all too often, we don't have a solid understanding of the questions being asked and how to apply the data correctly to the problem at hand.

This course has one purpose, and that is to share a methodology that can be used within data science, to ensure that the data used in problem solving is relevant and properly manipulated to address the question at hand.

Accordingly, in this course, you will learn:

  • The major steps involved in tackling a data science problem.
  • The major steps involved in practicing data science, from forming a concrete business or research problem, to collecting and analyzing data, to building a model, and understanding the feedback after model deployment.
  • How data scientists think!
Please note that version 3.0 of this course was released on September 15, 2017. Please refer to the Change Log section in the course for a detailed description of the changes and updates.You can start creating your own data science projects and collaborating with other data scientists using IBM Data Science Experience. When you sign up, you get free access to Data Science Experience. Start now and take advantage of this platform.


Module 1: From Problem to Approach

  • Business Understanding
  • Analytic Approach

Module 2: From Requirements to Collection 

  • Data Requirements
  • Data Collection

Module 3: From Understanding to Preparation 

  • Data Understanding
  • Data Preparation

Module 4: From Modeling to Evaluation

  • Modeling
  • Evaluation

Module 5: From Deployment to Feedback

  • Deployment
  • Feedback


  • This course is free.
  • It is self-paced.
  • It can be taken at any time.
  • It can be audited as many times as you wish.


  • Data Science Hands-on with Open Source Tools (DS0105EN)


  • Passion for Data Science


John B. Rollins, Instructor of Data Science Methodology

John B. Rollins

John B. Rollins, Ph.D., P.E., is a Data Scientist at IBM. He is part of the IBM Analytics group and holds a Ph.D. in Petroleum Engineering and Economics from Texas A&M University. With an excellent background of engineering consulting, having been a professor and researcher, he has authored many patents, books, and papers. He achieved honors and awards from IBM as an IBM Second Plateau Inventor. He has great experience in data science methodology.

Polong Lin, Data Science Bootcamp instructor

Polong Lin

Polong Lin is a Data Scientist and Lead Data Science Advocate at IBM in Canada. Polong co-organizes the largest data science meetup group in Canada, and regularly speaks at conferences about data science. Polong holds a M.Sc. in Cognitive Psychology.


Dr. Alex Aklson, Data Visualization with Python Course Instructor

Alex Aklson

Alex Aklson, Ph.D., is a data scientist in the Digital Business Group at IBM Canada. Alex has been intensively involved in many exciting data science projects such as designing a smart system that could detect the onset of dementia in older adults using longitudinal trajectories of walking speed and home activity. Before joining IBM, Alex worked as a data scientist at Datascope Analytics, a data science consulting firm in Chicago, IL, where he designed solutions and products using a human-centred, data-driven approach. Alex received his Ph.D. in Biomedical Engineering from the University of Toronto.