Offered By: IBM
Data Analysis with Python
Data Analysis has always been a very important field and a highly demanded skill. Until recently, it has been practiced using mostly closed, expensive, and limited tools like Excel or Tableau. Python, pandas, and other open-source libraries have changed Data Analysis forever and have become must-have tools for anyone looking to build a career as a Data Analyst.
Continue readingDA0101EN
Course
Data Analysis
17.1k+ EnrolledAt a Glance
Data Analysis has always been a very important field and a highly demanded skill. Until recently, it has been practiced using mostly closed, expensive, and limited tools like Excel or Tableau. Python, pandas, and other open-source libraries have changed Data Analysis forever and have become must-have tools for anyone looking to build a career as a Data Analyst.
You will learn how to:
- Import data sets
- Clean and prepare data for analysis
- Manipulate pandas DataFrame
- Summarize data
- Build machine learning models using scikit-learn
- Build data pipelines
Data Analysis with Python is delivered through lectures, hands-on labs, and assignments. It includes the following parts:
- Data Analysis libraries: will learn to use Pandas DataFrames, Numpy multi-dimensional arrays, and SciPy libraries to work with various datasets. We will introduce you to pandas, an open-source library, and we will use it to load, manipulate, analyze, and visualize cool datasets. Then we will introduce you to another open-source library, scikit-learn, and we will use some of its machine learning algorithms to build smart models and make cool predictions.
- Learning Objectives
- Understanding the Domain
- Understanding the Dataset
- Python package for data science
- Importing and Exporting Data in Python
- Basic Insights from Datasets
Module 2 - Cleaning and Preparing the Data
- Identify and Handle Missing Values
- Data Formatting
- Data Normalization Sets
- Binning
- Indicator variables
Module 3 - Summarizing the Data Frame
- Descriptive Statistics
- Basic of Grouping
- ANOVA
- Correlation
- More on Correlation
Module 4 - Model Development
- Simple and Multiple Linear Regression
- Model Evaluation Using Visualization
- Polynomial Regression and Pipelines
- R-squared and MSE for In-Sample Evaluation
- Prediction and Decision Making
Module 5 - Model Evaluation
- Model Evaluation
- Over-fitting, Under-fitting, and Model Selection
- Ridge Regression
- Grid Search
- Model Refinement
- This course is self-paced.
- It can be taken at any time.
- It can be audited as many times as you wish.
- Python programming, Statistics
- Some Python experience is expected
- Python for Data Science
Estimated Effort
3 Hours
Level
Beginner
Skills You Will Learn
Data Analysis, Python, Data Science
Language
English