About This Course
Learn why data cleaning, preparation and enrichment take up an enormous amount of time, and is nevertheless a crucial stage in the data science methodology.
Tools available for data transformation haven’t fully caught up with the data science new scene.
- Learn why domain experts need powerful yet easy-to-use interfaces to explore new data sets, normalize them and process them via innovative services often available via an API only.
- Learn how OpenRefine helps you minimize this so you can spend more time on building a model by offering the best of both worlds with a self service agile and iterative interface for data discovery and preparation and an easy-to-learn scripting language.
- Module 1 - Introduction to OpenRefine
- Introduction to Data Quality and Integration
- Moving toward an Agile Data Process
- OpenRefine History and Community
- OpenRefine Interface Tour
- Installing OpenRefine and Getting Started
- Module 2 - Data Mining and Discovery
- Data Mining and Discovery Text Based Facet
- Data Mining and Discovery date and Number Based Facet
- Data Mining and Discovery Text Combining Facet
- Data Mining and Discovery sorting Data
- Module 3 - Data Preparation and Normalization
- Data Preparation and Normalization clustering
- Data Preparation and Normalization Remove Duplicate
- Data Preparation and Normalization Split Multi Valued Cells
- Data Preparation and Normalization Concatenation
- Data Preparation and Normalization Using OpenRefine History Do/Undo
- Module 4 - General Refine Expression Language
- General Refine Expression Language Introduction
- General Refine Expression Language Replace Function
- General Refine Expression Language Split and work with Array
- General Refine Expression Language String Comparison and If Condition
- General Refine Expression Language Calculate with Refine
- Module 5 - Data Enrichment
- Data Enrichment
- Data Enrichment Joining two OpenRefine Projects
- Data Enrichment Working with API Introduction
- Data Enrichment Calling an API with Refine
- Data Enrichment Parsing API Results
- This course is free.
- It is self-paced.
- It can be taken at any time.
- It can be audited as many times as you wish.
Recommended skills prior to taking this course
- Familiarity with spreadsheet software and different data type (text, date, number)
Martin Magdinier is the founder of RefinePro which is a company that create OpenRefine software. He has a Master's degree (MA) in IT Management, Knowledge Management, and Competitive Intelligence. In 2010, he attended Workplace Communication in Canada at Ryerson University, Toronto, Ontario, Canada. He started his first project, OpenRefine in July 2011. July 2012 to August 2013, he worked on another project for the Toronto Transit Commission, the TTC Pass. In 2013, he published a book as an introduction to using OpenRefine. Coming from a business approach, his focus is on data management and transformation tools that empower the business user.