Course details

Building ML Training Sets: Introduction

Building ML Training Sets: Introduction


Overview/Description
Expected Duration
Lesson Objectives
Course Number
Expertise Level



Overview/Description

There are numerous options available to scale and encode the features and labels in your dataset to get the best out of machine learning algorithms. Explore techniques such as standardizing, normalizing, one-hot encoding, and more.



Expected Duration (hours)
1.1

Lesson Objectives

Building ML Training Sets: Introduction

  • Course Overview
  • use the Pandas library to load a dataset in the form of a CSV file and perform some exploratory analysis on its features
  • transform the continuous data in a series to binary values by using scikit-learn's Binarizer
  • apply the MinMaxScaler on a dataset to get two similar columns to have the same range of values
  • standardize multiple columns in your dataset using scikit-learn's StandardScaler
  • distinguish between the Normalizer and other scaling techniques and apply this scaler on the continuous features of a dataset
  • represent the values in a column as a proportion of the maximum absolute value by using the MaxAbsScaler
  • apply label encoding on the features and target in your dataset and recognize its limitations when applied on input features
  • use the Pandas library to one-hot encode one or more features of your dataset and distinguish between this technique and label encoding
  • transform a continuous series into a categorical (binary) one, distinguish between Normalization and other scaling techniques, score each product as a proportion of the top product’s sales, and encode the ”VehicleType” field which contains values [“Hatchback”, “Sedan”, “SUV”]
  • Course Number:
    it_mlbmltdj_01_enus

    Expertise Level
    Beginner