SKILL BENCHMARK

AWS Certified Machine Learning Specialty: Exploratory Data Analysis Competency (Intermediate Level)

34m
34 questions

Explore all Skills Benchmarks

The AWS Certified Machine Learning Specialty: Exploratory Data Analysis Competency benchmark measures your ability to explore, sanitize, and prepare data for modeling, perform feature engineering, and analyze and visualize data for machine learning. A learner who scores high on this benchmark demonstrates that they have the necessary skills to identify and handle dirty data, transform data, recognize labeled data and identify migration strategies, identify and extract features from data sets, graph data, interpret descriptive statistics, and perform clustering.

Topics covered

define Bernoulli, uniform, and binomial data distributions
define binning and discretization as the process of transforming numerical variables into categorical counterparts
define data scaling and normalization and describe why it is important to standardize independent variables
define normal, Poisson, and exponential data distributions
define the main Amazon QuickSight processes and terms
describe bag-of-words model and compare it to TF-IDF
describe how Amazon SageMaker Ground Truth works and name its major benefits
describe how data outliers impact data analysis and name common ways to deal with outliers
describe how dimensions and features are linked to each other, specifying their impacts on building accurate ML models
describe how missing data impacts ML models and name ways to deal with missing data
describe how tables, databases, and data catalogs work in Amazon Athena and how to query data from other AWS services in Athena
describe how the Apache Spark open-source framework works with Amazon Elastic MapReduce (EMR) and its real-world use cases
describe how to perform one-hot encoding and its main purpose
describe how to use Amazon SageMaker Feature Store to fully manage repositories for ML features
describe the concept of n-gram and why they are used for machine learning
describe the process of term frequency-inverse document frequency (TF-IDF) and its uses in text mining
describe the use cases for Amazon Elastic MapReduce (EMR), recognize when to deploy it, and compare EMR and Glue
differentiate between categorical and numerical data types
name and describe modern graphic types used in data analysis
name and describe traditional graphic types used in data analysis
outline data shuffling and define its role in removing biases and building more robust training models
outline how data transformation can be used to make data more useful for data analysis
outline how the Apache Hadoop open-source framework works with Amazon Elastic MapReduce (EMR) and its real-world use cases
recognize the basic principles behind text feature engineering
recognize what's meant by advanced time series analysis concepts, such as trends, seasonality, and autocorrelation
specify how skewed data can affect ML classification and ways to address it
use Spark and EMR workflows to prepare data for a TF-IDF problem
work with Amazon Athena to create databases and tables and run queries
work with Amazon QuickSight to create a simple multi-visual analysis and a dashboard
work with Amazon SageMaker Feature Store to achieve feature consistency and standardization
work with commonly used feature engineering techniques on real data
work with data, analyses, visuals, ML insights, and dashboards in Amazon QuickSight
work with Python toolkits to implement various types of data visualization
work with time series data in Python, implementing data analysis pipelines

RECENTLY ADDED COURSES

Course

AWS Certified Machine Learning: Feature Engineering Techniques

(19)

Course

AWS Certified Machine Learning: Data Analysis Fundamentals

(24)

Course

AWS Certified Machine Learning: Feature Engineering Overview

(13)

Get Started

Sharpen your skills. Upgrade your career. Find the right learning path for you, based on your role and skills. Take part in hands-on practice, study for a certification, and much more - all personalized for you.

*Not included: Compliance, Leadership Development Program content, and Engineering books

Your content + our content + our platform = a path to learning success

Using our learning experience platform, Percipio, your learners can engage in custom learning paths that can feature curated content from all sources.

Learn More

Aspire to something bigger

Aspire Journeys are guided learning paths that set you in motion for career success.

Browse Aspire Journeys

Explore a world of live learning with Global Knowledge

Choose from convenient delivery formats to get the training you and your team need - where, when and how you want it.

Browse Live Learning

IT Skills & Salary Report

ESG Impact Report

AWS Certified Machine Learning Specialty: Exploratory Data Analysis Competency (Intermediate Level)

Topics covered

RECENTLY ADDED COURSES