Statistical Analysis and Modeling in R: Performing Classification

R Programming | Expert

13 videos | 1h 36m 32s
Includes Assessment
Earns a Badge

(11)

From Channel:

R Programming

From Journey:

Data Analysis with R

Classification models are used to classify or categorize data points into two or more categories. Learn how these models work and how you can evaluate your classification models using the confusion matrix and metrics such as accuracy, precision, and recall. During this course, you'll perform classification using both logistic regression and an imbalanced dataset. You'll also examine why precision or recall scores may be better metrics than accuracy to evaluate such models. Furthermore, build a classification model using decision trees, visualize the tree structure, and explore the variable importance assigned by this tree structure to understand and interpret the model. When you've finished this course, you'll be able to confidently use logistic regression and decision trees to build classification models and evaluate your models using accuracy, precision, and recall.

WHAT YOU WILL LEARN

Discover the key concepts covered in this course

Recall the key metrics to evaluate classifiers

Fit and interpret the s-curve of logistic regression

Train and evaluate a logistic regression model

Train and evaluate a logistic model using all predictors

Train a model on an imbalanced dataset

Interpret the significance of coefficients, confidence intervals, and odds ratios
Evaluate a model built using an imbalanced dataset

Use resampling techniques to improve the model

Recall the basic structure of decision tree models

Explore and pre-process data before model fitting

Use decision tree models for prediction

Summarize the key concepts covered in this course

IN THIS COURSE

2m 9s

In this video, you’ll learn more about your instructor and this course. In this course, you’ll learn how classification models work and how to evaluate your classification model using the confusion matrix and metrics such as accuracy, precision, and recall. Then, you’ll perform classification using logistic regression and compute probabilities for outcomes. You’ll also perform classification using an imbalanced data set. Finally, you’ll build a classification model using decision trees. FREE ACCESS
8m 21s

In this video, you’ll learn more about classification models. You’ll learn classification models are used to categorize or classify data points into output categories that are discrete in nature. The output of a classification model is a categorical variable. Output is a discrete or a categorical value. The output cannot be any value in a range. It can only be a subset of allowed values. FREE ACCESS
3. Interpreting Logistic Regression Using R

8m 12s

In this video, you’ll watch a demo. You’ll learn how to train and use a logistic regression model for classification. Logistic regression is a classification model that classifies data into categories. You’ll see logistic regression fits an S-curve on your data. This S-curve can be used to predict binary outcomes. Logistic regression can be extended to perform multiclass classification. FREE ACCESS
4. Training and Evaluating a Logistic Regression Model

10m 12s

In this video, you’ll watch a demo. You’ll learn to train your logistic regression model to perform classification. First, you’ll split your data into training data and test data. You’ll use training data to train our model and test data will be used to evaluate your model. To ensure your split is replicable, you’ll set your random seed to 3. Next, you’ll invoke the sample.split function, which will split your data. FREE ACCESS
5. Building a Logistic Model in R Using all Predictors

6m 23s

In this video, you’ll watch a demo. You’ll learn how to build another logistic regression model using the predictors you have available. You'll invoke the glm function that’s your generalized linear model. The glm function can be used to build different families of models. Onscreen, this is a logistic regression model because you’ve specified family = "binomial". This will fit the logic function on the S-curve on your data and output probability scores. FREE ACCESS
6. Using R to Train a Model with Imbalanced Data

8m 31s

In this video, you’ll watch a demo. You’ll learn more about imbalanced data. Imbalanced data is one kind of skewed data. Skewness is a measure of asymmetry of the probability distribution of a random variable. If you look at a probability distribution and it's symmetric about the center, that’s a symmetric distribution. If the distribution tilts to the left or the right, that’s a skewed distribution. You’ll see two examples of skewed distributions onscreen. FREE ACCESS
7. Building and Evaluating Models with R

7m 9s

In this video, you’ll watch a demo. Now, you’ll split your dataset into training and test data to build and evaluate your model. So that you can replicate your splits, you’ll set the seed to three. Once that's done, you’ll invoke the sample.split function to split the data. The SplitRatio onscreen is 80%. You'll use 80% of your data to train your model and 20% to evaluate your model. FREE ACCESS
8. Using R to Evaluate Imbalanced Data Model Types

6m 27s

In this video, you’ll watch a demo. You’ll learn about prediction. You’ll invoke the predict function, pass in the logistic.model, specify the test.data, and type is equal to "response". This will give you the prediction.probabilities you’ll use to compute the prediction labels. First, you’ll look at the prediction.probabilities onscreen. FREE ACCESS
9. Using Resampling Techniques on Imbalanced Data in R

10m 35s

In this video, you’ll watch a demo. You’ll learn resampling techniques for imbalanced data in R. You’ll see your current model is not able to identify what you need. You’ll fix this by increasing the number of records you have and resampling your original data. This allows you to artificially increase the number of records you have. You’ll get a larger sample by resampling the data you have to work with, with replacement. FREE ACCESS
10. Recognizing Decision Tree Models

8m 1s

In this video, you’ll learn to recognize decision tree models. The logistic regression algorithm you know fits an S-curve on your data. This S-curve is used to compute the probabilities of the outcome variable. You’ll learn different machine learning algorithms use different techniques to predict outcomes. The logistic regression algorithm used the S-curve. Now, you’ll see how to perform classification using decision trees. FREE ACCESS
11. Using R to Explore and Process Data

6m 56s

In this video, you’ll learn how to use R to explore and process data. First, you’ll invoke ggplot to plot a bar plot for the different styles of wine. Onscreen, you’ll see you have red wine and white wine. You’ll build a classification model to use the attributes of the wine to predict the quality of the wine. Onscreen, there are seven categories for quality ranging from 3 all the way through to 9. FREE ACCESS
12. Visualizing Decision Trees and Performing Prediction

11m 11s

In this video, you’ll train your decision tree.model. You’ll learn training the decision tree.model involves invoking the rpart function. rpart refers to recursive partitioning and regression trees, which is a certain type of decision tree. This fits a tree.model on your data. You’ll use this model to predict the quality and all remaining variables are predictors. That is shown by the formula onscreen. The data you’ll use to train this model is our training.data. FREE ACCESS
13. Course Summary

2m 25s

In this video, you’ll summarize what you’ve learned in this course. You’ve learned classification models used to predict output categories of data points. You used classification models to classify records into classes or categories. You also saw how classification models can be evaluated using metrics such as accuracy, precision, and recall. You also performed classification on a real-world dataset using a logistic regression model. You learned linear regression fits a straight line on our data. FREE ACCESS

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.

Digital badges are yours to keep, forever.

Course Low-code ML with KNIME: Building Classification Models

(1)

Course Predictive Analytics: Performing Classification Using Machine Learning

(6)

Course Statistical Analysis and Modeling in R: Statistical Analysis on Your Data

(34)

PEOPLE WHO VIEWED THIS ALSO VIEWED THESE

Course R Programming for Beginners: Getting Started

(176)

Course Professional in Human Resources: Introduction to PHR(r) Exam

(662)

Course CompTIA Data+: Inferential Statistical Methods

(5)

Get Started

Sharpen your skills. Upgrade your career. Find the right learning path for you, based on your role and skills. Take part in hands-on practice, study for a certification, and much more - all personalized for you.

*Not included: Compliance, Leadership Development Program content, and Engineering books

Your content + our content + our platform = a path to learning success

Using our learning experience platform, Percipio, your learners can engage in custom learning paths that can feature curated content from all sources.

Learn More

Aspire to something bigger

Aspire Journeys are guided learning paths that set you in motion for career success.

Browse Aspire Journeys

Explore a world of live learning with Global Knowledge

Choose from convenient delivery formats to get the training you and your team need - where, when and how you want it.

Browse Live Learning

IT Skills & Salary Report

ESG Impact Report

Statistical Analysis and Modeling in R: Performing Classification

WHAT YOU WILL LEARN

IN THIS COURSE

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

YOU MIGHT ALSO LIKE

PEOPLE WHO VIEWED THIS ALSO VIEWED THESE