Statistical Analysis and Modeling in R: Performing Classification

R Programming 4.0+    |    Expert
  • 13 videos | 1h 36m 32s
  • Includes Assessment
  • Earns a Badge
Likes 7 Likes 7
Classification models are used to classify or categorize data points into two or more categories. Learn how these models work and how you can evaluate your classification models using the confusion matrix and metrics such as accuracy, precision, and recall. During this course, you'll perform classification using both logistic regression and an imbalanced dataset. You'll also examine why precision or recall scores may be better metrics than accuracy to evaluate such models. Furthermore, build a classification model using decision trees, visualize the tree structure, and explore the variable importance assigned by this tree structure to understand and interpret the model. When you've finished this course, you'll be able to confidently use logistic regression and decision trees to build classification models and evaluate your models using accuracy, precision, and recall.


  • discover the key concepts covered in this course
    recall the key metrics to evaluate classifiers
    fit and interpret the S-curve of logistic regression
    train and evaluate a logistic regression model
    train and evaluate a logistic model using all predictors
    train a model on an imbalanced dataset
    interpret the significance of coefficients, confidence intervals, and odds ratios
  • evaluate a model built using an imbalanced dataset
    use resampling techniques to improve the model
    recall the basic structure of decision tree models
    explore and pre-process data before model fitting
    use decision tree models for prediction
    summarize the key concepts covered in this course


  • 2m 9s
    In this video, you’ll learn more about your instructor and this course. In this course, you’ll learn how classification models work and how to evaluate your classification model using the confusion matrix and metrics such as accuracy, precision, and recall. Then, you’ll perform classification using logistic regression and compute probabilities for outcomes. You’ll also perform classification using an imbalanced data set. Finally, you’ll build a classification model using decision trees. FREE ACCESS
  • 8m 21s
    In this video, you’ll learn more about classification models. You’ll learn classification models are used to categorize or classify data points into output categories that are discrete in nature. The output of a classification model is a categorical variable. Output is a discrete or a categorical value. The output cannot be any value in a range. It can only be a subset of allowed values. FREE ACCESS
  • Locked
    3.  Interpreting Logistic Regression Using R
    8m 12s
    In this video, you’ll watch a demo. You’ll learn how to train and use a logistic regression model for classification. Logistic regression is a classification model that classifies data into categories. You’ll see logistic regression fits an S-curve on your data. This S-curve can be used to predict binary outcomes. Logistic regression can be extended to perform multiclass classification. FREE ACCESS
  • Locked
    4.  Training and Evaluating a Logistic Regression Model
    10m 12s
    In this video, you’ll watch a demo. You’ll learn to train your logistic regression model to perform classification. First, you’ll split your data into training data and test data. You’ll use training data to train our model and test data will be used to evaluate your model. To ensure your split is replicable, you’ll set your random seed to 3. Next, you’ll invoke the sample.split function, which will split your data. FREE ACCESS
  • Locked
    5.  Building a Logistic Model in R Using all Predictors
    6m 23s
    In this video, you’ll watch a demo. You’ll learn how to build another logistic regression model using the predictors you have available. You'll invoke the glm function that’s your generalized linear model. The glm function can be used to build different families of models. Onscreen, this is a logistic regression model because you’ve specified family = "binomial". This will fit the logic function on the S-curve on your data and output probability scores. FREE ACCESS
  • Locked
    6.  Using R to Train a Model with Imbalanced Data
    8m 31s
    In this video, you’ll watch a demo. You’ll learn more about imbalanced data. Imbalanced data is one kind of skewed data. Skewness is a measure of asymmetry of the probability distribution of a random variable. If you look at a probability distribution and it's symmetric about the center, that’s a symmetric distribution. If the distribution tilts to the left or the right, that’s a skewed distribution. You’ll see two examples of skewed distributions onscreen. FREE ACCESS
  • Locked
    7.  Building and Evaluating Models with R
    7m 9s
    In this video, you’ll watch a demo. Now, you’ll split your dataset into training and test data to build and evaluate your model. So that you can replicate your splits, you’ll set the seed to three. Once that's done, you’ll invoke the sample.split function to split the data. The SplitRatio onscreen is 80%. You'll use 80% of your data to train your model and 20% to evaluate your model. FREE ACCESS
  • Locked
    8.  Using R to Evaluate Imbalanced Data Model Types
    6m 27s
    In this video, you’ll watch a demo. You’ll learn about prediction. You’ll invoke the predict function, pass in the logistic.model, specify the, and type is equal to "response". This will give you the prediction.probabilities you’ll use to compute the prediction labels. First, you’ll look at the prediction.probabilities onscreen. FREE ACCESS
  • Locked
    9.  Using Resampling Techniques on Imbalanced Data in R
    10m 35s
    In this video, you’ll watch a demo. You’ll learn resampling techniques for imbalanced data in R. You’ll see your current model is not able to identify what you need. You’ll fix this by increasing the number of records you have and resampling your original data. This allows you to artificially increase the number of records you have. You’ll get a larger sample by resampling the data you have to work with, with replacement. FREE ACCESS
  • Locked
    10.  Recognizing Decision Tree Models
    8m 1s
    In this video, you’ll learn to recognize decision tree models. The logistic regression algorithm you know fits an S-curve on your data. This S-curve is used to compute the probabilities of the outcome variable. You’ll learn different machine learning algorithms use different techniques to predict outcomes. The logistic regression algorithm used the S-curve. Now, you’ll see how to perform classification using decision trees. FREE ACCESS
  • Locked
    11.  Using R to Explore and Process Data
    6m 56s
    In this video, you’ll learn how to use R to explore and process data. First, you’ll invoke ggplot to plot a bar plot for the different styles of wine. Onscreen, you’ll see you have red wine and white wine. You’ll build a classification model to use the attributes of the wine to predict the quality of the wine. Onscreen, there are seven categories for quality ranging from 3 all the way through to 9. FREE ACCESS
  • Locked
    12.  Visualizing Decision Trees and Performing Prediction
    11m 11s
    In this video, you’ll train your decision tree.model. You’ll learn training the decision tree.model involves invoking the rpart function. rpart refers to recursive partitioning and regression trees, which is a certain type of decision tree. This fits a tree.model on your data. You’ll use this model to predict the quality and all remaining variables are predictors. That is shown by the formula onscreen. The data you’ll use to train this model is our FREE ACCESS
  • Locked
    13.  Course Summary
    2m 25s
    In this video, you’ll summarize what you’ve learned in this course. You’ve learned classification models used to predict output categories of data points. You used classification models to classify records into classes or categories. You also saw how classification models can be evaluated using metrics such as accuracy, precision, and recall. You also performed classification on a real-world dataset using a logistic regression model. You learned linear regression fits a straight line on our data. FREE ACCESS


Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.

Digital badges are yours to keep, forever.


Likes 10 Likes 10  
Likes 59 Likes 59