Statistical Analysis and Modeling in R: Building Regularized Models & Ensemble Models

R Programming    |    Expert
  • 14 videos | 1h 31m 11s
  • Includes Assessment
  • Earns a Badge
Rating 4.4 of 17 users Rating 4.4 of 17 users (17)
Understanding the bias-variance trade-off allows data scientists to build generalizable models that perform well on test data. Machine learning models are considered a good fit if they can extract general patterns or dominant trends in the training data and use these to make predictions on unseen instances. Use this course to discover what it means for your model to be a good fit for the training data. Identify underfit and overfit models and what the bias-variance trade-off represents in machine learning. Mitigate overfitting on training data using regularized regression models, train and evaluate models built using ridge regression, lasso regression, and ElasticNet regression, and implement ensemble learning using the random forest model. When you're done with this course, you'll have the skills and knowledge to train models that learn general patterns using regularized models and ensemble learning.

WHAT YOU WILL LEARN

  • Discover the key concepts covered in this course
    Recall characteristics of overfitted and underfitted models
    Describe the bias-variance trade-off
    Examine and interpret the data for regression
    Perform ordinary least squares (osl) regression
    Prepare data to build regularized regression models
    Perform and evaluate ridge regression
  • Perform and evaluate lasso regression
    Perform and evaluate elasticnet regression
    Outline the main characteristics of ensemble learning
    Examine and visualize data for regression
    Perform regression using decision trees
    Perform regression using random forest
    Summarize the key concepts covered in this course

IN THIS COURSE

  • 2m 7s
    In this video, you’ll learn more about the course and your instructor. In this course, you’ll learn what it means for your model to be a good fit for the training data. You’ll discover the characteristics of Underfit and Overfit models and what the bias-variance tradeoff represents in machine learning. You’ll learn to mitigate overfitting on the training data using regularized regression models. You’ll use ridge regression, lasso regression, and elastic net regression. FREE ACCESS
  • 9m 51s
    In this video, you’ll learn more about machine learning models. These models are said to be good models if they’re a good fit for your underlying data. Machine learning algorithms can be divided into two categories: supervised learning and unsupervised learning algorithms. Regression and classification are examples of supervised learning techniques which are trained using labeled training data. FREE ACCESS
  • Locked
    3.  The Bias-Variance Trade-off
    6m 18s
    In this video, you’ll learn more about the bias-variance trade-off. First, you’ll look at the different kinds of errors that can exist within machine learning models. Errors in machine learning models can be categorized into three broad categories. There are bias errors, variance, and irreducible errors. Bias errors are erroneous assumptions made by machine learning algorithms. FREE ACCESS
  • Locked
    4.  Exploring and Understanding Data for Regression
    9m 21s
    In this video, you’ll watch a demo. Here, you’ll explore regularized regression models. You’ll see regularized regression models tweak the objective function that OLS regression uses. In addition to the original objective function for OLS regression, regularized models add a penalty term to this objective function. This penalty term will penalize complex coefficients in your model. Regularized regression models allow you to mitigate the effects of overfitting on the training data. FREE ACCESS
  • Locked
    5.  Performing Ordinary Least Squares (OLS) Regression
    5m 41s
    In this video, you’ll watch a demo. You’ll split the data you’re working with into training data and test data. First, you’ll invoke set.seed and pass in (1). Then, you’ll use sample.split to split the admission.data. Your target variable is Chance.of.Admit. That's what has been specified as an input argument. You’ll see the SplitRatio is 80%. This sample.split will generate a mask that will have true values for 80% of the records. FREE ACCESS
  • Locked
    6.  Preparing Data for Regularized Regression Models
    4m 31s
    In this video, you’ll watch a demo covering regularized regression models. First, you’ll need to pre-process your data to fit into your Ridge, Lasso, and ElasticNet models. These are the functions you'll be using for your regularized regression model. These functions require data in the form of a model matrix. A model.matrix creates a design matrix by expanding factors in the data frame to a set of dummy variables, expanding interactions similarly. FREE ACCESS
  • Locked
    7.  Performing Ridge Regression in R
    10m 42s
    In this video, you’ll learn more about performing ridge regression in R. You’ll see regularized regression models mitigate overfitting on the training data. This allows the model to learn more general patterns that exist in the data rather than specific patterns which don’t have much predictive ability. FREE ACCESS
  • Locked
    8.  Performing Lasso Regression in R
    9m 19s
    In this video, you’ll learn more about performing lasso regression in R. Here, you’ll start with ordinary least squares regression. You’ve learned the objective function of this regression is to minimize the sum of squared errors. Lasso regression is a regularized regression model, which means this objective function of lasso regression is similar to ordinary least squares regression. It has an additional penalty added for complex coefficients. FREE ACCESS
  • Locked
    9.  Performing ElasticNet Regression in R
    6m 43s
    In this video, you’ll learn more about performing ElasticNet regression in R. ElasticNet regression can be thought of as a combination of lasso and ridge regression. The penalty used by ElasticNet is some combination of L1 regularization and L2 regularization. You have the tuning parameter lambda needed to specify, but lambda is then multiplied by some combination of L1 and L2 regularization. ElasticNet regularization incorporates penalties from both. FREE ACCESS
  • Locked
    10.  Recognizing Ensemble Learning
    7m 36s
    In this video, you’ll learn more about ensemble learning. You’ll learn that in statistics and machine learning, ensemble methods use multiple-learning algorithms to obtain better predictive performance than can be obtained from any of the individual learning algorithms alone. Ensemble learning techniques try to harness the wisdom of crowds. With multiple models, the predictions are better than the prediction of any of the individual models. FREE ACCESS
  • Locked
    11.  Using R to Explore and Visualize Data
    4m 54s
    In this video, you’ll watch a demo. Onscreen, you’ll see you need to use ggplot to set up a bar plot to see how many countries you have from different regions. This gives you a quick overview of the countries in your dataset. You’ll see the different colors represent the different Region, Western Europe, Sub-Saharan Africa, and Asia. Here, you’ll see how the HappinessScore is distributed across regions. FREE ACCESS
  • Locked
    12.  Performing Regression Using Decision Trees in R
    5m 45s
    In this video, you’ll watch a demo. You’ll first build a decision tree regression model, and then move on to the random forest model. First, you’ll invoke set.seed and pass in (3). Then, you’ll invoke sample.split, specify your target variable, which is HappinessScore, and your SplitRatio of 0.8. This will give you a mask with true and false values. You’ll see you have 125 records to train in your regression model. FREE ACCESS
  • Locked
    13.  Performing Regression Using Random Forest in R
    6m 2s
    In this video, you’ll watch a demo. Here, you’ll run a random forest regression. To build a random forest model, you’ll need to install an additional package. You’ll invoke install.packages for the package "randomForest". This can be found on line 115. Now, you’ll include this package in your current program by invoking the library function for randomForest. This contains the function you’ll use to build your random forest model. FREE ACCESS
  • Locked
    14.  Course Summary
    2m 22s
    In this video, you’ll summarize what you’ve learned in the course. In this course, you’ve learned to mitigate our models from overfitting on the training data. You also learned about the bias-variance trade-off that every data engineer has to keep in mind while building and training machine learning models. You saw how to mitigate overfitting for regression models using regularization. Finally, you explored ensemble learning. FREE ACCESS

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.

Digital badges are yours to keep, forever.

YOU MIGHT ALSO LIKE

PEOPLE WHO VIEWED THIS ALSO VIEWED THESE

Rating 4.4 of 89 users Rating 4.4 of 89 users (89)
Rating 4.6 of 34 users Rating 4.6 of 34 users (34)
Rating 3.9 of 139 users Rating 3.9 of 139 users (139)