Statistical Analysis and Modeling in R: Performing Clustering

R Programming    |    Expert
  • 7 videos | 49m 44s
  • Includes Assessment
  • Earns a Badge
Rating 4.3 of 12 users Rating 4.3 of 12 users (12)
Clustering is an unsupervised learning algorithm that self-discovers patterns in data and helps identify logical groupings. Use this course to distinguish between supervised and unsupervised learning and recognize how regression and classification algorithms differ from clustering. Examine the basic principles of clustering models and how k-means clustering finds logical groupings in your data. Learn the evaluation techniques used in clustering and find the optimal number of clusters in your data using both the elbow method and the Silhouette score. Perform clustering on a dataset with multiple attributes and visualize clusters in your data using principal components. When you've completed this course, you'll be able to find groupings in your data using k-means clustering and compute the optimal number of clusters for your data.

WHAT YOU WILL LEARN

  • Discover the key concepts covered in this course
    Recall the techniques used to evaluate clustering models
    Investigate and visualize data before fitting a model
    Perform k-means clustering and interpret clustering results
  • Find the optimal number of clusters using the elbow method and silhouette score
    Perform k-means clustering on multi-attribute data
    Summarize the key concepts covered in this course

IN THIS COURSE

  • 2m 10s
    In this video, you’ll learn more about your instructor and this course. In this course, you’ll learn the differences between supervised and unsupervised learning and see how regression and classification algorithms differ from clustering in how they learn patterns from data. You’ll discover the basic principles of clustering models and get an overview of how K means clustering works to find logical groupings in your data. You’ll also learn evaluation techniques used in clustering. FREE ACCESS
  • 11m 55s
    In this video, you’ll learn more about Clustering. Clustering is a machine learning technique used to find patterns or logical groupings in data. Clustering as a machine learning technique is fundamentally different from regression and classification algorithms. Both regression and classifications are supervised machine learning techniques. Supervised learning techniques require training data, which are classified or labeled to learn from that data. Supervised algorithms use training data that have labeled values. FREE ACCESS
  • Locked
    3.  Investigating and Visualizing Clustering Data in R
    3m 46s
    In this video, you’ll watch a demo. In this demo, you'll see how to perform k-means clustering on data that has just two variables that can be expressed on the two axes of a coordinate plane. First, you’ll invoke rm list = ls to clear your R programming memory to start afresh. Before you perform cluster analysis on your data, you'll visualize and understand your data. You’ll use ggplot2. FREE ACCESS
  • Locked
    4.  Performing K-means Clustering, Interpreting Results
    10m 25s
    In this video, you’ll watch a demo. In this demo, you'll run your k-means Clustering algorithm. You’ll notice the k-means algorithm requires you to specify the number of clusters you want to find upfront. You’ll invoke the k-means function, pass in the data you want to perform clustering on, and that’s the income.score.data, iter.max is set to 10. This means you’ll want the k-means algorithm to run for 10 iterations. FREE ACCESS
  • Locked
    5.  Using R to Find the Optimal Number of Clusters
    10m 56s
    In this video, you’ll watch a demo. In this demo, you'll see how to use the elbow method to determine the right number of clusters for your data. You'll be using functions from the tidyverse collection of data science packages. First, you’ll install the tidyverse package within your current R programming environment. Once the package has been installed, you’ll include the tidyverse as a part of your current program using the library function. FREE ACCESS
  • Locked
    6.  Using K-means Clustering on Multi-attribute Data
    8m 23s
    In this video, you’ll watch a demo. In this demo, you'll learn how to apply the k-means clustering algorithm on data with multiple variables. You’ll learn how to find and visualize clusters in such data. You’ll start by invoking the rm function passing in list = ls to remove objects from R's memory. Then you’ll include a number of packages in your current program using the library function, ggplot2, factoextra, and the cluster packages. FREE ACCESS
  • Locked
    7.  Course Summary
    2m 9s
    In this video, you’ll summarize what you’ve learned in this course. You learned to apply an unsupervised learning technique to find logical patterns in unlabeled data. You learned the difference between supervised and unsupervised learning. You also saw that regression and classification models were both examples of supervised learning and used labeled training data to improve your model. You learned Clustering is an unsupervised learning technique that self discovers patterns in data. FREE ACCESS

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.

Digital badges are yours to keep, forever.

YOU MIGHT ALSO LIKE

Rating 4.7 of 6 users Rating 4.7 of 6 users (6)
Rating 4.0 of 2 users Rating 4.0 of 2 users (2)
Rating 5.0 of 1 users Rating 5.0 of 1 users (1)

PEOPLE WHO VIEWED THIS ALSO VIEWED THESE

Rating 4.4 of 176 users Rating 4.4 of 176 users (176)
Rating 4.3 of 107 users Rating 4.3 of 107 users (107)
Rating 4.5 of 11 users Rating 4.5 of 11 users (11)