Datasets in R: Selecting, Filtering, Ordering, & Grouping Data

R Programming 4.0+    |    Intermediate
  • 12 videos | 1h 34m 30s
  • Includes Assessment
  • Earns a Badge
Likes 10 Likes 10
Data analysis often requires performing a series of complex transformations. R makes this hassle-free via the forward pipe operator for chaining operations, data selection and filtering based on conditional operations, and grouping and aggregating options to compute summaries. Learn how to carry out all these operations in this course. Task you'll carry out include using logical and relational operators to perform conditional filtering, sampling records at random, and computing the top N records based on values in a variable. You'll also learn to use the forward pipe operator in the magrittr package and tibbles, the next-generation data frame, to store and transform your data. You'll round this course off by performing ordering, grouping, and aggregations on your data. When you're finished, you'll have a solid grasp of complex operations on data frames and be able to apply these concepts using the R programming language.


  • discover the key concepts covered in this course
    edit data frame columns to be of the right data type
    select variables from data frames
    filter data using relational operators
    use the select() function and chaining to filter data in tibbles
    use the %>% operator and the filter() function to filter tibbles
  • sample rows using sample() and select top N rows using top_n()
    change columns to be of their logically correct data type
    use the order() and arrange() functions to sort data frames
    create crosstabs and view the aggregate statistics of data frames
    view aggregate statistics of tibbles with summarize() and group_by()
    summarize the key concepts covered in this course


  • 2m 5s
    In this video, you’ll learn about the instructor and the course. In this course, you’ll learn to select and filter only the data you want to work with using different selection criteria and filtering techniques. You’ll use logical and relational operators to perform conditional filtering. You’ll sample records at random and compute the top n records based on the values in a variable. You’ll also learn how you can chain operations on your data frame. FREE ACCESS
  • 7m 6s
    In this video, you’ll watch a demo. In this demo, you’ll perform a range of operations in R that allow you to select and filter data stored in data frames or in data frame like formats. The name of the file where you’ll write our code is called SelectionAndFiltering.R. First, you’ll perform the rm list is equal to ls command to get rid of any objects that are currently in your R memory. FREE ACCESS
  • Locked
    3.  Selecting Specific Rows and Columns
    6m 22s
    In this video, you’ll watch a demo. In this demo, you’ll perform selection and filtering operations using raw data frames. First, you’ll take a look at the column names in R data. You’ll invoke the colnames function and pass in your data frame. You’ll see there are a total of 13 columns starting from CLIENTNUM. The 13th column is the total revolving balance of a customer. FREE ACCESS
  • Locked
    4.  Filtering Operations on Data Frame Rows
    10m 49s
    In this video, you’ll watch a demo. In this demo, you’ll see row selection within a data frame also allows you to select rows based on a condition. Rather than specifying the index values of a row, you can specify that you’d like to select rows that match a certain condition. Onscreen, you’ll see square brackets are used to index into the bank churners data frame. FREE ACCESS
  • Locked
    5.  Selecting and Filtering Using Packages in tidyverse
    10m 27s
    In this video, you’ll watch a demo. In this demo, you’ll use functions and packages from the tidyverse universe to perform selection and filtering on your data frames. Instead of working with data frames, you’ll work with the tibble format. In the tidyverse universe, which R packages for data science, data is stored in tibbles rather than data frames. You’ll see tibbles and data frames are alike because they store data in a tabular format. FREE ACCESS
  • Locked
    6.  Using the dplyr filter() Function
    8m 19s
    In this video, you’ll watch a demo. In this demo, you’ll learn about the different functions available in the dplyr package. These allow you to slice and filter your data. The first function you’ll look at is the slice function. Slice allows you to specify the index values of the records you want to select.The slice operation is performed on the bank.churners tibble, and you’ll specify this using the forward pipe operator. FREE ACCESS
  • Locked
    7.  Retrieving Samples and Top N Results
    7m 17s
    In this video, you’ll watch a demo. In this demo, you’ll explore functions from the dplyr package which allow you to sample records at random. You’ll discover the sample_n function allows you to sample any number of records from the original data frame. Onscreen you’ll see how to feed in the bank.churners tibble to sample_n using the forward pipe operator. Sample_ n takes as input arguments the samples you want from the original data. FREE ACCESS
  • Locked
    8.  Specifying the Correct Data Types for Columns
    7m 17s
    In this video, you’ll watch a demo. In this demo, you’ll cover a range of operations you can perform on R dataframes and tibbles. You’ll learn to order your data, group your data, and then perform aggregations. You’ll start by looking at different techniques you can use to order or sort your data, starting with clean dataframes and moving on to tibbles. First, you’ll run rm(list=ls()), to get rid of any existing objects in memory. FREE ACCESS
  • Locked
    9.  Sorting Using Order and Arrange
    9m 21s
    In this video, you’ll watch a demo. In this demo, you’ll use the order function to sort the records in your dataframe. You’ll see you can use it to order by any column. Onscreen, you’ve ordered the dataframe records based on the price of the car. The default is to order in ascending order from the lowest price to the highest. FREE ACCESS
  • Locked
    10.  Grouping and Aggregations on Data Frames
    12m 8s
    In this video, you’ll watch a demo. In this demo, you’ll move on to grouping operations. First, you’ll perform grouping without using functions from packages that belong to the tidyverse universe. You'll perform grouping using functions from the base R package. You’ll see that if you want to group records and see the counts of records in different categories, the easiest way to do this is to build a contingency table using the table function. FREE ACCESS
  • Locked
    11.  Grouping and Aggregation Using dplyr
    11m 21s
    In this video, you’ll watch a demo. In this demo, you’ll learn how to compute aggregations on your data using functions from the dplyr package. The dplyr package offers functions to quickly summarize and aggregate your data. You’ll look at functions in the dplyr package that work with dataframes and with tibbles. Since you’re in the tidyverse universe, you’ll use the tibble for all your aggregations. FREE ACCESS
  • Locked
    12.  Course Summary
    1m 59s
    In this video, you’ll summarize what you’ve learned in the course. In this course, you’ve learned a wide variety of data transformations and manipulation techniques, using R data frames and tibbles from the tidyverse universe. You explored the different techniques used to select columns or variables in a data frame. We learned the data types associated with the variables in our data and performed indexing operations to select specific rows and columns. FREE ACCESS


Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.

Digital badges are yours to keep, forever.