SKILL BENCHMARK

Acquiring and Cleaning Data in R Competency (Intermediate Level)

  • 17m
  • 17 questions
The Acquiring and Cleaning Data in R Competency benchmark measures whether a learner has had exposure and experience in gathering data, identifying dirty data, and cleaning the data in R. A learner who scores high on this benchmark demonstrates knowledge and experience in getting the data in various formats, understanding the data, and cleaning the data using R libraries for data analysis.

Topics covered

  • apply a summary function using dplyr
  • combine two related datasets using a join operation
  • export tabular data from R to a CSV file
  • export tabular data from R to an Excel spreadsheet
  • export tabular data from R to an HTML table
  • fetch a JSON document over HTTP and load it using dplyr
  • handle common errors encountered when reading CSV data
  • load multiple sheets from an Excel document
  • read data from a CSV formatted text file
  • read data from an Excel spreadsheet
  • read data from a relational database using a SQL query
  • read tabular data from a HTML file
  • recognize criteria for ensuring data quality
  • recognize types of unclean data
  • reshape tabular data by spreading values from rows to columns
  • use a regular expression to extract data into a new column
  • use mean imputation to replace missing values