Data Analysis with R Proficiency

  • 20m
  • 20 questions
The Data Analysis with R Proficiency benchmark measures whether a learner has had significant exposure and experience in performing data analysis operations using R. A learner who scores high on this benchmark demonstrates an independent knowledge in using various R libraries for data analysis, model building, and deployments.

Topics covered

  • create a reference class that inherits or derives from another reference class
  • create lists containing data of different types
  • examine how to fit a straight line on data to build a regression model and evaluate the model
  • explore the ANOVA (analysis of variance) test to compare the means of two or more groups
  • find the optimal number of clusters using the elbow method and Silhouette score
  • implement closures, which include the environment, body, and input arguments to a function
  • interpret QQ plots for normally and non-normally distributed data
  • perform joins on data frames using the merge() function
  • perform regression using random forest
  • recall how string values can be set to be factors in data frames
  • recall the functions print() invokes based on the type of input argument
  • recall the use of R environments as bindings of variable names to values
  • reformat a real-world dataset
  • run the two-sample t-test for equal variances
  • run the two-way ANOVA test for additive and interaction models
  • sample rows using sample() and select top N rows using top_n()
  • train a model on an imbalanced dataset
  • use the melt() and dcast() functions to reformat data frames
  • use the sapply(), vapply(), and tapply() functions to apply functions to elements in vectors
  • use vignettes for help on packages