Big Data Awareness (Entry Level)

  • 15m
  • 15 questions
The Big Data Awareness benchmark measures whether a learner has exposure to big data concepts, including what big data is, various sources of big data, formats, applications, and use cases for big data analytics. A learner who scores high on this benchmark demonstrates that they have the foundational knowledge of big data.

Topics covered

  • briefly describe traditional data and data warehousing architecture
  • compare and contrast parallel and distributed computing systems
  • compare key differences in ETL (extract, transform, load) and ELT (extract, load, transform) systems and describe how ETL is used with traditional data architectures and ELT with modern ones
  • compare structured and unstructured data and describe how the ability to extract value from unstructured data is important when dealing with big data
  • define the big 7 characteristics that define big data: volume, velocity, variety, variability, veracity, visualization, and value
  • describe how business intelligence analytics has developed from traditional to modern approaches
  • describe the concept of big data and the history behind it
  • describe the difference between data warehousing and big data and specify the impact that big data has had on data warehousing
  • describe the difference between horizontal and vertical scaling and specify why horizontal scaling is the best choice with respect to big data
  • distinguish between raw data, information, applicable knowledge, and general wisdom
  • identify the sources that are capable of generating big data
  • list and describe the limitations of traditional data architecture, including limitations on speed, scalability, compatibility, and consumption
  • list and describe the limitations of using ETL systems when working with data, including limitations on performance, scalability, and structure
  • list the most commonly used data sources and formats
  • specify why real-time processing is advantageous when dealing with large amount of data