Previous Page

Getting Started with Hive: Optimizing Query Executions

Getting Started with Hive: Optimizing Query Executions


Overview/Description
Expected Duration
Lesson Objectives
Course Number
Expertise Level



Overview/Description

Apache Hive is one of the most popular data warehouses out in the market used for data science. Hive allows analysis of big data by means of a simple query interface. In this Skillsoft Aspire course, you will explore the optimizations that allow Hive to handle parallel processing of data, while users can still contribute to improving query performance.



Expected Duration (hours)
0.7

Lesson Objectives

Getting Started with Hive: Optimizing Query Executions

  • Course Overview
  • recognize how Hive translates queries to Hadoop MapReduce operations
  • identify the different options available in Hive to optimize query execution
  • recall how partitioning of a dataset can help queries run efficiently and identify the types of partitioning available in Hive
  • specify how bucketing improves query performance and compare it with partitioning a dataset
  • identify how to join tables in Hive to ensure the best performance of your query
  • work with techniques to improve performance and work with partitioning, bucketing and structured queries
  • Course Number:
    it_dsgshvdj_04_enus

    Expertise Level
    Intermediate