Optimizing Query Executions with Hive

Apache Hive 2.3.2
  • 7 Videos | 44m 39s
  • Includes Assessment
  • Earns a Badge
Likes 17 Likes 17
In this 7-video Skillsoft Aspire course, learners can explore optimizations allowing Apache Hive to handle parallel processing of data, while users can still contribute to improving query performance. For this course, learners should have previous experience with Hive and familiarity with querying big data for analysis purposes. The course focuses only on concepts; no queries are run. Learners begin to understand how to optimize query executions in Hive, beginning with exploring different options available in Hive to query data in an optimal manner. Discuss how to split data into smaller chunks, specifically, partitioning and bucketing, so that queries need not scan full data sets each time. Hive truly democratizes access to data stored in a Hadoop cluster, eliminating the need to know MapReduce to process cluster data, and makes data accessible using the Hive query language. All files in Hadoop are exposed in the form of tables. Watch demonstrations of structuring queries to reduce numbers of map reduce operations generated by Hive, and speeding up query executions.  Other concepts covered include partitioning, bucketing, and joins.

WHAT YOU WILL LEARN

  • recognize how Hive translates queries to Hadoop MapReduce operations
    identify the different options available in Hive to optimize query execution
    recall how partitioning of a dataset can help queries run efficiently and identify the types of partitioning available in Hive
  • specify how bucketing improves query performance and compare it with partitioning a dataset
    identify how to join tables in Hive to ensure the best performance of your query
    work with techniques to improve performance and work with partitioning, bucketing and structured queries

IN THIS COURSE

  • Playable
    1. 
    Course Overview
    2m 18s
    UP NEXT
  • Playable
    2. 
    Hive Queries as MapReduce Jobs
    4m 52s
  • Locked
    3. 
    Techniques to Improve Query Performance in Hive
    6m 50s
  • Locked
    4. 
    Partitioning Tables in Hive
    8m 36s
  • Locked
    5. 
    Bucketing Tables in Hive
    7m 17s
  • Locked
    6. 
    Structuring Join Queries in Hive
    4m 36s
  • Locked
    7. 
    Exercise: Optimizing Query Execution in Hive
    7m 40s

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion of this course, which can be shared on any social network or business platform

Digital badges are yours to keep, forever.

YOU MIGHT ALSO LIKE

PEOPLE WHO VIEWED THIS ALSO VIEWED THESE

Likes 131 Likes 131  
Likes 41 Likes 41