Using Hive to Optimize Query Executions with Partitioning

Apache Hive 2.3.2    |    Intermediate
  • 10 videos | 1h 47s
  • Includes Assessment
  • Earns a Badge
Rating 4.8 of 14 users Rating 4.8 of 14 users (14)
Continue to explore the versatility of Apache Hive, among today's most popular data warehouses, in this 10-video Skillsoft Aspire course. Learners are shown ways to optimize query executions, including the powerful technique of partitioning data sets. The hands-on course assumes previous work with Hive tables using the Hive query language and in processing complex data types, along with theoretical understanding of improving query performance by partitioning very large data sets. Demonstrations focus on basics of partitioning and how to create partitions and load data into them. Learners work with both Hive-managed tables and external tables to see how partitioning works for each; then watch navigating to the shell of the Hadoop master node, and creating new directories in the Hadoop file system. Observe dynamic partitioning of tables and how this simplifies loading of data into partitions. Finally, you explore how using multiple columns in a table can partition data within it. During this course, learners will acquire a sound understanding of how exactly large data sets can be partitioned into smaller chunks, improving query performance.

WHAT YOU WILL LEARN

  • Use the google cloud platform's dataproc service to provision a hadoop cluster. not required if you already have a hadoop environment set up with hive
    Define a table which will contain data partitioned based on the value in one of its columns
    Insert data into partitions of a hive table and explore the partition and its data on hdfs
    Load data into table partitions from files
    Create and populate partitions in an external table
  • Alter the definition of a partition to modify its contents
    Define and work with dynamic partitions on your hive tables
    Configure a table to use more than one column to define partitions and explore the partition on hdfs
    use partitioning to boost query performance in hdfs

IN THIS COURSE

  • 2m 26s
  • 4m 52s
    Learn how to use the Google Cloud Platform's Dataproc service to provision a Hadoop cluster. This is not required if you already have a Hadoop environment set up with Hive. FREE ACCESS
  • Locked
    3.  Creating a Partitioned Table in Hive
    6m 16s
    Learn how to define a table which will contain data partitioned based on the value in one of its columns. FREE ACCESS
  • Locked
    4.  Working with Partitions in Hive
    7m 2s
    In this video, you will learn how to insert data into partitions of a Hive table and explore the partition and its data on HDFS. FREE ACCESS
  • Locked
    5.  Populating Partitions in Hive
    7m 43s
    In this video, you will learn how to load data into table partitions from files. FREE ACCESS
  • Locked
    6.  Partitioning External Tables in Hive
    7m 21s
    In this video, find out how to create and populate partitions in an external table. FREE ACCESS
  • Locked
    7.  Modifying Partitions in Hive
    4m 28s
    In this video, you will change the definition of a partition to modify its contents. FREE ACCESS
  • Locked
    8.  Dynamic Partitions in Hive
    7m 12s
    In this video, you will define and work with dynamic partitions on your Hive tables. FREE ACCESS
  • Locked
    9.  Using Multiple Columns for Partitioning in Hive
    7m 47s
    Find out how to configure a table to use more than one column to define partitions and explore the partitions on HDFS. FREE ACCESS
  • Locked
    10.  Exercise: Optimize Executions with Partitioning
    5m 41s
    In this video, you will use partitioning to improve query performance in HDFS. FREE ACCESS

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.

Digital badges are yours to keep, forever.

PEOPLE WHO VIEWED THIS ALSO VIEWED THESE

Rating 4.0 of 54 users Rating 4.0 of 54 users (54)
Rating 4.7 of 7 users Rating 4.7 of 7 users (7)
Rating 4.9 of 7 users Rating 4.9 of 7 users (7)