Loading & Querying Data with Hive

Apache Hive 2.3.2    |    Beginner
  • 13 Videos | 1h 24m 53s
  • Includes Assessment
  • Earns a Badge
Likes 35 Likes 35
Among the market’s most popular data warehouses used for data science, Apache Hive simplifies working with large data sets in files by representing them as tables. In this 12-video Skillsoft Aspire course, learners explore how to create, load, and query Hive tables. For this hands-on course, learners should have a conceptual understanding of Hive and its basic components, and prior experience with querying data from tables using SQL (structured query language) and with using the command line. Key concepts covered include cluster, joining tables, and modifying tables. Demonstrations covered include using the Beeline client for Hive for simple operations; creating tables, loading them with data, and then running queries against them. Only tables with primitive data types are used here, with data loaded into these tables from HDFS (Hadoop Distributed File System) file system and local machines. Learners will work with Hive metastore and temporary tables, and how they can be used. You will become familiar with basics of using the Hive query language and quite comfortable working with HDFS.

WHAT YOU WILL LEARN

  • use the Google Cloud Platform's Dataproc service to provision a Hadoop cluster
    define and create a simple table in Hive using the Beeline client
    load a few rows of data into a table and query it with simple select statements
    run Hive queries from the shell of a host where a Hive client is installed
    define and run a join query involving two related tables
    describe the structure of the Hive Metastore on the Hadoop Distributed File System (HDFS)
  • create, load data into, and query an external table in Hive and contrast it with a Hive-managed table
    use the alter table statement to change the definition of a Hive table
    work with temporary tables that are only valid for a single Hive session and recognize how they differ from regular tables
    populate Hive tables with data in files on both HDFS and the file system of the Hive client
    load data into multiple tables from the contents of another table
    use the Hadoop shell to execute Hive query scripts and work with Hive tables

IN THIS COURSE

  • Playable
    1. 
    Course Overview
    2m 14s
    UP NEXT
  • Playable
    2. 
    Setting up a Hadoop Cluster on the Google Cloud
    6m 22s
  • Locked
    3. 
    Creating a Hive Table
    6m 23s
  • Locked
    4. 
    Running Simple Queries in Hive
    7m 15s
  • Locked
    5. 
    Executing Hive Queries from the Shell
    3m 56s
  • Locked
    6. 
    Joining Tables in Hive
    4m 22s
  • Locked
    7. 
    Exploring the Hive Warehouse
    9m 23s
  • Locked
    8. 
    External Tables in Hive
    8m 52s
  • Locked
    9. 
    Modifying Tables in Hive
    5m 37s
  • Locked
    10. 
    Temporary Tables in Hive
    5m 42s
  • Locked
    11. 
    Loading Data into Tables in Hive
    9m 24s
  • Locked
    12. 
    Populating Multiple Tables in Hive
    4m 15s
  • Locked
    13. 
    Exercise: Loading and Querying Data in Hive
    5m 38s

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion of this course, which can be shared on any social network or business platform

Digital badges are yours to keep, forever.

YOU MIGHT ALSO LIKE

PEOPLE WHO VIEWED THIS ALSO VIEWED THESE

Likes 19 Likes 19  
Likes 66 Likes 66