Previous Page

Getting Started with Hive: Loading and Querying Data

Getting Started with Hive: Loading and Querying Data


Overview/Description
Expected Duration
Lesson Objectives
Course Number
Expertise Level



Overview/Description

Apache Hive is one of the most popular data warehouses out in the market used for data science. It simplifies working with large datasets in files by representing them as tables. This allows them to be queried with a simple and intuitive query language. In ths course Skillsoft Aspire course, you will explore how to create, load, and query Hive tables.



Expected Duration (hours)
1.3

Lesson Objectives

Getting Started with Hive: Loading and Querying Data

  • Course Overview
  • use the Google Cloud Platform's Dataproc service to provision a Hadoop cluster
  • define and create a simple table in Hive using the Beeline client
  • load a few rows of data into a table and query it with simple select statements
  • run Hive queries from the shell of a host where a Hive client is installed
  • define and run a join query involving two related tables
  • describe the structure of the Hive Metastore on the Hadoop Distributed File System (HDFS)
  • create, load data into, and query an external table in Hive and contrast it with a Hive-managed table
  • use the alter table statement to change the definition of a Hive table
  • work with temporary tables that are only valid for a single Hive session and recognize how they differ from regular tables
  • populate Hive tables with data in files on both HDFS and the file system of the Hive client
  • load data into multiple tables from the contents of another table
  • use the Hadoop shell to execute Hive query scripts and work with Hive tables
  • Course Number:
    it_dsgshvdj_02_enus

    Expertise Level
    Beginner