Data Warehousing: Apache Hive 2.3.2 Beginner

https://www.skillsoft.com/channel/data-warehousing-e8fdec76-bd65-4935-80c9-031d06a3f056?technologyandversion=64293&expertiselevel=64292 https://www.skillsoft.com/channel/data-warehousing-e8fdec76-bd65-4935-80c9-031d06a3f056?technologyandversion=64294&expertiselevel=64292 https://www.skillsoft.com/channel/data-warehousing-e8fdec76-bd65-4935-80c9-031d06a3f056?technologyandversion=64297&expertiselevel=64292 https://www.skillsoft.com/channel/data-warehousing-e8fdec76-bd65-4935-80c9-031d06a3f056?technologyandversion=64296&expertiselevel=64295 https://www.skillsoft.com/channel/data-warehousing-e8fdec76-bd65-4935-80c9-031d06a3f056?technologyandversion=64297&expertiselevel=64295
  • 5 Courses | 4h 20m 38s
  • 2 Books | 6h 18m
  • 2 Courses | 2h 45m 57s
  • 5 Books | 38h 25m
  • 3 Courses | 3h 29m 48s
  • 3 Books | 6h 4m
  • 6 Courses | 6h 37m 42s
  • 2 Books | 14h 36m
  • 3 Courses | 2h 47m 28s
  • 3 Books | 6h 4m
Likes 32
 
Explore Data Warehousing (EDW), a well-established system that can be used to enhance business productivity by providing systems that facilitate reporting and data analysis.

GETTING STARTED

Scalable Data Architectures: Getting Started

  • Playable
    1. 
    Course Overview
    2m 40s
    NOW PLAYING
  • Playable
    2. 
    Scalable Architectures with Distributed Computing
    9m 14s
    UP NEXT

GETTING STARTED

Data Warehouse Essential: Concepts

  • Playable
    1. 
    Role of Data Warehouse in Strategic Information
    8m 31s
    NOW PLAYING
  • Playable
    2. 
    OLAP and Data Warehouse
    7m 5s
    UP NEXT

GETTING STARTED

Getting Started with Hive

  • Playable
    1. 
    Course Overview
    2m 24s
    NOW PLAYING
  • Playable
    2. 
    Hive as a Data Warehouse
    4m 57s
    UP NEXT

GETTING STARTED

Data Warehousing with Azure: Architecture & Modeling Techniques

  • Playable
    1. 
    Data Warehouse Features
    5m 26s
    NOW PLAYING
  • Playable
    2. 
    Data Warehouse Architectures
    5m 56s
    UP NEXT

GETTING STARTED

Optimizing Query Executions with Hive

  • Playable
    1. 
    Course Overview
    2m 21s
    NOW PLAYING
  • Playable
    2. 
    Hive Queries as MapReduce Jobs
    4m 55s
    UP NEXT

COURSES INCLUDED

Scalable Data Architectures: Getting Started
Explore theoretical foundations of the need for and characteristics of scalable data architectures in this 8-video course. Learn to use data warehouses to store, process, and analyze big data. Key concepts covered here include how to recognize the need to scale architectures to keep up with needs for storage and processing of big data; how to identify characteristics of data warehouses ideally suiting them to tasks of big data analysis and processing; and how to distinguish between relational databases and data warehouses. Next, learn to recognize specific characteristics of systems meant for online transaction processing and online analytical processing, and how data warehouses are an example of online analytical processing (OLAP) systems. Then, learn to identify various components of data warehouses enabling them to work with varied sources, extract and transform big data, and generate reports of analysis operations efficiently. Finally, study features of Amazon Redshift enabling big data to be processed at scale; features of data warehouses, contrasted with those of relational databases; and two options available to scale compute capacity.
8 videos | 52m has Assessment available Badge
Scalable Data Architectures: Using Amazon Redshift
Using a hands-on lab approach, explore how to use Amazon Redshift to set up and configure a data warehouse on the cloud in this 9-video course. Discover how to interact with Redshift service with both the console and Amazon Web Services (AWS) Command Line Interface (CLI). Key concepts covered here include how to use the Amazon Redshift Quick Launch feature to provision a data warehouse; provisioning a Redshift cluster with the default cluster; and tool configuration options for a Redshift cluster, and metrics available to optimize a cluster configuration. Next, learn how to create Identity and Access Management (IAM) roles on AWS that include necessary permissions to interact with Redshift and S3 services; to provision an IAM user that can connect to and interact with AWS using the CLI; and to install the AWS command-line interface to create and delete Redshift clusters. Then learn to use Redshift Query Editor to create tables, load data, and run queries; and learn features of Amazon Redshift and commands and configurations needed to work with Redshift by using the CLI.
9 videos | 54m has Assessment available Badge
Scalable Data Architectures: Using Amazon Redshift & QuickSight
In this 12-video course, explore the loading of data from an external source such as Amazon S3 into a Redshift cluster, as well as configuration of snapshots and resizing of clusters. Discover how to use Amazon QuickSight to visualize data. Key concepts covered in this course include using the AWS console to load data sets to Amazon S3 and then into a table provisioned on a Redshift cluster; running queries on data in a Redshift cluster with the query evaluation feature; and working with SQL Workbench to connect to and query data in a Redshift cluster. Learn how to disable automated snapshots for a Redshift cluster and configure a table to be excluded from snapshots; recover an individual table from the snapshot of an entire cluster; and create a security group rule enabling access from Amazon's QuickSight servers to a Redshift cluster. Next, configure Amazon QuickSight to load data from a table in a Redshift cluster for analysis; and use the QuickSight dashboard to generate a time series plot to visualize sales at a retailer over time.
12 videos | 1h 18m has Assessment available Badge
Traditional Data Architectures: Relational Databases
Databases are essential in working with large amounts of data. Managers, leaders, and decision-makers need to choose the right approach when working on a large data project, distinguishing among multiple database types and their use cases. A relational database is a primary traditional data architecture commonly used by most businesses. Working with relational databases has some key advantages but also poses certain limitations. In this course, learn how critically evaluate and work with relational databases. Explore normalization and denormalization of datasets along with specific use cases of these opposite approaches. Examine two main online information processing systems, Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP) systems. Finally, investigate the concepts of data warehousing, data marts, and data mining. Upon completion, you'll be able to identify when and how to use a relational database.
12 videos | 35m has Assessment available Badge
Traditional Data Architectures: Data Warehousing and ETL Systems
Data warehouses are actively used for business intelligence and, because they integrate data from multiple sources, are advantageous to simple databases in many instances. Considering modern companies often have ETL-based data warehousing systems, decision-makers need to comprehend how they operate and are appropriately managed. In this course, learn the necessary concepts and processes required to work with and manage projects related to data warehousing. Study data warehousing architectures and schemas and investigate some core data warehouse elements, such as dimension, fact tables, and keys. Furthermore, examine the extract, transform, and load (ETL) approach for working with data warehouses, specifying process flow, tools, and software as well as best practices. When you're done, you'll know how to adopt data warehousing and ETL systems for your business intelligence and data management needs.
12 videos | 39m has Assessment available Badge
SHOW MORE
FREE ACCESS

COURSES INCLUDED

Data Warehouse Essential: Concepts
Discover the fundamentals of data warehousing and the approaches of implementing it. Explore Data Warehouse planning, processes, schemes, and terms. You will also examine global and local Data Warehouses as well as comparing Data Warehouses with RDBMS and Data Lake.
18 videos | 1h 43m has Assessment available Badge
Data Warehouse Essential: Architecture Frameworks & Implementation
Examine architectures of data warehouse implementations, including logical and physical design. How to effectively implement and manage data warehousing projects is also covered.
11 videos | 1h has Assessment available Badge

COURSES INCLUDED

Getting Started with Hive
This 9-video Skillsoft Aspire course focuses solely on theory and involves no programming or query execution. Learners begin by examining what a data warehouse is, and how it differs from a relational database, important because Apache Hive is primarily a data warehouse, despite giving a SQL-like interface to query data. Hive facilitates work on very large data sets, stored as files in the Hadoop Distributed File System, and lets users perform operations in parallel on data in these files by effectively transforming Hive queries into MapReduce operations. Next, you will hear about types of data and operations which data warehouses and relational databases handle, before moving on to basic components of the Hadoop architecture.  Finally, the course discusses features of Hive making it popular among data analysts. The concluding exercise recalls differences between online transaction processing and online analytical processing systems, asking learners to identify Hadoop’s three major components; list Hadoop offerings on three major cloud platforms (AWS, Microsoft Azure, and Google Cloud Platform); and list benefits of Hive for data analysts.
10 videos | 56m has Assessment available Badge
Loading & Querying Data with Hive
Among the market’s most popular data warehouses used for data science, Apache Hive simplifies working with large data sets in files by representing them as tables. In this 12-video Skillsoft Aspire course, learners explore how to create, load, and query Hive tables. For this hands-on course, learners should have a conceptual understanding of Hive and its basic components, and prior experience with querying data from tables using SQL (structured query language) and with using the command line. Key concepts covered include cluster, joining tables, and modifying tables. Demonstrations covered include using the Beeline client for Hive for simple operations; creating tables, loading them with data, and then running queries against them. Only tables with primitive data types are used here, with data loaded into these tables from HDFS (Hadoop Distributed File System) file system and local machines. Learners will work with Hive metastore and temporary tables, and how they can be used. You will become familiar with basics of using the Hive query language and quite comfortable working with HDFS.
13 videos | 1h 20m has Assessment available Badge
Viewing & Querying Complex Data with Hive
Learners explore working with complex data types in Apache Hive in this Skillsoft Aspire course, which assumes previous work with Hive tables using the Hive query language, and comfort using a command-line interface or Hive client to run queries. Learners begin this 12-video, hands-on course by working with Hive tables whose columns are of complex data types (arrays, maps, and structs). Watch demonstrations of set operations and transforming complex types into tabular form with explode operation. Then use lateral views to add more data to exploded outputs. Course labs use the Beeline client; the instructor’s Beeline terminal runs on the master node of a Hadoop cluster, provisioned on Google Cloud platform using its Dataproc service, and learner access is assumed to a Hadoop cluster and Beeline, on-premises or in the cloud. Finally, learners observe how to use views to aggregate contents of multiple columns. As the course concludes, you should be comfortable working with all types of data in Hive and performing analysis tasks on tables with both parameter types as well as complex data.
12 videos | 1h 13m has Assessment available Badge

COURSES INCLUDED

Data Warehousing with Azure: Architecture & Modeling Techniques
Explore the fundamentals of data warehousing and the essential architectures and components being implemented to manage data.
15 videos | 1h 18m has