Data Warehousing: Apache Hive 2.3.2 intermediate

https://www.skillsoft.com/channel/data-warehousing-e8fdec76-bd65-4935-80c9-031d06a3f056?technologyandversion=122843&expertiselevel=122842 https://www.skillsoft.com/channel/data-warehousing-e8fdec76-bd65-4935-80c9-031d06a3f056?technologyandversion=122844&expertiselevel=122842 https://www.skillsoft.com/channel/data-warehousing-e8fdec76-bd65-4935-80c9-031d06a3f056?technologyandversion=122847&expertiselevel=122842 https://www.skillsoft.com/channel/data-warehousing-e8fdec76-bd65-4935-80c9-031d06a3f056?technologyandversion=122844&expertiselevel=122845 https://www.skillsoft.com/channel/data-warehousing-e8fdec76-bd65-4935-80c9-031d06a3f056?technologyandversion=122846&expertiselevel=122845 https://www.skillsoft.com/channel/data-warehousing-e8fdec76-bd65-4935-80c9-031d06a3f056?technologyandversion=122847&expertiselevel=122845

5 Courses | 4h 18m 1s
2 Books | 6h 18m

2 Courses | 2h 44m 30s
4 Books | 31h 37m

3 Courses | 3h 28m 4s
3 Books | 6h 4m

2 Courses | 2h 1m 36s

6 Courses | 6h 33m 45s
2 Books | 14h 36m

3 Courses | 2h 46m 10s
3 Books | 6h 4m

(1)

Explore Data Warehousing (EDW), a well-established system that can be used to enhance business productivity by providing systems that facilitate reporting and data analysis.

GETTING STARTED

Scalable Data Architectures: Getting Started

2m 37s
9m 11s

+6 MORE VIDEOS | FREE ACCESS

GETTING STARTED

Data Warehouse Essential: Concepts

8m 28s
7m 2s

+16 MORE VIDEOS | FREE ACCESS

GETTING STARTED

Getting Started with Hive

2m 21s
4m 54s

+8 MORE VIDEOS | FREE ACCESS

GETTING STARTED

Modern Data Warehouses

2m 9s
7m 48s

+10 MORE VIDEOS | FREE ACCESS

GETTING STARTED

Data Warehousing with Azure: Architecture & Modeling Techniques

5m 23s
5m 53s

+13 MORE VIDEOS | FREE ACCESS

GETTING STARTED

Optimizing Query Executions with Hive

2m 18s
4m 52s

+5 MORE VIDEOS | FREE ACCESS

COURSES INCLUDED

Scalable Data Architectures: Getting Started

Explore theoretical foundations of the need for and characteristics of scalable data architectures in this 8-video course. Learn to use data warehouses to store, process, and analyze big data. Key concepts covered here include how to recognize the need to scale architectures to keep up with needs for storage and processing of big data; how to identify characteristics of data warehouses ideally suiting them to tasks of big data analysis and processing; and how to distinguish between relational databases and data warehouses. Next, learn to recognize specific characteristics of systems meant for online transaction processing and online analytical processing, and how data warehouses are an example of online analytical processing (OLAP) systems. Then, learn to identify various components of data warehouses enabling them to work with varied sources, extract and transform big data, and generate reports of analysis operations efficiently. Finally, study features of Amazon Redshift enabling big data to be processed at scale; features of data warehouses, contrasted with those of relational databases; and two options available to scale compute capacity.

8 videos | 52m Assessment Badge

Scalable Data Architectures: Using Amazon Redshift

Using a hands-on lab approach, explore how to use Amazon Redshift to set up and configure a data warehouse on the cloud in this 9-video course. Discover how to interact with Redshift service with both the console and Amazon Web Services (AWS) Command Line Interface (CLI). Key concepts covered here include how to use the Amazon Redshift Quick Launch feature to provision a data warehouse; provisioning a Redshift cluster with the default cluster; and tool configuration options for a Redshift cluster, and metrics available to optimize a cluster configuration. Next, learn how to create Identity and Access Management (IAM) roles on AWS that include necessary permissions to interact with Redshift and S3 services; to provision an IAM user that can connect to and interact with AWS using the CLI; and to install the AWS command-line interface to create and delete Redshift clusters. Then learn to use Redshift Query Editor to create tables, load data, and run queries; and learn features of Amazon Redshift and commands and configurations needed to work with Redshift by using the CLI.

9 videos | 54m Assessment Badge

Scalable Data Architectures: Using Amazon Redshift & QuickSight

In this 12-video course, explore the loading of data from an external source such as Amazon S3 into a Redshift cluster, as well as configuration of snapshots and resizing of clusters. Discover how to use Amazon QuickSight to visualize data. Key concepts covered in this course include using the AWS console to load data sets to Amazon S3 and then into a table provisioned on a Redshift cluster; running queries on data in a Redshift cluster with the query evaluation feature; and working with SQL Workbench to connect to and query data in a Redshift cluster. Learn how to disable automated snapshots for a Redshift cluster and configure a table to be excluded from snapshots; recover an individual table from the snapshot of an entire cluster; and create a security group rule enabling access from Amazon's QuickSight servers to a Redshift cluster. Next, configure Amazon QuickSight to load data from a table in a Redshift cluster for analysis; and use the QuickSight dashboard to generate a time series plot to visualize sales at a retailer over time.

12 videos | 1h 17m Assessment Badge

Traditional Data Architectures: Relational Databases

Databases are essential in working with large amounts of data. Managers, leaders, and decision-makers need to choose the right approach when working on a large data project, distinguishing among multiple database types and their use cases. A relational database is a primary traditional data architecture commonly used by most businesses. Working with relational databases has some key advantages but also poses certain limitations. In this course, learn how critically evaluate and work with relational databases. Explore normalization and denormalization of datasets along with specific use cases of these opposite approaches. Examine two main online information processing systems, Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP) systems. Finally, investigate the concepts of data warehousing, data marts, and data mining. Upon completion, you'll be able to identify when and how to use a relational database.

12 videos | 34m Assessment Badge

Traditional Data Architectures: Data Warehousing and ETL Systems

Data warehouses are actively used for business intelligence and, because they integrate data from multiple sources, are advantageous to simple databases in many instances. Considering modern companies often have ETL-based data warehousing systems, decision-makers need to comprehend how they operate and are appropriately managed. In this course, learn the necessary concepts and processes required to work with and manage projects related to data warehousing. Study data warehousing architectures and schemas and investigate some core data warehouse elements, such as dimension, fact tables, and keys. Furthermore, examine the extract, transform, and load (ETL) approach for working with data warehouses, specifying process flow, tools, and software as well as best practices. When you're done, you'll know how to adopt data warehousing and ETL systems for your business intelligence and data management needs.

12 videos | 38m Assessment Badge

FREE ACCESS

COURSES INCLUDED

Data Warehouse Essential: Concepts

Discover the fundamentals of data warehousing and the approaches of implementing it. Explore Data Warehouse planning, processes, schemes, and terms. You will also examine global and local Data Warehouses as well as comparing Data Warehouses with RDBMS and Data Lake.

18 videos | 1h 43m Assessment Badge

Data Warehouse Essential: Architecture Frameworks & Implementation

Examine architectures of data warehouse implementations, including logical and physical design. How to effectively implement and manage data warehousing projects is also covered.

11 videos | 1h 1m Assessment Badge

FREE ACCESS

COURSES INCLUDED

Getting Started with Hive

This 9-video Skillsoft Aspire course focuses solely on theory and involves no programming or query execution. Learners begin by examining what a data warehouse is, and how it differs from a relational database, important because Apache Hive is primarily a data warehouse, despite giving a SQL-like interface to query data. Hive facilitates work on very large data sets, stored as files in the Hadoop Distributed File System, and lets users perform operations in parallel on data in these files by effectively transforming Hive queries into MapReduce operations. Next, you will hear about types of data and operations which data warehouses and relational databases handle, before moving on to basic components of the Hadoop architecture. Finally, the course discusses features of Hive making it popular among data analysts. The concluding exercise recalls differences between online transaction processing and online analytical processing systems, asking learners to identify Hadoop's three major components; list Hadoop offerings on three major cloud platforms (AWS, Microsoft Azure, and Google Cloud Platform); and list benefits of Hive for data analysts.

10 videos | 55m Assessment Badge

Loading & Querying Data with Hive

Among the market's most popular data warehouses used for data science, Apache Hive simplifies working with large data sets in files by representing them as tables. In this 12-video Skillsoft Aspire course, learners explore how to create, load, and query Hive tables. For this hands-on course, learners should have a conceptual understanding of Hive and its basic components, and prior experience with querying data from tables using SQL (structured query language) and with using the command line. Key concepts covered include cluster, joining tables, and modifying tables. Demonstrations covered include using the Beeline client for Hive for simple operations; creating tables, loading them with data, and then running queries against them. Only tables with primitive data types are used here, with data loaded into these tables from HDFS (Hadoop Distributed File System) file system and local machines. Learners will work with Hive metastore and temporary tables, and how they can be used. You will become familiar with basics of using the Hive query language and quite comfortable working with HDFS.

13 videos | 1h 19m Assessment Badge

Viewing & Querying Complex Data with Hive

Learners explore working with complex data types in Apache Hive in this Skillsoft Aspire course, which assumes previous work with Hive tables using the Hive query language, and comfort using a command-line interface or Hive client to run queries. Learners begin this 12-video, hands-on course by working with Hive tables whose columns are of complex data types (arrays, maps, and structs). Watch demonstrations of set operations and transforming complex types into tabular form with explode operation. Then use lateral views to add more data to exploded outputs. Course labs use the Beeline client; the instructor's Beeline terminal runs on the master node of a Hadoop cluster, provisioned on Google Cloud platform using its Dataproc service, and learner access is assumed to a Hadoop cluster and Beeline, on-premises or in the cloud. Finally, learners observe how to use views to aggregate contents of multiple columns. As the course concludes, you should be comfortable working with all types of data in Hive and performing analysis tasks on tables with both parameter types as well as complex data.

12 videos | 1h 12m Assessment Badge

FREE ACCESS

COURSES INCLUDED

Modern Data Warehouses

In today's world, data warehouses have become necessary for making informed business decisions. The wide availability of data comes at an increased cost of storing it efficiently - a necessity for any business working with large amounts of data. Learn more about the key concepts, architecture, stages, use cases, and available solutions for data warehouses using this course. You will examine data warehousing solutions, architecture, and techniques, discover Amazon Redshift and Google BigQuery, and explore the concepts, such as batch, stream, and real-time analytics. This course will also help highlight the considerations for implementing a data warehouse for a business and the implementation steps and best practices required. After completing this course, you will have a foundational knowledge of implementing a data warehousing solution for your business.

12 videos | 1h 4m Assessment Badge

Azure Databricks & Data Pipelines

Azure Databricks is a data analytics platform optimized to work with Microsoft Azure cloud services and is an example of a cloud platform designed to serve business analytics needs. Use this course to explore the architecture, features, advantages, and disadvantages of Azure Databricks - a leading cloud-based tool used for data engineering, and Snowflake - a data warehouse-as-a-service. Examine different types of data pipelines and their components and advantages. You will also compare various data pipeline tools and learn more about building a data pipeline through a case study. Upon finishing this course, you will be able to recognize the capabilities of different data warehouses and the steps required for building data pipelines.

12 videos | 56m Assessment Badge

FREE ACCESS

COURSES INCLUDED

Data Warehousing with Azure: Architecture & Modeling Techniques

Explore the fundamentals of data warehousing and the essential architectures and components being implemented to manage data.

15 videos | 1h 17m Assessment Badge

Data Warehousing with Azure: Implementing Azure SQL Data Warehouse

Explore the practical implementation of Azure SQL Data Warehouse. Examine how to design, model, and apply ELT approaches of extracting loads and transforming data.

11 videos | 1h 7m Assessment Badge

Data Warehousing with Azure: Working with SQL Data Warehouse Objects

Explore how to create and utilize SQL Data Warehouse objects and work with T-SQL to implement tables of diversified categories.

17 videos | 1h 26m Assessment Badge

Data Warehousing with Azure: Analytics & Reporting

Discover how to use Azure Analysis Services and Power BI to prepare reports that can be used to analyze data in SQL Data Warehouse and Azure Data Lake.

10 videos | 44m Assessment Badge

Data Warehousing with Azure: Data Lake Implementation Using Azure

Explore the fundamentals of data lakes and approaches for building and using data lakes. How to build and use an Azure Data Lake using Gen1 and Gen2 implementation approaches is also covered.

13 videos | 58m Assessment Badge

Data Warehousing with Azure: Managing Azure Data Lake

Explore the advanced features of Azure Data Lakes with additional focus on managing various scenarios of data ingestion. Securing and tuning an Azure Data Lake for performance enhancement is also covered.

13 videos | 59m Assessment Badge

FREE ACCESS

COURSES INCLUDED

Optimizing Query Executions with Hive

In this 7-video Skillsoft Aspire course, learners can explore optimizations allowing Apache Hive to handle parallel processing of data, while users can still contribute to improving query performance. For this course, learners should have previous experience with Hive and familiarity with querying big data for analysis purposes. The course focuses only on concepts; no queries are run. Learners begin to understand how to optimize query executions in Hive, beginning with exploring different options available in Hive to query data in an optimal manner. Discuss how to split data into smaller chunks, specifically, partitioning and bucketing, so that queries need not scan full data sets each time. Hive truly democratizes access to data stored in a Hadoop cluster, eliminating the need to know MapReduce to process cluster data, and makes data accessible using the Hive query language. All files in Hadoop are exposed in the form of tables. Watch demonstrations of structuring queries to reduce numbers of map reduce operations generated by Hive, and speeding up query executions. Other concepts covered include partitioning, bucketing, and joins.

7 videos | 42m Assessment Badge

Using Hive to Optimize Query Executions with Partitioning

Continue to explore the versatility of Apache Hive, among today's most popular data warehouses, in this 10-video Skillsoft Aspire course. Learners are shown ways to optimize query executions, including the powerful technique of partitioning data sets. The hands-on course assumes previous work with Hive tables using the Hive query language and in processing complex data types, along with theoretical understanding of improving query performance by partitioning very large data sets. Demonstrations focus on basics of partitioning and how to create partitions and load data into them. Learners work with both Hive-managed tables and external tables to see how partitioning works for each; then watch navigating to the shell of the Hadoop master node, and creating new directories in the Hadoop file system. Observe dynamic partitioning of tables and how this simplifies loading of data into partitions. Finally, you explore how using multiple columns in a table can partition data within it. During this course, learners will acquire a sound understanding of how exactly large data sets can be partitioned into smaller chunks, improving query performance.

10 videos | 1h Assessment Badge

Bucketing & Window Functions with Hive

Learners explore how Apache Hive query executions can be optimized, including techniques such as bucketing data sets, in this Skillsoft Aspire course. Using windowing functions to extract meaningful insights from data is also covered. This 10-video course assumes previous work with partitions in Hive, as well as conceptual understanding of how buckets can improve query performance. Learners begin by focusing on how to use the bucketing technique to process big data efficiently. Then take a look at HDFS (Hadoop Distributed File System) by navigating to the shell of the Hadoop master node; from there, make use of the Hadoop fs-ls command to examine contents of the directory. Observe three subdirectories corresponding to three partitions based on the value of the category column. You will then explore how to combine both the partitioning as well as bucketing techniques to further improve query performance. Finally, learners will explore the concept of co-windowing, which helps users analyze a subset of ordered data, and then to see how this technique can be implemented in Hive.

9 videos | 1h 3m Assessment Badge

FREE ACCESS

EARN A DIGITAL BADGE WHEN YOU COMPLETE THESE COURSES

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.

Digital badges are yours to keep, forever.

BOOKS INCLUDED

Book

Scalable Big Data Architecture: A Practitioner's Guide to Choosing Relevant Big Data Architecture

Covering real-world, concrete industry use cases, this book is for developers, data architects, and data scientists looking for a better understanding of how to choose the most relevant pattern for a big data project and which tools to integrate into that pattern.

1h 51m By Bahaaldine Azarmi

Book

Data Architecture: A Primer for the Data Scientist: Big Data, Data Warehouse and Data Vault

Drawing upon years of practical experience and using numerous examples and an easy to understand framework, this timely guide defines the importance of data architecture and how it can be used effectively to harness big data within existing systems.

4h 27m By Daniel Linstedt, W.H. Inmon

FREE ACCESS

BOOKS INCLUDED

Book

The Kimball Group Reader: Relentlessly Practical Tools for Data Warehousing and Business Intelligence Remastered Collection, Second Edition

Organized for quick navigation and easy reference, this vital resource is the essential reference for data warehouse and business intelligence design, packed with best practices, design tips, and valuable insight from industry pioneer Ralph Kimball and the Kimball Group.

21h 34m By Margy Ross, Ralph Kimball

Data Mining and Data Warehousing, BPB Publications (c) 2015

Book

Providing comprehensive coverage of various aspects of data mining and warehousing concepts, this book offers examples, diagrams, and questions in a simple language, a crystal clear approach, and a straightforward comprehensible presentation.

2h 2m By Akash Saxena, Khushboo Saxena, Sandeep Saxena

Book

Enterprise Business Intelligence and Data Warehousing: Program Management Essentials

Covering best practices for managing and leading an enterprise-scale business intelligence (BI) and data warehousing (DW) program, this essential book describes what the Enterprise Program Manager must accomplish to orchestrate the many moving parts involved.

1h 22m By Alan Simon

Book

Data Warehousing in the Age of Big Data

Helping you navigate through the complex layers of Big Data and data warehousing, this practical and timely book provides information on how to effectively think about using all of the technologies and architectures to design the next-generation data warehouse.

6h 39m By Krish Krishnan

FREE ACCESS

BOOKS INCLUDED

Book

Hadoop Architecture and SQL: The Best HiveQL Book in the Universe

Including hundreds of pages of SQL examples and explanations, this book is perfect for anyone who wants to query Hadoop with SQL and educates readers on how to create tables, how the data is distributed, and how the system processes the data.

1h 32m By Jason Nolander, Tom Coffing

Book

Practical Hive: A Guide to Hadoop's Data Warehouse System

From deploying Hive on your hardware or virtual machine and setting up its initial configuration to learning how Hive interacts with Hadoop, MapReduce, Tez and other big data technologies, this go-to resource gives you a detailed treatment of the software.

3h 57m By Andreas François Vermeulen, Ankur Gupta, David Kjerrumgaard, Scott Shaw

Book

Apache Hive: 34 Most Asked Questions On Apache Hive

Offering a thorough view of key knowledge and detailed insight, this all-embracing guide provides comprehensive answers and extensive details and references for everything you want to know about Apache Hive.

35m By Jacqueline Douglas

FREE ACCESS

BOOKS INCLUDED

Book

Microsoft Azure SQL Data Warehouse: Architecture and SQL (Book 18)

Including numerous SQL examples and explanations, this book details the architecture of the Azure SQL Data Warehouse and the SQL commands available, and educates readers on how to create tables and indexes, how the data is distributed, and how the system process the data.

3h 22m By Todd Wilson, Tom Coffing

Book

Data Warehouse Systems: Design and Implementation

With extensive coverage of all data warehouse issues, ranging from basic technologies to the most recent findings and systems, this book illustrates the concepts with an on-going example based on the North wind database using Microsoft Analysis Services and Pentaho Business Analytics.

11h 14m By Alejandro Vaisman, Esteban Zimányi

FREE ACCESS

BOOKS INCLUDED

Book

Hadoop Architecture and SQL: The Best HiveQL Book in the Universe

1h 32m By Jason Nolander, Tom Coffing

Book

Practical Hive: A Guide to Hadoop's Data Warehouse System

3h 57m By Andreas François Vermeulen, Ankur Gupta, David Kjerrumgaard, Scott Shaw

Book

Apache Hive: 34 Most Asked Questions On Apache Hive

35m By Jacqueline Douglas

FREE ACCESS

SKILL BENCHMARKS INCLUDED

Data Warehousing Competency (Intermediate Level)

The Data Warehousing Competency (Intermediate Level) benchmark assesses your recognition of core data warehousing concepts. You will be evaluated on your skills in recognizing high-level elements of data warehousing, architectures, and techniques. Learners who score high on this benchmark demonstrate that they have a solid understanding of intermediate-level data warehousing techniques.

20m | 12 questions

FREE ACCESS

Channel Teradata

Channel Big Data

(1)

Channel Analytics

(1)

Get Started

Sharpen your skills. Upgrade your career. Find the right learning path for you, based on your role and skills. Take part in hands-on practice, study for a certification, and much more - all personalized for you.

*Not included: Compliance, Leadership Development Program content, and Engineering books

Your content + our content + our platform = a path to learning success

Using our learning experience platform, Percipio, your learners can engage in custom learning paths that can feature curated content from all sources.

Learn More

Aspire to something bigger

Aspire Journeys are guided learning paths that set you in motion for career success.

Browse Aspire Journeys

Explore a world of live learning with Global Knowledge

Choose from convenient delivery formats to get the training you and your team need - where, when and how you want it.

Browse Live Learning

IT Skills & Salary Report

ESG Impact Report

Data Warehousing: Apache Hive 2.3.2 intermediate

GETTING STARTED

GETTING STARTED

GETTING STARTED

GETTING STARTED

GETTING STARTED

GETTING STARTED

COURSES INCLUDED

COURSES INCLUDED

COURSES INCLUDED

COURSES INCLUDED

COURSES INCLUDED

COURSES INCLUDED

EARN A DIGITAL BADGE WHEN YOU COMPLETE THESE COURSES

BOOKS INCLUDED

BOOKS INCLUDED

BOOKS INCLUDED

BOOKS INCLUDED

BOOKS INCLUDED

SKILL BENCHMARKS INCLUDED

YOU MIGHT ALSO LIKE