Using Hive & Pig with Hadoop
Apache Hadoop 2.0
7 Videos | 35m 57s
- Includes Assessment
- Earns a Badge
There are components other than MapReduce that let you write code to process large data sets stored in Hadoop. Let's see how to work with two such components - Hive and Pig.
WHAT YOU WILL LEARN
Understand the basics of Apache Hive and HiveQL. Describe how HiveQL is similar to ANSI SQL and how it can be used to select data. Understand how HiveQL is implicitly transformed into MapReduce jobs.Understand usage of the four file formats supported in Hive, which are TEXTFILE, SEQUENCEFILE, ORC and RCFILE. Demonstrate each, and be able to describe each of the four.Understand how to use custom Hive data types such Arrays and Maps to write custom Hive jobs. Learn DDL Hive commands.Understand Pig and how it is used. Demonstrate how to use Pig Latin like SQL to obtain data. Understand how to use Pig as a component to build complex and large MapReduce applications.
Learn how to write Pig scripts. Also understand the Pig modes, Local, MapReduce, and Batch.Learn Pig command such as LOAD, LIMIT, DUMP, and STORE for data read/write operators in Pig Latin. Understand GRUNT commands used for DDL.Compare and contrast the internals and performance of MapReduce, Hive, and Pig. Understand the strengths and weaknesses of the three.
IN THIS COURSE
1.Using HiveQL to Write Queries3m 51sUP NEXT
2.Understanding Hive File Formats8m 32s
3.Working with Custom Hive Data Types3m 47s
4.Using Pig Latin to Communicate with Hadoop3m 32s
5.Writing Pig Scripts6m 31s
6.Loading and Storing Data in Pig3m 13s
7.Comparing Performance: MapReduce, Hive, and Pig3m 31s
EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE
Skillsoft is providing you the opportunity to earn a digital badge upon successful completion of this course, which can be shared on any social network or business platformDigital badges are yours to keep, forever.