Advanced Operations Using Hadoop MapReduce
Apache Hadoop 2.9
| Intermediate
- 9 videos | 48m 16s
- Includes Assessment
- Earns a Badge
In this Skillsoft Aspire course, explore how MapReduce can be used to extract the five most expensive vehicles in a data set, then build an inverted index for the words appearing in a set of text files. Begin by defining a vehicle type that can be used to represent automobiles to be stored in a Java PriorityQueue, then configure a Mapper to use a PriorityQueue to store the five most expensive automobiles it has processed from the dataset. Learn how to use a PriorityQueue in the Reducer of the application to receive the five most expensive automobiles from each mapper and write the top five automobiles overall to the output, then execute the application to verify the results. Next, explore how you can utilize the MapReduce framework in order to generate an inverted index and configure the Reducer and Driver for the inverted index application. This leads on to running the application and examining the inverted index on HDFS (Hadoop Distributed File System). The concluding exercise involves advanced operations using MapReduce.
WHAT YOU WILL LEARN
-
define a vehicle type that can be used to represent automobiles to be stored in a Java PriorityQueueconfigure a Mapper to use a PriorityQueue to store the five most expensive vehicles it has processed from the datasetuse a PriorityQueue in the Reducer of the application to receive the five most expensive automobiles from each mapper and write the top 5 vehicles overall to the outputexecute the application and examine the output on HDFS to confirm that the five most expensive automobiles have been written out
-
define the Mapper for a MapReduce application to build an inverted index from a set of text filesconfigure the Reducer and the Driver for the inverted index applicationrun the application and examine the inverted index on HDFSrecognize the data structures and configurations involved when extracting the top N values from a data set
IN THIS COURSE
-
1.Course Overview2m 30sUP NEXT
-
2.Defining a User-Defined Type for a PriorityQueue6m 45s
-
3.Implementing a PriorityQueue in a Mapper5m 31s
-
4.Using a PriorityQueue in a Reducer6m 29s
-
5.Running and Verifying the Results5m 7s
-
6.Building an Inverted Index - Map Phase6m 5s
-
7.Building an Inverted Index - Reduce Phase5m 31s
-
8.Executing the Application and Viewing the Index5m 1s
-
9.Exercise: Advanced Operations Using MapReduce5m 17s
EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE
Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.
Digital badges are yours to keep, forever.