Hadoop & MapReduce Getting Started
Apache Hadoop
| Beginner
- 8 videos | 1h 3m 29s
- Includes Assessment
- Earns a Badge
In this course, learners will explore the theory behind big data analysis using Hadoop, and how MapReduce enables parallel processing of large data sets distributed on a cluster of machines. Begin with an introduction to big data and the various sources and characteristics of data available today. Look at challenges involved in processing big data and options available to address them. Next, a brief overview of Hadoop, its role in processing big data, and the functions of its components such as the Hadoop Distributed File System (HDFS), MapReduce, and YARN (Yet Another Resource Negotiator). Explore the working of Hadoop's MapReduce framework to process data in parallel on a cluster of machines. Recall steps involved in building a MapReduce application and specifics of the Map phase in processing each row of the input file's data. Recognize the functions of the Shuffle and Reduce phases in sorting and interpreting the output of the Map phase to produce a meaningful output. To conclude, complete an exercise on the fundamentals of Hadoop and MapReduce.
WHAT YOU WILL LEARN
-
Describe what big data is and list the various sources and characteristics of data available todayRecognize the challenges involved in processing big data and the options available to address them such as vertical and horizontal scalingSpecify the role of hadoop in processing big data and describe the function of its components such as hdfs, mapreduce, and yarnIdentify the purpose and describe the workings of hadoop's mapreduce framework to process data in parallel on a cluster of machines
-
Recall the steps involved in building a mapreduce application and the specific workings of the map phase in processing each row of data in the input fileRecognize the functions of the shuffle and reduce phases in sorting and interpreting the output of the map phase to produce a meaningful outputRecognize the techniques related to scaling data processing tasks, working with clusters, and mapreduce and identify the hadoop components and their functions
IN THIS COURSE
-
2m 53s
-
8m 27sAfter completing this video, you will be able to describe what big data is and list the various sources and characteristics of data available today. FREE ACCESS
-
9m 16sUpon completion of this video, you will be able to recognize the challenges involved in processing big data and the options available to address them, such as vertical and horizontal scaling. FREE ACCESS
-
9m 28sUpon completion of this video, you will be able to specify the role of Hadoop in processing big data and describe the function of its components, such as HDFS, MapReduce, and YARN. FREE ACCESS
-
9m 17sDuring this video, you will learn how to identify the purpose and describe the workings of Hadoop's MapReduce framework. This framework processes data in parallel on a cluster of machines. FREE ACCESS
-
8m 24sUpon completion of this video, you will be able to recall the steps involved in building a MapReduce application and the specific workings of the Map phase in processing each row of data in the input file. FREE ACCESS
-
7m 16sAfter completing this video, you will be able to recognize the functions of the Shuffle and Reduce phases in sorting, and interpreting the output of the Map phase to produce a meaningful output. FREE ACCESS
-
8m 28sAfter completing this video, you will be able to recognize the techniques related to scaling data processing tasks, working with clusters, and MapReduce, and identify the Hadoop components and their functions. FREE ACCESS
EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE
Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.
Digital badges are yours to keep, forever.