Hadoop & MapReduce Getting Started

Apache Hadoop | Beginner

8 videos | 1h 3m 29s
Includes Assessment
Earns a Badge

(54)

From Channel:

Apache Hadoop

From Journey:

Data Analyst to Data Scientist

In this course, learners will explore the theory behind big data analysis using Hadoop, and how MapReduce enables parallel processing of large data sets distributed on a cluster of machines. Begin with an introduction to big data and the various sources and characteristics of data available today. Look at challenges involved in processing big data and options available to address them. Next, a brief overview of Hadoop, its role in processing big data, and the functions of its components such as the Hadoop Distributed File System (HDFS), MapReduce, and YARN (Yet Another Resource Negotiator). Explore the working of Hadoop's MapReduce framework to process data in parallel on a cluster of machines. Recall steps involved in building a MapReduce application and specifics of the Map phase in processing each row of the input file's data. Recognize the functions of the Shuffle and Reduce phases in sorting and interpreting the output of the Map phase to produce a meaningful output. To conclude, complete an exercise on the fundamentals of Hadoop and MapReduce.

WHAT YOU WILL LEARN

Describe what big data is and list the various sources and characteristics of data available today

Recognize the challenges involved in processing big data and the options available to address them such as vertical and horizontal scaling

Specify the role of hadoop in processing big data and describe the function of its components such as hdfs, mapreduce, and yarn

Identify the purpose and describe the workings of hadoop's mapreduce framework to process data in parallel on a cluster of machines
Recall the steps involved in building a mapreduce application and the specific workings of the map phase in processing each row of data in the input file

Recognize the functions of the shuffle and reduce phases in sorting and interpreting the output of the map phase to produce a meaningful output

Recognize the techniques related to scaling data processing tasks, working with clusters, and mapreduce and identify the hadoop components and their functions

IN THIS COURSE

2m 53s

FREE ACCESS
8m 27s

After completing this video, you will be able to describe what big data is and list the various sources and characteristics of data available today. FREE ACCESS
3. Building Systems to Scale with Data

9m 16s

Upon completion of this video, you will be able to recognize the challenges involved in processing big data and the options available to address them, such as vertical and horizontal scaling. FREE ACCESS
4. A Quick Overview of Hadoop

9m 28s

Upon completion of this video, you will be able to specify the role of Hadoop in processing big data and describe the function of its components, such as HDFS, MapReduce, and YARN. FREE ACCESS
5. MapReduce Overview

9m 17s

During this video, you will learn how to identify the purpose and describe the workings of Hadoop's MapReduce framework. This framework processes data in parallel on a cluster of machines. FREE ACCESS
6. The Map Phase of a MapReduce

8m 24s

Upon completion of this video, you will be able to recall the steps involved in building a MapReduce application and the specific workings of the Map phase in processing each row of data in the input file. FREE ACCESS
7. The Shuffle and Reduce Phases

7m 16s

After completing this video, you will be able to recognize the functions of the Shuffle and Reduce phases in sorting, and interpreting the output of the Map phase to produce a meaningful output. FREE ACCESS
8. Exercise: Fundamentals of Hadoop and MapReduce

8m 28s

After completing this video, you will be able to recognize the techniques related to scaling data processing tasks, working with clusters, and MapReduce, and identify the Hadoop components and their functions. FREE ACCESS

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.

Digital badges are yours to keep, forever.

Course CompTIA Server+: Server Storage

(77)

Course GCP Data Engineer Pro: Creating a Pipeline of Services

(1)

Course Prompt Engineering for Data: Basic Data Manipulation Using Generative AI

(1)

PEOPLE WHO VIEWED THIS ALSO VIEWED THESE

Course Business Continuity: Cloud Integration

(35)

Course Fundamentals & Installation

(97)

Course Certified Information Systems Auditor (CISA) 2019: Virtualization & Cloud

(52)

Get Started

Sharpen your skills. Upgrade your career. Find the right learning path for you, based on your role and skills. Take part in hands-on practice, study for a certification, and much more - all personalized for you.

*Not included: Compliance, Leadership Development Program content, and Engineering books

Your content + our content + our platform = a path to learning success

Using our learning experience platform, Percipio, your learners can engage in custom learning paths that can feature curated content from all sources.

Learn More

Aspire to something bigger

Aspire Journeys are guided learning paths that set you in motion for career success.

Browse Aspire Journeys

Explore a world of live learning with Global Knowledge

Choose from convenient delivery formats to get the training you and your team need - where, when and how you want it.

Browse Live Learning

IT Skills & Salary Report

ESG Impact Report

Hadoop & MapReduce Getting Started

WHAT YOU WILL LEARN

IN THIS COURSE

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

YOU MIGHT ALSO LIKE

PEOPLE WHO VIEWED THIS ALSO VIEWED THESE