Data Refinery with MapReduce

Apache Hadoop 2.0    |    Intermediate
  • 13 videos | 54m 55s
  • Earns a Badge
Likes 8 Likes 8
MapReduce is a set of classes, which abstract away the complexity of parallel processing. Learn how MapReduce can take a single compute job and run it in our super computing platform.

WHAT YOU WILL LEARN

  • define the principle concepts of key-value pairs and list the rules for key-value pairs
    describe how MapReduce transforms key-value pairs
    load a large text book and then run WordCount to count the number of words in the text book
    label all of the functions for MapReduce on a diagram
    match the phases of MapReduce to their definitions
    set up the classpath and test WordCount
    build a JAR file and run WordCount
  • describe the base mapper class of the MapReduce Java API and describe how to override its methods
    describe the base Reducer class of the MapReduce Java API and describe how to override its methods
    describe the function of the MapReduceDriver Java class
    set up the classpath and test a MapReduce job
    identify the concept of streaming for MapReduce
    stream a Python job

IN THIS COURSE

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.

Digital badges are yours to keep, forever.

YOU MIGHT ALSO LIKE

Likes 49 Likes 49  
Channel Apache HBase
Likes 48 Likes 48