Course details

Hadoop HDFS: Introduction to the Shell

Hadoop HDFS: Introduction to the Shell


Overview/Description
Expected Duration
Lesson Objectives
Course Number
Expertise Level



Overview/Description

HDFS is the file system used for data science which enables the parallel processing of big data in distributed cluster. In this Skillsoft Aspire course, you will discover how to set up a Hadoop Cluster on the cloud and explore the bundled web apps - the YARN Cluster Manager app and the HDFS NameNode UI. Then use the hadoop fs and hdfs dfs shells to browse the Hadoop file system.



Expected Duration (hours)
0.9

Lesson Objectives

Hadoop HDFS: Introduction to the Shell

  • Course Overview
  • provision a Hadoop cluster on the cloud using the Google Cloud Platform's Dataproc service
  • identify the various GCP services used by Dataproc when provisioning a cluster
  • list the metrics available on the YARN Cluster Manager app and recognize how it can be useful to monitor job executions
  • recall the details and metrics of HDFS available on the NameNode web app and how it can be used to browse the file system
  • identify the tools of the Hadoop ecosystem which are packaged with Hadoop and recall how they can be accessed
  • configure HDFS using the hdfs-site.xml file and identify the properties which can be set from it
  • compare the hadoop fs and hdfs dfs shells and recognize their similarities to Linux shells
  • explore apps for Hadoop, configure HDFS, work with HDFS shells
  • Course Number:
    it_dshdfsdj_02_enus

    Expertise Level
    Beginner