Apache Hadoop: Apache Hadoop 2.7 intermediate
Technology:
Expertise:
- 7 Courses | 5h 28m 28s
- 7 Books | 32h 25m
- 6 Courses | 6h 9m 24s
- 4 Books | 18h 27m
- 19 Courses | 14h 48m 2s
- 8 Books | 39h 5m
- 23 Courses | 25h 1m 44s
- 8 Books | 42h 22m
- 4 Courses | 4h 13m 12s
- 1 Book | 3h 4m
- 3 Courses | 2h 59m 55s
- 4 Books | 18h 27m
- 5 Courses | 3h 36m 44s
- 8 Books | 36h 48m
Apache Hadoop is an open source framework for the storage and processing of big data. Come explore the ins and outs of Hadoop.
GETTING STARTED
Managing Big Data Using HDInsight Hadoop
-
6m 37s
-
5m 25s
COURSES INCLUDED
Fundamentals & Installation
Apache Hadoop is a set of algorithms for distributed storage and distributed processing of very large data sets. Get started with Hadoop by learning about big data, and how to install and use Hadoop.
12 videos |
40m
Assessment
Badge
Storage & MapReduce
MapReduce is a framework for writing applications to process huge amounts of data. Let's look at Hadoop storage, MapReduce, and how to use MapReduce with associated development tools.
11 videos |
46m
Assessment
Badge
Programming with MapReduce
You must have a good understanding of MapReduce to be able to program with it. Here we look at MapReduce in detail, and demonstrate the basics of programming in MapReduce.
16 videos |
1h 6m
Assessment
Badge
Using Hive & Pig with Hadoop
There are components other than MapReduce that let you write code to process large data sets stored in Hadoop. Let's see how to work with two such components - Hive and Pig.
7 videos |
32m
Assessment
Badge
Introduction to Hadoop
Hadoop is an open-source, Java-based programming framework that supports the processing of large data sets in a distributed computing environment. Explore Hadoop, its key tools, and applications.
10 videos |
41m
Assessment
Badge
Ecosystem & MapReduce
Hadoop is a framework providing for distributed storage and processing of large data sets. Explore the Hadoop ecosystem and Java MapReduce.
10 videos |
38m
Assessment
Badge
Introduction to Data Modeling
Discover various data genres and management tools, the reasons behind the evolving plethora of new big data platforms from the perspective of big data management systems, and analytical tools.
15 videos |
1h 1m
Assessment
Badge
SHOW MORE
FREE ACCESS
COURSES INCLUDED
Hadoop HDFS Getting Started
Explore the concepts of analyzing large data sets in this 12-video Skillsoft Aspire course, which deals with Hadoop and its Hadoop Distributed File System (HDFS), which enables parallel processing of big data efficiently in a distributed cluster. The course assumes a conceptual understanding of Hadoop and its components; purely theoretical, it contains no labs, with just enough information provided to understand how Hadoop and HDFS allow processing big data in parallel. The course opens by explaining the ideas of vertical and horizontal scaling, then discusses functions served by Hadoop to horizontally scale data processing tasks. Learners explore functions of YARN, MapReduce, and HDFS, covering how HDFS keeps track of where all pieces of large files are distributed, replication of data, and how HDFS is used with Zookeeper: a tool maintained by the Apache Software Foundation and used to provide coordination and synchronization in distributed systems, along with other services related to distributed computing-a naming service, configuration management, and so on. Learn about Spark, a data analytics engine for distributed data processing.
12 videos |
1h 14m
Assessment
Badge
Introduction to the Shell for Hadoop HDFS
In this Skillsoft Aspire course, learners discover how to set up a Hadoop Cluster on the cloud and explore bundled web apps-the YARN Cluster Manager app and the HDFS (Hadoop Distributed File System) NameNode UI. This 9-video course assumes a good understanding of what Hadoop is, and how HDFS enables processing of big data in parallel by distributing large data sets across a cluster; learners should also be familiar with running commands from the Linux shell, with some fluency in basic Linux file system commands. The course opens by exploring two web applications which are packaged with Hadoop, the UI for the YARN cluster manager, and the node name UI for HDFS. Learners then explore two shells which can be used to work with HDFS, the Hadoop FS shell and Hadoop DFS shell. Next, you will explore basic commands which can be used to navigate HDFS; discuss their similarities with Linux file system commands; and discuss distributed computing. In a closing exercise, practice identifying web applications used to explore and also monitor Hadoop.
9 videos |
52m
Assessment
Badge
Working with Files in Hadoop HDFS
In this Skillsoft Aspire course, learners will encounter basic Hadoop file system operations such as viewing the contents of directories and creating new ones. This 8-video course assumes good understanding of what Hadoop is, and how HDFS enables processing of big data in parallel by distributing large data sets across a cluster; learners should also be familiar with running commands from the Linux shell, with some fluency in basic Linux file system commands. Begin by working with files in various ways, including transferring files between a local file system and HDFS (Hadoop Distributed File System) and explore ways to create and delete files on HDFS. Then examine different ways to modify files on HDFS. After exploring the distributed computing concept, prepare to begin working with HDFS in a production setting. In the closing exercise, write a command to create a directory/data/products/files on HDFS, for which data/products may not exist; list two commands for two copy operations-one from local file system to HDFS, and another for reverse transfer, from HDFS to local host.
8 videos |
47m
Assessment
Badge
Hadoop & MapReduce Getting Started
In this course, learners will explore the theory behind big data analysis using Hadoop, and how MapReduce enables parallel processing of large data sets distributed on a cluster of machines. Begin with an introduction to big data and the various sources and characteristics of data available today. Look at challenges involved in processing big data and options available to address them. Next, a brief overview of Hadoop, its role in processing big data, and the functions of its components such as the Hadoop Distributed File System (HDFS), MapReduce, and YARN (Yet Another Resource Negotiator). Explore the working of Hadoop's MapReduce framework to process data in parallel on a cluster of machines. Recall steps involved in building a MapReduce application and specifics of the Map phase in processing each row of the input file's data. Recognize the functions of the Shuffle and Reduce phases in sorting and interpreting the output of the Map phase to produce a meaningful output. To conclude, complete an exercise on the fundamentals of Hadoop and MapReduce.
8 videos |
1h 3m
Assessment
Badge
Developing a Basic MapReduce Hadoop Application
In this Skillsoft Aspire course, discover how to use Hadoop's MapReduce; provision a Hadoop cluster on the cloud; and build an application with MapReduce to calculate word frequencies in a text document. To start, create a Hadoop cluster on the Google Cloud Platform using its Cloud Dataproc service; then work with the YARN Cluster Manager and HDFS (Hadoop Distributed File System) NameNode web applications that come packaged with Hadoop. Use Maven to create a new Java project for the MapReduce application, and develop a mapper for word frequency application. Create a Reducer for the application that will collect Mapper output and calculate word frequencies in input text files, and identify configurations of MapReduce applications in the Driver program and the project's pom.xml file. Next, build the MapReduce word frequency application with Maven to produce a jar file and prepare for execution from the master node of the Hadoop cluster. Finally, run the application and examine outputs generated to get word frequencies in the input text document. The exercise involves developing a basic MapReduce application.
10 videos |
1h 13m
Assessment
Badge
Filtering Data Using Hadoop MapReduce
Extracting meaningful information from a very large dataset can be painstaking. In this Skillsoft Aspire course, learners examine how Hadoop's MapReduce can be used to speed up this operation. In a new project, code the Mapper for an application to count the number of passengers in each Titanic class in the input data set. Then develop a Reducer and Driver to generate final passenger counts in each Titanic class. Build the project by using Maven and run on Hadoop master node to check that output correctly shows passenger class numbers. Apply MapReduce to filter only surviving Titanic passengers from the input data set. Execute the application and verify that filtering has worked correctly; examine job and output files with YARN cluster manager and HDFS (Hadoop Distributed File System) NameNode web User interfaces. Using a restaurant app's data set, use MapReduce to obtain the distinct set of cuisines offered. Build and run the application and confirm output with HDFS from both command line and web application. The exercise involves filtering data by using MapReduce.
9 videos |
58m
Assessment
Badge
SHOW MORE
FREE ACCESS
COURSES INCLUDED
Ecosystem for Hadoop
Hadoop is a framework providing for distributed storage and processing of large data sets. Introduce yourself to a big data model and the Hadoop ecosystem.
8 videos |
32m
Assessment
Badge
Hadoop Design Principles
Hadoop's HDFS is a highly fault-tolerant distributed file system suitable for applications that have large data sets. Explore the principles of supercomputing and Hadoop's open source software components.
11 videos |
42m
Assessment
Badge
Selecting & Creating an Environment
Learn how to prepare your environment for a Hadoop installation. Here we review the minimum system requirements, create a development environment, install Java, and set up SSH for Hadoop.
4 videos |
26m
Badge
Installation & Configuration
Once your environment is set up, you are ready to install Hadoop. Follow the step-by-step instructions for installing Hadoop in a pseudo-mode, and learn more about the Hadoop architecture.
9 videos |
1h 1m
Assessment
Badge
Configuration & Troubleshooting
After installation, there are tasks you need to perform before using Hadoop. Learn how to first use HDFS, WordCount, & Web UIs, perform configuration changes, and troubleshoot installation errors.
8 videos |
35m
Assessment
Badge
Data Repository with HDFS & HBase
It is vital you understand the Hadoop Distributed File System (HDFS). Explore the server architecture, and learn about the command line interface and common HDFS administration issues facing all end users.
13 videos |
1h
Assessment
Badge
HBase & ZooKeeper
Hadoop is all about big data. Explore the theory of HBase as another data repository built alongside or on top of HDFS. Also, learn how to install and configure HBase and ZooKeeper, and use the HBase command line.
7 videos |
50m
Assessment
Badge
Data Repository with Flume
Flume is tool for dealing with extraction and loading of unstructured data. Learn about the theory of Flume, its functional parts, and how to install Flume for use.
12 videos |
47m
Assessment
Badge
Timestamps, Sources, & Troubleshooting
Flume is tool for dealing with extraction and loading of unstructured data. Learn how to work with Flume sinks, sources, & agents, and how to troubleshoot Flume agents & failures.
12 videos |
47m
Assessment
Badge
Data Repository with Sqoop
Sqoop is a tool for transferring structured data between Hadoop and a RDBMS. Explore the architecture and installation of Sqoop, how to perform imports and exports, Hive SQL statements, and more.
16 videos |
1h 11m
Badge
Data Refinery with YARN
YARN is a parallel processing framework that provides the resources for data computations. Explore the theory of parallel processing and the architecture of the YARN framework.
7 videos |
23m
Assessment
Badge
Data Refinery with MapReduce
MapReduce is a set of classes, which abstract away the complexity of parallel processing. Learn how MapReduce can take a single compute job and run it in our super computing platform.
13 videos |
54m
Badge
Hive Joining, Partitioning, & Troubleshooting
Hive is a SQL-like tool for interfacing with Hadoop. Learn how to use Hive joins and views, partition Hive data, create Hive buckets, and troubleshoot errors.
10 videos |
40m
Badge
Data Factory with Pig
Pig is a data flow language for interfacing with Hadoop to extract, transform, and load data. Learn how to install & configure Pig, and use the command line to write and execute Pig scripts.
12 videos |
47m
Badge
Pig Functions & Troubleshooting
Pig is a data flow language for interfacing with Hadoop to extract, transform, and load data. Learn how to work with Pig joins, groups, & user-defined functions, and troubleshoot & debug with Pig.
8 videos |
47m
Badge
HiveServer2 & HCatalog
Oozie is a workflow tool for coordinating other components in Hadoop. To use Oozie, a number of other components must be installed first. Learn the purpose of and how to install and configure the Hive metastore, HiveServer2, and HCatalog.
6 videos |
52m
Badge
Data Factory with Oozie
Oozie is a workflow tool for coordinating other components of the Hadoop ecosystem. Learn how to install, configure, & use Oozie to create and run workflows.
10 videos |
55m
Badge
Data Factory with Hue
Hue is an easy-to-use web UI to interface to HTFS, MapReduce, Hive, Pig, & Oozie. Learn how to install, configure, & use Hue to work with Hadoop components.
6 videos |
31m
Assessment
Badge
Data Flow for the Hadoop Ecosystem
Data must move into and through Hadoop for it to function. Here we look at Hadoop and the data life cycle management, and use Sqoop and Hive to flow data.
12 videos |
59m
Badge
SHOW MORE
FREE ACCESS
COURSES INCLUDED
Designing Clusters
Hadoop is a framework providing fast and reliable analysis of large data sets. Introduce yourself to supercomputing, and explore the design principles of using Hadoop as a supercomputing platform.
6 videos |
32m
Assessment
Badge
Hadoop Cluster Architecture
Learn how to design a Hadoop cluster by taking an in-depth look at the hardware, network concepts, and the architecture that make up the cluster.
11 videos |
52m
Assessment
Badge
Hadoop in the Cloud
Amazon Web Services (AWS) is a secure cloud-computing platform offered by Amazon.com. Explore the key services offered by AWS and learn how to set up a Hadoop cluster.
16 videos |
1h 31m
Assessment
Badge
Data Migration & EMR
Discover how to use the AWS command line interface, examine AWS Elastic MapReduce (EMR), learn how to set up an EMR cluster, and explore the various ways to run EMR jobs.
10 videos |
1h 11m
Badge
Cluster Deployment Tools & Images
To deploy a Hadoop Cluster, you must ensure networks, disks, and hosts are configured correctly. Examine the configuration management tools, learn how to create configuration items, and set up a CM environment.
6 videos |
47m
Badge
Cluster Architecture Configuration
To deploy a Hadoop Cluster, you must ensure networks, disks, and hosts are configured correctly. Explore the Hadoop cluster architecture, learn how to start, stop, & configure Hadoop clusters, and configure logging & MySQL databases.
8 videos |
57m
Assessment
Badge
Cluster Deployment
To deploy a Hadoop Cluster, you must ensure networks, disks, and hosts are configured correctly. Learn how to set up of some of the common open-source software used to create and deploy a Hadoop ecosystem.
8 videos |
1h 4m
Badge
Cluster Availability
Nothing is more important than having your Hadoop cluster available for use. Discover how Hadoop leverages fault tolerance, and explore a number of the reliability features that have been designed into Hadoop.
10 videos |
1h 6m
Assessment
Badge
Availability Configuration
To be useful, your Hadoop cluster must be available. Here we discuss and demonstrate high availability for HDFS NameNode and how to recover from failures.
6 videos |
51m
Badge
Securing Clusters
Hadoop lets big data technologies reach companies, but as this grows so do the security concerns. Examine the risks and learn how to implement security groups and work with Kerberos.
8 videos |
1h 4m
Assessment
Badge
Securing with Kerberos
Hadoop lets big data technologies reach companies, but as this grows so do the security concerns. Examine the risks and learn how to implement HDFS, YARN, Hive, and other measures.
10 videos |
1h 11m
Assessment
Badge
Managing Security
Hadoop lets big data technologies reach companies, but as this grows so do the security concerns. Examine the risks and learn how to manage user security, access control lists, and other features.
9 videos |
1h 2m
Assessment
Badge
Operating Hadoop Clusters
Hadoop is a framework for running applications on large clusters of commodity hardware. Discover service levels, Hadoop releases, change management, and rack awareness.
5 videos |
36m
Assessment
Badge
Cluster Administration
Hadoop is a framework for running applications on large clusters of commodity hardware. Discover HDFS administration, quotas, DataNodes, HDFS scaling, and more.
10 videos |
1h 7m
Badge
Stabilizing Clusters
Tuning Hadoop clusters is vital to improve cluster performance. Explore the importance of incident management and working with Nagios.
8 videos |
1h 24m
Badge
Monitoring & Troubleshooting
Tuning Hadoop clusters is vital to improve cluster performance. Explore log management, problem management, and best practices for root cause analysis.
10 videos |
1h 8m
Assessment
Badge
Capacity Management Strategies
Apache Hadoop is an open-source software framework for storage and large-scale processing of datasets on clusters of commodity hardware. Explore capacity management of Hadoop clusters, including strategies and schedulers.
4 videos |
27m
Assessment
Badge
Capacity Management
Apache Hadoop is an open-source software framework for storage and large-scale processing of datasets on clusters of commodity hardware. Explore resource management through scheduling, the Fair Scheduler tool, and how to plan for scaling.
16 videos |
1h 38m
Assessment
Badge
Performance Tuning Best Practices
Hadoop can scale up from single servers to thousands of machines, each offering local computation and storage. Discover performance tuning concepts, including compression, tune up options, and memory optimization.
11 videos |
1h 14m
Assessment
Badge
Cluster Performance Tuning
Hadoop can scale up from single servers to thousands of machines, each offering local computation and storage. Examine tune up options, best practices for performance tuning, HDFS, YARN and MapReduce.
13 videos |
1h 17m
Assessment
Badge
Cloudera Manager & Hadoop Clusters
Cloudera Manager is a simple automated customizable management tool for Hadoop clusters. Explore web consoles for Cloudera Manager, cluster management tools, and cluster deployment.
6 videos |
55m
Assessment
Badge
Cloudera Manager Administration
Cloudera Manager is a simple automated customizable management tool for Hadoop clusters. Discover Cloudera Manager administration, including cluster management, services, and resource management.
7 videos |
1h 4m
Badge
Cloudera Manager Tools & Configuration
Cloudera Manager is a simple automated customizable management tool for Hadoop clusters. Discover Cloudera Manager tools and configuration, including performance tweaking, Impala, Sentry, Hive, Hue with MySQL, and Oozie workflows.
12 videos |
1h 52m
Badge
SHOW MORE
FREE ACCESS
COURSES INCLUDED
Managing Big Data Using HDInsight Hadoop
Explore the fundamentals of Azure HDInsight and the essential architectural components.
12 videos |
1h 5m
Assessment
Badge
Microsoft Analytics Platform System & Hive
Explore the Microsoft Analytics Platform System and using Hive to manage data from a data warehouse perspective.
17 videos |
1h 27m
Assessment
Badge
HDInsight & Retail Sales Implementation Using Hive
This course covers the implementation of data warehousing in retail sales. Learners will learn to design and implement data warehousing solutions using Hive and PowerBI on HDInsight.
11 videos |
45m
Assessment
Badge
Working with Spark Using HDInsight & Cluster Management
Discover how to work with Spark and its in-memory capabilities of data management. How to manage and troubleshoot HDInsight clusters using Ambari and the Azure CLI tool is also covered.
12 videos |
55m
Assessment
Badge
SHOW MORE
FREE ACCESS
COURSES INCLUDED
Hadoop HDFS File Permissions
Explore reasons why not all users should have free reign over all data sets, when managing a data warehouse. In this 9-video Skillsoft Aspire course, learners explore how file permissions can be viewed and configured in HDFS (Hadoop File Management System) and how the NameNode UI is used to monitor and explore HDFS. For this course, you need a good understanding of Hadoop and HDFS, along with familiarity with the HDFS shells, and confidence in working with and manipulating files on HDFS, and exploring it from the command line. The course focuses on different ways to view permissions, which are linked to files and directories, and how these can be modified. Learners explore automating many tasks involving HDFS by simply scripting them, and to use HDFS NameNode UI to monitor the distributed file system, and explore its contents. Review distributed computing and big data. The closing exercise involves writing a command to be used on the HDFS dfs shell to count the number of files within a directory on HDFS, and to perform related tasks.
9 videos |
48m
Assessment
Badge
Hadoop MapReduce Applications With Combiners
In this Skillsoft Aspire course, explore the use of Combiners to make MapReduce applications more efficient by minimizing data transfers. Start by learning about the need for Combiners to optimize the execution of a MapReduce application by minimizing data transfers within a cluster. Recall the steps to process data in a MapReduce application, and look at using a Combiner to perform partial reduction of data output from the Mapper. Then create a new project to calculate average automobile prices using Maven for a MapReduce application. Next, develop the Mapper and Reducer to calculate the average price for automobile makes in the input data set. Create a driver program for the MapReduce application, run it, and check output to get the average price per automobile. Learn how to code up a Combiner for a MapReduce application, fix the bug in the application so it can be used to correctly calculate the average price, then run the fixed application to verify that the prices are being calculated correctly. The concluding exercise concerns optimizing MapReduce with Combiners.
13 videos |
1h 23m
Assessment
Badge
Advanced Operations Using Hadoop MapReduce
In this Skillsoft Aspire course, explore how MapReduce can be used to extract the five most expensive vehicles in a data set, then build an inverted index for the words appearing in a set of text files. Begin by defining a vehicle type that can be used to represent automobiles to be stored in a Java PriorityQueue, then configure a Mapper to use a PriorityQueue to store the five most expensive automobiles it has processed from the dataset. Learn how to use a PriorityQueue in the Reducer of the application to receive the five most expensive automobiles from each mapper and write the top five automobiles overall to the output, then execute the application to verify the results. Next, explore how you can utilize the MapReduce framework in order to generate an inverted index and configure the Reducer and Driver for the inverted index application. This leads on to running the application and examining the inverted index on HDFS (Hadoop Distributed File System). The concluding exercise involves advanced operations using MapReduce.
9 videos |
48m
Assessment
Badge
COURSES INCLUDED
Hadoop Distributed File System
Discover the HDFS architecture and its main building blocks. In addition, explore data replication, communication protocols, and accessibility.
11 videos |
32m
Assessment
Badge
Clusters
Clusters are used to store and analyze large volumes of data in a distributed computer environment. Explore the best practices to follow when implementing clusters in Hadoop.
8 videos |
48m
Assessment
Badge
Hadoop on Amazon EMR
Hadoop can be used with Amazon EMR to process vast amounts of data. Explore how to use Hadoop with Amazon EMR.
10 videos |
47m
Assessment
Badge
Hadoop Ranger
Apache Ranger is used to provide data security across a Hadoop implementation. Explore the installation of Ranger and Ranger authentication considerations, as well as customizing services to run Ranger alongside Hadoop.
9 videos |
51m
Assessment
Badge
Maintenance & Distributions
Distributions provide performance and functionality enhancements over the base open source code Apache provides. Explore the various distributions available and common maintenance tasks in a Hadoop environment.
10 videos |
37m
Assessment
Badge
SHOW MORE
FREE ACCESS
EARN A DIGITAL BADGE WHEN YOU COMPLETE THESE COURSES
Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.
Digital badges are yours to keep, forever.BOOKS INCLUDED
Book
Professional HadoopServing as the complete reference and resource for experienced developers looking to employ Apache Hadoop in real-world settings, this guide details every aspect of Hadoop technology to enable optimal processing of large data sets, and gets you acquainted with the framework's processes and capabilities right away.
3h 47m
By Benoy Antony, et al.
Book
Practical Hadoop Ecosystem: A Definitive Guide to Hadoop-Related Frameworks and ToolsFrom setting up the environment to running sample applications, this step-by-step resource is a practical tutorial on using the Apache Hadoop ecosystem project.
4h 56m
By Deepak Vohra
Book
Big Data Made Easy: A Working Guide to the Complete Hadoop ToolsetApproaching the problem of managing massive data sets from a systems perspective, this book explains the roles for each project (like architect and tester, for example) and shows how the Hadoop toolset can be used at each system stage - and then explains, in an easily understood manner and through numerous examples, how to use each tool.
5h 27m
By Michael Frampton
Book
Pro HadoopWritten from the perspective of a principal engineer with down-in-the-trenches knowledge of what to do wrong with Hadoop, this book shows how to avoid the common, expensive first errors that everyone makes with creating their own Hadoop system.
7h
By Jason Venner
Book
Hadoop Architecture and SQL: The Best HiveQL Book in the UniverseIncluding hundreds of pages of SQL examples and explanations, this book is perfect for anyone who wants to query Hadoop with SQL and educates readers on how to create tables, how the data is distributed, and how the system processes the data.
1h 32m
By Jason Nolander, Tom Coffing
Book
Hadoop for DummiesShowing you how to harness the power of your data and rein in the information overload, this detailed guide will help you understand the value of big data, make a business case for using Hadoop, navigate the Hadoop ecosystem, and build and manage Hadoop applications and clusters.
6h 39m
By Dirk deRoos, et al.
Book
Processing Big Data with Azure HDInsight: Building Real-World Big Data Systems on Azure HDInsight Using the Hadoop EcosystemAs most Hadoop and Big Data projects are written in either Java, Scala, or Python, this book minimizes the effort to learn another language and is written from the perspective of a .NET developer.
3h 4m
By Vinit Yadav
SHOW MORE
FREE ACCESS
BOOKS INCLUDED
Book
Big Data and Hadoop: Learn by ExampleContaining the latest trends in big data and Hadoop, this learn-by-doing resource explains how big Big Data is and why everybody is trying to implement it into their IT projects.
4h 17m
By Mayank Bhushan
Book
Practical Hadoop Ecosystem: A Definitive Guide to Hadoop-Related Frameworks and ToolsFrom setting up the environment to running sample applications, this step-by-step resource is a practical tutorial on using the Apache Hadoop ecosystem project.
4h 56m
By Deepak Vohra
Book
Professional HadoopServing as the complete reference and resource for experienced developers looking to employ Apache Hadoop in real-world settings, this guide details every aspect of Hadoop technology to enable optimal processing of large data sets, and gets you acquainted with the framework's processes and capabilities right away.
3h 47m
By Benoy Antony, et al.
Book
Big Data Made Easy: A Working Guide to the Complete Hadoop ToolsetApproaching the problem of managing massive data sets from a systems perspective, this book explains the roles for each project (like architect and tester, for example) and shows how the Hadoop toolset can be used at each system stage - and then explains, in an easily understood manner and through numerous examples, how to use each tool.
5h 27m
By Michael Frampton
SHOW MORE
FREE ACCESS
BOOKS INCLUDED
Book
Professional HadoopServing as the complete reference and resource for experienced developers looking to employ Apache Hadoop in real-world settings, this guide details every aspect of Hadoop technology to enable optimal processing of large data sets, and gets you acquainted with the framework's processes and capabilities right away.
3h 47m
By Benoy Antony, et al.
Book
Big Data and Hadoop: Learn by ExampleContaining the latest trends in big data and Hadoop, this learn-by-doing resource explains how big Big Data is and why everybody is trying to implement it into their IT projects.
4h 17m
By Mayank Bhushan
Book
Pro Apache Hadoop, Second EditionTaking you quickly to the seasoned pro level on the hottest cloud-computing framework, this book covers everything you need to build your first Hadoop cluster and begin analyzing and deriving value from your business and scientific data.
7h 26m
By Jason Venner, Madhu Siddalingaiah, Sameer Wadkar
Book
Hadoop for DummiesShowing you how to harness the power of your data and rein in the information overload, this detailed guide will help you understand the value of big data, make a business case for using Hadoop, navigate the Hadoop ecosystem, and build and manage Hadoop applications and clusters.
6h 39m
By Dirk deRoos, et al.
Book
Pro Hadoop Data Analytics: Designing and Building Big Data Systems using the Hadoop EcosystemEmphasizing best practices to ensure coherent, efficient development, this book provides the right combination of architecture, design, and implementation information to create analytical systems that go beyond the basics of classification, clustering, and recommendation.
3h 4m
By Kerry Koitzsch
Book
Professional Hadoop SolutionsWith in-depth code examples in Java and XML and the latest on recent additions to the Hadoop ecosystem, this complete resource also covers the use of APIs, exposing their inner workings and allowing architects and developers to better leverage and customize them.
8h 2m
By Alexey Yakubovich, Boris Lublinsky, Kevin T. Smith
Book
Big Data Processing Beyond Hadoop and MapReduceAuthored by EMC Proven Professionals, Knowledge Sharing articles present ideas, expertise, unique deployments, and best practices. This article provides an overview of various new and upcoming alternatives to Hadoop MR.
23m
By Ravi Sharda
Book
Big Data Made Easy: A Working Guide to the Complete Hadoop ToolsetApproaching the problem of managing massive data sets from a systems perspective, this book explains the roles for each project (like architect and tester, for example) and shows how the Hadoop toolset can be used at each system stage - and then explains, in an easily understood manner and through numerous examples, how to use each tool.
5h 27m
By Michael Frampton
SHOW MORE
FREE ACCESS
BOOKS INCLUDED
Book
Professional HadoopServing as the complete reference and resource for experienced developers looking to employ Apache Hadoop in real-world settings, this guide details every aspect of Hadoop technology to enable optimal processing of large data sets, and gets you acquainted with the framework's processes and capabilities right away.
3h 47m
By Benoy Antony, et al.
Book
Big Data and Hadoop: Learn by ExampleContaining the latest trends in big data and Hadoop, this learn-by-doing resource explains how big Big Data is and why everybody is trying to implement it into their IT projects.
4h 17m
By Mayank Bhushan
Book
Big Data Made Easy: A Working Guide to the Complete Hadoop ToolsetApproaching the problem of managing massive data sets from a systems perspective, this book explains the roles for each project (like architect and tester, for example) and shows how the Hadoop toolset can be used at each system stage - and then explains, in an easily understood manner and through numerous examples, how to use each tool.
5h 27m
By Michael Frampton
Book
Practical Hadoop SecurityFor administrators planning a production Hadoop deployment who want to secure their Hadoop clusters, this resource takes you through a comprehensive study of how to implement defined security within a Hadoop cluster in a hands-on way.
3h 40m
By Bhushan Lakhe
Book
Professional Hadoop SolutionsWith in-depth code examples in Java and XML and the latest on recent additions to the Hadoop ecosystem, this complete resource also covers the use of APIs, exposing their inner workings and allowing architects and developers to better leverage and customize them.
8h 2m
By Alexey Yakubovich, Boris Lublinsky, Kevin T. Smith
Book
Pro Hadoop Data Analytics: Designing and Building Big Data Systems using the Hadoop EcosystemEmphasizing best practices to ensure coherent, efficient development, this book provides the right combination of architecture, design, and implementation information to create analytical systems that go beyond the basics of classification, clustering, and recommendation.
3h 4m
By Kerry Koitzsch
Book
Hadoop for DummiesShowing you how to harness the power of your data and rein in the information overload, this detailed guide will help you understand the value of big data, make a business case for using Hadoop, navigate the Hadoop ecosystem, and build and manage Hadoop applications and clusters.
6h 39m
By Dirk deRoos, et al.
Book
Pro Apache Hadoop, Second EditionTaking you quickly to the seasoned pro level on the hottest cloud-computing framework, this book covers everything you need to build your first Hadoop cluster and begin analyzing and deriving value from your business and scientific data.
7h 26m
By Jason Venner, Madhu Siddalingaiah, Sameer Wadkar
SHOW MORE
FREE ACCESS
BOOKS INCLUDED
Book
Processing Big Data with Azure HDInsight: Building Real-World Big Data Systems on Azure HDInsight Using the Hadoop EcosystemAs most Hadoop and Big Data projects are written in either Java, Scala, or Python, this book minimizes the effort to learn another language and is written from the perspective of a .NET developer.
3h 4m
By Vinit Yadav
BOOKS INCLUDED
Book
Big Data and Hadoop: Learn by ExampleContaining the latest trends in big data and Hadoop, this learn-by-doing resource explains how big Big Data is and why everybody is trying to implement it into their IT projects.
4h 17m
By Mayank Bhushan
Book
Practical Hadoop Ecosystem: A Definitive Guide to Hadoop-Related Frameworks and ToolsFrom setting up the environment to running sample applications, this step-by-step resource is a practical tutorial on using the Apache Hadoop ecosystem project.
4h 56m
By Deepak Vohra
Book
Professional HadoopServing as the complete reference and resource for experienced developers looking to employ Apache Hadoop in real-world settings, this guide details every aspect of Hadoop technology to enable optimal processing of large data sets, and gets you acquainted with the framework's processes and capabilities right away.
3h 47m
By Benoy Antony, et al.
Book
Big Data Made Easy: A Working Guide to the Complete Hadoop ToolsetApproaching the problem of managing massive data sets from a systems perspective, this book explains the roles for each project (like architect and tester, for example) and shows how the Hadoop toolset can be used at each system stage - and then explains, in an easily understood manner and through numerous examples, how to use each tool.
5h 27m
By Michael Frampton
SHOW MORE
FREE ACCESS
BOOKS INCLUDED
Book
Practical Hadoop Ecosystem: A Definitive Guide to Hadoop-Related Frameworks and ToolsFrom setting up the environment to running sample applications, this step-by-step resource is a practical tutorial on using the Apache Hadoop ecosystem project.
4h 56m
By Deepak Vohra
Book
Pro Hadoop Data Analytics: Designing and Building Big Data Systems using the Hadoop EcosystemEmphasizing best practices to ensure coherent, efficient development, this book provides the right combination of architecture, design, and implementation information to create analytical systems that go beyond the basics of classification, clustering, and recommendation.
3h 4m
By Kerry Koitzsch
Book
Professional HadoopServing as the complete reference and resource for experienced developers looking to employ Apache Hadoop in real-world settings, this guide details every aspect of Hadoop technology to enable optimal processing of large data sets, and gets you acquainted with the framework's processes and capabilities right away.
3h 47m
By Benoy Antony, et al.
Book
Practical Hive: A Guide to Hadoop's Data Warehouse SystemFrom deploying Hive on your hardware or virtual machine and setting up its initial configuration to learning how Hive interacts with Hadoop, MapReduce, Tez and other big data technologies, this go-to resource gives you a detailed treatment of the software.
3h 57m
By Andreas François Vermeulen, Ankur Gupta, David Kjerrumgaard, Scott Shaw
Book
Big Data Made Easy: A Working Guide to the Complete Hadoop ToolsetApproaching the problem of managing massive data sets from a systems perspective, this book explains the roles for each project (like architect and tester, for example) and shows how the Hadoop toolset can be used at each system stage - and then explains, in an easily understood manner and through numerous examples, how to use each tool.
5h 27m
By Michael Frampton
Book
Hadoop Architecture and SQL: The Best HiveQL Book in the UniverseIncluding hundreds of pages of SQL examples and explanations, this book is perfect for anyone who wants to query Hadoop with SQL and educates readers on how to create tables, how the data is distributed, and how the system processes the data.
1h 32m
By Jason Nolander, Tom Coffing
Book
Pro Apache Hadoop, Second EditionTaking you quickly to the seasoned pro level on the hottest cloud-computing framework, this book covers everything you need to build your first Hadoop cluster and begin analyzing and deriving value from your business and scientific data.
7h 26m
By Jason Venner, Madhu Siddalingaiah, Sameer Wadkar
Book
Hadoop for DummiesShowing you how to harness the power of your data and rein in the information overload, this detailed guide will help you understand the value of big data, make a business case for using Hadoop, navigate the Hadoop ecosystem, and build and manage Hadoop applications and clusters.
6h 39m
By Dirk deRoos, et al.
SHOW MORE
FREE ACCESS