Big Data Using Hadoop and Hive

  • 2h 14m
  • Nitin Kumar
  • Mercury Learning
  • 2021

This book is the basic guide for developers, architects, engineers, and anyone who wants to start leveraging the open-source software Hadoop and Hive to build distributed, scalable concurrent big data applications. Hive will be used for reading, writing, and managing the large data set files. The book is a concise guide on getting started with an overall understanding on Apache Hadoop and Hive and how they work together to speed up development with minimal effort. It will refer to simple concepts and examples, as they are likely to be the best teaching aids. It will explain the logic, code, and configurations needed to build a successful, distributed, concurrent application, as well as the reasons behind those decisions.


  • Shows how to leverage the open-source software Hadoop and Hive to build distributed, scalable, concurrent big data applications
  • Includes material on Hive architecture with various storage types and the Hive query language
  • Features a chapter on big data and how Hadoop can be used to solve its challenges
  • Explains the basic Hadoop setup, configuration, and optimization

Brief TOC

1: Big Data. 2: What Is Apache Hadoop? 3: The Hadoop Distribution File System.4: Getting Started with Hadoop. 5: Interfaces to Access HDFS Files. 6: Yet Another Resource Negotiator. 7: MapReduce. 8: Hive. 9: Getting Started with Hive. 10: File Format. 11: Data Compression. Index.

In this Book

  • Big Data
  • What is Apache Hadoop?
  • The Hadoop Distribution Filesystem
  • Getting Started with Hadoop
  • Interfaces to Access HDFS Files
  • Yet Another Resource Negotiator
  • Mapreduce
  • Hive
  • Getting Started with Hive
  • File Format
  • Data Compression


Rating 4.4 of 79 users Rating 4.4 of 79 users (79)
Rating 4.0 of 1 users Rating 4.0 of 1 users (1)
Rating 4.3 of 97 users Rating 4.3 of 97 users (97)