Data Lake Analytics on Microsoft Azure: A Practitioner's Guide to Big Data Engineering

  • 2h 22m
  • Harsh Chawla, Pankaj Khattar
  • Apress
  • 2020

Get a 360-degree view of how the journey of data analytics solutions has evolved from monolithic data stores and enterprise data warehouses to data lakes and modern data warehouses.

This book includes comprehensive coverage of how:

  • To architect data lake analytics solutions by choosing suitable technologies available on Microsoft Azure
  • The advent of microservices applications covering ecommerce or modern solutions built on IoT and how real-time streaming data has completely disrupted this ecosystem
  • These data analytics solutions have been transformed from solely understanding the trends from historical data to building predictions by infusing machine learning technologies into the solutions

Data platform professionals who have been working on relational data stores, non-relational data stores, and big data technologies will find the content in this book useful. The book also can help you start your journey into the data engineer world as it provides an overview of advanced data analytics and touches on data science concepts and various artificial intelligence and machine learning technologies available on Microsoft Azure.

What Will You Learn

You will understand the:

  • Concepts of data lake analytics, the modern data warehouse, and advanced data analytics
  • Architecture patterns of the modern data warehouse and advanced data analytics solutions
  • Phases—such as Data Ingestion, Store, Prep and Train, and Model and Serve—of data analytics solutions and technology choices available on Azure under each phase
  • In-depth coverage of real-time and batch mode data analytics solutions architecture
  • Various managed services available on Azure such as Synapse analytics, event hubs, Stream analytics, CosmosDB, and managed Hadoop services such as Databricks and HDInsight

Who This Book Is For

Data platform professionals, database architects, engineers, and solution architects

About the Authors

Harsh Chawla has been working on data platform technologies for last 14 years. He has been in various roles in the Microsoft world for last 12 years, going from CSS to services to technology strategy. He currently works as an Azure specialist with data and AI technologies and helps large IT enterprises build modern data warehouses, advanced analytics, and AI solutions on Microsoft Azure. He has been a community speaker and blogger on data platform technologies.

Pankaj Khattar is a seasoned Software Architect with over 14 years of experience in design and development of Big Data, Machine Learning and AI based products. He currently works with Microsoft on the Azure platform as a Sr. Cloud Solution Architect for Data & AI technologies. He also possesses extensive industry experience in the field of building scalable multi-tier distributed applications and client/server based development.

You can connect with him on LinkedIn at

In this Book

  • Data Lake Analytics Concepts
  • Building Blocks of Data Analytics
  • Data Analytics on Public Cloud
  • Data Ingestion
  • Data Storage
  • Data Preparation and Training Part I
  • Data Preparation and Training Part II
  • Model and Serve
  • Summary


Rating 4.6 of 423 users Rating 4.6 of 423 users (423)
Rating 4.7 of 78 users Rating 4.7 of 78 users (78)
Rating 4.6 of 189 users Rating 4.6 of 189 users (189)