Mastering Databricks Lakehouse Platform: Perform Data Warehousing, Data Engineering, Machine Learning, DevOps, and BI into a Single Platform
- 2h 48m
- Anjani Kumar, Sagar Lad
- BPB Publications
Enable data and AI workloads with absolute security and scalability
- Detailed, step-by-step instructions for every data professional starting a career with data engineering.
- Access to DevOps, Machine Learning, and Analytics within a single unified platform.
- Includes design considerations and security best practices for efficient utilization of Databricks platform.
Starting with the fundamentals of the databricks lakehouse platform, the book teaches readers on administering various data operations, including Machine Learning, DevOps, Data Warehousing, and BI on the single platform.
The subsequent chapters discuss working around data pipelines utilizing the databricks lakehouse platform with data processing and audit quality framework. The book teaches to leverage the Databricks Lakehouse platform to develop delta live tables, streamline ETL/ELT operations, and administer data sharing and orchestration. The book explores how to schedule and manage jobs through the Databricks notebook UI and the Jobs API. The book discusses how to implement DevOps methods on the Databricks Lakehouse platform for data and AI workloads. The book helps readers prepare and process data and standardizes the entire ML lifecycle, right from experimentation to production.
The book doesn't just stop here; instead, it teaches how to directly query data lake with your favourite BI tools like Power BI, Tableau, or Qlik. Some of the best industry practices on building data engineering solutions are also demonstrated towards the end of the book.
WHAT YOU WILL LEARN
- Acquire capabilities to administer end-to-end Databricks Lakehouse Platform.
- Utilize Flow to deploy and monitor machine learning solutions.
- Gain practical experience with SQL Analytics and connect Tableau, Power BI, and Qlik.
- Configure clusters and automate CI/CD deployment.
- Learn how to use Airflow, Data Factory, Delta Live Tables, Databricks notebook UI, and the Jobs API.
WHO THIS BOOK IS FOR
This book is for every data professional, including data engineers, ETL developers, DB administrators, Data Scientists, SQL Developers, and BI specialists. You don't need any prior expertise with this platform because the book covers all the basics.
About the Author
Sagar Lad is a Technical Solution Architect with a leading multinational software company and has deep expertise in implementing Data & Analytics solutions for large enterprises using Cloud and Artificial Intelligence. He is an experienced Azure Platform evangelist with strong focus on driving cloud adoption for enterprise organizations using Microsoft Cloud Solutions & Offerings with 8+ Years of IT experience. He loves blogging and is an active blogger on medium, LinkedIn and C# Corner developer community. He was awarded the C# Corner MVP in September 2021, for his contributions to the developer community.
Anjani Kumar is the MD and Founder of MultiCloud4u which is one of the fastest DIgital transformation startups extensively using Data Driven solutions.
As a technologist Anjani is a multifaceted Enterprise and Data Solution Architect and have consulted 100 of top fortune 500 clients for setting up large data warehouse and digital business transformations correctly while working with publicis.Sapient, Royal Bank Of Scotland, Microsoft, AMEX and multiple other clients such as Unilever, Citi Bank, Brown Advisory, Nissan, Sprint, Bata, Philips, Jera Americas and many more.He runs a global knowledge sharing platform called 5thir.
In this Book
Getting Started with Databricks Platform
Management of Databricks Platform
Spark, Databricks, and Building a Data Quality Framework
Data Sharing and Orchestration with Databricks
Simplified ETL with Delta Live Tables
SCD Type 2 Implementation with Delta Lake
Machine Learning Model Management with Databricks
Continuous Integration and Delivery with Databricks
Visualization with Databricks
Best Security and Compliance Practices of Databricks