Mapping Data Flows in Azure Data Factory: Building Scalable ETL Projects in the Microsoft Cloud

  • 2h 7m
  • Mark Kromer
  • Apress
  • 2022

Build scalable ETL data pipelines in the cloud using Azure Data Factory’s Mapping Data Flows. Each chapter of this book addresses different aspects of an end-to-end data pipeline that includes repeatable design patterns based on best practices using ADF’s code-free data transformation design tools. The book shows data engineers how to take raw business data at cloud scale and turn that data into business value by organizing and transforming the data for use in data science projects and analytics systems.

The book begins with an introduction to Azure Data Factory followed by an introduction to its Mapping Data Flows feature set. Subsequent chapters show how to build your first pipeline and corresponding data flow, implement common design patterns, and operationalize your result. By the end of the book, you will be able to apply what you’ve learned to your complex data integration and ETL projects in Azure. These projects will enable cloud-scale big analytics and data loading and transformation best practices for data warehouses.

What You Will Learn

  • Build scalable ETL jobs in Azure without writing code
  • Transform big data for data quality and data modeling requirements
  • Understand the different aspects of Azure Data Factory ETL pipelines from datasets and Linked Services to Mapping Data Flows
  • Apply best practices for designing and managing complex ETL data pipelines in Azure Data Factory
  • Add cloud-based ETL patterns to your set of data engineering skills
  • Build repeatable code-free ETL design patterns

About the Author

Mark Kromer has been in the data analytics product space for over 20 years and is currently a Principal Program Manager for Microsoft’s Azure data integration products. Mark often writes and speaks on big data analytics and data analytics and was an engineering architect and product manager for Oracle, Pentaho, AT&T, and Databricks prior to Microsoft Azure.

In this Book

  • Introduction
  • ETL for the Cloud Data Engineer
  • Introduction to Azure Data Factory
  • Introduction to Mapping Data Flows
  • Build Your First ETL Pipeline in ADF
  • Common ETL Pipeline Practices in ADF with Mapping Data Flows
  • Slowly Changing Dimensions
  • Data Deduplication
  • Mapping Data Flow Advanced Topics
  • Basics of CI/CD and Pipeline Scheduling
  • Monitor, Manage, and Optimize
SHOW MORE
FREE ACCESS

YOU MIGHT ALSO LIKE

Rating 4.8 of 32 users Rating 4.8 of 32 users (32)
Rating 4.6 of 31 users Rating 4.6 of 31 users (31)
Rating 4.6 of 423 users Rating 4.6 of 423 users (423)