Previous Page

Data Pipeline: Using Frameworks for Advanced Data Management

Data Pipeline: Using Frameworks for Advanced Data Management


Overview/Description
Expected Duration
Lesson Objectives
Course Number
Expertise Level



Overview/Description

Discover how to implement data pipelines using Python Luigi, integrate Spark, and Tableau to manage data pipelines, use Dask arrays, and build data pipeline visualization with Python.



Expected Duration (hours)
0.6

Lesson Objectives

Data Pipeline: Using Frameworks for Advanced Data Management

  • recognize the features of Celery and Luigi that can be used to set up data pipelines
  • implement Python Luigi in order to set up data pipelines
  • list Dask task scheduling and big data collection features
  • implement Dask arrays in order to manage NumPy APIs
  • list frameworks that can be used to implement data exploration and visualization in data pipelines
  • integrate Spark and Tableau to manage data pipelines
  • use Python to build visualizations for streaming data
  • recognize the data pipeline building capabilities provided by Kafka, Spark, and PySpark
  • set up Luigi to implement data pipelines, integrate Spark and Tableau for data pipeline management, and build visualizations for data pipelines using Python
  • Course Number:
    it_dsdptbdj_02_enus

    Expertise Level
    Intermediate