Data Pipeline: Using Frameworks for Advanced Data Management

Data Pipeline    |    Intermediate
  • 10 Videos | 36m 1s
  • Includes Assessment
  • Earns a Badge
Likes 6 Likes 6
Discover how to implement data pipelines using Python Luigi, integrate Spark and Tableau to manage data pipelines, use Dask arrays, and build data pipeline visualization with Python in this 10-video course. Begin by learning about features of Celery and Luigi that can be used to set up data pipelines, then how to implement Python Luigi to set up data pipelines. Next, turn to working with Dask library, after listing the essential features provided by Dask from the perspective of task scheduling and big data collections. Learn about implementation of Dask arrays to manage NumPy application programming interfaces (APIs). Explore frameworks that can be used to implement data exploration and visualization in data pipelines. Integrate Spark and Tableau to manage data pipelines. Move on to streaming data visualization with Python, using Python to build visualizations for streaming data. Then learn about the data pipeline building capabilities provided by Kafka, Spark, and PySpark. The concluding exercise involves setting up Luigi to implement data pipelines, Spark and Tableau integration, and building pipelines with Python.

WHAT YOU WILL LEARN

  • recognize the features of Celery and Luigi that can be used to set up data pipelines
    implement Python Luigi in order to set up data pipelines
    list Dask task scheduling and big data collection features
    implement Dask arrays in order to manage NumPy APIs
    list frameworks that can be used to implement data exploration and visualization in data pipelines
  • integrate Spark and Tableau to manage data pipelines
    use Python to build visualizations for streaming data
    recognize the data pipeline building capabilities provided by Kafka, Spark, and PySpark
    set up Luigi to implement data pipelines, integrate Spark and Tableau for data pipeline management, and build visualizations for data pipelines using Python

IN THIS COURSE

  • Playable
    1. 
    Course Overview
    1m 34s
    UP NEXT
  • Playable
    2. 
    Celery and Luigi
    3m 45s
  • Locked
    3. 
    Data Pipeline with Python Luigi
    3m 38s
  • Locked
    4. 
    Working with Dask Library
    3m 11s
  • Locked
    5. 
    Dask Arrays
    3m 59s
  • Locked
    6. 
    Data Exploration and Visualization Frameworks
    3m 46s
  • Locked
    7. 
    Spark and Tableau
    2m 26s
  • Locked
    8. 
    Streaming Data Visualization with Python
    2m 51s
  • Locked
    9. 
    Data Pipeline Open Source Tools
    3m 45s
  • Locked
    10. 
    Exercise: Implement Data Pipelines with Luigi
    3m 7s

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion of this course, which can be shared on any social network or business platform

Digital badges are yours to keep, forever.

PEOPLE WHO VIEWED THIS ALSO VIEWED THESE

Likes 185 Likes 185  
Likes 144 Likes 144