Graph Modeling on Apache Spark: Working with Apache Spark GraphFrames

Apache Spark 3.2    |    Intermediate
  • 13 Videos | 1h 51m 32s
  • Includes Assessment
  • Earns a Badge
Apache Spark, which is a widely used analytics engine, also helps anyone modeling graphs to perform powerful graph analytics. GraphFrames, a Spark package, aids this process by providing various graph algorithm implementations. Use this course to learn about GraphFrames and the application of graph algorithms on data to extract insights. Explore how GraphFrames complements the Apache Hadoop ecosystem in processing graph data. Getting hands-on, construct and visualize a GraphFrame. Practice querying nodes and relationships in a graph and finding motifs in it. Moving along, work with the breadth-first search and the shortestPaths functions to find paths between graph nodes. And finally, apply the PageRank algorithm to arrive at the most relevant nodes in a network. Upon completion, you'll be able to use GraphFrames to analyze and generate insights from graph data.

WHAT YOU WILL LEARN

  • discover the key concepts covered in this course
    outline Apache Hadoop and its ecosystem, describe GraphFrames and their capabilities, and recognize where GraphFrames fit into the Apache Hadoop ecosystem
    download and install Apache Spark and set up your IDE with GraphFrames
    construct a GraphFrame starting with the definition of its nodes and edges
    define functions to present a directed as well as an undirected graph
    demonstrate the identification of the most and the least-connected nodes in a graph
    apply filters on the nodes in a graph at the DataFrame and the GraphFrame levels
  • apply filters on the edges of a graph and apply aggregation operations on them
    search for patterns of relationships between the nodes in a Spark GraphFrame
    illustrate how to find chains of connections as well as cycles in a GraphFrame
    use the breadth-first search and the shortestPaths functions to find the shortest paths between nodes in a graph
    apply the PageRank algorithm to identify triangles of connections in a graph and calculate the page rank for a graph of connected web pages
    summarize the key concepts covered in this course

IN THIS COURSE

  • Playable
    1. 
    Course Overview
    2m 23s
    UP NEXT
  • Playable
    2. 
    An Overview of GraphFrames
    11m 51s
  • Locked
    3. 
    Setting up PySpark and GraphFrames
    7m 26s
  • Locked
    4. 
    Constructing a GraphFrame
    11m 46s
  • Locked
    5. 
    Visualizing a GraphFrame
    10m 19s
  • Locked
    6. 
    Calculating the Degrees of Nodes in a Graph
    6m 17s
  • Locked
    7. 
    Filtering the Nodes in a GraphFrame
    9m 58s
  • Locked
    8. 
    Filtering the Edges in a GraphFrame
    7m 54s
  • Locked
    9. 
    Finding Simple Motifs in GraphFrames
    11m 45s
  • Locked
    10. 
    Searching for Complex Patterns in Graphs
    10m 24s
  • Locked
    11. 
    Finding the Shortest Paths between Nodes in a Graph
    12m 5s
  • Locked
    12. 
    Applying the PageRank Algorithm
    7m 30s
  • Locked
    13. 
    Course Summary
    1m 54s

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion of this course, which can be shared on any social network or business platform

Digital badges are yours to keep, forever.