Scalable Graph Partitioning for Distributed Graph Processing

I just realized that the link to my doctoral thesis doesn’t work, so I decided to host it on the Finxter blog as a backup. Find the thesis here:

🔗 PDF Download link: https://blog.finxter.com/wp-content/uploads/2023/09/dissertation_christian_mayer_distributed_graph_processing_DIS-2019-03.pdf

Here’s the abstract:

💡 Abstract:

Distributed graph processing systems such as Pregel, PowerGraph, or GraphX have gained popularity due to their superior performance of data analytics on graph-structured data such as social networks, web document graphs, and biological networks. These systems scale out graph processing by dividing the graph into k partitions that are processed in parallel by k worker machines. The graph partitioning problem is NP-hard. Yet, finding good solutions for massive graphs is of paramount importance for distributed graph processing systems because it reduces communication overhead and latency of distributed graph processing. A multitude of graph partitioning heuristics emerged in recent years, fueled by the challenge of partitioning large graphs quickly.

The goal of this thesis is to tailor graph partitioning to the specifics of distributed graph processing and show that this leads to reduced graph processing latency and communication overhead compared to state-of-the-art partitioning.

In particular, we address the following four research questions.

(I) Recent partitioning algorithms unrealistically assume a uniform and constant amount of data exchanged between graph vertices (i.e., uniform vertex traffic) and homogeneous network costs between workers hosting the graph partitions. The first research question is: how to consider dynamically changing and heterogeneous graph workload for graph partitioning?

(II) Existing graph partitioning algorithms focus on minimal partitioning latency at the cost of reduced partitioning quality. However, we argue that the mere minimization of partitioning latency is not the optimal design choice in terms of minimizing total latency, i.e., the sum of partitioning and graph processing latency. The second research question is: how much latency should we invest into graph partitioning when considering that we often have to pay higher partitioning latency in order to achieve better partitioning quality (and therefore reduced graph processing latency)?

(III) Popular user-centric graph applications such as route planning and personalized social network analysis have initiated a shift of paradigms in modern graph processing systems towards multi-query analysis, i.e., processing multiple graph queries in parallel on a shared data graph. These applications generate a dynamic number of localized queries around query hotspots such as popular urban areas. However, the employed methods for graph partitioning and synchronization management disregard query locality and dynamism which leads to high query latency. The third question is: how to dynamically adapt the graph partitioning when multiple localized graph queries run in parallel on a shared graph structure?

(IV) Graphs are special cases of hypergraphs where each edge does not necessarily connect exactly two but an arbitrary number of vertices. Like graphs, they need to be partitioned as a pre-processing step for distributed hypergraph processing systems. Real-world hypergraphs have billions of vertices and a skewed degree distribution. However, no existing hypergraph partitioner tailors partitioning to the important subset of hypergraphs that are very large-scale and have a skewed degree distribution. Regarding this, the fourth research question is: how to partition these large-scale, skewed hypergraphs in an efficient way such that neighboring vertices tend to reside on the same partition?

We answer these research questions by providing the following four contributions.

(I) We developed the graph processing system GrapH that considers both, diverse vertex traffic and heterogeneous network costs. The main idea is to avoid frequent communication over expensive network links using an adaptive edge migration strategy.

(II) We developed a static partitioning algorithm ADWISE that allows to control the trade-off between partitioning latency and graph processing latency. Besides providing evidence for efficiency and effectiveness of our approach, we also show that state-of-the-art partitioning approaches invest too little latency into graph partitioning. By investing more latency into partitioning using ADWISE, total latency of partitioning and processing reduces significantly.

(III) We developed a distributed graph system QGraph for multi-query graph analysis that allows multiple localized graph queries to run in parallel on a shared graph structure. Our novel query-centric dynamic partitioning approach yields significant speedup as it repartitions the graph such that queries can be executed in a localized manner. This avoids expensive communication overhead while still providing good workload balancing.

(IV) We developed a novel hypergraph partitioning algorithm, called HYPE, that partitions the hypergraph by using the idea of neighborhood expansion. HYPE grows k partitions separately—expanding one vertex at a time over the neighborhood relation of the hypergraph. We show that HYPE leads to fast and effective partitioning performance compared to state-of-the-art hypergraph partitioning tools and partitions billion-scale hypergraphs on a single thread.

The algorithms and approaches presented in this thesis tailor graph partitioning towards the specifics of distributed graph processing with respect to (I) dynamic and heterogeneous traffic patterns and network costs, (II) the integrated latency of partitioning plus graph processing, and (III) the graph query workload for partitioning and synchronization. On top of that, (IV) we propose an efficient hypergraph partitioner which is specifically tailored to real-world hypergraphs with skewed degree distributions.