19/10/2020

PandaSQL: Parallel randomized triangle enumeration with SQL queries

Abir Farouzi, Ladjel Bellatreche, Carlos Ordonez, Gopal Pandurangan, Mimoun Malki

Keywords: triangle enumeration, dbms, graphs, query language., parallelism

Abstract: Triangles are an important pattern in large-scale graph analysis for their practical use in many real-life applications. However, with the expansion of networks, maintaining a balanced computational load is challenging especially for problems like triangle computations because of skewed vertices. On the other hand, there is a huge amount of data in database management systems (DBMSs) that can be modeled and analyzed as graphs. With these motivations in mind, we developed PandaSQL, a novel approach using SQL queries to enumerate all the triangles in a given graph based on Randomized Triangle Enumeration Algorithm. Our approach is elegant, abstract, and short compared to traditional languages like C++ or Python. Moreover, our partitioning queries ensures perfect load balancing. Thus, the triangle enumeration is independent, local, and parallel.

The video of this talk cannot be embedded. You can watch it here:
https://dl.acm.org/doi/10.1145/3340531.3417429#sec-supp
(Link will open in new window)
 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at CIKM 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd

Similar Papers