23/08/2020

Efficient algorithm for the b-matching graph

Yasuhiro Fujiwara, Atsutoshi Kumagai, Sekitoshi Kanai, Yasutoshi Ida, Naonori Ueda

Keywords: efficient, algorithm, b-matching graph

Abstract: The b-matching graph is a useful approach to computing a graph from high-dimensional data. Unlike the k-NN graph that greedily connects each data point to its k nearest neighbors and typically has more than k edges, each data point in the b-matching graph uniformly has b edges; the idea is reduce edges between cross-clusters that have different semantics. In addition, edge weights are obtained from regression results of each data pointand restricted to be non-negative to improve the robustness for data noise. The b-matching graph can more effectively model high-dimensional data than the traditional k-NN graph. However, the construction cost of the b-matching graph is impractical for large-scale data sets. This is because, to determine edges in the graph, it needs to iteratively update messages between all pairs of data points until convergence, and it computes non-negative edge weights of each data point by applying a solver intended for quadratic programming problems. Our proposal, b-dash, can efficiently construct a b-matching graph because of its two key techniques: (1) it prunes unnecessary update messages in determining edges and (2) it incrementally computes edge weights by exploiting the Sherman-Morrison formula. Experiments show that our approach is up to 58.6 times faster than the previous approaches while guaranteeing result optimality.

The video of this talk cannot be embedded. You can watch it here:
https://dl.acm.org/doi/10.1145/3394486.3403061#sec-supp
(Link will open in new window)
 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at KDD 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers