19/10/2020

Learning formatting style transfer and structure extraction for spreadsheet tables with a hybrid neural network architecture

Haoyu Dong, Jiong Yang, Shi Han, Dongmei Zhang

Keywords: deep learning, automatic formatting, document intelligence

Abstract: Table formatting is a typical task for spreadsheet users to better exhibit table structures and data relationships. But quickly and effectively formatting tables is a challenge for users. Lots of manual operations are needed, especially for complex tables. In this paper, we propose techniques for table formatting style transfer, i.e., to automatically format a target table according to the style of a reference table. Considering the latent many-to-many mappings between table structures and formats, we propose CellNet, which is a novel end-to-end, multi-task model leveraging conditional Generative Adversarial Networks (cGANs) with three key components to (1) model and recognize table structures; (2) encode formatting styles; (3) learn and apply the latent mapping based on recognized table structure and encoded style, respectively. Moreover, we build up a spreadsheet table corpus containing 5,226 tables with high-quality formats and 784 tables with human-labeled structures. Our evaluation shows that CellNet is highly effective according to both quantitative metrics and human perception studies by comparing with heuristic-based and other learning-based methods.

The video of this talk cannot be embedded. You can watch it here:
https://dl.acm.org/doi/10.1145/3340531.3412718#sec-supp
(Link will open in new window)
 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at CIKM 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers