19/10/2020

Intent-driven similarity in e-commerce listings

Gilad Fuchs, Yoni Acriche, Idan Hasson, Pavel Petrov

Keywords: machine learning, e-commerce, sentence similarity

Abstract: Discovering similarities between online listings is a common backend task being used across different downstream experiences in eBay. Our baseline unstructured listing similarity method relies on measuring the semantic textual similarity between the embedding vectors of listing titles. However, we discovered that even with the latest contextualized embedding methods, our similarity fails to give the proper weight to the key tokens in the title that matter. This often results in identifying listing similarities that are not sufficient, which later hurts the downstream experiences. In this paper we present a method we call "Listing2Query", or "L2Q", which uses a Sequence Labeling approach to learn token importance from our users? search queries and on-site behaviour. We used pairs of listing titles and their matching search queries, and leveraged a contextualized character language model, to train L2Q as a bidirectional recurrent neural network to produce token importance weights. We demonstrate that plugging these weights into relatively straightforward listing similarity methods is a simple way to significantly improve the similarity results, even to the extent that it consistently outperforms those created by popular representations such as BERT. Notably, this approach is not reserved to only large online marketplaces but can be generalized to other cases that include a search-driven experience and a recall set of short documents.

The video of this talk cannot be embedded. You can watch it here:
https://dl.acm.org/doi/10.1145/3340531.3412715#sec-supp
(Link will open in new window)
 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at CIKM 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers