Abstract:
Representation learning on graphs, as alternatives to traditional feature engineering, has been exploited in many application domains, ranging from e-commerce to computational biology. However, generating satisfactory video embeddings and putting them into practical use to improve the performance of recommendation tasks remains a challenge. In this paper, we present a video embedding approach named Equuleus, which learns video embeddings from user interaction behaviors. In Equuleus, we carefully incorporate user behavior characteristics into the construction of the video graph and the generation of node sequences. To accurately quantify the contributions of different attributes to embeddings, we propose a particular attributed encoder network, which employs an attention mechanism to aggregate different attributes in a distinguishable way. Moreover, we also leverage the user feedback as a guide to correct the generation of embeddings. Video embeddings generated by Equuleus have been used for relevant recommendation of videos in MX Player. Based on real data from MX Player, extensive offline experiments and online A/B test are conducted. Both experimental results and online CTRs illustrate that Equuleus can generate high-quality video embeddings and it can work effectively in a real-world production environment.