12/07/2020

Online Learning for Active Cache Synchronization

Andrey Kolobov, Sebastien Bubeck, Julian Zimmert

Keywords: Online Learning, Active Learning, and Bandits

Abstract: Existing multi-armed bandit (MAB) models make two implicit assumptions: an arm generates a payoff only when it is played, and the agent observes every payoff that is generated. This paper introduces synchronization bandits, a MAB variant where all arms generate costs at all times, but the agent observes an arm's instantaneous cost only when the arm is played. Synchronization MABs are inspired by online caching scenarios such as Web crawling, where an arm corresponds to a cached item and playing the arm means downloading its fresh copy from a server. While not refreshed, each cached item grows progressively stale with time, continuously generating stochastic costs due to degraded cache performance, but the cache doesn't know how much until it refreshes the item and computes the difference between the item’s fresh version and the old one. We present MirrorSync, an online learning algorithm for synchronization bandits, establish an adversarial regret of $O(T^{2/3})$ for it, and show how to make it efficient in practice.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at ICML 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers