12/07/2020

Consistent Structured Prediction with Max-Min Margin Markov Networks

Alex Nowak, Francis Bach, Alessandro Rudi

Keywords: Sequential, Network, and Time-Series Modeling

Abstract: Max-margin methods for binary classification such as the support vector machine (SVM) have been extended to the structured prediction setting under the name of max-margin Markov networks ($M^3N$), or more generally structural SVMs. These methods are able to model interactions between output parts and incorporate a cost between labels. Unfortunately, these methods are inconsistent when the relationship between inputs and labels is far from deterministic. To overcome such limitations, in this paper we go beyond max-margin, defining the learning problem in terms of a ``max-min'' margin formulation. The resulting method, which we name max-min margin Markov networks ($M^4N$), provides a correction of the $M^3N$ loss that is key to achieve consistency in the general case. In this paper, we prove consistency and finite sample generalization bounds for $M^4N$ and provide an explicit algorithm to compute the estimator. The algorithm has strong statistical and computational guarantees: in a worst case scenario it achieves a generalization error of $O(1/\sqrt{n})$ for a total cost of $O(n\sqrt{n})$ marginalization-oracle calls, which have essentially the same cost as the max-oracle from $M^3N$. Experiments on multi-class classification and handwritten character recognition demonstrate the effectiveness of the proposed method over $M^3N$ networks.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at ICML 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers