Abstract:
Recently, gradient-based methods have been developed for Bi-Level Programmings (BLPs) in learning and vision fields. The successes of these methods heavily rely on the simplification that for each fixed upper-level variable, the lower-level solution is a singleton (i.e., Lower-Level Singleton, LLS). However, LLS is usually too restrictive to be satisfied in real-world complex scenarios. This paper first presents a counter-example to illustrate the invalidation of those existing gradient-based bi-level schemes in the absence of the LLS condition. To address this critical issue, a new method, named Bi-level Descent Aggregation (BDA) is proposed, aiming to broaden the application horizon of first-order schemes for BLPs. In particular, by investigating BLPs from the view point of optimistic bi-level, BDA establishes a generic algorithmic framework. In our strategy, the aggregation of hierarchical objective information helps to produce flexible bi-level iteration schemes. Our theoretical investigations prove the strict convergence of BDA for general BLPs without the LLS condition. We also show that BDA is indeed compatible to a verify of particular first-order computation modules. Additionally, as an interesting byproduct, we improve those conventional first-order bi-level schemes under the LLS simplification. Particularly, we establish their convergences with weaker assumptions. Extensive experiments justify our theoretical results and demonstrate the superiority of the proposed BDA for different tasks, including hyper-parameter optimization and meta learning.