Abstract:
We introduce two methods enabling recurrent neural networks (RNNs) to trade off accuracy for computational cost during the analysis of a sequence. This opens up the possibility to adapt RNNs in real time to changing computational constraints, such as when running on shared hardware with other processes or in mobile edge computing nodes. The first approach makes minimal changes to the model. Therefore, it avoids loading new parameters from slow memory. In the second approach, different models can replace one another within a sequence analysis. The latter works on more data sets. We evaluate these two approaches on permuted MNIST, adding task and a human activity recognition task. We demonstrate that changing the computational cost of a RNN with our approaches leads to sensible results. Indeed, the resulting accuracy and computational cost is typically a weighted average of the corresponding metrics of the models used. The weight of each model also increases with the number of time steps a model is used.