Abstract:
The LSTM network was proposed to overcome the difficulty in learning long-term dependence, and has made significant advancements in applications. With its success and drawbacks in mind, we raise the question - do RNN and LSTM have long memory? We answer it partially by proving that RNN and LSTM do not have long memory from a time series perspective. Since the term "long memory" is still not well-defined for a network, we propose a new definition for long memory network. To verify our theory, we make minimal modifications to RNN and LSTM and convert them to long memory networks, and illustrate their superiority in modeling long-term dependence of various datasets.