[Submitted on 19 Apr 2023]
Abstract: This technical report presents the application of a recurrent memory to
extend the context length of BERT, one of the most effective Transformer-based
models in natural language processing. By leveraging the Recurrent Memory
T