AbstractIn this paper, we propose a convolutional layer inspiredby optical flow algorithms to learn motion representations.Our representation flow layer is a fully-differentiable layerdesigned to capture the ‘flow’ of any representation channelwithin a convolutional neural network for action recognition.Its parameters for iterative flow optimization are learned inan end-to-end fashion together with the other CNN modelparameters, maximizing the action recognition performance.Furthermore, we newly introduce the concept of learning‘flow of flow’ representations by stacking multiple representation flow layers. We conducted extensive experimental evaluations, confirming its advantages over previous recognitionmodels using traditional optical flows in both computationalspeed and performance. The code is publicly available. 1