RGB differences to achieve state-of-the-art performance.Unlike prior works, our proposed model with representation flow layers relies only on RGB input, learning far fewerparameters while correctly representing motion with the iterative optimization. It is significantly faster than the videoCNNs requiring optical flow input, while still performing asgood as or even better than the two-stream models. It clearlyoutperforms existing motion representation methods including TVNet [5] and OFF [21] in both speed and accuracy,which we experimentally confirm.