worse than our baseline of not doing so. However, we canadd a convolutional layer between the first and second flowlayer, flow-conv-flow (FcF), (Fig. 6), allowing the model tobetter learn longer-term flow representations. We find thisperforms best, as this intermediate layer is able to smooththe flow and produce a better input for the representationflow layer. However, we find adding a third flow layer reduces performance as the motion representation becomesunreliable, due to the large spatial receptive field size. In Fig.7, we visualize the learned flow-of-flow, which is a smoother,acceleration-like feature with abstract motion patterns.