Our training method is built upon the work of Thys et al. [10], which is illustrated in Fig. 3. In each iteration a batch of images containing airplanes is used for patch training. The current adversarial patch (initially starting with a random patch) is placed on the airplanes that have been annotated in the ground truth. The patches are scaled, rotated, corrupted with some noise and contrast stretched, such that they resemble real-life recording conditions, before placing them on the airplanes. In our scenario, the patches are randomly rotated over 360 degrees since there is no fixed orientation in aerial images. The effect of this can be observed in the final trained patches (Figure 4) which contain mostly circular symmetric patterns. The YOLOv2 network is used for training, but has its weights fixed during back-propagation; only the patch is adjusted in each iteration. The following loss formula is used for optimization: