In general, a key problem of applying deep convolutional networks in computer vision is to find a large, consistent dataset suitable for a specific task. To perform our experi- ments, it is necessary to have a database consisting of image pairs relevant to the landmark-type datasets. The collection of such set of image pairs is a non-trivial task and often involves testing many pairs by matching SIFT features and performing geometric verification. For our experiments, we utilize 5 crowd-sourced image collections downloaded from Flickr, each corresponding to a popular landmark (London Eye (LE) 6856 images, San Marco (SM) 7580 images, Tate Modern (TM) 4583 images, Times Square (TS) 6361 images, Trafalgar (T) 6802 images) [2]. Original datasets contained both color and grayscale images.