The randomization experiment shows that optimization continues to work well even when generalization performance is no better than random guessing, i.e.,10%accuracy in the case of the CIFAR-10 benchmark that has 10 classes. The optimization method is more over insensitive to properties of the data, since it works even on random labels. A consequence of this simple experiment is that a proof of convergence for the optimization method may not reveal any insights into the nature of generalization.