In fact, the random gradient descent algorithm is an extension of the gradient descent algorithm. In depth learning, the cost function can be decomposed into the sum of each sample cost function. Therefore, it will take a long time to calculate one-step gradient with the increase of training set size. The core of random gradient descent is that gradient is expectation. It is expected that a small sample size approximation can be used<br>
正在翻译中..