Hi!

I have a GAN-generator setup where I want to compute the loss of a generated image w.r.t. the true image and then backpropagate that loss to update the input vector. The relevant code would look something like this:

```
for epoch in range(100):
for img, latent in dataset:
gen_img = generator(latent)
loss = loss_func(true_img, gen_img)
latent.grad.zero_()
loss.backward()
latent.data.add(latent.grad.data, alpha= -learning_rate)
```

My query is: does it matter if I zero out the gradients of the generator or not? My guess is: gradients are accumulated, so at every iteration, these accumulated grads in the generator are used in the chain rule to compute the derivative w.r.t. latent. So the latent will get updated incorrectly.

Please correct me if this guess is incorrect.

Thanks!