Last Updated: 2022-12-01
I ran out of GPU memory on a Jupyter notebook on a server after getting some exceptions. The garbage collector could not get rid of it. I had to rerun the whole notebook, which took ages, it being ML.
It turns out that the issue was Python was keeping the last exception around,
which had a reference to a massive
pytorch dataset. Instead of restarting, I
could have gotten rid of this by raising a new exception, one with a negligible
Exceptions can take up huge amounts of RAM. The easiest way to kill that reference is to raise a new exception.