
Is this most likely the reason why the GPU isn't being fully utilized (no CuDNN/CUDA)? Does it have something to do with the dedicated GPU memory usage being a bottleneck? Or maybe something to do with the network architecture I'm using (number of parameters, etc.)? I've had trouble in the past with getting tensorflow-gpu running with CUDA (although I haven't tried in over 2 years so maybe it's easier with the latest versions) which is why I used this installation method.
Gpu not going into 3d clock install#
Note that this installation specifically DOESN'T install CuDNN or CUDA. I followed the installation steps on this website: Setting use_multiprocessing to True and increasing number of workers in model.fit (no effect). Increasing batch size from 20 to 10,000 (increases GPU usage from ~3-4% to ~6-7%, greatly decreases training time as expected).Ģ.

The model I'm running is a fully convolutional neural network (basically a U-Net architecture) with 566,290 trainable parameters. I'm training on a lot of data (~128GB) which is all loaded into the RAM (512GB). I don't have an ethernet connection and am connected to Wifi (don't think this effects anything but I'm not sure with Jupyter since it runs through the web broswers). Note that the CPU isn't being utilized and nothing else on the task manager suggests anything is being fully utilized. The output in the command terminal shows that the GPU is being utilized, however the script I'm running takes longer than I expect to train/test on the data and when I open the task manager it looks like the GPU utilization is very low.

I'm running my code through Jupyter (most recent Anaconda distribution). My computer has a Intel Xeon e5-2683 v4 CPU (2.1 GHz). I'm running a CNN with keras-gpu and tensorflow-gpu with a NVIDIA GeForce RTX 2080 Ti on Windows 10.
