Utilization of current GPUs is often limited by the ability to get the data onto and off the device quickly. More precisely, this means taking data from the host RAM, transferring it over the PCI-e bus to the GPU RAM is the bottleneck of many deep learning use cases.Continue reading Transferring data FASTER to the GPU With Compression
One of the perks of working at Case Western Reserve is that we often qualify for access to cutting edge resource and special projects. In this case, since our digital histology deep learning work requires a large number of GPUs to analyze thousands of patients, we were granted access to the OSC Ruby cluster, which has 20 NVIDIA Tesla K40 GPUs. Since the cluster has only recently been setup, there was some leg work required on our end to get Caffe fully up and running, without root access, which we’ll document here.