Tag Archives: HPC

Ray: An Open-Source API For Easy, Scalable Distributed Computing In Python – Part 2 Distributed Scaling

Through a series of 4 blog posts, we’ll discuss and provide working examples of how one can use the open-source library Ray to (a) scale computing locally (single machine), (b) distribute scaling remotely (multiple-machines), and (c) serve deep learning models across a cluster (basic/advanced). Please note that the blog posts in this series increasingly raise in difficulty!

This is the second blog post in the series, (the first one here), where we will go into greater detail about how Ray Cluster creation works, associated terminology, requirements for successful execution, and extend our previous local-only example to a distributed environment.

Continue reading Ray: An Open-Source API For Easy, Scalable Distributed Computing In Python – Part 2 Distributed Scaling

Installing Caffe on the Ohio Super Computing (OSC) Ruby Cluster

One of the perks of working at Case Western Reserve is that we often qualify for access to cutting edge resource and special projects. In this case, since our digital histology deep learning work requires a large number of GPUs to analyze thousands of patients, we were granted access to the OSC Ruby cluster, which has 20 NVIDIA Tesla K40 GPUs. Since the cluster has only recently been setup, there was some leg work required on our end to get Caffe  fully up and running, without root access, which we’ll document here.

Continue reading Installing Caffe on the Ohio Super Computing (OSC) Ruby Cluster