Deep Learning

Diagnosing cancer [1], recognising your voice commands [2], enabling cars to recognise pedestrian and obstacles [3], fighting spam [4] [5] and beating GO champions [6], these a just a few applications that highlight the massive potential of deep learning. A set of deep multi-layered neural networks learning from a large set of training data can tackle problems which are extremely hard or even impossible for humans to manually program. While the additional layers of nodes in the neural network topology means the system can understand more abstract concepts, the size of the network makes the training process very computationally expensive and can take days or weeks on non-optimised hardware.

The field of deep learning is undergoing a revolution as a result of the availability of big data which can be used for training the system and cheap highly parallel computational hardware such as the Graphics Processing Units (GPUs) to accelerate the training. These factors have made it possible for researchers to experiment and create deep learning systems that can perform a huge variety of useful tasks.

The Department of Computer Science at University of Sheffield has invested in the NVIDIA DGX-1 system, the world’s first supercomputer purpose-built for deep learning. The machine is equipped with 8 Tesla P100 GPUs connected together with NVLink technology to provide super-fast inter-GPU communication. Capable of performing 170 Teraflops of computation, it can provide up to 75 times speed up on training deep neural networks compared to the latest Intel Xeon CPU. Deep learning problems that took days to train can now be done in hours allowing researchers more flexibility to experiment with different configurations, process and classify increasing amounts of data and achieve more optimal results. Sheffield is one of very few universities in the UK who has made an early investment in this technology.

The University of Sheffield, an existing NVIDIA GPU research and teaching centre, has specialist Research Software Engineering (RSE) staff (including EPSRC RSE Fellows Dr Paul Richmond and Dr Mike Croucher) within the RSE Sheffield group. The RSE group can assist with transitioning of legacy and HPC codes as well as applying deep learning to large data problems using this exciting new GPU architecture. The group undertakes both research and industrial consultancy and is looking to expand its collaborations with new groups and industrial partners wishing to use the new DGX-1 hardware.

For more information, contact


[1] D. Wang, A. Khosla, R. Gargeya, H. Irshad, and A. H. Beck, “Deep Learning for Identifying Metastatic Breast Cancer,” arXiv [q-bio.QM], 18-Jun-2016.

[2] A. Hannun, C. Case, J. Casper, B. Catanzaro, G. Diamos, E. Elsen, R. Prenger, S. Satheesh, S. Sengupta, A. Coates, and A. Y. Ng, “Deep Speech: Scaling up end-to-end speech recognition,” arXiv [cs.CL], 17-Dec-2014.

[3] D. Tomè, F. Monti, L. Baroffio, L. Bondi, M. Tagliasacchi, and S. Tubaro, “Deep convolutional neural networks for pedestrian detection,” arXiv [cs.CV], 13-Oct-2015.

[4] G. Tzortzis and A. Likas, “Deep Belief Networks for Spam Filtering,” in 19th IEEE International Conference on Tools with Artificial Intelligence(ICTAI 2007), 2007, vol. 2, pp. 306–309.

[5] “Google Says Its AI Catches 99.9 Percent of Gmail Spam,” Wired, 09-Jul-2015.

[6] S. Byford, “Google’s DeepMind defeats legendary Go player Lee Se-dol,” The Verge, 09-Mar-2016. [Online].