11 March 2019
8:30 am - 6:00 pm
Room 300, Level 3, Suntec Singapore Convention & Exhibition Centre

Attendee Set Up Requirements
To maximize your training time during your DLI training, please follow the instructions below, before attending your first training session:
1. You must bring your own laptop in order to run the training. Please bring your laptop and its charger.
2. A current browser is needed. For optimal performance, Chrome, Firefox, or Safari for Macs are recommended. IE is operational but does not provide the best performance.
3. Create an account at http://courses.nvidia.com/join. Click the “Create account” link to create a new account. If you are told your account already exists, please try logging in instead. If you are asked to link your “NVIDIA Account” with your “Developer Account”, just follow the on-screen directions.
4. Ensure your laptop will run smoothly by going to http://websocketstest.com/. Make sure that WebSockets work by ensuring Websockets is supported under “Environment”. Additionally, make sure that “Data Receive”, “Data Send” and “Echo Test” all check Yes under “WebSockets”. If there are issues with WebSockets, try updating your browser.

If you have any questions, please contact [email protected].

Learn how to use multiple GPUs to train neural networks and effectively parallelize training of deep neural networks using TensorFlow.

The computational requirements of deep neural networks used to enable AI applications like self-driving cars are enormous. A single training cycle can take weeks on a single GPU, or even years for the larger datasets like those used in self-driving car research. Using multiple GPUs for deep learning can significantly shorten the time required to train lots of data, making solving complex problems with deep learning feasible.

This course will teach you how to use multiple GPUs to training neural networks. You’ll learn:
• Approaches to multi-GPU training
• Algorithmic and engineering challenges to large-scale training
• Key techniques used to overcome the challenges mentioned above

Upon completion, you’ll be able to effectively parallelize training of deep neural networks using TensorFlow.

  • 08:30 Registration
  • 09:00 Theory of Data Parallelism
  • 09:45 Introduction to Multi GPU Training
  • 10:30 Morning Break
  • 11:00 Introduction to Multi GPU Training (Continue)
  • 12:30 Lunch
  • 13:30 Algorithmic Challenges to Multi GPU Training
  • 15:30 Afternoon Break
  • 15:45 Engineering Challenges to Multi GPU Training
  • 17:45 Closing Comments and Questions (15 mins)

*Agenda is subjected to change
Content Level (e.g. Beginner): Beginner; Introduction to Deep Learning


Define a simple neural network and a cost function and iteratively calculate the gradient of the cost function and model parameters using the SGD optimization algorithm.


Learn to transform single GPU to Horovod multi-GPU implementation to reduce the complexity of writing efficient distributed software. Understand the data loading, augmentation, and training logic using AlexNet model.


Understand the aspects of data input pipeline, communication, reference architecture and take a deeper dive into the concepts of job scheduling.

Experience with Stochastic Gradient Descent, Network Architecture, and Parallel Computing.