This past summer, I shared a tutorial for how to load large datasets into TensorFlow 1.x from object storage using the S3 API. This new tutorial will show how TensorFlow 2.0 can leverage data from an object store.

For those unfamiliar, object storage is the de-facto standard for storing unstructured data in environments requiring massive scale, geographical distribution, high durability, high throughput/concurrency, or some combination of those.  Traditional storage suffers from several major bottlenecks that impact scalability, e.g. filesystem and file metadata, particularly when you start using multi-Petabyte datasets and require tremendous throughput and concurrency like we see in modern GPU-driven ML workloads.  On the other hand, object storage platforms perform even faster the larger they get, and can handle even the most demanding workloads. This is why object storage is quickly becoming ubiquitous in large-scale Deep Learning use-cases.


TensorFlow 2.0: Keras and More

TensorFlow 2.0 represents a huge step forward from TensorFlow 1.x.  The largest and most noticeable change is that the high-level API has been replaced with Keras.  The importance of this change is hard to overstate.  Keras, the python deep learning library, is incredibly easy to use (as you’ll see in this exercise), but previously suffered from relatively poor performance and distributed processing capabilities compared to TensorFlow.  With TensorFlow 2.0 you can write your models in Keras, and still get the performance and distributed processing power of TensorFlow.

Other changes in TensorFlow 2.0 include eager execution by default (which makes building and debugging TensorFlow code much easier), general API cleanup, and standardizing on for input pipelines.


Tutorial: Training a Deep Neural Network to Recognize Hand-Written Digits

Neural Network Training Data



For this exercise, we’ll be using the MNIST dataset: a collection of hand-written digits (0-9) that represents the “hello world” of Deep Learning.  We will train a Deep Neural Network (DNN) to identify hand-written digits with >99% accuracy in just a few minutes!

As with the TensorFlow 1.x tutorial, we leverage an object storage backend, which becomes critical when working with large-scale deep learning datasets due to the concurrency and throughput limitations of traditional storage solutions (although trivial for this example).  I based the training workflow on the “TensorFlow 2.0 Quickstart for Experts” tutorial on, and added a few bells and whistles.  The goals of this exercise are:

  1. Do some training with the new native Keras functionality in TensorFlow 2.0.
  2. Leverage S3-compatible object storage as the data source.
  3. Visualize the results in a useful manner.

This tutorial is a little more involved than the last one. In a nutshell, here is the workflow:

  1. Install the required modules (including Tensorflow 2.0 via pip, because Colab still uses 1.x by default).
  2. Verify we have the correct version of TensorFlow, and that GPU support is enabled (training will be quite slow without it).
  3. Acquire the MNIST data using Boto3.
  4. Load the data for training using
  5. Create a model – we just stick with a fairly standard model leveraging Convolutional Neural Networks, Max Pooling, and Dropout.
  6. Train the model – 10 epochs should get us to over 99% test accuracy.
  7. Visualize the outputs (via Confusion Matrix and by looking at the incorrect predictions made by the model on the test dataset).

I hope that this tutorial on Google Colab useful! Once again, if anything is unclear, if you have any questions or feedback on any of this (including questions related to Object Storage for AI/ML/DL workloads), or if you just want to say hello, you can reach out to us at any time or contact me on twitter at @j0nkelly.

About Author

Jon Kelly

Jon Kelly

Jon Kelly is Director of Machine Learning Solutions at SwiftStack. He has worked in emerging areas of the IT industry for N years, where N is a number larger than he cares to admit. His hobbies include obstinately pursuing demanding physical activities beyond the point at which any rational being would have stopped, and spending time with his family in a variety of activities such as attending tea parties, engaging in hijinks in the world of Minecraft, and emprically demonstrating just how much more energy a child whose age is in the single digits has as compared to an adult.