A few weeks ago, I put together a tutorial on how to get S3 data into TensorFlow and I used pico swiftstack as an example data source. This article goes into a little more detail about what pico swiftstack is and how I used it to quickly test the S3 API.

First off, pico swiftstack is a Docker image allowing you to quickly run the access layer of SwiftStack so you can test for S3 or Swift API compatibility. pico swiftstack may also be used when integrating with a CI/CD system. Essentially, if  your application works with pico swiftstack, it will work with the full SwiftStack Storage platform. pico SwiftStack is freely available to use and if you need to test the S3 or Swift API in a private environment, I recommend giving it a try.


Step 1: Install Docker and AWS CLI


sudo apt install


AWS CLI Install: both Ubuntu and CentOS

pip install awscli


Step 2: Run the picoswiftstack container and get the credentials

sudo docker run -d --rm -p 8080:8080 --hostname="picoswiftstack" --name="picoswiftstack" swiftstack/picoswiftstack

sudo docker exec picoswiftstack get_auth

[note down the output, e.g.]


======================= CLUSTER AUTHENTICATION =======================


Swift API Auth: or http://<your VM IP>:<your exposed port>/auth/v1.0

SwiftStack Auth Username: test

SwiftStack Auth Password: test

S3 API URL: or http://<your VM ip>

S3 API Region: us-east-1

S3 Access Key: test

S3 Secret Key: 78cffadc0bea806e405e9615ce8dbb0e




Step 3: Create credentials files for the AWS CLI client

mkdir -p ~/.aws


cat <<EOF|tee ~/.aws/config






cat <<EOF|tee ~/.aws/credentials



aws_secret_access_key=[PUT YOUR 'S3 Secret Key' here]



Step 4: Verify functionality

Assuming that all worked as anticipated, you should be able to run:


aws --endpoint-url s3 ls


and get no errors (no content either).   If you run into errors, it is probably due to one of the variables not being set properly in the ~/.aws/ config files.  You can also manually specify them when calling the AWS client for troubleshooting. Otherwise, verify the docker container is running, that you specified the correct IP/port with the AWS client, and that the endpoint is accessible.


Step 5: Create a bucket

In the colab example, we use ‘mnist’ as the bucket name, but you can call it whatever you want as long as you ensure to set the variables properly in the colab code.


aws --endpoint-url s3 mb s3://mnist


Output should look like:


make_bucket: mnist


Step 6: Fetch and unzip data for upload

mkdir mnist

cd mnist/





gunzip *

rm -f *.gz


Step 7: Upload the MNIST data

for i in `ls`;do aws --endpoint-url s3 cp $i s3://mnist;done;


The output should look like:

upload: ./t10k-images-idx3-ubyte to s3://mnist/t10k-images-idx3-ubyte

upload: ./t10k-labels-idx1-ubyte to s3://mnist/t10k-labels-idx1-ubyte

upload: ./train-images-idx3-ubyte to s3://mnist/train-images-idx3-ubyte

upload: ./train-labels-idx1-ubyte to s3://mnist/train-labels-idx1-ubyte


Verify everything was uploaded properly:

aws --endpoint-url s3 ls mnist


The output should look like:

2019-07-08 22:15:25    7840016 t10k-images-idx3-ubyte

2019-07-08 22:15:26      10008 t10k-labels-idx1-ubyte

2019-07-08 22:15:27   47040016 train-images-idx3-ubyte

2019-07-08 22:15:28      60008 train-labels-idx1-ubyte


As always, if there’s any way we can help, please feel free to reach out.

About Author

Jon Kelly

Jon Kelly

Jon Kelly is Director of Machine Learning Solutions at SwiftStack. He has worked in emerging areas of the IT industry for N years, where N is a number larger than he cares to admit. His hobbies include obstinately pursuing demanding physical activities beyond the point at which any rational being would have stopped, and spending time with his family in a variety of activities such as attending tea parties, engaging in hijinks in the world of Minecraft, and emprically demonstrating just how much more energy a child whose age is in the single digits has as compared to an adult.