*******************************************
*TensorFlow Serving* with *Docker* on *AWS*
*******************************************
.. raw:: html
.. admonition:: In this project/tutorial, we will
:class: spellbook-admonition-orange
- Deploy a **TensorFlow** model on **AWS** and serve it with
**TensorFlow Serving** in a **Docker container**
The source code file for this tutorial are located in
``examples/4-tensorflow-serving-docker-aws/``.
Deploying on *Amazon EC2*
=========================
In this tutorial, we will take the containerised example *TensorFlow* model
from the previous tutorial :doc:`/_examples/3-tensorflow-serving-docker/code`
and deploy it on *Amazon EC2*. To achieve this, we will
- create an *AWS Identity & Access Management* (:term:`IAM`)
user with the required permission policies
- install the *AWS Command Line Interface* (CLI)
- upload our *Docker* image to Amazon *Elastic Container Registry*
(:term:`ECR`)
- create a virtual server based on one of the *Deep Learning AMIs* on
Amazon *Elastic Cloud Compute* (:term:`EC2`)
- pull the *Docker* images from *ECR* to the server instance and run it there
- expose the server instance to the internet for answering HTTP
prediction requests via *TensorFlow Serving*'s REST API
The *Deep Learning AMIs* are container images with a range of machine learning
frameworks and tools installed, including *TensorFlow*, *PyTorch* and
*Apache MXNet*.
In this tutorial, we will issue commands both from our host machine
terminal as well as the terminal of virtual servers running on *EC2*
in the Amazon cloud. To mark the difference, the commands for the local
machine are prepended with ``$``, while the commands to be run on the
virtual server are prepended with ``(ec2) $``.
.. rubric:: Links & Resources
- Amazon ECR: `Using Amazon ECR with the AWS CLI
`_
- AWS Deep Learning Containers
- `What are AWS Deep Learning Containers?
`_
- `Amazon EC2 Tutorials
`_
- `Deep Learning Containers Images
`_
- GitHub: `Available Deep Learning Containers Images
`_
- `Release Notes for Deep Learning Containers
`_
Preparations
------------
Storage Encryption by Default
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- `Amazon EBS encryption
`_
In the *EC2* management console, go to
*EC2 Dashboard > Account attributes > EBS encryption*
and set *Always encrypt new EBS volumes* to *Enabled*.
If the alias ``aws/ebs`` does not work, go to the *AWS Key Management Service*
(:term:`KMS`) management console and copy over the proper ARN.
Creating an *IAM* User
^^^^^^^^^^^^^^^^^^^^^^
Either extend an existing user's permissions or create a new *IAM* user
with the following permission policies attached
- ``AmazonEC2ContainerRegistryFullAccess``
- ``AmazonECS_FullAccess``
If it doesn't yet exist, create an access key and note down the
*access key ID* and the *secret access key*.
In the following, I will assume a user with the name ``tf-fmnist-ec2``.
Installing the *AWS CLI*
^^^^^^^^^^^^^^^^^^^^^^^^
The `installation instructions
`_
are given in the user guide in the `AWS CLI documentation
`_.
Download the ``*.zip``-file, uncompress it, install it with
.. code:: bash
$ ./aws/install --install-dir /path/to/aws-installdir --bin-dir /path/to/aws-bin-dir
and add ``/path/to/aws-bin-dir`` to the ``PATH`` environment variable
in ``.bashrc``:
.. code:: bash
$ export PATH=$PATH:/path/to/aws-bin-dir
In case the *AWS CLI* is already installed, check that it is up to date
and if not, install the latest version as described `here
`_.
Run ``aws configure`` and add the access key created earlier as described
`here `_.
To add the user as the default profile, run
.. code:: bash
$ aws configure
Here, however, we will add the user ``tf-fmnist-ec2`` under a specific profile
with the same name:
.. code:: bash
$ aws configure --profile tf-fmnist-ec2
After adding the access key ID and the secret access key, you will be asked
for a region (````) and a default output format - in my case
``eu-central-1`` and ``json``.
A list of the regional endpoints and their names can be found
`here `_.
Profiles with a specific name, i.e. the non-default profiles, can be
used in ``aws`` commands like this:
.. code:: bash
$ aws s3 ls --profile
The profile information is stored in ``~/.aws/credentials`` and
``~/.aws/config``. More information on named profiles is given `here
`_.
Uploading the *Docker* Image to *ECR*
-------------------------------------
- `Using Amazon ECR with the AWS CLI
`_
On :term:`ECR`, create a private or public repository named
``tf-serving-fmnist`` and enable *Scan on push*.
The commands needed to push images to this repository
can be displayed in *ECR* by selecting the repository and clicking on
*View push commands*.
Next, we have to authenticate *Docker* on our local machine to this repository:
.. code:: bash
$ aws ecr get-login-password --profile tf-fmnist-ec2 --region | docker login --username AWS --password-stdin .dkr.ecr..amazonaws.com
Tag the image and push it to the ``tf-serving-fmnist`` repository on *ECR*:
- list the images
.. code:: bash
$ docker images
.. code-output::
REPOSITORY TAG IMAGE ID CREATED SIZE
tf-serving-fmnist latest f34fefc2ee4c 27 hours ago 411MB
tensorflow/serving latest e874bf5e4700 5 weeks ago 406MB
- tag the image
.. code:: bash
$ docker tag tf-serving-fmnist:latest .dkr.ecr..amazonaws.com/tf-serving-fmnist:latest
- push the image
.. code:: bash
$ docker push .dkr.ecr..amazonaws.com/tf-serving-fmnist:latest
.. code-output::
The push refers to repository [.dkr.ecr..amazonaws.com/tf-serving-fmnist]
75d124fb0170: Pushed
bb4423850a27: Pushed
b60ba33781cd: Pushed
547f89523b17: Pushed
bd91f28d5f3c: Pushed
8cafc6d2db45: Pushed
a5d4bacb0351: Pushed
5153e1acaabc: Pushed
latest: digest: sha256:8807f835d9beadfa22679630bcd75f9555695272245b11201bc39f9c8c55d6e0 size: 1991
The latest image from this :term:`ECR` repository can be pulled with
.. code:: bash
$ docker pull .dkr.ecr..amazonaws.com/tf-serving-fmnist:latest
as described `here
`_.
Serving the Model on *EC2*
--------------------------
Now, we are going to create an :term:`EC2` instance based on the
`Deep Learning Base AMI
`_.
In the *EC2* management console, go to *Network & Security > Key Pairs*,
create a key pair called ``tf-fmnist-ec2`` and download it as a ``*.pem``-file.
Then, go to *Instances* and click on *Launch instances*.
In this tutorial, we are using the
`AWS Deep Learning AMI (Ubuntu 18.04)
`_
from the AWS Marketplace.
Next, we have to choose an instance type. I have been running on a
``t2.medium`` which comes with 2 vCPUs, 4 GiB of memory and 'low to
moderate' network performance, which is more than enough for testing.
Also, it has storage on *EBS*, which can be configured to persist
after an *EC2* instance is terminated. To achieve this, when launching
an *EC2* instance, we have to deselect *Delete on Termination* in
*Step 4: Add Storage*.
In *Step 5: Add Tags*, you can add a tag with
- **key**: Name
- **value**: tf-fmnist-ec2
In *Step 6: Configure Security Group*, create a new security group with
the name ``tf-fmnist-ec2`` and description *SSH and TCP:8501 HTTP REST access*
and add the following rule:
- **type**: custom TCP
- **protocol**: TCP
- **port range**: 8501
- **source**: anywhere
- **description**: tensorflow-serving predict REST API
As the description suggests, opening port 8501 is necessary to allow
internet traffic requesting a prediction from the model via
*TensorFlow Serving*'s REST API to reach the *EC2* instance.
After confirming the configuration in *Step 7: Review Instance Launch* by
clicking on *Launch*, a dialog window will ask you to either select an existing
key pair or create a new one. Here we can choose the existing key pair
``tf-fmnist-ec2``. This concludes the instance configuration and will spin
up the requested server.
However, as long as no internet gateway is configured in *VPC*, the instance
cannot be reached via SSH or otherwise. Therefore, in the *VPC* management
console go to *Virtual Private Cloud > Internet Gateways*, click on
*Create internet gateway* and set the name tag to ``tf-fmnist-ec2``.
Once it is created, attach it to the existing VPC that is used by the
*EC2* instance: *Actions > Attach to VPC*. Next, in the *VPC* management
console, go to *Virtual Private Cloud > Route Tables*, select the appropriate
one and check that in the *Routes* tab, there is a route with the newly
created internet gateway as a target. If this is not the case, click on
*Edit routes* and add a route with the following configuration:
- **destination**: 0.0.0.0/0
- **target**: the newly created internet gateway
Now, if the instance has started and is marked as *running*, we can connect
to it from the terminal via *SSH* as described `here
`_:
.. code:: bash
$ ssh -i /path/to/tf-fmnist-ec2.pem ubuntu@
The command with the adequate IP address can be displayed in the
*EC2 management console* by selecting the instance and clicking on
*Actions > Connect > SSH client*.
.. note::
In case of connectivity issues, be it reaching the instance via SSH
or querying the deployed model via *TensorFlow Serving*, the
*VPC Reachability Analyzer* is a very helpful tool. In the *VPC*
management console, go to *Reachability > Reachability Analyzer*
and click on *Create and analyze path* to configure an analysis.
If the connection fails, the error message will give hints as to
what should be fixed or set up differently.
When connected to the *EC2* instance, update the *AWS CLI* as described
`here
`_
and then run ``aws configure``
.. code:: bash
(ec2) $ aws configure --profile tf-fmnist-ec2
and specify the credentials of the ``tf-fmnist-ec2`` *IAM* user.
We can then pull the *Docker* image from *ECR* as described `here
`_.
But before, we have to authenticate to the *ECR* repository - otherwise
we will get the ``no basic auth credentials`` error from *Docker*.
Since we are not on a tty console, the ``aws ecr ... | docker login ...``
command from before will not work. Instead, we can do
.. code:: bash
(ec2) $ docker login -u AWS -p $(aws ecr get-login-password --profile tf-fmnist-ec2 --region ) .dkr.ecr..amazonaws.com
as suggested `here `_.
Now we can pull the image with
.. code:: bash
(ec2) $ docker pull .dkr.ecr..amazonaws.com/tf-serving-fmnist:latest
.. code-output::
latest: Pulling from tf-serving-fmnist
01bf7da0a88c: Pull complete
f3b4a5f15c7a: Pull complete
57ffbe87baa1: Pull complete
e72e6208e893: Pull complete
6ea3f464ef73: Pull complete
01e9bf86544b: Pull complete
68f6bba3dc50: Pull complete
dafe84328936: Pull complete
Digest: sha256:8807f835d9beadfa22679630bcd75f9555695272245b11201bc39f9c8c55d6e0
Status: Downloaded newer image for 009984474629.dkr.ecr.eu-central-1.amazonaws.com/tf-serving-fmnist:latest
009984474629.dkr.ecr.eu-central-1.amazonaws.com/tf-serving-fmnist:latest
show the list of all images with
.. code:: bash
(ec2) $ docker images
.. code-output::
REPOSITORY TAG IMAGE ID CREATED SIZE
.dkr.ecr..amazonaws.com/tf-serving-fmnist latest f34fefc2ee4c 2 days ago 411MB
and run it with the same command already used in the previous tutorial
:doc:`/_examples/3-tensorflow-serving-docker/code`:
.. code:: bash
(ec2) $ docker run -p 8501:8501 -e MODEL_NAME=fmnist-model -t --name tf-serving-fmnist .dkr.ecr..amazonaws.com/tf-serving-fmnist
.. code-output::
[...]
2021-06-26 09:12:09.808270: I tensorflow_serving/model_servers/server.cc:414] Exporting HTTP/REST API at:localhost:8501 ...
[evhttp_server.cc : 245] NET_LOG: Entering the event loop ...
Now we can query the deployed model with an HTTP request to the REST API
like in the last tutorial :doc:`/examples/3-tensorflow-serving-docker/code`.
For this, the script ``1-request`` is provided in
``examples/4-tf-serving-docker-aws/``. In it, set the ``IPv4`` variable to
the appropriate address for your *EC2* instance.
Download a few example images of different items of clothing from the
internet and include their filenames in the script.
.. margin:: from **1-request.py**
in ``examples/4-tf-serving-docker-aws/``
.. code:: python
def load_images(images: Union[str, List[str]]):
if isinstance(images, str): images = [images]
array = np.empty(shape=(len(images), 28, 28, 1))
for i, image in enumerate(images):
img = tf.keras.preprocessing.image.load_img(
path = image,
color_mode = 'grayscale',
target_size = (28, 28))
array[i] = tf.keras.preprocessing.image.img_to_array(img=img)
return (255 - array) / 255
tshirts = [
'images/test/tshirt-1.jpg', 'images/test/tshirt-2.png',
'images/test/tshirt-3.jpg', 'images/test/tshirt-4.png'
]
sandals = [f'images/test/sandal-{i}.jpg' for i in range(1, 5)]
sneakers = [f'images/test/sneaker-{i}.jpg' for i in range(1, 5)]
# test_images = load_images(tshirts)
test_images = load_images(sandals)
# test_images = load_images(sneakers)
IPv4 = 'ec2-54-93-96-215.eu-central-1.compute.amazonaws.com'
# IPv4 = '54.93.96.215'
data = json.dumps({
'signature_name': 'serving_default',
'instances': test_images.tolist() # *either* 'instances'
# 'inputs': test_images.tolist() # *or* 'inputs'
})
headers = {'content-type': 'application/json'}
json_response = requests.post(
f'http://{IPv4}:8501/v1/models/fmnist-model:predict',
headers=headers,
data=data
)
predictions = json.loads(json_response.text)['predictions'] # for 'instances'
# predictions = json.loads(json_response.text)['outputs'] # for 'inputs'
for i, prediction in enumerate(predictions):
print('prediction {}: {} -> predicted class: {}'.format(
i, prediction, np.argmax(prediction)))
Runing the script on a few example pictures of sandals yields
.. code:: bash
$ python 1-request.py
.. code-output::
prediction 0: [6.30343275e-05, 1.85037115e-05, 7.5565149e-06, 2.83105837e-05, 1.74145697e-07, 0.998879, 3.21875377e-05, 2.88182633e-07, 2.93241665e-05, 0.000941563747] -> predicted class: 5
prediction 1: [0.00130418374, 2.53213402e-05, 3.95829147e-06, 7.95430169e-05, 1.08047243e-06, 0.975131869, 0.000146661841, 0.0195651725, 2.74548784e-05, 0.00371479546] -> predicted class: 5
prediction 2: [0.000241088521, 3.73763069e-05, 1.56952246e-05, 0.000210506681, 5.54936341e-05, 0.992369056, 9.06738e-05, 0.00564925186, 0.00128137472, 4.94648739e-05] -> predicted class: 5
prediction 3: [9.74363502e-06, 7.06914216e-06, 6.15942408e-05, 0.00042412043, 0.000195012341, 0.910350859, 1.11019081e-05, 0.0710703135, 0.0176780988, 0.000191997562] -> predicted class: 5
so all example images were correctly classified as sandals.