Description

This CK-powered container is our attempt to provide a common API to customize, build and run AI and ML applications with different models, frameworks, libraries, datasets, compilers, formats, backends and platforms. Our on-going project is to make the onboarding process as simple as possible via this platform. Please check this CK white paper and don't hesitate to contact us if you have suggestions or feedback!

ReadMe

Object Detection TensorFlow (Python) CK-powered Docker images

This collection of CK-powered adaptive containers is based on the TensorRT images from NVIDIA (which are in turn based on Ubuntu 18.04):

`Dockerfile`	Base image	CUDA	TensorRT	TensorFlow
`Dockerfile` (`Dockerfile_20.03-py3_tf-2.1.0`)	20.03-py3	10.2.89	7.0.0	2.1.0
`Dockerfile_20.03-py3_tf-2.0.1`	20.03-py3	10.2.89	7.0.0	2.0.1
`Dockerfile_19.10-py3_tf-2.0.1`	19.10-py3	10.1.243	6.0.1	2.0.1
`Dockerfile_19.10-py3_tf-1.15.2`	19.10-py3	10.1.243	6.0.1	1.15.2
`Dockerfile_19.07-py3_tf-1.14.0`	19.07-py3	10.1.168	5.1.5	1.14.0

NB: 19.10-py3 was the last base image to support CUDA 10.1. TensorFlow 1.15, 2.0 and 2.1 all require patching (done by CK) to work with CUDA 10.2.

The images include about a dozen of TensorFlow models for object detection, the COCO 2017 validation dataset, and two TensorFlow variants: - TensorFlow prebuilt for the CPU (installed via pip). - TensorFlow built from sources for the GPU, with TensorRT support enabled.

NB: The latter variant can be forced to run on the CPU. We used to have two separate Docker images based on Ubuntu 18.04 to measure the performance of prebuilt TensorFlow vs TensorFlow built from sources on the CPU, but it is easier to manage a single image.

Setup
- Set up NVIDIA Docker
- Set up Collective Knowledge
- Download and/or Build images
Usage
- Run once
  - Models
  - Flags
- Benchmark
  - Docker parameters
  - CK parameters
- Explore
- Analyze

Setup

Set up NVIDIA Docker

As our GPU image is based on nvidia-docker, please follow instructions there to set up your system.

Note that you may need to run commands below with sudo, unless you manage Docker as a non-root user.

Set up Collective Knowledge

You will need to install Collective Knowledge to build images and save benchmarking results. Please follow the CK installation instructions and then pull our object detection repository:

$ ck pull repo:ck-object-detection

NB: Refresh all CK repositories after any updates (e.g. bug fixes):

$ ck pull all

(This only updates CK repositories on the host system. To update the Docker image, rebuild it using the --no-cache flag.)

Download from Docker Hub

To download a prebuilt image from Docker Hub, run:

$ docker pull ctuning/object-detection-tf-py.tensorrt.ubuntu-18.04

NB: As the prebuilt TensorFlow variant does not support AVX2 instructions, we advise to use the TensorFlow variant built from sources on compatible hardware. In fact, as the prebuilt image was built on an HP Z640 workstation with an Intel(R) Xeon(R) CPU E5-2650 v3 (launched in Q3'14), we advise to rebuild the image on your system.

Build

Latest

To build the latest image on your system (from (Dockerfile), run:

$ ck build docker:object-detection-tf-py.tensorrt.ubuntu-18.04

NB: This CK command is equivalent to:

$ cd `ck find docker:object-detection-tf-py.tensorrt.ubuntu-18.04`
$ docker build --no-cache -f Dockerfile -t ctuning/object-detection-tf-py.tensorrt.ubuntu-18.04:latest .

Snapshot

To build a snapshot (e.g. from Dockerfile_20.03-py3_tf-2.0.1), run:

$ docker build --no-cache -t ctuning/object-detection-tf-py.tensorrt.ubuntu-18.04:20.03-py3_tf-2.0.1 -f Dockerfile_20.03-py3_tf-2.0.1 .

Usage

Run inference once

Once you have downloaded or built an image, you can run inference on the CPU e.g. as follows:

$ docker run --rm ctuning/object-detection-tf-py.tensorrt.ubuntu-18.04 \
    "ck run program:object-detection-tf-py \
        --dep_add_tags.lib-tensorflow=vprebuilt \
        --dep_add_tags.weights=ssd-mobilenet,quantized \
        --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=50 \
    "

Here, we run inference on 50 images on the CPU using the quantized SSD-MobileNet model.

NB: This is equivalent to the default run command:

$ ck run docker:object-detection-tf-py.tensorrt.ubuntu-18.04

To run inference on the GPU, add the --runtime=nvidia flag:

$ docker run --runtime=nvidia --rm ctuning/object-detection-tf-py.tensorrt.ubuntu-18.04 \
    "ck run program:object-detection-tf-py \
        --dep_add_tags.lib-tensorflow=vsrc \
        --dep_add_tags.weights=ssd-mobilenet,quantized \
        --env.CK_BATCH_SIZE=1 --env.CK_BATCH_COUNT=50 \
        --env.CK_ENABLE_TENSORRT=1 \
        --env.CK_TENSORRT_DYNAMIC=1 \
    "

Here, we additionally request to use TensorRT in the dynamic mode.

We describe all supported models and flags below.

Models

Our TensorFlow-Python application supports the following TensorFlow models trained on the COCO 2017 dataset. With the exception of a TensorFlow reimplementation of YOLO v3, all the models come from the TensorFlow Object Detection model zoo. Note that we report the accuracy reference (mAP in %) on the COCO 2017 validation dataset (5,000 images).

Model	Unique CK Tags (`<tags>`)	Is Custom?	mAP in %
`faster_rcnn_nas_lowproposals_coco`	`rcnn,nas,lowproposals,vcoco`	0	44.340195
`faster_rcnn_resnet50_lowproposals_coco`	`rcnn,resnet50,lowproposals`	0	24.241037
`faster_rcnn_resnet101_lowproposals_coco`	`rcnn,resnet101,lowproposals`	0	32.594327
`faster_rcnn_inception_resnet_v2_atrous_lowproposals_coco`	`rcnn,inception-resnet-v2,lowproposals`	0	36.520117
`faster_rcnn_inception_v2_coco`	`rcnn,inception-v2`	0	28.309626
`ssd_inception_v2_coco`	`ssd,inception-v2`	0	27.765988
`ssd_mobilenet_v1_coco`	`ssd,mobilenet-v1,non-quantized,mlperf,tf`	0	23.111170
`ssd_mobilenet_v1_quantized_coco`	`ssd,mobilenet-v1,quantized,mlperf,tf`	0	23.591693
`ssd_mobilenet_v1_fpn_coco`	`ssd,mobilenet-v1,fpn`	0	35.353170
`ssd_resnet_50_fpn_coco`	`ssd,resnet50,fpn`	0	38.341120
`ssdlite_mobilenet_v2_coco`	`ssdlite,mobilenet-v2,vcoco`	0	24.281540
`yolo_v3_coco`	`yolo-v3`	1	28.532508

Each model can be selected by adding the --dep_add_tags.weights=<tags> flag when running a customized command for the container. For example, to run inference on the quantized SSD-MobileNet model, add --dep_add_tags.weights=ssd-mobilenet,quantized; to run inference on the YOLO model, add --dep_add_tags.weights=yolo; and so on.

Flags

Env Flag	Possible Values	Default Value	Description
`--env.CK_CUSTOM_MODEL`	0,1	0	Specifies whether the model comes from the TensorFlow zoo (`0`) or from another source (`1`). (Models from other sources have to implement their own preprocess, postprocess and get tensor functions, as explained in the application documentation.)
`--env.CK_ENABLE_BATCH`	0,1	0	Specifies whether batching should be enabled (must be used for `--env.CK_BATCH_SIZE` to take effect).
`--env.CK_BATCH_SIZE`	positive integer	1	Specifies the number of images to process in a single batch (if not `1`, must be used with `--env.CK_ENABLE_BATCH=1`).
`--env.CK_BATCH_COUNT`	positive integer	1	Specifies the number of batches to be processed.
`--env.CK_ENV_IMAGE_WIDTH`, `--env.CK_ENV_IMAGE_HEIGHT`	positive integer	Model-specific (set by CK)	These parameters can be used to resize at runtime the input images to a different size than the default size for the model. (This usually decreases the accuracy.)
`--env.CK_ENABLE_TENSORRT`	0,1	0	Enables the TensorRT backend (only to be used with TensorFlow built from sources).
`--env.CK_TENSORRT_DYNAMIC`	0,1	0	Enables the TensorRT dynamic mode (must be used with `--env.CK_ENABLE_TENSORRT=1`).
`--env.CUDA_VISIBLE_DEVICES`	list of integers	N/A	Specifies which GPUs should be used by TensorFlow; `-1` forces TensorFlow to use the CPU (even with TensorFlow built from sources).

Benchmark models individually

When you run inference using ck run, the result gets printed but not saved. You can use ck benchmark to save the result on the host system as CK experiment entries (JSON files) e.g. as follows:

$ docker run --runtime=nvidia \
    --env-file `ck find docker:object-detection-tf-py.tensorrt.ubuntu-18.04`/env.list \
    --volume=<folder_for_results>:/home/dvdt/CK_REPOS/local/experiment \
    --user=$(id -u):1500 \
    --rm ctuning/object-detection-tf-py.tensorrt.ubuntu-18.04 \
    "ck benchmark program:object-detection-tf-py \
        --dep_add_tags.lib-tensorflow=vsrc \
        --dep_add_tags.weights=ssd-mobilenet,quantized \
        --env.CK_BATCH_COUNT=50 \
        --repetitions=1 \
        --record \
        --record_repo=local \
        --record_uoa=object-detection-tf-py-ssd-mobilenet-quantized-accuracy \
        --tags=object-detection,tf-py,ssd-mobilenet,quantized,accuracy \
    "

Docker parameters

--env-file: the path to the env.list file, which is usually located in the same folder as the Dockerfile. (Currently, the env.list files are identical for all the images.)
--volume: a folder with read/write permissions for the user that serves as shared space ("volume") between the host and the container.
--user: your user id on the host system and a fixed group id (1500) needed to access files in the container.

Gory details

We ask you to launch a container with --user=$(id -u):1500, where $(id -u) gives your user id on the host system and 1500 is the fixed group id of the dvdtg group in the image. We also ask you to mount a folder with read/write permissions with --volume=<folder_for_results>. This folder gets mapped to the /home/dvdt/CK_REPOS/local/experiment folder in the image. While the experiment folder belongs to the dvdt user, it is made accessible to the dvdtg group. Therefore, you can retrieve the results of a container run under your user id from this folder.

CK parameters

--dep_add_tags.lib-tensorflow: specify vsrc to use TensorFlow built from sources controlling its execution via flags and vprebuilt to use prebuilt TensorFlow on the CPU.
--dep_add_tags.weights: specify the tags for a particular model.
--env.CK_BATCH_COUNT: the number of batches to be processed; assuming --env.CK_BATCH_SIZE=1, we typically use --env.CK_BATCH_COUNT=5000 for experiments that measure accuracy over the COCO 2017 validation set and e.g. --env.CK_BATCH_COUNT=2 for experiments that measure performance. (With TensorFlow, the first batch is usually slower than all subsequent batches. Therefore, its execution time has to be discarded. The execution times of subsequent batches will be averaged.)
--repetitions: the number of times to run an experiment (3 by default); we typically use --repetitions=1 for experiments that measure accuracy and e.g. --repetitions=10 for experiments that measure performance.
--record, --record_repo=local: must be present to have the results saved in the mounted volume.
--record_uoa: a unique name for each CK experiment entry; here, object-detection-tf-py (the name of the program) is the same for all experiments, ssd-mobilenet-quantized is unique for each model, accuracy indicates the accuracy mode.
--tags: specify the tags for each CK experiment entry; we typically make them similar to the experiment entry name.

Explore design space

Putting this all together, we provide two shell scripts that launch full design space exploration in the accuracy mode (--repetitions=1) and the performance mode (--repetitions=10) with the corresponding experiment names and tags: - prebuilt CPU (no AVX2 FMA), CPU, CUDA, TensorRT with the dynamic mode disabled, TensorRT with the dynamic mode enabled. - over all the object detection models. - in the performance mode, over several batch sizes (1, 2, 4, 8, 16).

The scripts can be found under:

$ ck find script:<script_name>

where <script_name> is either 'dse-acc' or 'dse-perf'

To use the script, you have to modify the first lines in order to adapt the path to the your host system. You can also modify other parameters, like the list of models to test or the batch sizes and counts.

Analyze the results

Copy the results to a machine for analysis

Once you have accumulated some experiment entries in <folder_for_results>, you can zip them:

$ cd <folder_for_results>
$ zip -rv <file_with_results>.zip {.cm,*}

copy <file_with_results>.zip to a machine where you would like to analyze them, create there a new repository with a placeholder for experiment entries:

$ ck add repo:object-detection-tf-py-experiments --quiet
$ ck add object-detection-tf-py-experiments:experiment:dummy --common_func
$ ck rm  object-detection-tf-py-experiments:experiment:dummy --force

or:

$ ck add repo:object-detection-tf-py-experiments --quiet
$ ck create_entry --data_uoa=experiment --data_uid=bc0409fb61f0aa82 \
--path=`ck find repo:object-detection-tf-py-experiments`

and, finally, extract the results:

$ unzip <file_with_result>.zip -d `ck find repo:object-detection-tf-py-experiments`/experiment
$ ck list object-detection-tf-py-experiments:experiment:*
...

Visualize the results via Jupyter

View online

Versions

2.5.0 (2020-09-04)

Files

.cm/desc.json (3 bytes)
.cm/info.json (339 bytes)
.cm/meta.json (241 bytes)
Dockerfile (9920 bytes)
Dockerfile_19.07-py3_tf-1.14.0 (8049 bytes)
Dockerfile_19.10-py3_tf-1.15.2 (8505 bytes)
Dockerfile_19.10-py3_tf-2.0.1 (9882 bytes)
Dockerfile_20.03-py3_tf-2.0.1 (9757 bytes)
README.md (17710 bytes)
env.list (235 bytes)

Comments

Please log in to add your comments!

If you notice any inapropriate content that should not be here, please report us as soon as possible and we will try to remove it within 48 hours!

Description Hide

ReadMe Hide