Posted by

ITEM TILE – File Size: 7.9K

ONNX-TensorRT: TensorRT backend for ONNX

TensorRT backend for ONNX

Parses ONNX models for execution with TensorRT.

See also the TensorRT documentation.

Supported TensorRT Versions

Development on the Master branch is for the latest version of TensorRT 6.0 with full-dimensions and dynamic shape support.

For version 6.0 without full-dimensions support, clone and build from the 6.0 branch

For version 5.1, clone and build from the 5.1 branch

For versions < 5.1, clone and build from the 5.0 branch

Full Dimensions + Dynamic Shapes

Building INetwork objects in full dimensions mode with dynamic shape support requires calling the following API:


const auto explicitBatch = 1U <createNetworkV2(explicitBatch)


import tensorrtexplicit_batch = 1 << (int)(tensorrt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)builder.create_network(explicit_batch)

For examples of usage of these APIs see:* sampleONNXMNIST* sampleDynamicReshape

Supported Operators

Current supported ONNX operators are found in the operator support matrix.




For building on master, we recommend following the instructions on the master branch of TensorRT as there are new dependencies that were introduced to support these new features.

To build on older branches refer to their respective READMEs.

Executable usage

ONNX models can be converted to serialized TensorRT engines using the onnx2trt executable:

onnx2trt my_model.onnx -o my_engine.trt

ONNX models can also be converted to human-readable text:

onnx2trt my_model.onnx -t my_model.onnx.txt

See more usage information by running:

onnx2trt -h

Python modules

Python bindings for the ONNX-TensorRT parser are packaged in the shipped .whl files. Install them with

pip install /python/tensorrt-

TensorRT 6.0 supports ONNX release 1.5.0. Install it with:

pip install onnx==1.5.0

ONNX Python backend usage

The TensorRT backend for ONNX can be used in Python as follows:

“`pythonimport onnximport onnx_tensorrt.backend as backendimport numpy as np

model = onnx.load(“/path/to/model.onnx”)engine = backend.prepare(model, device=’CUDA:1′)inputdata = np.random.random(size=(32, 3, 224, 224)).astype(np.float32)outputdata =[0]print(outputdata)print(output_data.shape)“`

C++ library usage

The model parser library,, has its C++ API declared in this header:


Important typedefs required for parsing ONNX models are declared in this header:


Docker image

Tar-Based TensorRT

Build the onnx_tensorrt Docker image using tar-based TensorRT by running:

git clone --recurse-submodules onnx-tensorrtcp /path/to/TensorRT-6.0.*.tar.gz .docker build -f docker/onnx-tensorrt-tar.Dockerfile --tag=onnx-tensorrt:6.0.6 .

Deb-Based TensorRT

Build the onnx_tensorrt Docker image using deb-based TensorRT by running:

git clone --recurse-submodules onnx-tensorrtcp /path/to/nv-tensorrt-repo-ubuntu1x04-cudax.x-trt6.x.x.x-ga-yyyymmdd_1-1_amd64.deb .docker build -f docker/onnx-tensorrt-deb.Dockerfile --tag=onnx-tensorrt:6.0.6 .


After installation (or inside the Docker container), ONNX backend tests can be run as follows:

Real model tests only:

python OnnxBackendRealModelTest

All tests:


You can use -v flag to make output more verbose.

Pre-trained models

Pre-trained models in ONNX format can be found at the ONNX Model Zoo

To restore the repository download the bundle


and run:

 git clone onnx-onnx-tensorrt_-_2019-11-18_21-18-37.bundle 

Uploader: onnx
Upload date: 2019-11-18