How to use the TPU(0.2TOPS@INT8) on the Duo

As you could see from The SPEC Of CV1800B, 0.2TOPS computing power@INT8 is a plus to try some ai/ml applications.

Some docs you may need: :smirk_cat:

TPU-MLIR Quick Start:

TPU-MLIR Technical Reference Manual:

Supported Models:

Since the RAM is only <64MB on Duo, most of the ai/ml models are not available on this device(too large :pensive:). Light-weighting modeling is the only option. Fortunately, Duo is very cheap. Including the models tested in the figure, other models with light-weighting tech also approach to Duo, like Yolov5s, which is not updated in the figure.(Sorry)

About TPU-MLIR

TPU-MLIR is the TPU compiler project for AI chips. This project provides a complete toolchain, which can convert pre-trained neural networks under different frameworks into binary files bmodel that can be efficiently run on TPUs. The code has been open-sourced to github: GitHub - sophgo/tpu-mlir: Machine learning compiler based on MLIR for Sophgo TPU. .

The current directly supported frameworks are pytorch, onnx, tflite and caffe. Models from other frameworks need to be converted to onnx models. The method of converting models from other frameworks to onnx can be found on the onnx official website: GitHub - onnx/tutorials: Tutorials for creating and using ONNX models.

To convert a model, firstly you need to execute it in the specified docker. With the required environment, conversion work can be done in two steps, converting the original model to mlir file by model_transform.py and converting the mlir file to bmodel/cvimodel by model_deploy.py. To obtain an INT8 model, you need to call run_calibration.py to generate a quantization table and pass it to model_deploy.py. This article mainly introduces the process of this model conversion.

Tutorials: :kissing_cat:

  1. Yolov5s on Duo

Other Materials You May Need:

1 Like