907
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Accelerating AI performance with the incorporation of TVM and MediaTek NeuroPilot

ORCID Icon, , , &
Article: 2272586 | Received 01 Jan 2023, Accepted 13 Oct 2023, Published online: 30 Oct 2023
 

Abstract

The continuing prominence of machine learning has led to an increased focus on enhancing the inference performance of edge devices to reduce latency and improve efficiency. Two widely adopted strategies for accelerating computational performance are quantisation and the utilisation of AI hardware accelerators. Each type of accelerator or inference engine offers distinct advantages, with accelerators primarily designed to optimise neural network operations. In this paper, we present an innovative method for integrating TVM's quantisation flow with the MediaTek Neuropilot AI accelerator. We outline the process of converting the TVM relay intermediate-representation quantised neural network dialect model to a tensor-oriented quantisation format, with the aim of harnessing the full potential of both TVM and MediaTek NeuroPilot. This integration enables more efficient neural network inference while preserving the accuracy of the results. We assessed the effectiveness of our proposed integration by conducting a series of experiments and comparing the performance of our approach with that of TVM equipped with an autotuning mechanism. The findings indicate that our approach substantially outperforms TVM in both floating-point model inference and quantised model inference, with inference speedups of up to 11× and up to 70×, respectively. These results underscore the potential of our approach in accelerating AI performance across a diverse range of applications and edge devices. Moreover, a key contribution of our work is providing a valuable practical method for other hardware companies interested in integrating TVM with their own accelerators to achieve performance gains.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes