Reactively Smart To Proactively Intelligent

2019-07-12 By: InnoVEX Team

One of the speakers for the AI Forum session is Mr. Alex Chang, Director of MediaTek Computing and AI Group. In his keynote speech, Mr. Chang shared how Edge computing is a necessity for the development of AI.

Benefits of Edge over Cloud

While many AI applications still rely on the cloud, there is a trend of moving the applications to edge devices such as cellphones, home appliances, cars, etc. This is especially necessary for functions in which real time response is required or expected. Mr. Chang brought out a pair of contrasting applications commonly found in the modern smartphones: face unlock and voice assistant.

Face unlock requires an almost instantaneous response from the phone while users are more forgiving to delays coming from voice assistants. Compare the average response time of face unlock which 0.2 seconds to voice assistants which are commonly 0.5 to 1.5 seconds. The difference is numerically negligible; but in reality the fraction of seconds can be entirely felt by the users. This is due to 2 main factors: the complexity of the tasks and where the computations are done.

Face unlock could achieve its split second response due to the computation being done in the cloud and its task if primarily image recognition. On the other hand, voice assistant is a combination of voice recognition and natural language processing (NLP); it is done in the cloud due to its high complexity and further slowed down by the required movement of information through the communication networks.

Edge AI is beneficial because it provides better availability, quicker response time, and better privacy. While the benefits are much coveted, it does not mean that this is the only way going forward.

The 2 Phases of AI Application Development

Mr. Chang pointed out the 2 phases of AI application development: Deep Neural Network Training and Deep Neural Network Inference. Deep Neural Network Training requires a massive amount of computation; which is why it is done in the cloud. On the other hand, Inference requires more privacy and availability; there is also a trend of moving inference to the device rather than keeping on the cloud. Running the inference on the device also provides other benefits such as lower cost and better power efficiency, etc. Combining all this, both cloud and edge have their own advantages. Currently the most popular solutions for edge AI is edge inference and cloud training.

Edge AI capabilities evolve by algorithm and hardware. An example for this is in images where vision AI algorithms have gone through 3 stages: image perception, image construction, and image quality.

Image perception consists of image classification and object detection. Even as the most basic task, it takes 100G MACs (Multiplication-Accumulation Commands per second) in 2017. In 2018, AI algorithm further focuses on constructing the environment through image construction. The process itself consists of image segmentation, depth estimation, motion estimation, and pose estimation; which can be used by the AI to understand the context of the image. For comparison, this process will need 500G MACs, 5 times the computation power.

After the environment is constructed, the next step is to resolve the pixel level challenges. The image quality phase needs super resolution and image enhancement through noise reduction, image deburring, etc. This process takes up to 10T MACs; a 100x computation jump.

Mr. Chang pointed out that the trend shows the challenges and trends of AI moving forward: AI will need more computation power, more memory access, and need to work well with the other SoC components. The best user experience always comes from a combination of well-coordinated and processed different components.

Edge AI for the Future

Mr. Chang stated that there are 2 key success factors for AI: improving Deep Neural Network execution efficiency while keeping the accuracy and fast time to market with minimum investment & risk. These success factors are deeply connected to 4 technologies for AI: Neural Network Algorithm Evolution, Specialized Hardware & System design, Software Solution to reduce network size, and integrated & flexible SoC platform.

Neural Network Architecture Evolution happens on the purpose; there is an ongoing trend of transforming the initial goal from simply improving the accuracy to fit human needs to reducing the computation requirement with better accuracy. The history of this development can be divided into 3 sections: from 2012 to 2016, 2016 to 2018, and 2018 onwards. During 2012 to 2016, the only goal was to improve the accuracy at any cost necessary; resulting in an improved industry accuracy of 24% at 10x the computation. In 2016 to 2018, the accuracy of AI was better than human and at this stage, the goal changed to reduce the necessary computation power. By the end of the period, the accuracy has remained the same as it was in 2016, but the needed computation is only 10% of the period. From 2018 onwards, the goal is to automate Deep Neural Network with platform aware information.

Specialized hardware acceleration needs to be seen from a computing system perspective, not just simple AI. Focusing on the flexibility and efficiency of the computing system will provide a better understanding of the landscape. CPU as a general processor is good at control and serial computing; however Deep Neural Network computing contains a lot of repetitive operations that requires Data Parallel Processing. While GPUs are popular alternatives for AI computing purposes, it also contains other components that make it less energy efficient. A dedicated AI Processing Unit (APU) is needed especially as it is 55 times more power efficient compared to a CPU and 20 times faster than CPUs for Deep Neural Network.

Mr. Chang suggested a heterogeneous computing subsystem to assign the right tasks to the right processors; as well as to scale all the computations through different product segments. This will be beneficial not only to improve efficiency, but also accuracy.

Neural Network Models are generally represented as floating points. They can be further quantized into integer to maintain the accuracy using software solutions. At the same time, neural network models may contain many unnecessary neurons and weight which can be further optimized. Combining all the tasks in software, reducing the network size by 10 times becomes possible; further reducing the computation and memory access usages.

Mr. Chang stated that there needs to be a highly integrated platform to help bring ideas to the market quickly. Looking to 2020, combining all these technology breakthroughs will be able to improve algorithm accuracy efficiency, adds AI computing, and reduce the network model size significantly. This means it will be possible to transform or redefine smart devices from reactively smart to proactively intelligent.

To watch the full forum session, visit our YouTube channel here.