• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
Codemotion Magazine

Codemotion Magazine

We code the future. Together

  • Magazine
  • Dev Hub
    • Community Manager
    • CTO
    • DevOps Engineer
    • Backend Developer
    • Frontend Developer
    • Web Developer
    • Mobile Developer
    • Game Developer
    • Machine Learning Developer
    • Blockchain Developer
    • Designer – CXO
    • Big Data Analyst
    • Security Manager
    • Cloud Manager
  • Articles
    • Stories
    • Events
  • Sign In
Home » Dev Hub » Machine Learning Developer » Understanding AI: Inference
Machine Learning Developer

Understanding AI: Inference

An overview about which IoT devices can support the process of inference in any AI/Machine Learning application, and why some devices are better than others

Last update November 5, 2019 by Mark Patrick, Mouser Electronics

AI Inference

Inferencing is the second phase of machine learning, following on from the initial training phase. During the training phase, the algorithm generates a new model or repurposes a pre-trained model for a specific application and helps the model learn its parameters. During the inferencing phase, predictions and decisions on new data are made – based on the learned parameters.

Learning requires a significant amount of time, computation power and electricity. In contrast, the inferencing phase requires less processing and draws less power too. However, the traditional way of computing in the central cloud may be just too resource-intensive for IoT devices. Each IoT node residing at the edge collects large datasets, making edge-to-cloud (and conversely cloud-to-edge) data transfer expensive and slow. Instead of relying on the cloud-based servers to do all the processing, “computing at the edge” performs most calculations directly and only transfers relevant information back to the cloud (and vice versa) when completely necessary. While computing at the edge reduces data transfer costs and time, this model also has certain drawbacks. For example, the need for IoT devices to be power-efficient runs contrary to the hefty amount of processing power that learning and inferencing demand. This is a problem that accelerators for AI edge computing can potentially address.

AI Accelerators

Both hardware- and software-based AI accelerators expedite machine learning. Hardware acceleration can target training, inferencing or possibly both. In some instances, the hardware can reduce the power requirement. In other cases, the hardware can improve the processing capacity. Several main types of chips, or processing units, exist for hardware acceleration. These include central processing units (CPUs), graphics processing units (GPUs), field-programmable gate arrays (FPGAs), system-on-chips (SoCs), application-specific integrated circuits (ASICs), vision processing units (VPCs) and neuromorphic ICs. In addition to hardware acceleration, solutions on the market also comprise software-based approaches – like machine learning frameworks for improving AI software development and optimizing system performance.

CPUs and GPUs

CPUs are what AI traditionally uses. While CPUs are designed to be all-purpose, they are often inadequate in supporting the massive calculations used in model generation and inferencing. In response, companies including ARM (with its DynamIQ product offering) and Samsung (with its Exynos 9 series), have started making AI-specific chips. While ARM and Samsung have chosen to stick with AI-specific CPUs, others are shifting toward GPUs.

Originating in the video gaming industry and built for processing massive datasets, GPUs are a good match for machine learning. Because GPUs have more processing units per chip and higher throughput, plus more parallel processing capability than CPUs, they cut down computation time significantly. In addition, a GPU’s single processing unit weighs less than the multiple units CPUs use, making GPUs a better fit for constrained IoT devices, which require small and nimble components. The companies that are making AI-specific GPUs include AMD (Radeon Instinct), NEC (SX-Aurora), NVIDIA (DGX) and Qualcomm (Adreno).

FPGAs

While CPUs and GPUs have considerable processing power at their disposal and are effective for accelerating learning and inferencing, they spend a lot of time and energy moving data between memory and processing. Since CPUs and GPUs are densely packed with circuits, they can often overheat and cause system failures. For remotely located IoT devices, the combination of high energy consumption and potential system failures is far from ideal. It makes sense to find a way to offload some tasks to more energy-efficient hardware.

Based on programmable logic, FPGAs are a type of IC that can be reconfigured by customers or designers in the field after production. While generally not as powerful as CPUs or GPUs, FPGAs offer fast processing for some calculations (such as multiplication, addition, integration, differentiation and exponentials) by computing inside the chip instead of transferring data. Although an FPGA offers more flexibility, it tends to be quite bulky, so miniaturization for IoT devices is a challenge for this type of chip. The major companies that offer AI-targeted inference chips include NVIDIA (TensorRT) and Xilinx. Also, Microsoft is using FPGA chips to accelerate inference, and Intel is currently expanding its FPGA portfolio.

SoCs

SoCs contains can contain a combination of electronic components (microprocessors, microcontrollers, digital signal processors, on-chip memory, hardware accelerators, etc.). Due to the integration of the components onto a single semiconductor substrate, a SoC is more powerful than a microcontroller chip. In a smartphone, the SoC might integrate video, audio and image processing capabilities. ARM has developed its Machine Learning Processor and its Object Detection Processor – and these will be incorporated into SoCs in the future. HiSilicon, a Huawei-backed company, has licensed the IP from ARM to to make SoCs that are seeing preliminary utilization in phone handsets and tablets. Also, HiSilicon is making the Ascend chips for Huawei. Another big player in the SoC space is Arteris, which is developing a network-on-chip interconnect fabric technology (FlexNoC) that many mobile and wireless companies are using. Because Arteris holds a dominant position in IP, it has a bird’s-eye view of the space. Other companies likely to soon be making a play in the AI SoC market include Intel (via its Movidius subsidiary), NXP, Renesas, Toshiba, Texas Instruments and STMicroelectronics.

ASICs, VPUs and Neuromorphic Chips

ASICs are specifically built for accelerating the training of deep learning algorithms, with examples including Google’s Edge TPU and Intel’s Nervana. A vision processing unit (VPU) is designed to accelerate machine vision tasks and run machine vision algorithms, such as convolutional neural networks (CNNs) – so VPU video processing capabilities differ from those of a GPU, which does not offer the same type of task-specific processing. Examples of VPUs include Intel’s Movidius Myriad chips, Google’s Pixel Visual Core, Microsoft’s HoloLens, Inuitive’s NU series and Mobileye’s EyeQ.

Digital chips and analog chips have their respective deficiencies: digital circuitry is precise but gobbles energy, while analog circuitry keeps both latency and energy consumption low but lacks precision. Therefore, researchers are looking for ways to combine the technical advantages of digital and analog chips while sidestepping the weaknesses. Inspired by the human brain, neuromorphic chips are designed to adhere to what is essentially a digital architecture, but use analog circuitry for mixed-signal processing. IBM’s TrueNorth is a neuromorphic processor targeting sensor data pattern recognition and intelligence tasks. Also, Columbia University, Stanford University’s ‘Brains in Silicon’ project, and the DARPA-backed University of Michigan IC Lab are all working on various aspects of neuromorphic system implementation.

Machine Learning Frameworks

AI accelerators also include software. For example, machine learning frameworks, which can be interfaces, libraries or tools, help reduce the complexity associated with machine learning so that developers can build models and optimize performance more quickly and easily. Such frameworks are built to specific languages, like Python or Java. Some of the most popular open-source machine learning frameworks come from Amazon (AWS), Apache, Caffe2, Keras, Theano, Microsoft (Azure) and Google (TensorFlow). Also, some companies offer in-house platforms. For example, Intel’s OpenVINO toolkit is a software and hardware accelerator that optimizes inference with CNN models. In addition, Qualcomm’s Snapdragon is a mobile platform and a software accelerator, IBM has its Watson machine learning accelerator platform, and Huawei has recently launched its MindSpore AI framework.

First Steps with AI

Mouser now offers various items of hardware that can form the initial building blocks for AI implementation. Intel’s plug-and-play Neural Compute Stick 2 can aid engineers with early prototyping of deep neural networks. It relies on the company’s Movidius X VPU to deliver a compelling mix of power efficiency and performance – attaining 4TOPS. Targeted at industrial computing, the highly compact AAEON UP AI Core processing module is based on the mini-PCI Express format. It also features an Intel Movidius VPU (this time the Myriad 2 2450 – with 512MBytes of DDR memory, plus 12 VUW programmable SHAVE cores and dedicated vision accelerators all built in). The Gumstix Aerocore 2 board employs an array of NVIDIA Jetson TX1 and TX2 CUDA cores to give it strong parallel processing capabilities, along with an ARM Cortex-M4 microcontroller and numerous peripherals. It is particularly well suited to object recognition, production line inspection and various other kinds of machine vision.

Looking to the Future

With NVIDIA remaining dominant in industrial AI applications, most newcomers are focusing on the IoT AI space. GreenWave and Reduced Energy Microsystems are in the low-power chip arena, while Mythic and Syntiant are developing battery-powered processors. Similarly, Wiliot is making a Bluetooth chip that can be powered by ambient radio frequencies. In the massive parallel data processing space, there are Vathys, Graphcore, Cerebras and Wave Computing. Meanwhile, Hailo Technologies and Horizon Robotics are working on specialized chips for autonomous vehicles. In the deep learning space, BrainChip has made the first spiking neural processor, Thinci has rolled out a streaming graph processor, and Gyrfalcon is developing a deep learning processor with proprietary AI processing in memory (APiM) technology. Lastly, at Groq, the ex-Googlers who designed Google’s TPU are developing a chip with ultra-low latency. As the field of machine learning witnesses astonishing progress, many technical challenges remain for IoT edge computing – with hardware and software developers continuing to reach for a superior processing performance/energy efficiency balance.

If you enjoyed this article and would like to read more on AI, other technology topics, new products, and more SUBSCRIBE HERE.

Mouser Electronics is a worldwide leading authorised distributor of semiconductors and electronic components for over 800 industry-leading manufacturers. They specialise in the rapid introduction of new products and technologies for design engineers and buyers. Mouser Electronics extensive product offering includes semiconductors, interconnects, passives, and electromechanical components.

About the author

Mark Patrick joined Mouser Electronics in July 2014 having previously held senior marketing roles at RS Components. Prior to RS, Mark spent 8 years at Texas Instruments in Applications Support and Technical Sales roles and holds a first class Honours Degree in Electronic Engineering from Coventry University.

Tagged as:Artificial Intelligence

Codemotion Milan: 2500 Coders at n.1 Dev Event in Europe
Previous Post
Let’s digitize the PA with the Digital Transformation Team
Next Post

Primary Sidebar

Subscribe to our newsletter

I consent to the processing of personal data in order to receive information on upcoming events, commercial offers or job offers from Codemotion.
THANK YOU!

Whitepaper & Checklist: How to Organise an Online Tech Conference

To help community managers and companies like ours overcome the Covid-19 emergency we have decided to share our experience organizing our first large virtual conference. Learn how to organise your first online event thanks to our success story – and mistakes!

DOWNLOAD

Latest

we love founders

Thinking Like a Founder – meet Chad Arimura

CTO

Move Over DevOps, It’s Time for DesignOps and the Role of UX Engineer

Designer - CXO

developer

The State of AI in 2021

Machine Learning Developer

Machine Learning on the Network Edge

The Rise of Machine Learning at the Network Edge

Machine Learning Developer

robot programming

Are You Ready for the FaaS Wars?

Backend Developer

Related articles

  • The State of AI in 2021
  • The Rise of Machine Learning at the Network Edge
  • The Future of Machine Learning at the Edge
  • AI Ladder: the IBM Approach to Artificial Intelligence
  • Questions and Answers in Virtual Assistants
  • Voice Control: Building Your Voice Assistant
  • Seeing Is Believing: Image Recognition on a €10 MCU
  • Exploring LIME Explanations and the Mathematics Behind it
  • ML at the Edge: a Practical Example
  • Google AI Hub: what, why, how

Subscribe to our newsletter

I consent to the processing of personal data in order to receive information on upcoming events, commercial offers or job offers from Codemotion.
THANK YOU!

Footer

  • Learning
  • Magazine
  • Community
  • Events
  • Kids
  • How to use our platform
  • About Codemotion Magazine
  • Contact us
  • Become a contributor
  • How to become a CTO
  • How to run a meetup
  • Tools for virtual conferences

Follow us

  • Facebook
  • Twitter
  • LinkedIn
  • Instagram
  • YouTube
  • RSS

DOWNLOAD APP

© Copyright Codemotion srl Via Marsala, 29/H, 00185 Roma P.IVA 12392791005 | Privacy policy | Terms and conditions

  • Learning
  • Magazine
  • Community
  • Events
  • Kids
  • How to use our platform
  • About Codemotion Magazine
  • Contact us
  • Become a contributor
  • How to become a CTO
  • How to run a meetup
  • Tools for virtual conferences

Follow us

  • Facebook
  • Twitter
  • LinkedIn
  • Instagram
  • YouTube
  • RSS

DOWNLOAD APP

CONFERENCE CHECK-IN