• Skip to primary navigation
  • Skip to main content
  • Skip to footer

Codemotion Magazine

We code the future. Together

  • Discover
    • Events
    • Community
    • Partners
    • Become a partner
    • Hackathons
  • Magazine
    • Backend
    • Frontend
    • AI/ML
    • DevOps
    • Dev Life
    • Soft Skills
    • Infographics
  • Talent
    • Discover Talent
    • Jobs
    • Manifesto
  • Companies
  • For Business
    • EN
    • IT
    • ES
  • Sign in

Adam TaylorJune 1, 2021

Embedded Processing in Programmable Logic

Machine Learning
Embedded Processing in Programmable Logic
facebooktwitterlinkedinreddit

In the first article in this series co-produced with Mouser Electronics, we explored the range of FPGA devices produced by Xilinx and discussed the benefits of adopting such a system for developers, engineers, and end-users alike. Now, let’s dig a little deeper and discover what makes an FPGA tick.

The programmable logic found in FPGAs is an excellent solution for implementing parallel processing structures. Although Programmable Logic (PL) is ideal for dealing with issues such as finite impulse response filters, image processing pipelines, and motor control algorithms, sometimes serial processing is necessary.

Recommended article
November 19, 2024

From Junior to Senior Developer with ChatGPT

Matteo Baccan

Matteo Baccan

Machine Learning

Situations in which this is the case include implementing communication protocols, graphical user interfaces or control, configuration, and status reporting of IP blocks. Serial processing is also essential if we want to work with advanced open-source frameworks and languages such as TensorFlow, OpenCV and Python. 

Aided by programmable logic, there are several options open to us if we want to implement embedded processors with programmable logic devices. Taking a broad view, we can define these in two discrete groups:

  • Heterogeneous System-on-Chip: Combining programmable logic with a processing system, the processing solution of these heterogeneous system-on-chip solutions is tough on the device’s silicon. This solution consequently offers outstanding performance but only limited configuration flexibility because of the processing solution. 
  • Soft-Core Embedded Processors: Programmable logic resources such as flip flops (FF), look-up tables and BRAMs (Block RAM) are used to implement soft-core processors. Consequently, the processors offer more configuration possibilities, but their performance is often negatively impacted. 

As we will discover, both solutions – heterogeneous SoC and soft-core embedded – offer a variety of use cases across several exciting applications.

Additionally, it is possible to implement additional soft-core processors in the programmable logic of heterogeneous SoCs.  This is not unusual and can be used to create a big.LITTLE system that enables time for essential tasks to be off-loaded. 

Table Of Contents
  1. Embedded Processors in Xilinx
  2. Soft-Core Processors – spoilt for choice!
  3. The big.LITTLE Approach
  4. Software Development 
  5. Conclusion 

Embedded Processors in Xilinx

The Zynq-7000 SoC and Zynq MPSoC product families in the Xilinx range both offer embedded processors. Devices from these lines offer genuinely heterogeneous processing systems on the same silicon. In architectural terms, the processor system initially boots in the manner of a traditional processor before subsequently configuring the programmable logic. 

First introduced in the Zynq-7000 SoC, Xilinx’s product combines programmable logic with dual or single-core 32-bit Arm Cortex-A9 processors. 

Unsurprisingly, the processing system also provides peripherals that can be used for both volatile and non-volatile memory, as well as a number of interfaces such as Ethernet, UART and CAN.  

The Cortex-A9 cores also include a floating-point unit and a NEON engine (or “MPE” Media Processing Engine) in order to support high-performance applications. Large data sets can be processed in parallel, thanks to the NEON engine, using a single instruction against multiple data (SIMD).

Image and audio processing benefit from this in particular, as do similar applications in which data sets need to be processed using simple instructions (e.g. multiply and add) repetitively, with little control code. In such applications, performance can be noticeably improved by leveraging the SIMD unit.

Advanced eXtensible Interfaces (AXI) are used to effect data transfer between the processing system and the programmable logic. This allows either the processor system or the programmable logic to initiate the transaction. Data can thus be easily transferred to and from the processor system’s DDR memory. 

Because it combines the processing system and programmable logic in this way, the Zynq-7000 series is an exceptional choice for applications such as image processing, robotics, and augmented reality that require both serial and parallel processing. 

Both sides of this pairing can be adapted to improve connectivity and make use of the support offered to a broad range of frameworks and applications. Central elements or algorithms can be accelerated using programmable logic, while the processing system benefits from Embedded Linux solutions.  

Combining PS (processing system) and PL provides for a more responsive and deterministic solution. The table that follows provides a simple illustration based on implementing AES encryption.

Operating SystemProcessor System ClocksPS Clocks with Programmable LogicReduction in Processing Time
Baremetal28574710475%
FreeRTOS28368710475%
Linux366621564454.8%

Processing capabilities underwent a major increase as the Zynq-7000 SoC evolved to become the next-generation Zynq MPSoC, and the latest logic fabric was added. Heterogeneous processors were included for the first time, giving developers the opportunity to deal with multiple challenges within the same device. 

The Zynq MPSoC processing system incorporates: 

  • Application Processing Unit – quad or dual 64-bit Arm Cortex-A53 processors
  • Real Time Processing Unit – dual lockstep 32-bit Arm Cortex-R5 processors
  • Platform Management Unit – Triple Modular Redundant 32-bit MicroBlaze processor, implemented in silicon
  • Graphics Processor Unit – Arm Mali-400 MP GPU 

The MPSoC processing system includes four processing groups available to the developer for programming but also offers a configuration security processor that allows engineers to implement safety and security processing and security event responses. 

Having such a broad array of processing solutions allows for single-chip solutions to be created for many applications. In the automotive field, for example, complex algorithms and user interfaces can be implemented using the APU and GPU while real-time control and vehicle control interfacing can utilise the RPU, designed and certified for ISO26262 or IEC6508 applications. 

AXI interfaces are also used to enable communication between the PS and PL, although this time they replace 32-bit interfaces. 128-bit interfaces increase the throughput between PS and PL to a significant degree.

High-performance vision-based machine learning applications, such as those often used in automotive or other edge-based solutions can be implemented as a result of this high bandwidth capability. The Zynq-7000 SoC and Zynq MPSoC class of devices consequently offer the highest performance processor systems around. 

To see this at work, consider the image processing application example at Figure 1.  Image data is transferred between the processor system and the programmable logic to implement the desired algorithm.

Diagram: Zynq MPSoC Processing System Interfacing with PL Image Processing Chain.
Zynq MPSoC Processing System Interfacing with PL Image Processing Chain.

Soft-Core Processors – spoilt for choice!

An unlimited choice exists in the Xilinx ecosystem for soft-core processors. The FF, LUTs (Local User Terminal), and RAMs of Xilinx FPGAs can be used to implement any processor described in RTL. 

The most popular choices include:

  • MicroBlaze – a 32-bit processor, a range of configurations from the controller to full MMU (Memory Management Unit) support capable of running embedded Linux are possible;
  • Arm Cortex-M1 – with a small logic footprint and great code density, courtesy of the Thumb Instruction set, this is a 32-bit FPGA implementation of the popular Cortex-M0;
  • Arm Cortex-M3 – another 32-bit implementation, this time of the Cortex-M3 processors. Full support of MMU and OS are on offer alongside good code density derived from Thumb/Thumb2 instruction set support. A popular choice for Internet of Things applications.
  • RISC-V – open-source, 32/64/128-bit instruction set. RISC-V compliant implementations are available from a number of IP vendors for use in Xilinx FPGAs. Highly customizable,  like MicroBlaze, RISC-V can also run embedded operating systems including Linux. 

Although a soft-core processor inevitably provides less performance than a hard silicon instantiation, the greater configuration possibilities and adaptability of the soft-core option means that a much more highly customized solution can be implemented.

A soft-core processor can also be portable, covering the needs of several devices or even vendors (depending upon the precise selection of processor).

As with hard silicon processors, AXI is often the interface of choice to connect peripherals to soft-core equipment. In this context, “peripherals” includes DDR memory interfaces, UARTs, and popular processor interfaces such as I2C and SPI. Figure 2 provides an illustration: a MicroBlaze processor configures and controls a high-speed image processing pipeline. 

Diagram: MicroBlaze Image Processing Application.
MicroBlaze Image Processing Application.

So, how does an engineer make the choice between the implementation of a hard or soft-core processor? 

Performance is always a big factor, but an engineer might also consider application-specific needs, flexibility, security, resource availability, portability, and licensing.

Each of these factors will have a different weight for every individual application, but these are the factors designers and developers should think about when determining the best choice for their particular situation. 

To usefully compare processor capabilities, inherent processing power first has to be compared. A benchmark called Dhrystone MIPS or Millions of Instructions Per Second is used to make this comparison. Lining up the hard and soft cores being considered in a table, it becomes clear that the embedded processors offer higher clock speeds. 

ProcessorDMIPS/MHzComment 
Cortex-A532.3 Quad or dual processors 
Cortex-A92.3Dual or single 
Cortex-R51.67Dual or lockstep
MicroBlaze1.04 – 1.31
Cortex-M10.8
Cortex-M31.25
RISC-V 1.7 Depends on implementation

The needs of the application are equally important; for example, if the processor core only needs to configure IP within the processing system or implement serial communication protocols, then a soft-core-based processor may well be the prefered choice. 

On the other hand, for high-performance algorithms that demand powerful processing capabilities, hard-core processors undeniably have a performance advantage. 

Another major factor affecting choice may be security.  This is particularly important for edge applications.  Hard processing solutions like the Zynq MPSoC incorporate security measures such as a Configuration Security Unit, Secure Boot and Arm Trust Zone. Soft-core processors often require that separate security protections are added to the programmable logic. 

Configuration is one of the biggest points of difference between hard- and soft-core processors: the processor dominates in a hard-core system, booting first and configuring the programmable logic to the desired specifications.

This allows the implementation of several power-saving modes, for example, powering down a processor core, peripherals, or even the entirety of the programmable logic. 

By contrast, an FPGA must first be configured to instantiate a soft-core processor. Once that has successfully occurred, the processor can begin operation. The ability of the soft-core processor to implement power down/power saving schemes is therefore limited, although a lot can still be achieved using lower clock frequencies.

The big.LITTLE Approach

Nonetheless, applications that simultaneously use both hard and soft processors in the same solution are seen with increasing frequency.  The image below illustrates this approach.

A big.LITTLE approach like this focuses the high-performance application processor on the high-level application and delegates real-time applications such as sensor interfacing and motor control to the soft-core processor in the programmable logic. 

A big.LITTLE approach like this offers a more responsive solution than an application processor in isolation. 

Diagram: Big-Little Approach with the Zynq MPSoC and Arm Corex-M3.
Big-Little Approach with the Zynq MPSoC and Arm Corex-M3.

Creating the architecture for a big.LITTLE interface correctly also allows updates to the main application as needed, but avoids changing the code in execution within the little processor. Sensor changes and updates can also be addressed easily by updating the code running in the soft-core processor. 

Software Development 

The Vivado Design Suite is used in the development of both hard and soft-core processor solutions, to configure the hard-core processor or implement the soft-core solution. Configuration complete, the Xilinx Unified Software Platform, Vitis, can make use of the design description. 

Vitis already supports application software for processors in the Zynq-7000 SoC, Zynq MPSoC, and MicroBlaze devices. Development aimed at third-party processors such as Arm Cortex-M1, Cortex-M3, and RISC-V will make use of toolchains provided by the processor core’s creator –  Arm Keil, for example. 

JTAG or Serial Wire interfaces allow users the ability to debug these solutions at the software development stage. Breakpoints, watch registers, and monitor memory locations can all be added, with the result that users can easily identify the root cause of any problem. 

Conclusion 

Thanks to the range of embedded processors that can be incorporated into programmable logic devices, there is a suitable processor available for each and every use case. 

Making a start with these solutions is simple: embedded processor solutions, whether hard- or soft-core, can be created easily – no need to write even a single line of HDL! Vivado’s IP Integrator allows developers to focus on their application.  

Join us for the next instalment in this series as we explore the design tools Xilinx has created to make integrating programmable logic a stress-free process. Or, if you want more technical information about the hardware you can use for your project, visit the Mouser website.

Related Posts

machine learning vs traditional programming, which is best for your career?

ML vs Traditional Programming: Which is Better for Your Career?

Pohan Lin
February 27, 2024
AI books you have to read. Machine Learning

Your Machine Learning Toolkit: Top OSS Libraries and Frameworks for ML

Codemotion
February 14, 2024
Javascript color library. Discover how to use it in this guide.

Unsupervised Learning in Python: A Gentle Introduction to Clustering Techniques for Discovering Patterns

Federico Trotta
September 25, 2023
What is RPA? Robotic process automation.

Emerging Tech: Everything You Need to Know About Robotic Process Automation (RPA)

Grace Lau
May 24, 2023
Share on:facebooktwitterlinkedinreddit
Adam Taylor
Adam Taylor is an expert in design and development of embedded systems and FPGA’s for several end applications. He is the author of numerous articles and papers on electronic design and FPGA design, a Chartered Engineer, Fellow of the Institute of Engineering and Technology, Visiting Professor of Embedded Systems at the University of Lincoln and Arm Innovator. He is also the owner of the engineering and consultancy company Adiuvo Engineering and Training.
Serverless Event Processing on AWS Platform w/ Kinesis
Previous Post
Kick Off a React JS Project: CRA, Next.js or Gatsby?
Next Post

Footer

Discover

  • Events
  • Community
  • Partners
  • Become a partner
  • Hackathons

Magazine

  • Tech articles

Talent

  • Discover talent
  • Jobs

Companies

  • Discover companies

For Business

  • Codemotion for companies

About

  • About us
  • Become a contributor
  • Work with us
  • Contact us

Follow Us

© Copyright Codemotion srl Via Marsala, 29/H, 00185 Roma P.IVA 12392791005 | Privacy policy | Terms and conditions