Services
Contact Us

Top 15 Edge AI Chip Makers with Use Cases

Cem Dilmegani
Cem Dilmegani
updated on Jun 4, 2026

The demand for low-latency processing has driven innovation in edge AI chips. These processors are designed to perform AI computations locally on devices rather than relying on cloud-based solutions.

Based on our experience analyzing AI chip makers, we identified the leading solutions for robotics, industrial IoT, and embedded systems.

Solution
Performance (TOPS)*
Power Consumption
Primary Applications
NVIDIA Jetson AGX Orin
275
15 – 60W
Robotics, Autonomous Systems
Axelera Metis AI Platform
Up to 214
~4 – 15W
High-Throughput Vision
Renesas RZ/V2H
Up to 80 (sparse) / 8 (INT8)
~10W (10 TOPS/W)
Robotics, Real-Time Vision Control
NVIDIA Jetson Orin Nano Super
67 (sparse) / 33 (dense)
7 – 25W
GenAI, Robotics, Vision
EdgeCortix SAKURA-II
60
<10W (~8W typ.)
Vision AI, Edge Servers
SiMa.ai MLSoC
50+
<5W
Embedded Vision, Edge Inference
Hailo-10H
40 (INT4) / 20 (INT8)
2.5W typ.
On-Device GenAI, LLMs/VLMs, Automotive
Hailo-8
26
2.5 – 3W
Smart Cameras, Automotive
Ambarella CV5
Not disclosed
<2W (8K30) / <5W (8K60)
AI Cameras, Automotive
Qualcomm Robotics RB5
15
5 – 15W
5G Robots, Edge AI Devices

*TOPS = Tera Operations Per Second. These are maximum quoted values by vendors.
**Kria K26 performance varies depending on the FPGA configuration.

Analysis of Edge AI chips

1. NVIDIA Jetson AGX Orin

NVIDIA Jetson AGX Orin delivers 275 TOPS, positioning it as the highest-performance edge AI module currently available. The module is built on NVIDIA’s Ampere architecture and is designed for robotics and autonomous systems that require significant on-device processing capabilities.

Key specifications:

  • Power consumption: 10-60W (configurable based on workload)
  • Memory: Up to 64GB LPDDR5
  • Software: Full CUDA support, compatibility with NVIDIA’s datacenter AI stack

The power consumption range of 10-60W provides flexibility for different deployment scenarios. Lower power modes can extend battery life in mobile robotics applications, while maximum performance mode supports multiple concurrent AI workloads.

NVIDIA’s software ecosystem represents a significant advantage. Models developed for NVIDIA datacenter GPUs can be deployed on Jetson with minimal modification. This compatibility reduces development time for teams already working within the NVIDIA ecosystem.

2. Axelera Metis AI Platform

Axelera’s Metis AI platform delivers up to 214 TOPS for high-throughput vision inference workloads. The platform uses Digital In-Memory Computing (D-IMC) architecture to improve throughput and efficiency.

Key specifications:

  • Performance: Up to 214 TOPS
  • Power consumption: 20-40W
  • Architecture: Digital In-Memory Computing (D-IMC)
  • Target: Computer vision inference

The D-IMC architecture performs computations directly within memory arrays, reducing data movement between memory and processing units. This approach addresses the memory bandwidth bottleneck that limits performance in traditional architectures.

Axelera targets applications that require the simultaneous processing of multiple video streams. The high throughput enables real-time analysis of dozens of camera feeds from a single device.

Use cases:

  • Multi-camera surveillance systems
  • Smart city infrastructure
  • Retail analytics with dense camera deployments
  • Industrial quality inspection systems

Axelera received €61.6 million in funding from the EuroHPC Joint Undertaking in March 2025, supporting the development of their Titania chiplet for deployment by 2028.

3. Renesas RZ/V2H

Renesas RZ/V2H delivers up to 80 TOPS using sparse computation, or 8 TOPS at INT8, targeting robotics and factory automation that combine vision AI with real-time control on a single chip. The MPU integrates the third-generation DRP-AI3 accelerator alongside both application and real-time processor cores.

Key specifications:

  • Performance: Up to 80 TOPS (sparse) or 8 TOPS (INT8)
  • Power efficiency: 10 TOPS/W
  • AI accelerator: DRP-AI3 (Dynamically Reconfigurable Processor for AI)
  • CPU: Quad-core Cortex-A55 at 1.8GHz, dual-core Cortex-R8 at 800MHz, and a Cortex-M33
  • Interfaces: PCIe Gen3, USB 3.2, Gigabit Ethernet

The DRP-AI3 uses INT8 quantization and hardware-supported unstructured pruning to achieve its rated performance, enabling the chip to run object recognition without a heat sink. A separate dynamically reconfigurable processor handles image processing such as OpenCV operations, offloading that work from the CPU cores.

The combination of application cores and real-time cores allows the RZ/V2H to process image recognition results and apply them to mechanical control on the same device. This suits applications where vision inference and motion control run together.

Use cases:

  • Service robots requiring real-time control
  • Autonomous mobile robots
  • Industrial drones
  • Factory automation and machine vision

4. NVIDIA Jetson Orin Nano Super

NVIDIA Jetson Orin Nano Super delivers 67 TOPS with sparse operations or 33 TOPS with dense operations, positioning it below the Jetson AGX Orin as a lower-cost entry point to the same software stack.

Key specifications:

  • Performance: 67 TOPS (sparse) or 33 TOPS (dense INT8)
  • Power consumption: 7-25W (configurable based on workload)
  • GPU: Ampere architecture with 1,024 CUDA cores and 32 Tensor cores
  • CPU: 6-core Arm Cortex-A78AE at 1.7GHz
  • Memory: 8GB LPDDR5 with 102 GB/s bandwidth
  • Software: Full CUDA support, NVIDIA JetPack SDK

The 67 TOPS figure represents a 1.7 times increase over the previous Jetson Orin Nano, achieved through a power mode that raises the GPU, CPU, and memory clocks. Existing Orin Nano owners can apply the increase through a software update rather than buying new hardware.

The memory bandwidth and software compatibility allow the module to run small language models, vision transformers, and vision-language models on the device. Models developed for other Jetson and NVIDIA platforms transfer with minimal modifications, reducing development time for teams already using the NVIDIA ecosystem.

Use cases:

  • Robotics navigation and perception
  • Vision AI prototypes
  • On-device language and vision-language models
  • Education and research

5. EdgeCortix SAKURA

EdgeCortix SAKURA delivers 60 TOPS with power consumption under 10W, targeting edge AI servers and high-performance vision applications. The platform features a reconfigurable architecture that adapts to different AI workloads.

Key specifications:

  • Performance: 60 TOPS
  • Power consumption: <10W
  • Architecture: Dynamic Neural Accelerator (DNA)
  • Software: MERA compiler supporting TensorFlow, PyTorch, ONNX

The SAKURA platform’s reconfigurable architecture allows optimization for different neural network topologies without hardware changes. This flexibility enables the deployment of emerging model architectures without requiring chip replacements.

Use cases:

  • Edge data centers
  • Distributed AI inference systems
  • Multi-model deployment scenarios
  • Vision AI workloads requiring flexibility

6. SiMa.ai MLSoC

SiMa.ai’s MLSoC (Machine Learning System-on-Chip) delivers over 50 TOPS while maintaining power consumption below 5W. The chip targets embedded vision applications requiring high performance in power-constrained environments.

Key specifications:

  • Performance: 50+ TOPS
  • Power consumption: <5W
  • Software: SiMa Platform SDK
  • Architecture: Optimized for vision transformers and CNNs

SiMa.ai designed the MLSoC specifically for computer vision workloads. The sub-5W power envelope enables deployment in battery-powered devices that require sustained high-performance inference.

Use cases:

  • Autonomous mobile robots
  • Drone-based inspection systems
  • Smart cameras for surveillance and analytics
  • Augmented reality devices

7. Hailo-10H

Hailo-10H delivers 40 TOPS at INT4, or 20 TOPS at INT8, while consuming 2.5W typical, extending Hailo’s vision-focused hardware to generative AI workloads. The accelerator uses the company’s second-generation neural core architecture in an M.2 module form factor.

Key specifications:

  • Performance: 40 TOPS (INT4) or 20 TOPS (INT8)
  • Power consumption: 2.5W typical
  • Form factor: M.2 module
  • Memory: On-module LPDDR4/4X
  • Software: Supports TensorFlow, PyTorch, ONNX, and Keras

A direct DDR interface, which the Hailo-8 lacks, enables the Hailo-10H to load larger models for language and vision-language inference. Hailo reports that the chip runs 2-billion-parameter models at around 10 tokens per second, generates images with Stable Diffusion 2.1 in under five seconds, and runs object detection on a 4K video stream for conventional vision tasks. These figures are vendor-reported rather than independently measured.1

The Hailo-10H is qualified to AEC-Q100 Grade 2 for automotive use, with vehicle production targeted for 2026. It became available to order in mid-2025 as standalone M.2 modules.

Use cases:

  • Hybrid pipelines combining vision and generative AI
  • On-device large language and vision-language models
  • Smart cameras with scene understanding
  • In-vehicle infotainment and driver monitoring

8. Hailo-8 AI Accelerator

Hailo-8 delivers 26 TOPS while consuming only 2.5-3W, representing one of the highest performance-per-watt ratios among edge AI chips.

Key specifications:

  • Performance: 26 TOPS
  • Power consumption: 2.5-3W
  • Form factors: M.2 module, PCIe card
  • Software: Hailo SDK with model zoo

The chip supports standard neural network layers and can run models developed in TensorFlow, PyTorch, and ONNX. Hailo’s compiler.

9. Ambarella CV5

Ambarella’s CV5 system-on-chip delivers over 20 TOPS specifically optimized for computer vision in automotive and camera applications. The chip combines AI processing with advanced image signal processing (ISP) capabilities.

Key specifications:

  • Performance: 20+ TOPS
  • Power consumption: 2.5-5W
  • Architecture: CVflow AI engine
  • Integrated: 4K/8K video encoding, advanced ISP

The CV5’s integrated ISP handles complex image preprocessing, reducing the computational burden on the AI engine. This integration improves overall system efficiency for vision-based applications.

Use cases:

  • ADAS and autonomous driving cameras
  • Professional surveillance systems
  • AI-powered dashcams
  • Drone imaging systems

10. Qualcomm Robotics RB5 Platform

Qualcomm’s Robotics RB5 platform integrates 5G connectivity with edge AI processing, delivering approximately 15 TOPS through its Qualcomm AI Engine. The platform targets autonomous robots and drones that require both high-bandwidth connectivity and on-device AI processing.

Key specifications:

  • AI performance: 15 TOPS
  • Connectivity: 5G, Wi-Fi 6, Bluetooth 5.1
  • Processing: Qualcomm Kryo 585 CPU, Adreno 650 GPU, Hexagon 698 DSP
  • Power consumption: 5-15W

The integration of 5G offers high-bandwidth, low-latency connectivity for applications that require real-time cloud communication.

The RB5 platform supports up to 7 concurrent camera inputs. This multi-camera capability supports 360-degree perception systems for autonomous mobile robots.

Use cases:

  • Autonomous delivery robots
  • Industrial inspection drones
  • Warehouse automation systems
  • Connected vehicles

11. Kneron KL730

Kneron’s KL730 AI SoC delivers 7 TOPS with ultra-low power consumption, targeting IoT and smart home applications. The chip emphasizes edge processing for privacy-sensitive applications.

Key specifications:

  • Performance: 7 TOPS
  • Power consumption: 0.5-2W
  • Architecture: Kneron NPU with ARM Cortex-M4
  • Software: Kneron PLUS SDK

The KL730’s low power consumption enables always-on AI processing in battery-powered devices. The chip supports face recognition, object detection, and gesture recognition with minimal energy draw.

Use cases:

  • Smart doorbells and security cameras
  • Smart home hubs
  • Wearable devices
  • IoT sensors with AI capabilities

12. Rockchip RK3588 SoC

The RK3588 is an 8-core SoC featuring a 6 TOPS neural processing unit. The chip targets single-board computers and edge devices requiring moderate AI performance alongside general-purpose computing capabilities.

Key specifications:

  • CPU: Quad-core Cortex-A76 + Quad-core Cortex-A55
  • NPU: 6 TOPS
  • GPU: Mali-G610 MP4
  • Power consumption: 8-15W
  • Memory: Support for up to 32GB LPDDR4/5

The 6 TOPS NPU handles neural network inference for computer vision, natural language processing, and audio processing tasks.

Use cases:

  • Digital signage with content recognition
  • Edge gateways with AI preprocessing
  • Smart home hubs
  • Industrial HMI panels

The RK3588’s general-purpose computing capabilities make it suitable for applications where AI inference is one component of a larger system. Organizations building edge devices that combine AI with web servers, databases, or other software services have adopted this SoC.

13. NXP i.MX 8M Plus

NXP’s iMX 8M Plus features a 2.3 TOPS neural processing unit, designed specifically for industrial IoT applications. The processor prioritizes reliability, security, and long-term availability over maximum performance.

Key specifications:

  • NPU: 2.3 TOPS
  • CPU: Quad-core Cortex-A53, Cortex-M7 real-time core
  • Power consumption: 3-8W
  • Security: EdgeLock secure enclave

The inclusion of a Cortex-M7 real-time core enables deterministic processing for time-critical control loops. This architecture supports applications that combine AI-based decision-making with real-time control, such as industrial robots and automated manufacturing equipment.

NXP’s EdgeLock security features provide hardware-based secure boot, encrypted storage, and secure key management.

Use cases:

  • Industrial automation
  • Medical devices
  • Building automation
  • Smart agriculture

14. Renesas RZ/V2L

Renesas RZ/V2L delivers 1.0 TOPS optimized for industrial vision applications with extremely low power consumption. The chip targets factory automation and quality inspection systems.

Key specifications:

  • Performance: 1.0 TOPS
  • Power consumption: 1.5-3W
  • Architecture: DRP-AI (Dynamically Reconfigurable Processor for AI)
  • CPU: Dual-core Cortex-A55

The DRP-AI architecture provides flexibility for different vision algorithms while maintaining low power consumption. This design suits industrial environments requiring long-term reliability and deterministic performance.

Use cases:

  • Factory quality inspection
  • Industrial cameras
  • Process monitoring systems
  • Automated sorting systems

15. AMD Xilinx Kria K26 SOM

The Kria K26 System-on-Module combines a Zynq UltraScale+ MPSoC with FPGA fabric, enabling adaptive edge AI solutions. The FPGA architecture enables customization of the processing pipeline for specific computer vision and sensor-fusion workloads.

Key specifications:

  • Processing: Quad-core Arm Cortex-A53, dual-core Arm Cortex-R5F
  • FPGA: UltraScale+ programmable logic
  • Power consumption: 5-15W
  • Memory: 4GB DDR4

AMD provides pre-built vision AI applications through the Kria KV260 Vision AI Starter Kit. These applications include smart camera implementations with capabilities for object detection, classification, and tracking.

Advantages:

  • Customizable processing pipeline
  • Low-latency sensor interfaces
  • Adaptable to new AI model architectures

Limitations:

  • Requires FPGA development expertise for custom implementations
  • Performance depends on FPGA configuration
  • Higher development complexity compared to fixed-function accelerators

Performance vs. Power Consumption Analysis

Edge AI chips face a tradeoff between performance and power consumption.

High Performance (>50 TOPS):

  • NVIDIA Jetson AGX Orin (275 TOPS, 10-60W)
  • Axelera Metis (214 TOPS, 20-40W)
  • EdgeCortix SAKURA (60 TOPS, <10W)
  • SiMa.ai MLSoC (50+ TOPS, <5W)

These solutions target applications where AI performance is the primary requirement. Use cases include autonomous vehicles, industrial robotics, and multi-camera video analytics systems.

Balanced Performance (15-30 TOPS):

  • Hailo-8 (26 TOPS, 2.5-3W)
  • Ambarella CV5 (20+ TOPS, 2.5-5W)
  • Qualcomm RB5 (15 TOPS, 5-15W)

Balanced solutions optimize the performance-per-watt ratio. These chips are suitable for applications where both performance and power consumption are constrained, such as battery-powered robots and smart cameras.

Low Power (<10 TOPS):

  • Kneron KL730 (7 TOPS, 0.5-2W)
  • Rockchip RK3588 (6 TOPS, 8-15W)
  • Intel Movidius Myriad X (4 TOPS, 5W)
  • Google Edge TPU (4 TOPS, 2W)
  • NXP i.MX 8M Plus (2.3 TOPS, 3-8W)
  • Renesas RZ/V2L (1.0 TOPS, 1.5-3W)

Low-power solutions prioritize energy efficiency over raw performance. IoT devices, battery-powered cameras, and embedded systems with limited thermal budgets typically use these chips.

The selection of appropriate hardware depends on:

  1. Required inference throughput (frames per second, inferences per second)
  2. Power budget (battery life requirements, thermal constraints)
  3. Latency requirements (real-time vs. near-real-time processing)
  4. Model complexity (number of parameters, operations per inference)

Software ecosystem

Software support has a significant impact on the practical performance and development time for edge AI deployments.

NVIDIA Jetson supports the full CUDA ecosystem. Models developed for NVIDIA data center GPUs can be deployed with minimal modification. This compatibility reduces development time for teams already using NVIDIA hardware.

Google Edge TPU requires TensorFlow Lite models with int8 quantization. While this limitation ensures optimal performance on the TPU, it requires model conversion and validation steps. Organizations not using TensorFlow may face additional development work.

Intel Movidius integrates with the OpenVINO toolkit, which supports multiple model frameworks. The toolkit’s optimization capabilities can significantly improve inference performance, but require learning Intel-specific tools.

AMD Xilinx Kria demands FPGA development expertise for custom implementations. While pre-built vision AI stacks reduce this requirement, organizations that seek custom processing pipelines require specialized skills.

Qualcomm, Hailo, and other vendors provide their own SDKs and model compilers. Development teams should evaluate these tools during the selection process to understand the required effort for model deployment and optimization.

Form factor options

Edge AI chips are available in multiple form factors to address different integration requirements:

System-on-Module (SoM):

  • NVIDIA Jetson AGX Orin
  • AMD Xilinx Kria K26
  • Qualcomm RB5

SoM provides a complete computing module that can be integrated into custom carrier boards. This approach reduces hardware design complexity while enabling customization of I/O interfaces.

M.2 and PCIe Cards:

  • Hailo-8
  • Google Coral
  • Intel Movidius (via M.2 adapter)

M.2 and PCIe form factors enable adding AI acceleration to existing systems. This approach is suitable for applications that upgrade existing hardware platforms with AI capabilities.

USB accelerators:

  • Google Coral USB Accelerator
  • Intel Neural Compute Stick 2

USB accelerators provide the simplest integration path. These devices are suitable for prototyping, development, and applications where the host system has available USB ports and sufficient bandwidth.

Integrated SoC:

  • Rockchip RK3588
  • NXP i.MX 8M Plus
  • Ambarella CV5
  • Kneron KL730
  • Renesas RZ/V2L

Integrated SoCs combine CPU, GPU, and NPU in a single chip. This integration reduces board complexity and cost for products designed around the specific SoC.

See more of our benchmarks and data-driven insights in Google Search.
GoogleAdd as preferred source

Application-specific recommendations

Robotics and autonomous systems: NVIDIA Jetson AGX Orin or Qualcomm RB5 provide the performance required for real-time navigation, object detection, and path planning. The choice depends on whether 5G connectivity is a requirement.

Industrial IoT and factory automation: NXP i.MX 8M Plus or AMD Xilinx Kria K26 address the security and real-time processing requirements common in industrial applications. The Kria platform suits applications requiring custom sensor interfaces or deterministic latency.

Smart cameras and video analytics: Hailo-8 or Axelera Metis deliver the performance-per-watt ratio required for always-on video processing. Hailo-8 suits single or few-camera deployments, while Axelera Metis targets multi-camera systems.

Battery-powered IoT devices: Google Edge TPU provides the lowest power consumption for applications where battery life is the primary constraint. The 2W power consumption enables extended operation from small batteries.

Drones and AR devices: Intel Movidius Myriad X or SiMa.ai MLSoC balance performance with power consumption for airborne and wearable devices. The weight and thermal constraints in these applications favor efficient solutions.

Automotive applications: Ambarella CV5 or Qualcomm platforms offer the necessary automotive-grade certifications and performance for ADAS and autonomous driving applications.

Development and prototyping: Intel Neural Compute Stick 2 or Google Coral USB Accelerator enable quick evaluation of edge AI capabilities without hardware modifications. These USB devices are suitable for proof-of-concept projects and algorithm development.

FAQs

Specialized AI chips, including cutting-edge AI chips and other AI accelerators, are designed to run AI models, AI algorithms, and deep neural networks directly on local devices. This shift toward processing data locally reduces cloud or data center overhead. It lowers cloud dependence, which is crucial for real-time data processing, analytics, and decision-making in edge AI applications.
By keeping sensitive data on local devices, organizations can improve security while enabling AI at the edge for various use cases, including object detection, anomaly detection, predictive maintenance, face recognition, and smart city applications. Specialized edge AI technology also enables low-power consumption, low-power computing, and reduced operational costs, which are important factors in embedded AI hardware and AI devices used in robotics, industrial IoT, and other edge environments.

Edge AI technology runs machine learning models, generative AI, and other AI applications directly on specialized hardware such as AI accelerators or a single chip (e.g., a single Metis chip). Unlike cloud AI, which depends on remote servers, AI at the edge focuses on local processing, where data is processed locally using AI inference.
This architecture reduces latency, improves decision-making, and enhances AI capabilities for time-critical uses like real-time monitoring, real-time processing, and managing safety hazards in business operations. Running AI on edge devices also reduces operational expenses, optimizes bandwidth usage, and helps organizations improve efficiency, optimize operations, and boost operational efficiency, especially in environments where continuous connectivity to a remote data center is not guaranteed.

AI accelerators and cutting-edge AI chips enable a wide range of typical applications that rely on AI inference, machine learning, and artificial intelligence running outside the cloud. These include object detection in smart cameras, detecting anomalies in industrial systems, predictive maintenance for equipment, and natural language interfaces on local devices.
Industries such as robotics, autonomous systems, industrial automation, and smart cities benefit from bringing AI closer to sensors for real-time decision making. With low-power consumption designs and support for different models of AI workloads, including large language models and vision-based workloads, edge systems become more cost-effective and help organizations reduce operational expenses. Whether using central processing units with integrated NPUs or advanced AI-specific architectures with minimal reliance on external memory, edge solutions allow AI to run efficiently on a single chip and enable next-generation AI at the edge deployments.

Further reading

If you need help finding a vendor or have any questions, feel free to contact us:

Find the Right Vendors

Cite this research

Pick the format that matches where you're publishing. Pasting the link version into your CMS preserves the backlink.

Cem Dilmegani (2026) - "Top 15 Edge AI Chip Makers with Use Cases". Published online at AIMultiple.com. Retrieved June 4, 2026, from: https://aimultiple.com/edge-ai-chips [Online Resource]

Dilmegani, C. (2026, June 4). Top 15 Edge AI Chip Makers with Use Cases. AIMultiple. https://aimultiple.com/edge-ai-chips

@misc{dilmegani2026,
  author = {Dilmegani, Cem},
  title  = {{Top 15 Edge AI Chip Makers with Use Cases}},
  year   = {2026},
  month  = jun,
  howpublished    = {\url{https://aimultiple.com/edge-ai-chips}},
  note   = {AIMultiple. Retrieved June 4, 2026}
}
Cem Dilmegani
Cem Dilmegani
Principal Analyst
Cem has been the principal analyst at AIMultiple since 2017. AIMultiple informs hundreds of thousands of businesses (as per similarWeb) including 55% of Fortune 500 every month.

Cem's work has been cited by leading global publications including Business Insider, Forbes, Washington Post, global firms like Deloitte, HPE and NGOs like World Economic Forum and supranational organizations like European Commission. You can see more reputable companies and resources that referenced AIMultiple.

Throughout his career, Cem served as a tech consultant, tech buyer and tech entrepreneur. He advised enterprises on their technology decisions at McKinsey & Company and Altman Solon for more than a decade. He also published a McKinsey report on digitalization.

He led technology strategy and procurement of a telco while reporting to the CEO. He has also led commercial growth of deep tech company Hypatos that reached a 7 digit annual recurring revenue and a 9 digit valuation from 0 within 2 years. Cem's work in Hypatos was covered by leading technology publications like TechCrunch and Business Insider.

Cem regularly speaks at international technology conferences. He graduated from Bogazici University as a computer engineer and holds an MBA from Columbia Business School.
View Full Profile

Be the first to comment

Your email address will not be published. All fields are required. Comments are left in their original language.

0/450