Executive Summary
AI can achieve remarkable performance under ideal conditions that are difficult to replicate in many real-world settings. The AI that often captures headlines typically runs under these conditions, in well-maintained data centers with an abundant supply of compute and power. Currently, most top-performing AI models designed for vision and language applications rely on these abundant resources. However, these resources are highly constrained on many systems in the real world, be it drones, satellites, or ground vehicles.
This is the challenge of ‘onboard AI’: running AI directly on a device or system without additional backend compute support. There are times when running models onboard is optimal or necessary, and doing so can bring a range of advantages. However, onboard computing constraints can introduce significant limitations, or completely inhibit the use of certain models on some systems. This creates a gap between the highest-performing AI systems and those deployed in the real world, which has implications for the performance and robustness of many sought-after applications.
Onboard AI systems are constrained for several reasons, but the primary factor is processing speed. The highest-performing models execute extremely large numbers of computations for each output they produce. These calculations require high-performance processors, often many of them. However, because of their size and power demands, such processors cannot be used in various systems. Practically, this means chips designed for onboard use do orders of magnitude fewer calculations and cannot run AI models quickly enough for many applications.
Onboard AI systems also need substantial working memory. Data center chips have the memory to hold large models, store the results of ongoing calculations, and enable fast communications both on the chip and between chips to split the calculations across several devices. However, many devices are not designed for large-scale computations or equipped with large working memories.
These constraints are influenced by the size, weight, and power limitations of many systems. Most state-of-the-art chips use far more power than what is available on small-footprint devices. Powerful chips require larger and heavier batteries that are infeasible for lightweight systems such as small drones, in addition to the chip’s packaging, which can increase their weight by a factor of ten.
Finally, real-world applications might have to sacrifice computing capabilities for a host of other reasons, such as radiation hardness and temperature sensitivity. Moreover, chips age and become out of date if they operate for many years, which can make them ill-equipped to process contemporary models.
Stakeholders across government and industry should understand that these constraints cannot always be resolved, given current technologies and platform limitations. Engineers can mitigate some of them, such as by using different algorithms that are less resource-intensive but still have acceptable performance. However, in many cases, onboard AI will be inferior to the state-of-the-art models that grab headlines or achieve high-level performance on benchmarks. In some high-risk contexts, the use of AI onboard systems may be inappropriate or require additional safeguards.