AI Test for AI Devices

Between the race to be first and the cost of being late to market, streamlining the engineering workflow has become more important than ever. Inefficiencies from early design through test can have a significant impact on release dates. Artificial intelligence (AI) shows promise as a tool that can accelerate innovation. Most people are likely familiar with large language models (LLMs), like ChatGPT, that can be used to generate text. But AI can be used in many other applications beyond LLMs. The wireless industry is no exception.

In a market that is searching for growth in new areas while dealing with the challenges around a shift from a growth to a profitability mindset, it is a good time to pause and re-think how we’re designing and testing the next generation of wireless technologies, especially with the advent of AI.

Wireless standards are becoming more complex and so are wireless devices. One way that companies are adapting to increasing complexity is by integrating embedded AI into firmware and software. While this is an exciting development for the industry and the engineers, it creates challenges, especially for test.

GUARANTEE TRUSTWORTHINESS

The need to test AI to guarantee its function and trustworthiness is one of the biggest disruptions in the field today. The main change this poses for test is shifting from a classically designed, or programmed system, to one that is trained. Classically designed systems are predictable, which allows for a relatively simple stimulus/response method of test. With rising device complexity, trained AI implementations will prove more efficient for an expanding range of problems. Yet, given the complexities of designing and testing AI solutions, overuse of AI can be counterproductive. There is a need to opt for classical designs in many cases, reserving AI for challenges where fixed implementations fall short.

As wireless spectrum is one of the most valuable, finite resources, it should come as no surprise that optimizing spectrum use and expanding into new frequency bands (in a cost-effective manner) are two hot topics. AI is especially good at finding and recognizing patterns. It is also uniquely good at the optimization of non-linear, hard-to-model problems. And many, if not most, of the new technologies being discussed at the 3GPP standards organization recently fall into one of these two areas of focus. For example:

The cognitive radio capabilities that will likely be needed in spectrum sharing in the FR3 bands
The imaging-like sensing capabilities needed to implement integrated comms and sensing
The beam steering/forming functions in advanced MIMO.

Figure 1: AI Uses in Wireless

At this point, embedding AI into wireless devices becomes a question of when and not if, and a new question arises: how do you embed and test trustworthy AI?

At the core, we need three things to effectively design and test trustworthy, embedded AI:

A sufficient quality and quantity of training and validation data
A reasonably complete method of testing the AI implementation as the traditional stimulus-response models for testing simply don’t work for these devices
The ability to leverage AI to test AI.

Let’s dig into each of these areas.

SUFFICIENT QUALITY DATA

The most important thing needed to successfully embed AI is a sufficient quality and quantity of training and validation data. Without a rich dataset that covers all the primary anticipated uses of the AI, trustworthy systems cannot be developed and tested. This data can come from a variety of synthetic and real-world sources. As it will likely be a dataset of significant scope and size, that data set will also need to be curated, managed, labeled and processed to be ready for building and testing an AI implementation.

Through record and playback hardware and software, we can provide a platform that can collect the data from the virtual and real-world. But from there, the question remains: how is that data stored so it can be broadly consumed?

Under an NSF grant, Emerson Test & Measurement has been collaborating with Northeastern University on a project called the RF Data Factory to create new tools that automatically capture data in a standard format. Emerson has released the RF Data Recording API as open-source code so that all researchers can access free of charge in Github.

The Data Recording API requires a single text-based configuration file to configure complex scenarios with multiple RF transmitters and RF receivers over a broad range of parameters. The code runs on an AI/ML research hardware and software platform—Emerson’s Ettus USRPs—and enables the researchers to capture real-world data. The recorded IQ samples are saved in Signal Metadata Format in short: SigMF format. It provides comprehensive scenario descriptions by means of the JSON-based SigMF metadata files, which are human readable and easy to parse. This simplifies the management of data set libraries for better adoption by the wider research community.

COMPLETE METHOD OF TESTING

Once the problem of data is solved, a reasonably complete method of testing the AI implementation is needed. The traditional stimulus-response models for testing simply do not work for these devices. The automotive industry is a good example of how complex adopted scenario-based testing can be. Given the number of scenarios possible in autonomous vehicles, extensive testing in a virtual world using software or model-in-the-loop methods is done. Figure 2: Testing the Design Cycle

The chart above represents the entire development cycle of a vehicle, from its concept development and design to its production. The colored arrows represent the overall test approaches along a typical design lifecycle. A similar design and test flow could also make sense for wireless AI-based implementations—starting in purely software-based test set-ups and progressing toward field trials and deployments. The process starts with defining a finite number of functional scenarios that are meant to demonstrate and measure how the AI-behaves in achieving specific, measurable performance metrics or outcomes.

Continuing to use the autonomous vehicle example, consider a car needing to safely merge into traffic in downtown Manhattan during a blizzard. The first step is to break down each functional scenario into a larger set of logical scenarios. Each of the configurable parameters of the system is introduced, and the ranges of those values that match the functional scenario from which the logical scenario is derived. Then each of the logical scenarios is broken down into a concrete scenario, which contains a very specific set of parameters. As very long lists of discrete parameters, a concrete scenario is easy for a computer to apply to the virtual or physical system. But at this point, it would be difficult for a human to understand the behavior of the configured end system.

In the same way the vehicle is attributed a set of parameters, the wireless industry believes that it can break down the wireless parameters into four fundamental categories:

Parameters needed at the application level
Information about the wireless environment or channel
The configuration of the terminals that are communicating wirelessly (the UE and base station, for example)
Information about competing traffic.

Combining these two views of the workflow and scenarios, a system is built that takes the concrete scenario’s application, environment, terminal and traffic parameter spaces and applies them, through a test execution engine, to a full emulation of a real-world implementation.

Figure 3: Functional, Logical and Concrete Scenarios

Imagine trying to test an AI-based implementation of a base station receiver—optimizing and tuning its performance. We can execute the system in the yellow dashed line in the image above virtually, using model-in-the-loop and system-in-the-loop methods in the early stages of the design. As a model get close to field trials, the base station or UE emulators can be replaced with real devices and the mathematical channel emulator replaced with a physical channel emulator, etc.

LEVERAGING AI TO TEST AI

As testing advances, it moves from testing the system from a completely virtual model to a scenario where the algorithm is tested with emulated channels, traffic and applications, before going to field trials. Operators have expressed a strong desire for testing methods that can show the performance or business gains of an AI-based implementation compared to their existing, classical implementation. At the moment, there are no specific tools that can provide that insight. So, Emerson Test and Measurement is building early prototypes to try out new testing methodologies needed in the industry.

In the early prototype, the AI-model chosen is an implementation of an AI-trained receiver based on the DeepRx implementation. The model was trained with synthetic data and real-world data, which was captured and managed using the RF Data Factory open-source tools mentioned earlier. To make the system both affordable and accessible, open-source implementations of the base station and UE IP were used as containers into which a researcher can add their IP under test.

Below are some of the early results comparing the performance of bit error rate versus the signal-to-noise ratio per bit of the system for a set of traditional and AI-based implementations.

Figure 4: RF Data Factory Testing Example

Figure 5: Bit Rate Error - Traditional Rx vs. DeepRX

Beyond testing AI powered devices, Emerson Test and Measurement predicts there will be an increasing reliance on using AI to test AI. AI has the potential to help hyper-automate the exploration and validation of the limits of an AI-based implementation through auto-scenario generation and through techniques designed to find the areas where the AI’s output is most sensitive to small changes to its inputs. The question becomes: Of the infinite number of scenarios that could be test, which are the most important? And which can be eliminated?

Figure 6: Smart Text Example

Using our new neural receiver prototype, Emerson Test and Measurement is researching a method that, starting from a set of seed scenarios, can make use of the gradient function of the inherent neural net implementation to create sets of derivative concrete scenarios. These scenarios should maximize the test coverage of each neuron within the neural net and find areas of the model that have inconsistent outputs with small changes in inputs. Optimizing neuron coverage makes some intuitive sense. If, in an attempt to reduce testing burden, a portion of the implementation is completely un-tested, that represents a coverage gap. Investigating derivatives where the model is most sensitive helps find areas of potential instability.

Reconsider the automotive example—perhaps the perception is most sensitive to white cars at an intersection in a blizzard. This example is unintuitive, but a human’s intuitive experience with these implementations is that there are always sensitivities in the trained models that are less intuitive. Our hypothesis is that using the mathematical gradient function of the DNN can help us find these efficiently.

CONCLUSION

Performance measurements of the products a company invents are a huge key in helping those companies unlock greater business success. Some people view test, both in terms of cost and time, as a necessary evil. With the adoption of new AI tools, all industries will need to adjust their views on test.

When done strategically and in a well-integrated way, measurements and the insight they unlock, can speed time-to-market, improve customer satisfaction, reduce operating costs and position companies as market leaders. And in the case of AI, test is critical to ensuring proper functionality and trustworthiness.

To learn more about how Emerson Test and Measurement is researching new methods for test using AI, read the white paper about the neural receiver prototype, which is being demonstrated at the Brooklyn 6G Summit. Here is the demo:

ABOUT THE AUTHOR

Sarah LaSelva is the chief product marketing manager for NI’s RF portfolio. She has over 15 years of experience in test and measurement, concentrating on wireless communications and RF technologies. Throughout her career, she has spent time in marketing, product management, services, test engineering and applications engineering. Sarah has spent most of her career working with software defined radios (SDRs) where she gained a deep knowledge of SDR hardware, software and wireless communications. In addition to her time at NI, she recently spent the four years at Keysight Technologies, where she led the marketing strategy for 6G. She has been an active participant in industry groups like the NextG Alliance, where she is an advocate for improving sustainability and making wireless technology green. Sarah’s academic background is in microwave and mmWave technology. She has a B.S. in electrical engineering from Texas Tech University.

AI Test for AI Devices

Report Abusive Comment