Unlock Edge AI on any SoC with ultra-tiny Neuton ML models

Unlock Edge AI on any SoC with ultra-tiny Neuton ML models

If you’re tuned into the latest development, you might already be aware that Nordic Semiconductor has acquired Neuton.AI. Neuton is an edge AI company dedicated to making machine learning models more accessible. It creates models 10 times smaller and faster than competing frameworks, allowing AI processing on even the edgiest of edge devices. In this blog post, we will walk through what this means for you as a developer and how using Neuton models can improve your development and end application.

Why Neuton

As a developer, the two biggest hindrances to using edge AI in your product are

  1. ML models are too large for the memory of your chosen microcontroller.

  2. Creating custom ML models is an inherently manual process that requires a high level of data science knowledge to do well.

Well, not anymore.

Neuton is a framework for automatically generating ML models at a fraction of the size of traditional frameworks like TensorFlow Lite. For developers, this means that to train a highly optimized, fast and accurate ML model, all you need is a dataset. Neuton models can run on any Nordic SoC, like our flagship nRF54L15, and are so efficient that they also fit well within the limitations of the most space-constrained ones, like the nRF52805, taking up only a few kilobytes of non-volatile memory (NVM). This allows for adding ML capabilities to applications where it was previously deemed impossible. For example, you can now do AI processing in every node of an extensive sensor network, where size and cost of the sensor is key and space comes at a premium.

What makes Neuton different

Other frameworks for edge AI are plentiful and have existed for a long time. The key difference that sets it apart from LiteRT (aka TensorFlow Lite for microcontrollers) and similar frameworks, is that other frameworks still rely on the developer to have knowledge about how to organize a neural network, neurons, and network depth manually, and then compress and optime the model after the fact to make it fit on the desired target device. This approach leads to models that are less efficient when it comes to code size, execution speed, and power consumption.

Neuton, on the other hand, handles all this automatically. Instead of statically defining the parameters of your network from the start, Neuton grows the network automatically, and for every new neuron, it checks if this improves the model's performance. Neurons that do not add value are immediately removed to conserve resources. This brings multiple benefits to the developer:

  1. No manual selection of neural network structure, parameters, or architecture

  2. No resource-intensive automatic neural architecture search (NAS)

  3. The smallest code size possible, with no need for compression or optimization

  4. Faster execution, which means lower power consumption

Neuton models are downloaded from the platform as pure C-code, with no external dependencies or special runtime requirements. They are ready to be integrated into any application running on any Arm Cortex-M series processor, like the application cores of the nRF52, nRF53, nRF54L and nRF54H Series SoCs or nRF91 Series SiPs.

What this means for developers

Creating ML models using the Neuton framework is already available through lab.neuton.ai. If you’re really eager, you can try it out today, but be aware that you will need to integrate this into an nRF Connect SDK application yourself, as a sample for this is not yet available. If you’d prefer the more streamlined development process and developer experience that Nordic is known for, we’re planning great enhancements when it comes to integration into our developer ecosystem. So if you’re not into doing all the integration work yourself, you can wait until we have made samples for the nRF Connect SDK and the necessary tooling to make it work seamlessly with our solution. Subscribe to upcoming product notifications, and we’ll notify you when we launch updates to the edge AI workflow.

How it works

Even though the integration with nRF Connect SDK through a dedicated sample firmware is still in the works, generating models using the Neuton framework is already fully automated and easy as it will ever be.

Generating a complete, dependency-free model for your application requires only three steps.

  1. Gather data and upload the dataset; the platform takes labeled CSV files

  2. Select what variable in your dataset is your target variable, and choose the evaluation metric

    1. The platform then grows the neural network “automagically” in the background, and once your algorithm is ready, it will notify you.

  3. Download your Neuton model as a complete C library and integrate it into your application using an API of only 3 simple function calls:

    1. neuton_nn_setup - Set up the internal components of Neuton, should be called first, and only once

    2. neuton_nn_feed_inputs - Feed and prepare live input features for model inference

    3. neuton_nn_run_inference - Run live input features into the Neuton model to calculate an output

Use cases this enables

Neuton unlocks use cases that were previously reserved for our flagship hardware, allowing all our SoCs to be utilized for edge AI. Examples include:

  • Predictive maintenance and building automation systems

  • Smart sensor networks with local data analysis on each node

  • Movement and gesture recognition for remote controls and wearable devices

  • Health and activity monitoring for smart health wearables

  • And many, many more…

Benchmarks

The table below presents the benchmarking results between a LiteRT model (previously known as TensorFlow Lite for Microcontrollers) and a Neuton model run on the same nRF52840 SoC. This comparison is using a well-known "magic wand" motion recognition use case, where both models were trained on the same dataset, and validated on the same holdout set. 

Total footprint (KB)

LiteRT

Neuton

Neuton advantages

NVM

TinyML framework (model + inference engine + DSP)

79.96

5.42

14 times
smaller model

43% reduction of
total NVM use

Device drivers and business logic

93.47

93.47

RAM

TinyML framework (model + inference engine + DSP)

18.2

1.72

10 times
smaller model

26% reduction of
total RAM use

Device drivers and business logic

45.69

45.69

Inference time (µs)

55,262

1,640

33 times
faster

Holdout validation accuracy

0.93

0.94

0.7% higher accuracy

The full write-up of the comparison can be found here (external link).

What’s next

Over the coming months, we will work to integrate Neuton into our development ecosystem, adding the tools, examples, and support that ease the life of developers and add value to their applications. If you’re really eager to try out building and integrating a Neuton model yourself, you can test it out using their legacy online platform at lab.neuton.ai today.

Webinar
To get a full hands-on tutorial of how to collect data and use the platform, sign up for our webinar on June 25th at webinars.nordicsemi.com