Skip to content

Epsilon is a library with functions for machine learning and statistics written in plain C. It is intended to run on microcontrollers.

License

Notifications You must be signed in to change notification settings

breuderink/epsilon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Epsilon

Epsilon is a library with small functions for machine learning and statistics written in plain C. The functions are decoupled and well tested.

tests and examples

Motivation

Most machine learning focuses on training large models on powerful hardware. After training, researchers freeze the models and apply them to new data. These models are too big to run on microcontrollers. One can compress these models to make them fit. The compressed model can make predictions for new data. But the model itself remains static, even if the environment changes.

An alternative approach is to optimise the model on the microcontroller itself. In this case, the model can adapt to new data. This requires particular memory-efficient algorithms. Further, the optimization process should be reliable. Epsilon provides methods to train and apply machine learning methods on microcontrollers.

These algorithms should run on microcontrollers, such as the Raspberry Pi Pico, the BBC micro:bit, the Arduino Nano 33 BLE Sense or even the ATtiny85 8-bit microcontroller.

To allow machine learning to run on microcontrollers, the implementations:

  • do not use dynamic memory allocation if possible,
  • favour online over batch operation,
  • work with fixed-point math when realistic, and
  • are easy to tune.

Building

Epsilon uses CMake for building. Create a build directory, and configure the project. In the repository root, configure and build the project. Then run the unit tests and the examples with CTest:

$ mkdir build && cd build
$ cmake ..
$ cmake --build .
$ ctest

Algorithms

Pseudo-random number generation

  • Xorshift is a fast and simple pseudo-random number generator by George Marsaglia that has good statistical properties. See the xorshift example.

Hashing

Online statistics

Feature extraction

  • Fast Walsh-Hadamard transform (FWHT) implements the Walsh-Hardamard transform in O(n log n) time. The FWHT similar to the fast Fourier transform and the Haar transform. See the FWHT example.
  • Structured random orthogonal features (SORF). An O(d log d) transformation that can be used for a feature map that approximates a specific kernel. Here d is the number of input dimensions. Note that SORF is patented, and that compilation of SORF is disabled by default. Instead one can use a budgeted kernel classification or regression.

Regression

  • Online passive-aggressive (PA) regression solves a regression problem by only updating the model on prediction mistakes. When the target depends non-linearly on the inputs one can use a kernel that projects the input onto a set of support vectors. Kernel methods such as the support vector machine (SVM) work efficiently in high-dimensional feature spaces, but don't easily scale to large datasets. To scale to large datasets one can maintain a budget of support vectors. The example of budgeted kernel passive aggressive (BKPA) regression demonstrates how online, non-linear regression can be performed with a limited memory budget.

Classification

  • Kernel passive-aggressive classification (TODO).

Other solutions for Tiny ML or Edge AI