This project is at its infancy. We are barely able to walk!
As a proof of concept, quantized MNIST is happily running in Mbed enabled MCUs. Upcoming work will include: reference counting, memory abstraction, tensorflow-to-mbed exporter and more ops.
These should be enough for us to run most DL models out there.
We started with the idea of putting AI everywhere and help people to build cooler things.
Inputs and collaborations are welcome.
Neil Tan, Kazami Hsieh, Dboy Liao, Michael Bartling
Hi, one of the biggest issues we have consistently had with TF is the serialization. How do you export from training and take that and deploy it on hardware.
Could you share your training scripts - would love to look at the piece which does quantization/SavedModel/FreezeGraph
Embedded learning library is good. But, this project aims to integrate with tensorflow. If you need run machine learning on MCU, these could be your choices.
Holy cow, an F767. I thought the 'I' postfixes were for somewhere around 512K-1M, though? Anyways, I guess you'd want a nice chip for this sort of thing, but could this also work on a cheaper F303CC?
Edit: Oh, 256KB RAM. Nevermind, although 2Mb RAM modules are pretty cheap and have as little as 32 pins...
You are right that the F767ZI has 512kb of RAM. However, the MLP code has been tested on boards with 256kb of RAM. Changes are on the way to pull that number even lower, significantly.
Neat! Have you looked at the upcoming H7 lines yet? I think they were running into production issues or problems with the smaller 40nm process or something, but apparently they can clock up to 400MHz and have a whole MB.
Am I right in understanding that this implementation is aimed at running the network (vs training one)? Sounds very exciting to be able to train a system and then be able to apply that to a hardware based problem without needing to so so much porting of the NN code.
Haven't got a mbed to test, but if it does work, it's a game changer! There are many potentially benefited applications (e.g. medical imaging in remote area) where speed is not essential.
Thanks man!
You summed it up nicely! This project is about making the trade-off between speed and cost. There is greatness in either-end of the spectrum.
And when I took the operating systems lab class, we had block time access to a PDP-11 with half that amount of total memory and a keypad for entering the bootloader -- a big step up from toggle switches. One of the reasons old grey-beards like me can find a happy home doing embedded software is that we don't get that deer-in-headlights reaction to memory counted in KBytes and single-digit MBytes.
It is not just MNIST specialized, it will be general purpose framework aim to make inference. We will make more ops to let developer can create more models on device, which could be trained from tensorflow. This project is still ongoing.
This project is at its infancy. We are barely able to walk!
As a proof of concept, quantized MNIST is happily running in Mbed enabled MCUs. Upcoming work will include: reference counting, memory abstraction, tensorflow-to-mbed exporter and more ops.
These should be enough for us to run most DL models out there.
We started with the idea of putting AI everywhere and help people to build cooler things.
Inputs and collaborations are welcome.
Neil Tan, Kazami Hsieh, Dboy Liao, Michael Bartling