.NET Mokka is a minimal Inference Engine for Dense Layer Neural Networks. Written on a single C# header, it uses AVX2. The code is aimed for competitive AIs, it does a lot of unsafe calls and pointers to memory, no getter/setters, no bound checks, all declared as public, etc. It tries to give good performance on a minimal binary size, without external references. It's not aimed for production grade systems with stability in mind. Most Inference Engines are bloated with external libraries, complex loaders that inflates the binary size.
These engines are too big to use it in AI challenges (i.e. www.codingame.com ), were file size is limited to <160KB without external libraries.
Current Test dll output is like 22KB in size, and current source file is 21KB (without code golf). It's feasible to have a compressed binary of 60KB + 80KB of weights .
Trust on proven frameworks. There is no point on reinvent the wheel and create your own framework to train your network. Mokka is just an inference engine, you need to train somewhere else and export weights. I use Tensorflow 2.0 to train some MNIST neural networks (even with GPU enabled).
Saving model was done with:
def SaveModel(my_model,fileSTR):
totalbytes=0
data=[]
Wmodel = open(fileSTR, "wb")
for x in my_model.weights:
nn = x.numpy()
T = nn
v = np.ndarray.tobytes(T)
Wmodel.write(bytearray(v))
totalbytes+=len(v)
print(x.name, len(v)," dims:",nn.ndim," ", T.shape)
data.append(base64.b64encode(v).decode("utf-8"))
Wmodel.close()
print("Total bytes:"+str(totalbytes))
You can see how the SaveModel works in the Jupyter notebooks.
Model in Tensorflow must match the Model created on Mokka. For the sake of binary size there aren't validity checks when loading weights.
- .NET Core SDK 3.1 or better (it needs System.Runtime.Intrinsics.X86 and AVX2 capable CPU)
- Tensorflow 2.0 for training, it can be even another PC. I have TF2.0 but on Windows, because I can use CUDA for GPU acceleration.
- MNIST datasets. TF2 will download them automatically.
- Tested both on Windows 10 and Ubuntu 18 (WSL)
It's inspired on Tensorflow. Create a layer, then feed that layer to the next one.
static Model CreateModel(out Input input,out Layer policy){
Layer x;
input = new Input(new List<int>(){28*28});
x = (new Dense("Dense",128,Activators.RELU)).link(input);
policy = (new Dense("Soft", 10,Activators.SOFTMAX)).link(x);
Model model = new Model( input, policy );
return model;
}
Some structures are flattened to W*H instead of being a {W,H} matrix
I've tested the accuracy of the code with two MNIST tests. The code achieves 12 us/sample, that is a good performance, similar to C++. Tensorflow is faster, 9us/sample.
They are called MNIST Simple.ipynb
and MNIST Simple29.ipynb
, they are Jupyter Notebooks. If you run both notebooks they will create two weight files on ./Mokka subfolder, called DENSE.weights
and DENSE29.weights
respectively.
Running the Test (requires .NET Core SDK 3.1):
cd Mokka
dotnet run -c release
When you run the binary file you'll get some accuracy percentages, these % are the same than on Jupyter notebooks.
Testing MNIST Simple29.ipynb
Accuracy is the same than on Test
Similar summary. Same number of trainable parameters
Testing MNIST Simple.ipynb
Accuracy is the same than on Test
Similar summary. Same number of trainable parameters
Binary size
22KB DLL, including MNIST load code
Tester will also save a DENSE.test
file. These are an export of the loaded weights, the file should be exactly the same as DENSE.weights
.
- Convolutional Network Layers. Convolutional are expensive to run and hard to implement, more in AVX2.
- AlphaZero implementation (policy + value)
- Sleef Tanhf AVX2. Right now tanh is done with Math.Tanh, it can be done with some good aproximations. Will increase performance of tanh.
NetMokka is a .Net port from https://github.com/marchete/Mokka
Mokka was based on Latte: https://github.com/amrmorsey/Latte