Skip to content

dorjeduck/kamo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

68 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

kamo 🔥

A personal journey into learning about Kolmogorov–Arnold Networks using Mojo.

The following excerpt from the abstract of the paper KAN: Kolmogorov-Arnold Networks provides the essential inspiration:

Inspired by the Kolmogorov-Arnold representation theorem, we propose Kolmogorov- Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs). While MLPs have fixed activation functions on nodes (“neurons”), KANs have learnable activation functions on edges (“weights”). KANs have no linear weights at all – every weight parameter is replaced by a univariate function parametrized as a spline. We show that this seemingly simple change makes KANs outperform MLPs in terms of accuracy and interpretability.

This repository explores KANs by porting the KAN Python implementation from ML without tears to Mojo. This very readable Python implementation provides a flexible foundation, enabling instantiation as either a KAN or a classic MLP, which allows for various comparisons and experiments. The main focus is on understanding the core concepts rather than on optimizing performance or implementing all aspects of KANs.

Empowering edges

The fundamental innovation of KANs lies in their learnable activation functions on edges. The paper KAN: Kolmogorov-Arnold Networks suggests using a linear combination of B-Splines and the SiLU function. Subsequent research also recommends the use of Chebyshev polynomials among others. One key feature of these functions is that their derivatives are well defined and easy to calculate, which is crucial for gradient descent optimization.

Basis Functions Derivatives
B-Splines & SILU
Chebyshev Polynominals
Gaussian RBF

Usage

Prerequisite: Ensure you have Mojo 24.4 installed.

The ML without tears implementation offers some basic usage examples to get started. We ported the first two of them to Mojo.

1D regression problem

Refer to train_1d.mojo for a simple 1D regression problem. This example compares the performance of a classical MLP with three KAN networks: one utilizing B-Spline-based edges, another using Chebyshev polynomial-based edges, and the third employing Gaussian RBF-based edges.

Performance:

2D regression problem

train_2d.mojo implements a 2D regression problem. We compare again the performance of a classical MLP with three KAN networks: B-Spline-based, Chebyshev polynomial-based, and Gaussian RBF-based edges

Performance:

Just Learning

Currently, this project is for our own educational purposes, with no immediate plans to make it suitable for real-world applications. Similar to the original Python code, this KAN implementation prioritizes network transparency over speed. If you're looking for a Mojo deep learning framework with competitive performance features, check out Basalt.

Remarks

  • The current implementation covers only the basic KAN concepts. The paper KAN: Kolmogorov-Arnold Networks suggests various ways to enhance KANs, such as sparsification and grid extension, and has inspired extensive follow-up research. There is plenty of room for improvement in our implementation.
  • For simplicity, we use tanh to normalize the edge inputs to the range of spline grids. This technique is widely used by other performance-optimized KAN implementations (see, for example, FasterKAN).
  • Mojo is evolving quickly but is still quite young and limited in some aspects, such as full support for dynamic polymorphism. Some of the boilerplate in our code is due to these limitations. We're looking forward to improve our implementation as Mojo continues to mature.

Resources

  • The GitHub repository associated with the above referenced paper can be found here: pykan.
  • Awesome KAN A curated list of awesome libraries, projects, tutorials, papers, and other resources related to Kolmogorov-Arnold Network (KAN).

Changelog

  • 2024.06.14
    • Added Gaussian Radial Basis Functions (inspired by FastKAN)
  • 2024.06.13
    • Initial commit version 2

License

MIT

About

Kolmogorov-Arnold Networks in Mojo

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages