Awesome KAN(Kolmogorov-Arnold Network)

A curated list of awesome libraries, projects, tutorials, papers, and other resources related to Kolmogorov-Arnold Network (KAN). This repository aims to be a comprehensive and organized collection that will help researchers and developers in the world of KAN!

Papers

KAN: Kolmogorov-Arnold Networks : Inspired by the Kolmogorov-Arnold representation theorem, we propose Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs). While MLPs have fixed activation functions on nodes ("neurons"), KANs have learnable activation functions on edges ("weights"). KANs have no linear weights at all -- every weight parameter is replaced by a univariate function parametrized as a spline. We show that this seemingly simple change makes KANs outperform MLPs in terms of accuracy and interpretability. For accuracy, much smaller KANs can achieve comparable or better accuracy than much larger MLPs in data fitting and PDE solving. Theoretically and empirically, KANs possess faster neural scaling laws than MLPs. For interpretability, KANs can be intuitively visualized and can easily interact with human users. Through two examples in mathematics and physics, KANs are shown to be useful collaborators helping scientists (re)discover mathematical and physical laws. In summary, KANs are promising alternatives for MLPs, opening opportunities for further improving today's deep learning models which rely heavily on MLPs.
KAN 2.0: Kolmogorov-Arnold Networks Meet Science
KAN or MLP: A Fairer Comparison : Under the same number of parameters or FLOPs, we find KAN outperforms MLP only in symbolic formula representing, but remains inferior to MLP on other tasks of machine learning, computer vision, NLP, and audio processing. We also conduct ablation studies on KAN and find that its advantage in symbolic formula representation mainly stems from its B-spline activation function. | code ｜
DropKAN: Regularizing KANs by masking post-activations : DropKAN (Dropout Kolmogorov-Arnold Networks) is a regularization method that prevents co-adaptation of activation function weights in Kolmogorov-Arnold Networks (KANs). DropKAN operates by randomly masking some of the post-activations within the KANs computation graph, while scaling-up the retained post-activations. We show that this simple procedure that require minimal coding effort has a regularizing effect and consistently lead to better generalization of KANs. | code ｜
Rethinking the Function of Neurons in KANs : The neurons of Kolmogorov-Arnold Networks (KANs) perform a simple summation motivated by the Kolmogorov-Arnold representation theorem, Our findings indicate that substituting the sum with the average function in KAN neurons results in significant performance enhancements compared to traditional KANs. Our study demonstrates that this minor modification contributes to the stability of training by confining the input to the spline within the effective range of the activation function. | code ｜
Chebyshev Polynomial-Based Kolmogorov-Arnold Networks
Kolmogorov Arnold Informed neural network: A physics-informed deep learning framework for solving PDEs based on Kolmogorov Arnold Networks | code ｜
Convolutional Kolmogorov-Arnold Networks | code ｜
Kolmogorov-Arnold Convolutions: Design Principles and Empirical Studies | code ｜
Smooth Kolmogorov Arnold networks enabling structural knowledge representation
TKAN: Temporal Kolmogorov-Arnold Networks ｜ code ｜
ReLU-KAN: New Kolmogorov-Arnold Networks that Only Need Matrix Addition, Dot Multiplication, and ReLU ｜ code ｜
U-KAN Makes Strong Backbone for Medical Image Segmentation and Generation｜ code ｜
Kolmogorov-Arnold Networks (KANs) for Time Series Analysis
Wav-KAN: Wavelet Kolmogorov-Arnold Networks
A First Look at Kolmogorov-Arnold Networks in Surrogate-assisted Evolutionary Algorithms | code｜
FourierKAN-GCF: Fourier Kolmogorov-Arnold Network--An Effective and Efficient Feature Transformation for Graph Collaborative Filtering ｜ code ｜
A Temporal Kolmogorov-Arnold Transformer for Time Series Forecasting ｜ code ｜
fKAN: Fractional Kolmogorov-Arnold Networks with trainable Jacobi basis functions | code |
BSRBF-KAN: A combination of B-splines and Radial Basic Functions in Kolmogorov-Arnold Networks | code |
GraphKAN: Enhancing Feature Extraction with Graph Kolmogorov Arnold Networks | code |
rKAN: Rational Kolmogorov-Arnold Networks | code |
SigKAN: Signature-Weighted Kolmogorov-Arnold Networks for Time Series ｜ code ｜
Demonstrating the Efficacy of Kolmogorov-Arnold Networks in Vision Tasks | code ｜
KANQAS: Kolmogorov-Arnold Network for Quantum Architecture Search | code
DeepOKAN: Deep Operator Network Based on Kolmogorov Arnold Networks for Mechanics Problems
A deep machine learning algorithm for construction of the Kolmogorov–Arnold representation)
Inferring turbulent velocity and temperature fields and their statistics from Lagrangian velocity measurements using physics-informed Kolmogorov-Arnold Networks
A Comprehensive Survey on Kolmogorov Arnold Networks (KAN)
Sparks of Quantum Advantage and Rapid Retraining in Machine Learning | code |
Adaptive Training of Grid-Dependent Physics-Informed Kolmogorov-Arnold Networks | code |
Gaussian Process Kolmogorov-Arnold Networks
Kolmogorov--Arnold networks in molecular dynamics
Kolmogorov-Arnold Network for Online Reinforcement Learning code |
TC-KANRecon: High-Quality and Accelerated MRI Reconstruction via Adaptive KAN Mechanisms and Intelligent Feature Scaling code |
Kolmogorov-Arnold Networks for Time Series: Bridging Predictive Power and Interpretability
KAN4TSF: Are KAN and KAN-based models Effective for Time Series Forecasting? | code |

Theorem

1957-On the representation of continuous functions of several variables by superpositions of continuous functions of a smaller number of variables : The original Kolmogorov Arnold paper
1957-On functions of three variables
2009-On a constructive proof of Kolmogorov’s superposition theorem
2021-The Kolmogorov-Arnold representation theorem revisited
2021-The Kolmogorov Superposition Theorem can Break the Curse of Dimension When Approximating High Dimensional Functions

Library

pykan : Offical implementation for Kolmogorov Arnold Networks ｜
efficient-kan : An efficient pure-PyTorch implementation of Kolmogorov-Arnold Network (KAN). ｜
FastKAN : Very Fast Calculation of Kolmogorov-Arnold Networks (KAN) ｜
FasterKAN : FasterKAN = FastKAN + RSWAF bases functions and benchmarking with other KANs. Fastest KAN variation as of 5/13/2024, 2 times slower than MLP in backward speed. ｜
FourierKAN : Pytorch Layer for FourierKAN. It is a layer intended to be a substitution for Linear + non-linear activation |
Vision-KAN : PyTorch Implementation of Vision Transformers with KAN layers, built on top ViT. 95% accuracy on CIFAR100 (top-5), 80% on ImageNet1000 (training in progress) |
ChebyKAN : Kolmogorov-Arnold Networks (KAN) using Chebyshev polynomials instead of B-splines. ｜
GraphKAN : Implementation of Graph Neural Network version of Kolmogorov Arnold Networks (GraphKAN) ｜
FCN-KAN : Kolmogorov–Arnold Networks with modified activation (using fully connected network to represent the activation) ｜
X-KANeRF : KAN based NeRF with various basis functions like B-Splines, Fourier, Radial Basis Functions, Polynomials, etc ｜
Large Kolmogorov-Arnold Networks : Variations of Kolmogorov-Arnold Networks (including CUDA-supported KAN convolutions) ｜
xKAN : Kolmogorov-Arnold Networks with various basis functions like B-Splines, Fourier, Chebyshev, Wavelets etc ｜
JacobiKAN : Kolmogorov-Arnold Networks (KAN) using Jacobi polynomials instead of B-splines. ｜
GraphKAN : Implementation of Graph Neural Network version of Kolmogorov Arnold Networks (GraphKAN) ｜
OrthogPolyKAN : Kolmogorov-Arnold Networks (KAN) using orthogonal polynomials instead of B-splines. ｜
kansformers : Kansformers: Transformers using KANs |
Deep-KAN: Better implementation of Kolmogorov Arnold Network |
RBF-KAN: RBF-KAN is a PyTorch module that implements a Radial Basis Function Kolmogorov-Arnold Network |
KolmogorovArnold.jl : Very fast Julia implementation of KANs with RBF and RSWAF basis. Extra speedup is gained by writing custom gradients to share work between forward and backward pass. ｜
Wav-KAN: Wav-KAN: Wavelet Kolmogorov-Arnold Networks |
KANX : Fast Implementation (Approximation) of Kolmogorov-Arnold Network in JAX |
FlashKAN: Grid size-independent computation of Kolmogorov Arnold networks |
BSRBF_KAN: Combine B-Spline (BS) and Radial Basic Function (RBF) in Kolmogorov-Arnold Networks (KANs) |
TaylorKAN: Kolmogorov-Arnold Networks (KAN) using Taylor series instead of Fourier |
fKAN: fKAN: Fractional Kolmogorov-Arnold Networks with trainable Jacobi basis functions |
Initial Investigation of Kolmogorov-Arnold Networks (KANs) as Feature Extractors for IMU Based Human Activity Recognition
rKAN: rKAN: Rational Kolmogorov-Arnold Networks |
TKAN: Temporal Kolmogorov-Arnold Networks Keras3 layer implementations multibackend (Jax, Tensorflow, Torch) |
TKAT: Temporal Kolmogorov-Arnold Transformer Tensorflow 2.x model implementation |
SigKAN: Path Signature-Weighted Kolmogorov-Arnold Networks tensorflow 2.x layer implementations, based on iisignature |
KAN-SGAN: Semi-supervised learning with Generative Adversarial Networks (GANs) using Kolmogorov-Arnold Network Layers (KANLs) |

Library-based

TorchKAN : Simplified KAN Model Using Legendre approximations and Monomial basis functions for Image Classification for MNIST. Achieves 99.5% on MNIST using Conv+LegendreKAN. ｜
efficient-kan-jax : JAX port of efficient-kan |
jaxKAN : Adaptation of the original KAN (with full regularization) in JAX + Flax |
cuda-Wavelet-KAN : CUDA implementation of Wavelet KAN. |
keras_efficient_kan: A full keras implementation of efficient_kan tested with tensorflow, pytorch and jax backend |
Quantum KAN: KANs optimizable through quantum annealing |
KAN: Kolmogorov–Arnold Networks in MLX for Apple silicon : KAN (Kolmogorov–Arnold Networks) in the MLX framework|

ConvKANs

Convolutional-KANs : This project extends the idea of the innovative architecture of Kolmogorov-Arnold Networks (KAN) to the Convolutional Layers, changing the classic linear transformation of the convolution to non linear activations in each pixel. ｜
Torch Conv KAN : This repository implements Convolutional Kolmogorov-Arnold Layers with various basis functions. The repository includes implementations of 1D, 2D, and 3D convolutions with different kernels, ResNet-like, Unet-like, and DenseNet-like models, training code based on accelerate/PyTorch, and scripts for experiments with CIFAR-10/100, Tiny ImageNet and ImageNet1k. Pretrained weights on ImageNet1k are also available ｜
convkan : Implementation of convolutional layer version of KAN (drop-in replacement of Conv2d) ｜
KA-Conv : Kolmogorov-Arnold Convolutional Networks with Various Basis Functions (Optimization for Efficiency and GPU memory usage) |
KAN-Conv2D : Drop-in Convolutional KAN built on multiple implementations (Original pykan / efficient-kan / FastKAN) to support the original paper hyperparameters. |
CNN-KAN : A modified CNN architecture using Kolmogorov-Arnold Networks |
ConvKAN3D : 3D Convolutional Layer built on top of the efficient-kan implementation (importable Python package from PyPi), drop-in replacement of Conv3d.

Benchmark

KAN-benchmarking : Benchmark for efficiency in memory and time of different KAN implementations. |
seydi1370/Basis_Functions : This packaege investigates the performance of 18 different polynomial basis functions, grouped into several categories based on their mathematical properties and areas of application. The study evaluates the effectiveness of these polynomial-based KANs on the MNIST dataset for handwritten digit classification. |

Non-Python

KolmogorovArnold.jl : Very fast Julia implementation of KANs with RBF and RSWAF basis. Extra speedup is gained by writing custom gradients to share work between forward and backward pass. ｜
kan-polar : Kolmogorov-Arnold Networks in MATLAB ｜
kamo : Kolmogorov-Arnold Networks in Mojo ｜
Julia-Wav-KAN : A Julia implementation of Wavelet Kolmogorov-Arnold Networks. ｜
Building a Kolmogorov-Arnold Neural Network in C
C# and C++ implementations, benchmarks, tutorials
FluxKAN.jl : An easy to use Flux implementation of the Kolmogorov Arnold Network. This is a Julia version of TorchKAN.

Alternative

high-order-layers-torch : High order piecewise polynomial neural networks using Chebyshev polynomials at Gauss Lobatto nodes (lagrange polynomials). Includes convolutional layers as well HP refinement for non convolutional layers, linear initialization and various applications in the linked repos with varrying levels of success. Euler equations of fluid dynamics, nlp, implicit representation and more |
Training based on Kaczmarz, not Broyden method : The training process is independent of the basis functions, the provided link shows alternative to Broyden method originally suggested in MIT paper. It outperforms MLP siginificantly, benchmarks provided.

Project

KAN-GPT : The PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold Networks (KANs) for language modeling ｜
KAN-GPT-2 : Training small GPT-2 style models using Kolmogorov-Arnold networks.(despite the KAN model having 25% fewer parameters!). ｜
KANeRF : Kolmogorov-Arnold Network (KAN) based NeRF ｜
Vision-KAN : KAN for Vision Transformer ｜
Simple-KAN-4-Time-Series : A simple feature-based time series classifier using Kolmogorov–Arnold Networks ｜
KANU_Net : U-Net architecture with Kolmogorov-Arnold Convolutions (KA convolutions) ｜
kanrl : Kolmogorov-Arnold Network for Reinforcement Leaning, initial experiments ｜
kan-diffusion : Applying KANs to Denoising Diffusion Models with two-layer KAN able to restore images almost as good as 4-layer MLP (and 30% less parameters). ｜
KAN4Rec : Implementation of Kolmogorov-Arnold Network (KAN) for Recommendations ｜
CF-KAN : Kolmogorov-Arnold Network (KAN) implementation for collaborative filtering (CF) |
X-KANeRF : X-KANeRF: KAN-based NeRF with Various Basis Functions to explain the the NeRF formula ｜
KAN4Graph : Implementation of Kolmogorov-Arnold Network (KAN) for Graph Neural Networks (GNNs) and Tasks on Graphs ｜
ImplicitKAN : Kolmogorov-Arnold Network (KAN) as an implicit function for images and other modalities ｜
ThangKAN : Kolmogorov-Arnold Network (KAN) for text classification over GLUE tasks ｜
JianpanHuang/KAN : This repository contains a demo of regression task (curve fitting) using an efficient Kolmogorov-Arnold Network. ｜
Fraud Detection in Supply Chains Using Kolmogorov Arnold Networks ｜
CL-KAN-ViT : Kolmogorov-Arnold Network (KAN) based vision transformer for class-based continual learning to mitigate catastrophic forgetting |
KAN-Autoencoder : KAE KAN-based AutoEncoder (AE, VAE, VQ-VAE, RVQ, etc.) |
OpenKAN
KAN-DQN : An experiment where KAN replaces MLP in Deep Q-Network to play Flappy Bird as a Reinforcement Learning agent. |

Discussion

HN-KAN Hacker news discussion
HN- A new type of neural network is more interpretable
HN-Trying Kolmogorov-Arnold Networks in Practice
Can Kolmogorov–Arnold Networks (KAN) beat MLPs?
Twitter thinks they killed MLPs. But what are Kolmogorov-Arnold Networks?
[D] Kolmogorov-Arnold Network is just an MLP
KAN: Kolmogorov–Arnold Networks: A review : This review raises 4 major criticisms of the paper KAN: Kolmogorov-Arnold Networks. "MLPs have learnable activation functions as well", "The content of the paper does not justify the name, Kolmogorov-Arnold networks (KANs)", "KANs are MLPs with spline-basis as the activation function" and "KANs do not beat the curse of dimensionality" unlike claimed.

Tutorial

KAN Author's twitter introduction
pg2455/KAN-Tutorial ｜
A Simplified Explanation Of The New Kolmogorov-Arnold Network (KAN) from MIT
The Math Behind KAN — Kolmogorov-Arnold Networks
A from-scratch implementation of Kolmogorov-Arnold Networks (KAN)…and MLP | GitHub Code
team-daniel/KAN : Implementation on how to use Kolmogorov-Arnold Networks (KANs) for classification and regression tasks.｜
vincenzodentamaro/keras-FastKAN : Tensorflow Keras implementation of FastKAN Kolmogorov Arnold Network｜
Official Tutorial Notebooks
imodelsX examples with KAN : Scikit-learn wrapper for tabular data for KAN (Kolmogorov Arnold Network)
What is the new Neural Network Architecture?(KAN) Kolmogorov-Arnold Networks Explained
KAN: Kolmogorov–Arnold Networks — A Short Summary
What is the significance of the Kolmogorov axioms for Mathematical Probability?
Andrey Kolmogorov — one of the greatest mathematicians of the XXth century
Unpacking Kolmogorov-Arnold Networks : Edge-Based Activation: Exploring the Mathematical Foundations and Practical Implications of KANs
Why is the (KAN) Kolmogorov-Arnold Networks so promising
Demystifying Kolmogorov-Arnold Networks: A Beginner-Friendly Guide with Code
KANvas : Provide quick & intuitive interaction for people to try KAN
KAN-Tutorial: Understanding Kolmogorov-Arnold Networks: A Tutorial Series on KAN using Toy Examples
KAN-Continual_Learning_tests : Collection of tests performed during the study of the new Kolmogorov-Arnold Neural Networks (KAN) ｜
The Annotated Kolmogorov Network (KAN): An annotated code guide implementation of KAN, like the Annotated Transformer.

YouTube

Contributing

We welcome your contributions! Please follow these steps to contribute:

Fork the repo.
Create a new branch (e.g., feature/new-kan-resource).
Commit your changes to the new branch.
Create a Pull Request, and provide a brief description of the changes/additions.

Please make sure that the resources you add are relevant to the field of Kolmogorov-Arnold Network. Before contributing, take a look at the existing resources to avoid duplicates.

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Name		Name	Last commit message	Last commit date
Latest commit History 293 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome KAN(Kolmogorov-Arnold Network)

Table of Contents