bareml is a set of "bare" implementations of machine learning / deep learning algorithms from scratch (only depending on numpy) in Python. "bare" means to aim at:
- Code as a direct translation of the algorithm / formula
- With minimum error handling and efficiency gain tricks
To maximise understandability of the code, interface of modules in bareml/machinelearning/
is aligned to Scikit-learn, and interface of modules in bareml/deeplearning/
is aligned to PyTorch, as seen in below 2 examples.
Example1:
from bareml.machinelearning.utils.model_selection import train_test_split
from bareml.machinelearning.supervised import KernelRidge
# assume the data X, y are defined
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)
reg = KernelRidge(alpha=1, kernel='rbf')
reg.fit(X_train, y_train)
y_pred = reg.predict(X_test)
print(reg.score(X_test, y_test))
Example2:
from bareml.deeplearning import layers as nn
from bareml.deeplearning import functions as F
class Net(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(in_channels=1, out_channels=32, kernel_size=3, stride=1)
self.conv2 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, stride=1)
self.dropout1 = nn.Dropout(p=0.25)
self.dropout2 = nn.Dropout(p=0.5)
self.fc1 = nn.Linear(in_features=33856, out_features=128)
self.fc2 = nn.Linear(in_features=128, out_features=10)
def forward(self, x):
x = self.conv1(x)
x = F.relu(x)
x = self.conv2(x)
x = F.relu(x)
x = F.max_pool2d(x, 2)
x = self.dropout1(x)
x = x.flatten()
x = self.fc1(x)
x = F.relu(x)
x = self.dropout2(x)
x = self.fc2(x)
return x
$ pip install bareml
or
$ git clone https://github.com/shotahorii/bareml.git
$ cd bareml
$ python setup.py install
Mandatory
- numpy
Optional
- cupy
- PIL
- matplotlib
- graphviz
- Bernoulli Naive Bayes
- Decision Trees
- Elastic Net
- Gaussian Naive Bayes
- Generalised Linear Model
- K Nearest Neighbors
- Kernel Ridge Regression
- Lasso Regression
- Linear Regression
- Logistic Regression
- Perceptron
- Poisson Regression
- Ridge Regression
- AdaBoost
- AdaBoost M1
- AdaBoost Samme
- AdaBoost RT
- AdaBoost R2
- Bagging
- Gradient Boosting
- Random Forest
- Stacking
- Voting
- XGBoost
- Preprocessing (Scaler, Encoder etc)
- Metrics
- Kernel functions
- Probability Distributions
- Model Selection (KFold etc)
- Deep learning programs are based on O'Reilly Japan's book "Deep learning from scratch 3" (Koki Saitoh) and its implementation Dezero.
- References of machine learning programs are documented in each source file, but mostly based on original papers, "Pattern Recognition and Machine Learning" (Christopher M. Bishop) and/or "Machine Learning: A Probabilistic Perspective" (Kevin P. Murphy).