Regresa

Regresa is a Python package where I implemented my own versions of the linear and logistic regression algorithms presented by Andrew Ng in his course: Supervised Machine Learning: Regression and Classification.

My motivations were:

to have reusable implementation of the algorithms for learning purposes,
to write the algorithms avoiding nested loops for readability,
to add tests to the implementations so I could play refactoring them.

Installation

Regresa is written with Poetry. The following instructions should be sufficient for you to start using it.

git clone https://github.com/elcapo/regresa.git
cd regresa
poetry install

Note that you'll need to install git, python and poetry to get this working.

Usage

Once installed, use Poetry's shell to interact with the package.

poetry shell

Linear

The linear module offers functions to compute a linear regression given a set of examples with one or more features:

predict: apply a given set of coefficients to the input to predict an output
loss: compute the individual loss for a set of examples
cost: compute the total cost for a set of examples
cost_gradient: compute the gradient of the cost of a given set of coefficients
gradient_descent: compute a gradient descent

These functions can be imported one by one:

from regresa.linear import predict

# ... and then use them directly by their names
predict([[0], [1]], [2], .5) # [0.5, 2.5]

... or all at once:

from regresa import linear

# ... and then use them directly by their names prefixed with linear
linear.predict([[0], [1]], [2], .5) # [0.5, 2.5]

Linear / Predict

from regresa.linear import predict

help(predict)


    Apply a given set of coefficients to the input to predict an output.

    Arguments:
        X (ndarray (m, n)): input values where the regression will be computed
        w (ndarray (n, )): weights for each of the features
        b (scalar): biased weight for the regression

    Return:
        f_wb (ndarray (m, )): evaluation of the linear regression for each value of x

$$f_{\vec{w}, b}(\vec{x}) = \vec{w}·\vec{x} + b$$

Linear / Loss

from regresa.linear import loss

help(loss)


    Compute the loss of a set of examples.

    Arguments:
        X (ndarray (m, n)): input values where the regression will be computed
        y (ndarray (m, )): vector with boolean tags for each example
        w (ndarray (n, )): weights for each of the features
        b (scalar): biased weight for the regression

    Returns:
        (ndarray (m, )): loss for each of the given examples

$$j^{[i]} = \sum_{i=1}^{m}(f_{\vec{w},b}(\vec{x}^{[i]}) - y^{[i]})^2$$

Linear / Cost

from regresa.linear import cost

help(cost)


    Compute the cost for a given set of examples.

    Arguments:
        X (ndarray (m, n)): input values where the regression will be computed
        y (ndarray (m, )): vector with boolean tags for each example
        w (ndarray (n, )): weights for each of the features
        b (scalar): biased weight for the regression
        lambde (scalar): factor of regularization

    Returns:
        (scalar): total cost for the given set of weights

$$J(\vec{w}, b) = \frac{1}{2 m} \sum_{i=1}^{m} j^{[i]} - \frac{\lambda}{2m} \vec{w} \cdot \vec{w}$$

Linear / Cost Gradient

from regresa.linear import cost_gradient

help(cost_gradient)


    Compute the gradient of the cost for a given set of examples.

    Arguments:
        X (ndarray (m, n)): input values where the regression will be computed
        y (ndarray (m, )): vector with boolean tags for each example
        w (ndarray (n, )): weights for each of the features
        b (scalar): biased weight for the regression
        lambde (scalar): factor of regularization

    Returns:
        (ndarray (n, )): gradient of the cost for the given set of weights w
        (scalar): gradient of the cost for the given weight b

$$ \begin{alignat*}{4} & \alpha \frac{\partial}{\partial \omega} J(\omega, b) = \frac{1}{m} \sum_{i=1}^{m} (f_{\vec{\omega}, b}(\vec{x}^{[i]}) - y^{[i]}) x^{[i]} \\ & \alpha \frac{\partial}{\partial b} J(\omega, b) = \frac{1}{m} \sum_{i=1}^{m} (f_{\vec{\omega}, b}(\vec{x}^{[i]}) - y^{[i]}) \end{alignat*} $$

Linear / Gradient Descent

from regresa.linear import gradient_descent

help(gradient_descent)


    Compute a gradient descent.

    Arguments:
        X (ndarray (m, n)): input values where the regression will be computed
        y (ndarray (m, )): vector with boolean tags for each example
        w (ndarray (n, )): weights for each of the features
        b (scalar): biased weight for the regression
        alpha (scalar): learning rate
        iterations (scalar): number of iterations to run

    Returns:
        (ndarray (n, )): weights for each feature after the iterations
        (scalar): additional scalar weight

$$ \begin{alignat*}{4} & w_n = w_{n-1} - \alpha \frac{\partial}{\partial \omega} J(\omega, b) \\ & b_n = b_{n-1} - \alpha \frac{\partial}{\partial b} J(\omega, b) \end{alignat*} $$

Note that the subscript in $w_n$ and $b_n$ represent a given iteration and $w_{n-1}$ and $b_{n-1}$ represent the previous one.

Logistic

The logistic module offers functions to compute a binary classification given a set of examples with one or more features:

sigmoid: compute the sigmoid of a vector
predict: apply a given set of coefficients to the input to predict an output
loss: compute the individual loss for a set of examples
cost: compute the total cost for a set of examples
cost_gradient: compute the gradient of the cost of a given set of coefficients
gradient_descent: compute a gradient descent

These functions can be imported one by one:

from regresa.logistic import sigmoid

# ... and then use them directly by their names
sigmoid(.5) # .6224593312018546

... or all at once:

from regresa import logistic

# ... and then use them directly by their names prefixed with logistic
logistic.sigmoid(.5) # .6224593312018546

Logistic / Sigmoid

from regresa.logistic import sigmoid

help(sigmoid)


    Compute the sigmoid of z. In other words, compute 1 / (1 + e**(-z)).

    Arguments:
        z (ndarray (m, )): one dimensional vector with the input values

    Returns:
        (ndarray (m, )): vector with the dimension of z and the result of the computation

$$s(\vec{z}) = \frac{1}{1 + e^{-\vec{z}}}$$

This function accepts scalars as input. If a scalar is given, a scalar is also returned.

sigmoid(0) # 0.5
sigmoid(9**9) # 1.0

The function also accepts lists of numbers and Numpy arrays as input. In those cases, a Numpy array with the same dimension of the input is returned.

sigmoid([0, 9**9]) # array([0.5, 1. ])

Example: Sigmoid plot near zero

In combination with the plot method from the plotter module, you can easily have a glimpse on how the function looks like.

from regresa.logistic import sigmoid
from regresa.plotter import plot

x = [x for x in range(-10, 10 + 1)]
y = sigmoid(x)

plot(x, y)

Logistic / Predict

from regresa.logistic import predict

help(predict)


    Apply a given set of coefficients to the input to predict an output.

    Arguments:
        X (ndarray (m, n)): input values where the regression will be computed
        w (ndarray (n, )): weights for each of the features
        b (scalar): biased weight for the regression

    Return:
        f_wb (ndarray (m, )): evaluation of the logistic regression for each value of x

$$f_{\vec{w}, b}(\vec{x}) = \frac{1}{1 + e^{\vec{w}·\vec{x} + b}}$$

In combination with the plot method from the plotter module, you can check how a logistic regression graph changes with different weights.

from regresa import plotter, logistic

x = [[x/10] for x in range(-100, 110, 1)]
multiple_y = [logistic.predict(x, [d/10], 0) for d in range(0, 12, 2)]
labels = ['w = {}'.format(d/10) for d in range(0, 12, 2)]

plotter.over_plot(x, multiple_y, legends)

Logistic / Loss

from regresa.logistic import loss

help(loss)


    Compute the loss of a set of examples.

    Arguments:
        X (ndarray (m, n)): input values where the regression will be computed
        y (ndarray (m, )): vector with boolean tags for each example
        w (ndarray (n, )): weights for each of the features
        b (scalar): biased weight for the regression

    Returns:
        (ndarray (m, )): loss for each of the given examples

$$j^{[i]} = -y^{[i]} log(f_{\vec{w},b}(\vec{x}^{[i]})) - (1 - y^{[i]}) log(1 - f_{\vec{w},b}(\vec{x}^{[i]}))$$

Logistic / Cost

from regresa.logistic import cost

help(cost)


    Compute the cost for a given set of examples.

    Arguments:
        X (ndarray (m, n)): input values where the regression will be computed
        y (ndarray (m, )): vector with boolean tags for each example
        w (ndarray (n, )): weights for each of the features
        b (scalar): biased weight for the regression
        lambde (scalar): factor of regularization

    Returns:
        (scalar): total cost for the given set of weights

$$J(\vec{w}, b) = \frac{1}{m} \sum_{i=1}^{m} j^{[i]} - \frac{\lambda}{2m} \vec{w} \cdot \vec{w}$$

Logistic / Cost Gradient

from regresa.logistic import cost_gradient

help(cost_gradient)


    Compute the gradient of the cost for a given set of examples.

    Arguments:
        X (ndarray (m, n)): input values where the regression will be computed
        y (ndarray (m, )): vector with boolean tags for each example
        w (ndarray (n, )): weights for each of the features
        b (scalar): biased weight for the regression
        lambde (scalar): factor of regularization

    Returns:
        (ndarray (n, )): gradient of the cost for the given set of weights w
        (scalar): gradient of the cost for the given weight b

$$ \begin{alignat*}{4} & \frac{\partial J(\vec{w}, b)}{\partial w_j} = \frac{1}{m} \sum_{i=1}^{m} (f_{\vec{w},b}(\vec{x}^{[i]}) - y^{[i]}) x_j^{[i]} + \frac{\lambda}{m} w_j \\ & \frac{\partial J(\vec{w}, b)}{\partial b} = \frac{1}{m} \sum_{i=1}^{m} (f_{\vec{w},b}(\vec{x}_i) - y_i) \end{alignat*} $$

Logistic / Gradient Descent

from regresa.logistic import gradient_descent

help(gradient_descent)


    Compute a gradient descent.

    Arguments:
        X (ndarray (m, n)): input values where the regression will be computed
        y (ndarray (m, )): vector with boolean tags for each example
        w (ndarray (n, )): weights for each of the features
        b (scalar): biased weight for the regression
        alpha (scalar): learning rate
        iterations (scalar): number of iterations to run

    Returns:
        (ndarray (n, )): weights for each feature after the iterations
        (scalar): additional scalar weight

$$ \begin{alignat*}{4} & w_j^i = w_j^{i-1} - \alpha \frac{\partial J(\vec{w}, b)}{\partial w_j} \\ & b^i = b^{i-1} - \alpha \frac{\partial J(\vec{w}, b)}{\partial b} \end{alignat*} $$

Note that the superscript in $w_j^i$ does not represent a power. Instead, it express that this is the value of $w_j$ that corresponds with the iteration $i$.

Tests

To run the tests, use PyTest from your shell.

pytest -v

Documentation

In order to maintain the documentation of each function up to date, this README uses templates to print the help text for each of the functions on the linear and logistic modules.

This means that rather than making changes to this document, changes should be done in docs/README.template instead.

After the template us updated, the main README.md file can be updated by running:

python docs/refresh_readme.py

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.github/workflows		.github/workflows
assets		assets
docs		docs
regresa		regresa
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Regresa

Installation

Usage

Linear

Linear / Predict

Linear / Loss

Linear / Cost

Linear / Cost Gradient

Linear / Gradient Descent

Logistic

Logistic / Sigmoid

Example: Sigmoid plot near zero

Logistic / Predict

Logistic / Loss

Logistic / Cost

Logistic / Cost Gradient

Logistic / Gradient Descent

Tests

Documentation

About

Releases

Packages

Languages

License

elcapo/regresa

Folders and files

Latest commit

History

Repository files navigation

Regresa

Installation

Usage

Linear

Linear / Predict

Linear / Loss

Linear / Cost

Linear / Cost Gradient

Linear / Gradient Descent

Logistic

Logistic / Sigmoid

Example: Sigmoid plot near zero

Logistic / Predict

Logistic / Loss

Logistic / Cost

Logistic / Cost Gradient

Logistic / Gradient Descent

Tests

Documentation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages