Skip to content

The Branches algorithm, fast Dynamic Programming and Branch and Bound search for seeking optimal Decision Trees

License

Notifications You must be signed in to change notification settings

Chaoukia/branches

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Branches: A Fast Dynamic Programming and Branch & Bound Algorithm for Optimal Decision Trees

Source code for the Algorithm Branches described in the paper Branches: A Fast Dynamic Programming and Branch & Bound Algorithm for Optimal Decision Trees that is currecntly under review.

Dependencies

We recommend creating a conda virtual environment from our .yml file as follows:

conda env create -f dependencies.yml
conda activate branches

To visualize Decision Trees, we need the svgling package, which is not currenlty supported by conda. Thus we install it with pip:

pip install svgling

Repository Structure

.
├── data                         # Data used for benchmarking
├── src                          # Source files
│   ├── branch_ordinal.py        # Source file for classification problems with ordinally encoded data
│   ├── branch_binary.py         # Source file for binary classification problems with binary data
│   ├── branch_binary_multi.py   # Source file for classification problems with binary data
│   ├── branches.py              # Source file for the Branches algorithm
│   └── tutorial.ipynb           # Tutorial .ipynb notebook
├── trees                        # SVG files of optimal decision trees
├── LICENSE
└── README.md

File src/tutorial.ipynb contains a tutorial on how to use Branches with illustrative examples.

Example of usage

The MONK's Problems are standard datasets for benchmarking Optimal Decision Trees algorithms. We use the first of these problems to illustrate how to use Branches.

from branches import *
from sklearn.preprocessing import OneHotEncoder, OrdinalEncoder, LabelEncoder

# Reading the data
data = np.genfromtxt('data/monks-1.train', delimiter=' ', dtype=int)
data = data[:, :-1] # Getting rid of the last column, it contains only ids.
data = data[:, ::-1] # Reorder the columns to put the predicted variable Y at the end.

# Ordinal Encoding of the data
encoder = OrdinalEncoder()
encoder.fit(data)
data = encoder.transform(data).astype(int)

# Running Branches
alg = Branches(data)
alg.solve(lambd=0.01)

# Printing the accuracy, number of branches and number of splits
branches, splits = alg.lattice.infer()
print('Number of branches :', len(branches))
print('Number of splits :', splits)
print('Accuracy :', ((alg.predict(data[:, :-1]) == data[:, -1]).sum())/alg.n_total)

Using the nltk and svgling packages, we can plot the optimal Decision Tree via the code below. $\color{red}{\textsf{Please note that if you do not see the figures, it is probably due to a contrast issue and you should set a light theme for Github.}}$

tree = alg.plot_tree()
svgling.draw_tree(tree)

If we are only interested in the tree structure regardless of the predicted classes at the leaves, we can set show_classes=False in the plot_tree method.

tree = alg.plot_tree(show_classes=False)
svgling.draw_tree(tree)

Here are some more examples of optimal Decision Trees we find for different problems.

The tutorial file src/tutorial.ipynb contains more examples on how to use Branches, especially with its micro-optimisation techniques that allow for significant computational gains.

Empirical Evaluation

Branches optimises the regularised accuracy $\mathcal{H}_{\lambda}\left( T\right) = \textrm{Accuracy}\left( T\right) - \lambda \mathcal{S}\left( T\right)$, where $\mathcal{S}\left( T\right)$ is the number of splits (internal nodes) of Decision Tree $T$ and $\lambda \in \left[ 0, 1 \right]$ is a penalty parameter. The table below summarises the empirical comparison between Branches and the state of the art. $\mathcal{T}$ is the execution time in seconds (TO indicates time out after 5 minutes), and $\mathcal{I}$ the number of iterations, ___ indicates the inapplicability of the method to the setting. Branches clearly outperforms the Python implementations OSDT and PyGOSDT. Branches also outperforms the C++ implementation GOSDT in many cases, and even when it is slower, Branches always converges in significantly fewer iterations. A future C++ implementation of Branches will likely lead to a significant improvement of Branches' computational performance, just as we notice when comparing PyGOSDT and GOSDT.

OSDT PyGOSDT GOSDT Branches
Dataset $\mathcal{H}_\lambda$ $\mathcal{T}$ $\mathcal{I}$ $\mathcal{H}_\lambda$ $\mathcal{T}$ $\mathcal{I}$ $\mathcal{H}_\lambda$ $\mathcal{T}$ $\mathcal{I}$ $\mathcal{H}_\lambda$ $\mathcal{T}$ $\mathcal{I}$
monk1-l $0.93$ $71$ $2\mathrm{e}6$ $0.93$ $181$ $3\mathrm{e}6$ $0.93$ $0.71$ $3\mathrm{e}4$ $0.93$ $0.11$ $617$
monk1-f $0.97$ TO $2\mathrm{e}4$ $0.97$ TO $2\mathrm{e}3$ $0.983$ $4.02$ $9\mathrm{e}4$ $0.983$ $1.31$ $1\mathrm{e}4$
monk1-o ___ ___ ___ ___ ___ ___ ___ ___ ___ $0.9$ $0.02$ $64$
monk2-l $0.95$ TO $7\mathrm{e}4$ $0.95$ TO $400$ $0.97$ $10$ $1\mathrm{e}5$ $0.97$ $2.8$ $3\mathrm{e}4$
monk2-f $0.90$ TO $4e4$ $0.90$ TO $3\mathrm{e}4$ $0.93$ $11.1$ $1\mathrm{e}5$ $0.93$ $5.9$ $7\mathrm{e}4$
monk2-o ___ ___ ___ ___ ___ ___ ___ ___ ___ $0.95$ $0.14$ $1\mathrm{e}3$
monk3-l $0.979$ TO $593$ $0.979$ TO $123$ $0.981$ $7.38$ $8\mathrm{e}4$ $0.981$ $1.20$ $9\mathrm{e}3$
monk3-f $0.975$ TO $1\mathrm{e}4$ $0.973$ TO $9\mathrm{e}3$ $0.983$ $2.13$ $5\mathrm{e}4$ $0.983$ $1.14$ $9\mathrm{e}3$
monk3-o ___ ___ ___ ___ ___ ___ ___ ___ ___ $0.987$ $0.04$ $156$
tic-tac-toe $0.765$ TO $40$ $0.808$ TO $37$ $0.850$ $41$ $1.6\mathrm{e}6$ $0.850$ $68$ $2.6\mathrm{e}5$
tic-tac-toe-o ___ ___ ___ ___ ___ ___ ___ ___ ___ $0.832$ $0.95$ $3479$
car-eval ___ ___ ___ ___ ___ ___ $0.799$ $18$ $9\mathrm{e}5$ $0.799$ $62$ $3\mathrm{e}5$
car-eval-o ___ ___ ___ ___ ___ ___ ___ ___ ___ $0.812$ $0.11$ $632$
nursery ___ ___ ___ ___ ___ ___ $0.755$ TO $9\mathrm{e}5$ $0.820$ TO $3\mathrm{e}5$
nursery-o ___ ___ ___ ___ ___ ___ ___ ___ ___ $0.822$ $0.34$ $244$
mushroom $0.945$ TO $4\mathrm{e}6$ $0.945$ TO $2\mathrm{e}6$ $0.925$ TO $1e6$ $0.938$ TO $2\mathrm{e}4$
mushroom-o ___ ___ ___ ___ ___ ___ ___ ___ ___ $0.975$ $0.17$ $6$
kr-vs-kp $0.900$ TO $6\mathrm{e}4$ $0.900$ TO $2\mathrm{e}4$ $0.815$ TO $4\mathrm{e}5$ $0.900$ TO $8\mathrm{e}4$
kr-vskp-o ___ ___ ___ ___ ___ ___ ___ ___ ___ $0.900$ TO $8\mathrm{e}4$
zoo ___ ___ ___ ___ ___ ___ $0.992$ $34$ $3e5$ $0.992$ $15$ $3\mathrm{e}4$
zoo-o ___ ___ ___ ___ ___ ___ ___ ___ ___ $0.993$ $0.94$ $1456$
lymph ___ ___ ___ ___ ___ ___ $0.784$ TO $1\mathrm{e}6$ $0.808$ TO $1\mathrm{e}5$
lymph-o ___ ___ ___ ___ ___ ___ ___ ___ ___ $0.852$ $18$ $2\mathrm{e}4$
balance $0.693$ TO $1\mathrm{e}5$ $0.693$ TO $3\mathrm{e}4$ $0.693$ $21$ $1\mathrm{e}6$ $0.693$ $62$ $3\mathrm{e}5$
balance-o ___ ___ ___ ___ ___ ___ ___ ___ ___ $0.661$ $0.02$ $130$

About

The Branches algorithm, fast Dynamic Programming and Branch and Bound search for seeking optimal Decision Trees

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published