SparseLinear is a PyTorch package that allows a user to create extremely wide and sparse linear layers efficiently. A sparsely connected network is a network where each node is connected to a fraction of available nodes. This differs from a fully connected network, where each node in one layer is connected to every node in the next layer.
The provided layer along with the dynamic activation sparsity module is compatible with backpropagation. The sparse linear layer is initialized with sparsity, supports unstructured sparsity and allows dynamic growth and pruning. We achieve this by building a linear layer on top of PyTorch Sparse, which provides optimized sparse matrix operations with autograd support in PyTorch.
The default arguments initialize a sparse linear layer with random connections that applies a linear transformation to the incoming data
- in_features - size of each input sample
- out_features - size of each output sample
- bias - If set to
False
, the layer will not learn an additive bias. Default:True
- sparsity - sparsity of weight matrix. Default:
0.9
- connectivity - user-defined sparsity matrix. Default:
None
- small_world - boolean flag to generate small-world sparsity. Default:
False
- dynamic - boolean flag to dynamically change the network structure. Default:
False
- deltaT - frequency for growing and pruning update step. Default:
6000
- Tend - stopping time for growing and pruning algorithm update step. Default:
150000
- alpha - f-decay parameter for cosine updates. Default:
0.1
- max_size - maximum number of entries allowed before chunking occurrs for small-world network generation and dynamic connections. Default:
1e8
- Input:
(N, *, H_{in})
where*
means any number of additional dimensions andH_{in} = in_features
- Output:
(N, *, H_{out})
where all but the last dimension are the same shape as the input andH_{out} = out_features
- ~SparseLinear.weight - the learnable weights of the module of shape
(out_features, in_features)
. The values are initialized from , where - ~SparseLinear.bias - the learnable bias of the module of shape
(out_features)
. Ifbias
isTrue
, the values are initialized from where
>>> m = sl.SparseLinear(20, 30)
>>> input = torch.randn(128, 20)
>>> output = m(input)
>>> print(output.size())
torch.Size([128, 30])
The following customization can also be done using appropriate arguments -
One can choose to add self-defined static sparsity. The connectivity
flag accepts a (2, nnz) LongTensor that represents the rows and columns of nonzero elements in the layer.
The default static sparsity is random. With this flag, one can instead use small-world sparsity. See here. To specify, set small_world
to True
. Specifically, we make connections distance-dependent to ensure small-world behavior.
The user can grow and prune units during training starting from a sparse configuration using this feature. The implementation is based on Rigging the lottery algorithm. Specify dynamic
to be True
to dynamically alter the layer connections while training.
In addition, we provide a Dynamic Activation Sparsity module to utilize principled, per-layer activation sparsity. The algorithm implementation is based on the K-Winners strategy.
- alpha - constant used in updating duty-cycle. Default:
0.1
- beta - boosting factor for neurons not activated in the previous duty cycle. Default:
1.5
- act_sparsity - fraction of the input used in calculating K for K-Winners strategy. Default:
0.65
- Input:
(N, *)
where*
means, any number of additional dimensions - Output:
(N, *)
, same shape as the input
>>> x = asy.ActivationSparsity(10)
>>> input = torch.randn(3,10)
>>> output = x(input)
- Follow the installation instructions and install PyTorch Sparse package from here.
- Then run
pip install sparselinear
We provide a Jupyter notebook in this repository that demonstrates the basic functionalities of the sparse linear layer. We also show steps to train various models using the additional features of this package.