Mini-Attention

A Keras Hirarchical Attention Layer for Document Classification in NLP 🤖

This library is an implementation of Heirarchical Attention Networks for Document Classification (Yang etal,2015).Link. This is compatible with Keras and Tensorflow (keras version >=2.0.6). As the paper suggests, it uses hierarchical attention mechanism and capabilities of Word Encoder (including bi-directional Recurrent unit- GRU) ,Sentence Attention and Document Classification are addressed.

Dependencies

Tensorflow

Keras

Usability

The library or the Layer is compatible with Tensorflow and Keras. Installation is carried out using the pip command as follows:

pip install MiniAttention==0.1

For using inside the Jupyter Notebook or Python IDE (along with Keras layers):

import MiniAttention.MiniAttention as MA

The Layer takes as input a 3D Tensor with dimensions: (sample_size,steps,features) The Layer as an output produces a 2D Tensor with dimensions: (sample_size,features) This Layer can be used after the Keras.Embedding() Layer to provide a global attention using the features and Embedding weights.Additionally Embedding libraries like (Glove,Word2Vec,ELMO) can be used in the Embedding Layer. It can also be used in between and before LSTM/Bidirectional LSTM/ GRU and recurrent layers . In this context, it has a capability to cater to Sequential and Model (functional) types of Model architectures(specific for Keras). The functional (keras.models.Model) version is provided as follows:

inp=Input(shape=(inp_shape,))
z=Embedding(max_features,256)(inp)
z=MA.MiniAttentionBlock(keras.initializers.he_uniform,None,None,keras.regularizers.L2(l2=0.02),None,None,None,None,None)(z)
z=tf.keras.layers.Bidirectional(LSTM(128,recurrent_activation="relu",return_sequences=True))(z)
z=tf.keras.layers.Bidirectional(LSTM(64,recurrent_activation="relu",return_sequences=True))(z)
z=MA.MiniAttentionBlock(keras.initializers.he_uniform,None,None,keras.regularizers.L2(l2=0.02),None,None,None,None,None)(z)
z=keras.layers.Dense(64,activation="relu")(z)
z=keras.layers.Dense(64,activation="relu")(z)
z=keras.layers.Dense(1,activation="sigmoid")(z)
model=keras.models.Model(inputs=inp,outputs=z)
model.compile(loss="binary_crossentropy",metrics=['accuracy'],optimizer=keras.optimizers.Adagrad(learning_rate=1e-3))
model.summary()

For Sequential Model (keras.models.Sequential):

model = Sequential()
model.add(Embedding(max_features,128,input_shape=(100,)))
model.add(MA.MiniAttentionBlock(None,None,None,None,None,None,None,None,None))
model.add(LSTM(128))
model.add(Dense(8,activation='relu'))
model.add(Dense(4,activation='sigmoid'))
model.compile(loss='binary_crossentropy',metrics=['accuracy'],optimizer='Adagrad')
model.summary()

The arguments for the MiniAttentionBlock class include:

1.W_init: Weight Initializer - Compatible with keras.initializers
2.b_init: Bias Initializer - Compatible with keras.initializers
3.u_init: Output Initializer -Compatible with keras.initializers
4.W_reg: Weight Regularizer - Compatible with keras.regularizers
5.b_reg: Bias Regularizer - Compatible with keras.regularizers
6.u_reg: Output Regularizer -Compatible with keras.regularizers
7.W_const: Weight Constraint - Compatible with keras.constraints
8.b_const: Bias Constraint - Compatible with keras.constraints
9.u_const: Output Constraint -Compatible with keras.constraints
10.bias: Boolean - True/False- Whether to use bias (Optional)

There are 3 main functions inside the MiniAttentionBlock class. The "init" method is used for initializeing the weight , bias tensors for computation. The "attention_block" method is used for assigning the variables (tensors) and checking for the input tensor size. The "build_nomask" method is used for computing the attention modules.Uses tanh as the internal activation function with exponential normalization.Masking has not been added to the library yet.

Example

For reference on how to use the library, a Jupyter Notebook sample is present in the repository: "MiniAttention_on_IMDB.ipynb".

This is a sample which uses this Layer with Keras.layers.Embedding() Layer in IMDB binary classification.It uses the default keras embedding which is followed from the official tutorial by Keras docs. Alternately,"Tensorboard-tfds-IMDB.py" contains a tensorboard demonstration.

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
MiniAttention		MiniAttention
dist		dist
2.1.0		2.1.0
LICENSE.TXT		LICENSE.TXT
MANIFEST		MANIFEST
MiniAttention_on_IMDB.ipynb		MiniAttention_on_IMDB.ipynb
README.md		README.md
Tensoboard-tfds-IMDB.ipynb		Tensoboard-tfds-IMDB.ipynb
Tensorboard-training.PNG		Tensorboard-training.PNG
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mini-Attention

A Keras Hirarchical Attention Layer for Document Classification in NLP 🤖

Dependencies

Usability

Example

Contributing

License

About

Releases 1

Packages

Languages

License

abhilash1910/MiniAttention

Folders and files

Latest commit

History

Repository files navigation

Mini-Attention

A Keras Hirarchical Attention Layer for Document Classification in NLP 🤖

Dependencies

Usability

Example

Contributing

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages