Skip to content

analyze the gradients associated with each word at each layer of a language model during training

Notifications You must be signed in to change notification settings

nsaphra/gradience

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 

Repository files navigation

gradience

gradience is a tool for analyzing a language model during backpropagation. This system uses pytorch hooks so it can be applied to any architecture that contains 1 embedding layer. It automatically attaches hooks for analyzing gradients at each module in the architecture. The hook at each module collects average statistics for each word: l1, l1, farno factor, mean, magnitude, range, and median of the gradient. It supports arbitrary batch sizes, as long as the input and output both have dimensions batch_size x sequence_length.

Usage

from gradient_analyzer import GradientAnalyzer

...

analyzer = GradientAnalyzer(model, l1=True, l2=True, variance=True)
analyzer.add_hooks_to_model()

for (input,  output) in corpus:
   analyzer.set_word_sequence(input, output)
   """training code here"""
   ...
   loss.backward()
   ...

analyzer.compute_and_clear(idx2word, "outfile.csv")
analyzer.remove_hooks()

About

analyze the gradients associated with each word at each layer of a language model during training

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages