0.1.6 / 2023-11-15

rationalize dependencies
update to include the KenLMModel example in README

0.1.4 / 2023-11-14

version bump; release
Merge branch 'benlipkin/main' into main
add simple surprisal compute test that should pass as long as full surprise() method executes. no assertions are made.
implement attention mask for use_bos_token token case
Merge branch 'main' of github.com:benlipkin/surprisal into benlipkin/main
Merge branch 'main' of github.com:aalok-sathe/surprisal into main
Merge pull request #14 from aalok-sathe/feature-support-kenlm
OK, we have a MWE! still TODO: figure out surprisal value: do we want that? maybe add an option to show but default to disabling it? do we also want bos?
bugfix in ids handling in CustomEncoding
bugfixes; bump numpy version for typing
make KenLMModel visible at the module level
add an KenLM and NGramSurprisal implementation
move repr() to SurprisalArray rather than huggingfacesurprisal. complete CustomEncoding implementation.
actually no point subclassing from tokenizers.Encoding
flesh out interface towards supporting CustomEncoding for custom-tokenized text, e.g. whitespace for kenlm
Update python-publish.yml: fix poetry install command
Update pylint.yml: fix poetry install command
add deps to publish action workspace using poetry
Update pylint.yml
Create python-publish.yml
Create pylint.yml
start writing kenLM implementation
prefix with hf to indicate it works with HF-based tokenizer outputs
explicitly specify attention mask
fix concat on device
always pad right
add arg to trust remote code
require torch2+ to take advantage of optimizations
added support for fp16 and bf16 precision

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0.1.6

0.1.6 / 2023-11-15

0.1.4 / 2023-11-14