Fixing Gradient Accumulation
•
38
float16
. However, there's some precision loss somewhere and generation doesn't work in float16
mode yet. I'm looking into this and will keep you posted! Or take a look at this issue if you'd like to help: https://github.com/huggingface/swift-transformers/issues/95