Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow sparse quantized models #22

Open
clementpoiret opened this issue Apr 29, 2022 · 1 comment
Open

Slow sparse quantized models #22

clementpoiret opened this issue Apr 29, 2022 · 1 comment
Labels
bug Something isn't working

Comments

@clementpoiret
Copy link
Owner

Describe the bug

Even on AVX512-VNNI CPUs, sparse int8-quantized models are slow

To Reproduce
Steps to reproduce the behavior:

  1. Use bagging_sq or single_sq segmentations for inference

Expected behavior

Faster than FP32 models

Screenshots

N/A

Environment (please complete the following information):

  • OS: [e.g. Debian] Ubuntu
  • Python: [e.g. 3.9.1] 3.8
  • HSF Version: [e.g. 0.1.1] 1.1.1
  • Relevant settings: [e.g. segmentation=bagging_fast] bagging_sq single_sq

Additional context
N/A

@clementpoiret clementpoiret added the bug Something isn't working label Apr 29, 2022
@clementpoiret
Copy link
Owner Author

See neuralmagic/sparseml#733

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant