Sparsity-aware deep learning inference runtime for CPUs
nlp
performance
computer-vision
inference
machinelearning
pruning
object-detection
pretrained-models
quantization
cpus
onnx
sparsification
llm-inference
deepsparse
-
Updated
Jul 19, 2024 - Python