Skip to content

rm-rfred/vit-worker

Repository files navigation

gRPC Vision Transformer worker

A python gRPC worker serving a Vision Transformer model: vit-base-patch16-224 available on Hugging Face 🤗

What is a Vision Transformer ?

Vision Transformer

The Vision Transformer, or ViT, is a model for image classification that employs a Transformer-like architecture over patches of the image. An image is split into fixed-size patches, each of them are then linearly embedded, position embeddings are added, and the resulting sequence of vectors is fed to a standard Transformer encoder. In order to perform classification, the standard approach of adding an extra learnable “classification token” to the sequence is used.

For additional ressources, please refer to paperswithcode

Run the project

git clone [email protected]:rm-rfred/vit-worker.git
cd vit-worker

docker-compose build
docker-compose up -d

Config files

image_classification_pb2.py and image_classification_pb2_grpc.py where generated by running :

bash run_protoc.sh

Why gRPC instead of REST ?

  • Higher performances for microservice architecture
  • High load APIs
  • Better suited for real time / streaming apps

gRPC architecture example

gRPC

Dependencies

Docker version 24.0.6, build ed223bc Docker Compose version v2.23.0

Releases

No releases published

Packages

No packages published