Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyTorch Serve: Custom handler not saving inference results #3031

Open
danial880 opened this issue Mar 19, 2024 · 6 comments
Open

PyTorch Serve: Custom handler not saving inference results #3031

danial880 opened this issue Mar 19, 2024 · 6 comments
Labels
triaged Issue has been reviewed and triaged

Comments

@danial880
Copy link

danial880 commented Mar 19, 2024

I have asked this question on stackoverflow but got no answer. The posted image with curl is not received on the local server and no errors are logging. Here is the code:

Handler.py

import cv2
import torch
import logging
import numpy as np
from PIL import Image
from typing import List
from ts.torch_handler.base_handler import BaseHandler
from GANModel import GAN
from LD import ldDetector
from fb import alc, pfb

# Define global variables for model paths
MODEL_PATH_FAC2 = 'landmarks.dat'
MODEL_PATH_RET = 'model.pth'
logger = logging.getLogger(__name__)



class ImageHandler(BaseHandler):
    def __init__(self):
        super(ImageHandler, self).__init__()
        self.initialized = False
        self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
        self.fc_h = None
        self.fc_enh = None

    def initialize(self, context):
        logger.info("\n\n\n Initialized Successfully")
        self.fc_h = ldDetector(MODEL_PATH_FAC2)
        self.fc_enh = self.get_enhancement_model(self.device)
        self.initialized = True
        logger.info("Initialized Successfully \n\n\n")
    
    def get_enhancement_model(self, device):
        gan = GAN(
            device
        )

        ldnet = torch.load(MODEL_PATH_RET)
        name = 'params_ema' if 'params_ema' in ldnet else 'params'
        gan.load_state_dict(loadnet[name], strict=True)
        gan.eval()
        gan = gan.to(device)
        return gan

    def inference(self, image: Image.Image) -> torch.Tensor:
        logger.info("\n\n\n Inside Inference Function")
        logger.info("Type of received image(PIL Image) = ",type(image))
        input_tensor = self.preprocess(image)
        output_tensor = self.pi(input_tensor)
        output_image = self.postprocess(output_tensor)
        logger.info("Type of Output(Tensor) = ",type(output_image))
        return output_image

    def preprocess(self, data) -> Image.Image:
        logger.info("\n\n\n Inside preprocess Function")
        logger.info("Type of received data = ",type(data))
        if isinstance(data, bytes):
        # If data is bytes, assume it's the raw image content
            image = Image.open(io.BytesIO(data))
        else:
            # If data is not bytes, assume it's a file path
            image = Image.open(data)

        print("Preprocessing complete.")
        return image

    def postprocess(self, output: torch.Tensor) -> Image.Image:
        output_np = output.squeeze(0).detach().cpu().numpy().transpose(1, 2, 0)
        output_np = np.clip(output_np, 0, 255)
        # Convert NumPy array to PIL image
        output_image = Image.fromarray(output_np.astype(np.uint8))
        output_image.save("processed_output.jpg")
        return output_image

    def pi(self, input_tensor: torch.Tensor) -> torch.Tensor:
        restored_img, _, _ = self.poi(input_tensor)
        return restored_img

    def poi(self, input_tensor: torch.Tensor):
        img_np = input_tensor.squeeze(0).detach().cpu().numpy().transpose(1, 2, 0)
        face_landmarks, _ = self.fc_h.get_face_landmarks(img_np)
        face_count = len(face_landmarks)
        restored_img = None
        for face_landmark in face_landmarks:
            cropped_face, inverse_affine = alc()
            restored_face = self.fc_enh(torch.from_numpy(cropped_face)
            restored_face = restored_face.squeeze(0).detach().cpu().numpy().transpose(1, 2, 0)
            restored_img = pfb(img_np, restored_face, inverse_affine=inverse_affine)

        return torch.from_numpy(restored_img.transpose(2, 0, 1)).unsqueeze(0), [], []

command for creating .mar

torch-model-archiver --model-name facex --version 1.0 --model-file model.py --serialized-file model.pth --handler handler.py --extra-files landmarks.dat,GANModel.py,LD.py,Utils.py

command for running server

torchserve --ncs --start --model-store model_store --ts-config config.properties --models facex.mar

command for inference

curl -X POST http:https://127.0.0.1:8080/predictions/facex -T 0294.png

config.properties

grpc_inference_port=7000
grpc_management_port=7001

ts_log.log

2024-03-19T00:28:13,602 [WARN ] main org.pytorch.serve.util.ConfigManager - Your torchserve instance can access any URL to load models. When deploying to production, make sure to limit the set of allowed_urls in config.properties
2024-03-19T00:28:13,602 [WARN ] main org.pytorch.serve.util.ConfigManager - Your torchserve instance can access any URL to load models. When deploying to production, make sure to limit the set of allowed_urls in config.properties
2024-03-19T00:28:13,605 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager - Initializing plugins manager...
2024-03-19T00:28:13,605 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager - Initializing plugins manager...
2024-03-19T00:28:13,646 [INFO ] main org.pytorch.serve.metrics.configuration.MetricConfiguration - Successfully loaded metrics configuration 
2024-03-19T00:28:13,646 [INFO ] main org.pytorch.serve.metrics.configuration.MetricConfiguration - Successfully loaded metrics configuration 
2024-03-19T00:28:13,754 [INFO ] main org.pytorch.serve.ModelServer - 
Torchserve version: 0.10.0

Temp directory: /tmp

Number of GPUs: 1
Number of CPUs: 16
Max heap size: 7988 M

Config file: config.properties
Inference address: http:https://127.0.0.1:8080
Management address: http:https://127.0.0.1:8081
Metrics address: http:https://127.0.0.1:8082

Initial Models: facex.mar

Netty threads: 0
Netty client threads: 0
Default workers per model: 1
Blacklist Regex: N/A
Maximum Response Size: 6553500
Maximum Request Size: 6553500
Limit Maximum Image Pixels: true
Prefer direct buffer: false
Allowed Urls: [file:https://.*|http(s)?:https://.*]
Custom python dependency for model allowed: false
Enable metrics API: true
Metrics mode: LOG
Disable system metrics: false

CPP log config: N/A
Model config: N/A
System metrics command: default
2024-03-19T00:28:13,754 [INFO ] main org.pytorch.serve.ModelServer - 
Torchserve version: 0.10.0


Temp directory: /tmp

Number of GPUs: 1
Number of CPUs: 16
Max heap size: 7988 M

Config file: config.properties
Inference address: http:https://127.0.0.1:8080
Management address: http:https://127.0.0.1:8081
Metrics address: http:https://127.0.0.1:8082

Initial Models: facex.mar

Netty threads: 0
Netty client threads: 0
Default workers per model: 1
Blacklist Regex: N/A
Maximum Response Size: 6553500
Maximum Request Size: 6553500
Limit Maximum Image Pixels: true
Prefer direct buffer: false
Allowed Urls: [file:https://.*|http(s)?:https://.*]
Custom python dependency for model allowed: false
Enable metrics API: true
Metrics mode: LOG
Disable system metrics: false

CPP log config: N/A
Model config: N/A
System metrics command: default
2024-03-19T00:28:13,767 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager -  Loading snapshot serializer plugin...
2024-03-19T00:28:13,767 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager -  Loading snapshot serializer plugin...
2024-03-19T00:28:13,794 [INFO ] main org.pytorch.serve.ModelServer - Loading initial models: facex.mar
2024-03-19T00:28:13,794 [INFO ] main org.pytorch.serve.ModelServer - Loading initial models: facex.mar
2024-03-19T00:28:20,040 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Adding new version 1.0 for model facex
2024-03-19T00:28:20,040 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Adding new version 1.0 for model facex
2024-03-19T00:28:20,041 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Setting default version to 1.0 for model facex
2024-03-19T00:28:20,041 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Setting default version to 1.0 for model facex
2024-03-19T00:28:20,041 [INFO ] main org.pytorch.serve.wlm.ModelManager - Model facex loaded.
2024-03-19T00:28:20,041 [INFO ] main org.pytorch.serve.wlm.ModelManager - Model facex loaded.
2024-03-19T00:28:20,041 [DEBUG] main org.pytorch.serve.wlm.ModelManager - updateModel: facex, count: 1
2024-03-19T00:28:20,041 [DEBUG] main org.pytorch.serve.wlm.ModelManager - updateModel: facex, count: 1

2024-03-19T00:28:20,047 [INFO ] main org.pytorch.serve.ModelServer - Initialize Inference server with: EpollServerSocketChannel.
2024-03-19T00:28:20,047 [INFO ] main org.pytorch.serve.ModelServer - Initialize Inference server with: EpollServerSocketChannel.
2024-03-19T00:28:20,047 [DEBUG] W-9000-facex_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: 
2024-03-19T00:28:20,082 [INFO ] main org.pytorch.serve.ModelServer - Inference API bind to: http:https://127.0.0.1:8080
2024-03-19T00:28:20,082 [INFO ] main org.pytorch.serve.ModelServer - Inference API bind to: http:https://127.0.0.1:8080
2024-03-19T00:28:20,082 [INFO ] main org.pytorch.serve.ModelServer - Initialize Management server with: EpollServerSocketChannel.
2024-03-19T00:28:20,082 [INFO ] main org.pytorch.serve.ModelServer - Initialize Management server with: EpollServerSocketChannel.
2024-03-19T00:28:20,082 [INFO ] main org.pytorch.serve.ModelServer - Management API bind to: http:https://127.0.0.1:8081
2024-03-19T00:28:20,082 [INFO ] main org.pytorch.serve.ModelServer - Management API bind to: http:https://127.0.0.1:8081
2024-03-19T00:28:20,083 [INFO ] main org.pytorch.serve.ModelServer - Initialize Metrics server with: EpollServerSocketChannel.
2024-03-19T00:28:20,083 [INFO ] main org.pytorch.serve.ModelServer - Initialize Metrics server with: EpollServerSocketChannel.
2024-03-19T00:28:20,083 [INFO ] main org.pytorch.serve.ModelServer - Metrics API bind to: http:https://127.0.0.1:8082
2024-03-19T00:28:20,083 [INFO ] main org.pytorch.serve.ModelServer - Metrics API bind to: http:https://127.0.0.1:8082
@hungtrieu07
Copy link

Can you try curl http:https://127.0.0.1:8080/ping ? It will return Status equals Healthy if your service run smoothly, else return Unhealthy

@hungtrieu07
Copy link

This is my config.properties file, you can refer to it:

inference_address=http:https://0.0.0.0:8080
management_address=http:https://0.0.0.0:8081
metrics_address=http:https://0.0.0.0:8082
load_models=all
install_py_dep_per_model=true
model_store=model-store
models={\
  "FaceDetection": {\
    "1.0": {\
        "defaultVersion": true,\
        "marName": "FaceDetection.mar",\
        "minWorkers": 1,\
        "maxWorkers": 1,\
        "batchSize": 256,\
        "maxBatchDelay": 100,\
        "responseTimeout": 120\
    }\
  },\
  "FaceExpression": {\
    "1.0": {\
        "defaultVersion": true,\
        "marName": "FaceExpression.mar",\
        "minWorkers": 1,\
        "maxWorkers": 4,\
        "batchSize": 256,\
        "maxBatchDelay": 100,\
        "responseTimeout": 120\
    }\
  },\
  "FaceRecognition": {\
    "1.0": {\
        "defaultVersion": true,\
        "marName": "FaceRecognition.mar",\
        "minWorkers": 1,\
        "maxWorkers": 4,\
        "batchSize": 256,\
        "maxBatchDelay": 100,\
        "responseTimeout": 120\
    }\
  },\
    "HumanPose": {\
    "1.0": {\
        "defaultVersion": true,\
        "marName": "HumanPose.mar",\
        "minWorkers": 1,\
        "maxWorkers": 4,\
        "batchSize": 256,\
        "maxBatchDelay": 100,\
        "responseTimeout": 120\
    }\
  },\
  "ActionRecognition": {\
    "1.0": {\
        "defaultVersion": true,\
        "marName": "ActionRecognition.mar",\
        "minWorkers": 1,\
        "maxWorkers": 4,\
        "batchSize": 256,\
        "maxBatchDelay": 100,\
        "responseTimeout": 120\
    }\
  }\
}

@namannandan
Copy link
Collaborator

namannandan commented Mar 20, 2024

@danial880 Once you've started torchserve and loaded the model could you please try the following API to check that the model has been successfully loaded and the workers have started successfully:
curl http:https://127.0.0.1:8081/models/facex

Also, when you run curl -X POST http:https://127.0.0.1:8080/predictions/facex -T 0294.png what happens?
Does curl hang or show any error?

@namannandan namannandan added the triaged Issue has been reviewed and triaged label Mar 20, 2024
@danial880
Copy link
Author

danial880 commented Mar 21, 2024

Status

Screenshot from 2024-03-21 11-39-05

Model Info

Screenshot from 2024-03-21 11-42-11

Upon posting image: curl command get executed with no error
Screenshot from 2024-03-21 11-50-25

@lxning
Copy link
Collaborator

lxning commented Mar 26, 2024

@danial880 The posted log does not include the information about inference. can you please post the full log?

@danial880
Copy link
Author

@lxning I have attached it in txt file
logs.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triaged Issue has been reviewed and triaged
Projects
None yet
Development

No branches or pull requests

4 participants