Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

esp32 picture to base64 to chatGPT_Client.vision_question #9

Open
robomaniac opened this issue Jun 7, 2024 · 3 comments
Open

esp32 picture to base64 to chatGPT_Client.vision_question #9

robomaniac opened this issue Jun 7, 2024 · 3 comments

Comments

@robomaniac
Copy link

I want to take picture with esp32 convert to base64 and send it to chatgpt

I have issue to convert the picture to base64, can you share your function

I came up with this function

const char* base64Image = Photo2Base64();
Serial.println(base64Image);

then pass base64Image to the function

if (chatGPT_Client.vision_question("gpt-4o", "user", "text", "What’s in this image?", "image_url", base64Image, "auto", 500, true, result)) {
    Serial.print("[ChatGPT] Response: ");
    Serial.println(result);
  } else {
    Serial.print("[ChatGPT] Error: ");
    Serial.println(result);
  }

if I paste this string into online tool, it does show my picture but there is error which could explain why chatgpt api does not take it, returns error.

• Check the repair tool and convert your value to a valid Base64 string.
• The specified string is a data URI that contains the Base64 value.

This is the function that take picture and convert to base64

const char* Photo2Base64() {
    camera_fb_t * fb = NULL;
    fb = esp_camera_fb_get();  
    if (!fb) {
      Serial.println("Camera capture failed");
      return NULL;
    }
  
    String base64String = "data:image/jpeg;base64,";
    char *input = (char *)fb->buf;
    char output[base64_enc_len(3)];
    for (int i = 0; i < fb->len; i += 3) {
      base64_encode(output, input, 3);
      input += 3;
      base64String += String(output);
    }

    esp_camera_fb_return(fb);
    
    // Allocate memory for the C-style string
    char *cString = (char*)malloc(base64String.length() + 1);
    if (cString == NULL) {
        Serial.println("Failed to allocate memory");
        return NULL;
    }
    
    // Copy the contents of the String object to the C-style string
    strcpy(cString, base64String.c_str());

    return cString;
}

next step for me is to try other base64 library until that online website does not detect any error

@0015
Copy link
Owner

0015 commented Jun 10, 2024

@robomaniac Memory consumption reaches its peak when doing HTTP Posts, including Base64 encoded strings.
Captured image buffer (JPEG) + base64 encoded string + HTTP Post Body... This is why I used QCIF (176x144) as the FrameSize when I created a demo application with ESP32CAM. Try setting FrameSize to minimum.

@0015 0015 closed this as completed Jun 17, 2024
@robomaniac
Copy link
Author

I did try to make smaller images but the issue is how I assemble/construct the base64 URL, how you attach data:image/jpeg;base64, to image taken, merge the 2/ Strycopy does not cut it.

This is json section in python that works. I need to reproduce this json format in C.

payload = {
        "model": "gpt-4o",
        "messages": [
            {
                "role": "user",
                "content": [
                    {
                        "type": "text",
                        "text": "Describe the objects or elements that are in the foreground of this image?"
                    },
                    {
                        "type": "image_url",
                        "image_url": {
                            "url": f"data:image/jpeg;base64,{base64_image}"
                        }
                    }
                ]
            }
        ],

@0015
Copy link
Owner

0015 commented Jun 19, 2024

It worked like this.

const char *base64Prefix = "data:image/jpeg;base64,";

auto inputLength = fb->len;  // Get the length of the framebuffer data
size_t prefixLength = strlen(base64Prefix);

// Calculate the total length for the buffer
size_t totalLength = prefixLength + base64::encodeLength(inputLength);

// Allocate buffer to hold the concatenated string
char *output = new char[totalLength + 1];  // +1 for null terminator

// Copy the prefix into the output buffer
strcpy(output, base64Prefix);

// Encode the framebuffer data into the output buffer after the prefix
base64::encode((const uint8_t *)fb->buf, inputLength, output + prefixLength);

// Ensure the output buffer is null-terminated
output[totalLength] = '\0';

Serial.println(prompt);
if (chatGPT_Client.vision_question("gpt-4o", "user", "text", prompt, "image_url", output, "auto", 200, true, chatGPTresult)) {

  Serial.print("[ChatGPT] Response: ");
  Serial.println(chatGPTresult);
} else {
  Serial.print("[ChatGPT] Error: ");
  Serial.println(chatGPTresult);
}

delete[] output;

@0015 0015 reopened this Jun 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants