Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Published API Token limit #439

Open
DTheunis opened this issue Jul 11, 2024 · 7 comments
Open

[BUG] Published API Token limit #439

DTheunis opened this issue Jul 11, 2024 · 7 comments
Labels
bug Something isn't working needs investigation Bug but cause is not identified.

Comments

@DTheunis
Copy link
Contributor

Describe the bug

When doing a POST to a published API with a large message, we run into this error:
[ERROR] ValidationException: An error occurred (ValidationException) when calling the InvokeModel operation: Malformed input request: #/texts/0: expected maxLength: 2048, actual: 8575, please reformat your input and try again.

To Reproduce

API POST to a published API with a message larger than 2048 tokens.

Screenshots

Additional context

Is there any solution to this or workaround? For example I could split up the message in different chunks - however I would like to send it all as 1 message so that I get 1 response to the full message.

@statefb
Copy link
Contributor

statefb commented Jul 12, 2024

Related: #434

The SQS massage size limit is 256KiB. As you mentioned, we need to split into chunks or put on external storage temporarily.

@statefb statefb added enhancement New feature or request duplicate This issue or pull request already exists labels Jul 12, 2024
@DTheunis
Copy link
Contributor Author

No sorry, it's a different issue.
The one you linked is indeed an SQS size issue - the issue I'm having is a character limit to invoke Bedrock.

I can send the exact message I want to post to an API in a chatbot through the application, while it will fail while using the API. Surely it should be possible to utilise the same logic for published API and API to post a chat message?

@statefb statefb removed the duplicate This issue or pull request already exists label Jul 16, 2024
@statefb
Copy link
Contributor

statefb commented Jul 16, 2024

Sorry, I misunderstood. Yes surely they utilize the same logic but there is a implementation difference. Published api uses chat method and the other uses process_chat_input. If the same message invokes error is posted on the post a chat interface, it would make the same err.

@DTheunis
Copy link
Contributor Author

DTheunis commented Jul 16, 2024

It does not generate the same error when you post the same message to the API and the chat interface, see below.
When posting to the API, the sqs will try to invokemodel but get the character limit error:

START RequestId: eed351e9-5e10-5791-a0c2-0ef9682d92ae Version: $LATEST
--
[INFO]	2024-07-16T08:49:18.020Z	eed351e9-5e10-5791-a0c2-0ef9682d92ae	Finding conversation: 01J2XBPNMK2MEXBJ723KRN3SFB
[INFO]	2024-07-16T08:49:18.674Z	eed351e9-5e10-5791-a0c2-0ef9682d92ae	No conversation found with id: 01J2XBPNMK2MEXBJ723KRN3SFB. Creating new conversation.
[INFO]	2024-07-16T08:49:18.674Z	eed351e9-5e10-5791-a0c2-0ef9682d92ae	Bot id is provided. Fetching bot.
LAMBDA_WARNING: Unhandled exception. The most likely cause is an issue in the function code. However, in rare cases, a Lambda runtime update can cause unexpected function behavior. For functions using managed runtimes, runtime updates can be triggered by a function change, or can be applied automatically. To determine if the runtime has been updated, check the runtime version in the INIT_START log entry. If this error correlates with a change in the runtime version, you may be able to mitigate this error by temporarily rolling back to the previous runtime version. For more information, see https://docs.aws.amazon.com/lambda/latest/dg/runtimes-update.html
[ERROR] ValidationException: An error occurred (ValidationException) when calling the InvokeModel operation: Malformed input request: #/texts/0: expected maxLength: 2048, actual: 4429, please reformat your input and try again.Traceback (most recent call last):  
File "/var/task/app/sqs_consumer.py", line 27, in handler    
chat_result = chat(user_id=user_id, chat_input=chat_input)  
File "/var/task/app/usecases/chat.py", line 325, in chat    
search_results = search_related_docs(  
File "/var/task/app/vector_search.py", line 74, in search_related_docs    
query_embedding = calculate_query_embedding(query)  
File "/var/task/app/bedrock.py", line 206, in calculate_query_embedding    
response = client.invoke_model(  
File "/var/lang/lib/python3.11/site-packages/botocore/client.py", line 565, in _api_call    
return self._make_api_call(operation_name, kwargs)  
File "/var/lang/lib/python3.11/site-packages/botocore/client.py", line 1021, in _make_api_call    
raise error_class(parsed_response, operation_name)
END RequestId: eed351e9-5e10-5791-a0c2-0ef9682d92ae
REPORT RequestId: eed351e9-5e10-5791-a0c2-0ef9682d92ae	
Duration: 3152.59 ms	Billed Duration: 3153 ms	Memory Size: 1024 MB	Max Memory Used: 200 MB


When posting to the Chat interface, it works, see my screenshot.
claude_chat

I've included the text that I used to test this as a file. (I let Claude generate a long text)
dummy.txt

This is the code I use to post to the API:

import requests
import json
from typing import List

def send_to_api(chunk: str, conversation_id: str = None):
    base_url = "https://XXXX.execute-api.eu-central-1.amazonaws.com/api""
    api_key = "XXXX"
    headers = {
        "x-api-key": api_key,
        "Content-Type": "application/json"
    }

    payload = {
        "message": {
            "content": [
                {
                    "contentType": "text",
                    "body": chunk
                }
            ],
            "model": "claude-v3.5-sonnet"
        },
        "continue_generate": False
    }

    if conversation_id:
        payload["conversation_id"] = conversation_id

    response = requests.post(f"{base_url}/conversation", headers=headers, json=payload)

    if response.status_code == 200:
        return response.json()
    else:
        raise Exception(f"API request failed with status code {response.status_code}: {response.text}")

# Read the transcription from the file
with open('transcription.txt', 'r', encoding='utf-8') as file:
    transcription = file.read()

# Split the transcription into chunks

conversation_id = None

result = send_to_api(transcription, conversation_id)
conversation_id = result.get("conversationId")
print("API Response:")
print(json.dumps(result, indent=2))

Please do keep in mind that if you were to test this yourself, the API call will return a success, but if you go to your Cloudwatch SQS logs, you will see that there was an error.

It might also be in the way I post to the API? Would love to hear a way to make this work since we would like to work with published API's and longer texts.

@statefb
Copy link
Contributor

statefb commented Jul 17, 2024

Thank you for the detail. The log shows that the bot looks like using Knowledge a.k.a. RAG but your chat screen does not look like using Knowledge. Could you describe the procedure to reproduce concretely and precisely as much as possible?

@statefb statefb removed the enhancement New feature or request label Jul 17, 2024
@DTheunis
Copy link
Contributor Author

DTheunis commented Jul 17, 2024

I did paste my code and procedure in my previous post, which is how I call the API from a published bot API.
Run that code with a file 'transcription.txt' (3000+ characters content) and you will get this error.

Also for clarification, both ways I showed are using a bot with no knowledge - only a custom instruction.

(Editting in your own published bot API and API Key)

So again:

2048+ characters text message -> Chat interface -> Bot with no knowledge (Custom instructions only) -> Response
2048+ characters text message -> Published API Post -> Bot with no knowledge (Custom instructions only) -> Error

@statefb statefb added bug Something isn't working needs investigation Bug but cause is not identified. labels Jul 17, 2024
@DTheunis
Copy link
Contributor Author

DTheunis commented Jul 25, 2024

Actually got an update:

Using a bot with knowledge generates the same error as calling a POST to the API of a bot without knowledge.

Operation failed: An error occurred (ValidationException) when calling the InvokeModel operation: Malformed input request: #/texts/0: expected maxLength: 2048, actual: 294210, please reformat your input and try again.

So when posting a long text to a bot without knowledge, it works, but posting it to a bot with knowledge - it does not work.

Is the cause of this known? Any workaround present/is it on the roadmap?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs investigation Bug but cause is not identified.
Projects
None yet
Development

No branches or pull requests

2 participants