Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Inference API] Add Google Vertex AI as provider for text_embedding task type #110090

Conversation

timgrein
Copy link
Contributor

This PR adds support for Google Vertex AI embeddings (single embedding and batch embeddings + chunked inference support) through the Google Vertex AI Get Text Embeddings API.

Creating a text_embedding inference endpoint using googlevertexai as service provider:

PUT {{ES_HOST}}/_inference/text_embedding/google_vertex_ai_embedding

{
    "service": "googlevertexai",
    "service_settings": {
        "service_account_json": "<service_account_json>",
        "model_id": "<model_id>",
        "location": "<location>",
        "project_id": "<project_id>"
    }
}

Creating a single embedding using a single string as input:

POST {{ES_HOST}}/_inference/text_embedding/google_vertex_ai_embedding

{
    "input": "Embed this text"
}

Creating a single embedding using a list with a single string as input:

POST {{ES_HOST}}/_inference/text_embedding/google_vertex_ai_embedding
{
    "input": ["Embed this text"]
}

Creating multiple embeddings with a list of multiple strings as input:

POST {{ES_HOST}}/_inference/text_embedding/google_vertex_ai_embedding

{
    "input": [
        "Embed this text",
        "Embed this text, too",
        "This text should also be embedded"
    ]
}

@timgrein timgrein added >non-issue :ml Machine learning Team:ML Meta label for the ML team v8.15.0 labels Jun 24, 2024
@timgrein timgrein requested a review from a team as a code owner June 24, 2024 12:14
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@@ -38,6 +38,169 @@ dependencies {
clusterPlugins project(':x-pack:plugin:inference:qa:test-service-plugin')

api "com.ibm.icu:icu4j:${versions.icu4j}"

runtimeOnly 'com.google.guava:guava:32.0.1-jre'
api 'com.google.code.gson:gson:2.10'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do those dependencies all need to be api? It seems implementation would be more appropriate? I wrote about this here https://groups.google.com/a/elastic.co/g/es/c/Do0hGs7ooJI/m/6nHB9r5NAwAJ a while ago

Copy link
Contributor Author

@timgrein timgrein Jun 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@breskeby Thank you for sharing! Adjusted it and seems to work fine with Use implementation instead of api command for 3rd party libraries :)

Copy link
Contributor

@jonathan-buttner jonathan-buttner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, don't forget to add the new service, task, and secret settings to the registry.

List<String> AUTH_SCOPE = Collections.singletonList("https://www.googleapis.com/auth/cloud-platform");

static void decorateWithBearerToken(HttpPost httpPost, GoogleVertexAiSecretSettings secretSettings) {
AccessController.doPrivileged((PrivilegedAction<Void>) () -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fyi: I've adjusted the change to use AccessController again and add an explicit SpecialPermission.check() with Use AccessController and add special permission check.

For some reason I get a permission denied error again using SocketAccess, which is confusing as now the code is basically the same and works? Looking through the codebase I see a lot of replications of the SocketAccess class, maybe it has something to do with some module magic I'm not aware of? Appreciating any hints!

I suggest we keep it like this for now, if we don't find a solution soonish. I'll create an issue and revisit it afterwards.

Copy link
Contributor Author

@timgrein timgrein Jun 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it be the case, because the x-pack core plugin misses permission java.lang.RuntimePermission "accessDeclaredMembers", which GoogleCredentials needs and the security manager uses the intersection of the two permission sets of the core and inference plugin?

Update: I think that's the actually the case. So we have two options:

  • Keep it like this for now
  • Create our own SocketAccess class, which will implicitly also grant access for GoogleCredentials through our security policy (which feels weird as it has nothing to do with SocketAccess, but reflection?)

I would keep it like this for now, otherwise we need a duplicated class, which also kinda hides the reflection detail. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, leaving it like you have is fine 👍

* or null if the response does not contain the `error.message` field
*/
@SuppressWarnings("unchecked")
public static GoogleVertexAiErrorResponseEntity fromResponse(HttpResult response) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@timgrein timgrein Jun 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also thought about this (specifically code duplication vs resiliency trade-off). It could be a little bit brittle IMHO to rely on the APIs from very different providers never changing and with separate classes I guess we're in a better spot to adapt more easily. WDYT? No super strong opinion here, just a gut feeling

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah leaving it in is fine 👍


public static final String NAME = "google_vertex_ai_embeddings_service_settings";

public static final String DIMENSIONS_SET_BY_USER = "dimensions_set_by_user";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this field? This would be needed if the user can set the dimensions in the request to google. I could have missed it but I didn't see that in the google rest docs here: https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-text-embeddings

Copy link
Contributor Author

@timgrein timgrein Jun 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Newer models, which are currently pre-GA will allow it, which you actually don't find in the english docs, but in the german docs 🤦‍♂️ . I've an additional issue to add the outputDimensionality to the request, but wanted to keep it separate first to check, whether we already want to support this for pre-GA models.

It's also announced here https://cloud.google.com/blog/products/ai-machine-learning/google-cloud-announces-new-text-embedding-models.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidkyle what do you think? I suppose we could just not document it as a supported option until we need to pass it through the request.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that GoogleVertexAiService::updateModelWithEmbeddingDetails will error if the user set dimensions does not match the embedding size I think we should allow this knowing that new models which support the option are coming.

validationException
);
SimilarityMeasure similarityMeasure = extractSimilarity(map, ModelConfigurations.SERVICE_SETTINGS, validationException);
Integer dims = extractOptionalPositiveInteger(map, DIMENSIONS, ModelConfigurations.SERVICE_SETTINGS, validationException);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're not consistent about how we do this but moving forward we should only allow parsing the dimensions field from the persistent parse context. The reason is that we shouldn't allow users to set the value (unless they can actually pass it through to the request like OpenAI). Instead we'll determine the dimensions by doing a request when the inference endpoint is first created.

So if this is the request context let's just default the dimensions to null.

Here's an example of how we did it recently for eland: https://github.com/elastic/elasticsearch/blob/main/x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/elasticsearch/CustomElandInternalTextEmbeddingServiceSettings.java#L51-L74

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah let's see what Dave thinks 👍

ModelConfigurations.SERVICE_SETTINGS,
validationException
);
SimilarityMeasure similarityMeasure = extractSimilarity(map, ModelConfigurations.SERVICE_SETTINGS, validationException);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're going to need to default this, it probably makes sense to do it in this file (I know previous we've done that in service file after we do an embeddings request to determine the dimensions). It's probably more straightforward to do it here because we'll always have a similarity and element type defined and we can avoid the readOptional* writeOptional* calls.

Here's an example from eland: https://github.com/elastic/elasticsearch/blob/main/x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/elasticsearch/CustomElandInternalTextEmbeddingServiceSettings.java#L100-L101

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed it from GoogleVertexAiService and added it here: Set default similarity measure in service settings

webServer.close();
}

// Successful case tested via end-to-end notebook tests in AppEx repo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@timgrein timgrein requested a review from breskeby June 25, 2024 08:58
@timgrein
Copy link
Contributor Author

Looking good, don't forget to add the new service, task, and secret settings to the registry.

Thanks! Added via Add Google Vertex AI named writeables to InferenceNamedWriteablesProv…

Copy link
Contributor

@jonathan-buttner jonathan-buttner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

return null;
});
} catch (Exception e) {
ValidationException validationException = new ValidationException(e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm should this be validation error or an ElasticsearchException? Or maybe an status exception 🤔 with a permissions denied.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@breskeby breskeby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. thanks!

@timgrein
Copy link
Contributor Author

@elasticmachine update branch

@timgrein timgrein merged commit dd3e73e into elastic:main Jun 26, 2024
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:ml Machine learning >non-issue Team:ML Meta label for the ML team v8.15.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants