[Inference API] Add Google Vertex AI as provider for text_embedding task type #110090

timgrein · 2024-06-24T12:14:42Z

This PR adds support for Google Vertex AI embeddings (single embedding and batch embeddings + chunked inference support) through the Google Vertex AI Get Text Embeddings API.

Creating a text_embedding inference endpoint using googlevertexai as service provider:

PUT {{ES_HOST}}/_inference/text_embedding/google_vertex_ai_embedding

{
    "service": "googlevertexai",
    "service_settings": {
        "service_account_json": "<service_account_json>",
        "model_id": "<model_id>",
        "location": "<location>",
        "project_id": "<project_id>"
    }
}

Creating a single embedding using a single string as input:

POST {{ES_HOST}}/_inference/text_embedding/google_vertex_ai_embedding

{
    "input": "Embed this text"
}

Creating a single embedding using a list with a single string as input:

POST {{ES_HOST}}/_inference/text_embedding/google_vertex_ai_embedding
{
    "input": ["Embed this text"]
}

Creating multiple embeddings with a list of multiple strings as input:

POST {{ES_HOST}}/_inference/text_embedding/google_vertex_ai_embedding

{
    "input": [
        "Embed this text",
        "Embed this text, too",
        "This text should also be embedded"
    ]
}

elasticsearchmachine · 2024-06-24T12:15:06Z

Pinging @elastic/ml-core (Team:ML)

…ddings

breskeby · 2024-06-24T13:59:39Z

x-pack/plugin/inference/build.gradle

@@ -38,6 +38,169 @@ dependencies {
 clusterPlugins project(':x-pack:plugin:inference:qa:test-service-plugin')

 api "com.ibm.icu:icu4j:${versions.icu4j}"
+
+ runtimeOnly 'com.google.guava:guava:32.0.1-jre'
+ api 'com.google.code.gson:gson:2.10'


do those dependencies all need to be api? It seems implementation would be more appropriate? I wrote about this here https://groups.google.com/a/elastic.co/g/es/c/Do0hGs7ooJI/m/6nHB9r5NAwAJ a while ago

@breskeby Thank you for sharing! Adjusted it and seems to work fine with Use implementation instead of api command for 3rd party libraries :)

jonathan-buttner

Looking good, don't forget to add the new service, task, and secret settings to the registry.

jonathan-buttner · 2024-06-24T13:31:33Z

...org/elasticsearch/xpack/inference/external/request/googlevertexai/GoogleVertexAiRequest.java

+ List<String> AUTH_SCOPE = Collections.singletonList("https://www.googleapis.com/auth/cloud-platform");
+
+ static void decorateWithBearerToken(HttpPost httpPost, GoogleVertexAiSecretSettings secretSettings) {
+ AccessController.doPrivileged((PrivilegedAction<Void>) () -> {


Do you think we could use the SocketAccess wrapper here? https://github.com/elastic/elasticsearch/blob/main/x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/http/HttpClient.java#L109

Yes, good point! Adjusted with Use SocketAccess instead of AccessController directly

fyi: I've adjusted the change to use AccessController again and add an explicit SpecialPermission.check() with Use AccessController and add special permission check.

For some reason I get a permission denied error again using SocketAccess, which is confusing as now the code is basically the same and works? Looking through the codebase I see a lot of replications of the SocketAccess class, maybe it has something to do with some module magic I'm not aware of? Appreciating any hints!

I suggest we keep it like this for now, if we don't find a solution soonish. I'll create an issue and revisit it afterwards.

Could it be the case, because the x-pack core plugin misses permission java.lang.RuntimePermission "accessDeclaredMembers", which GoogleCredentials needs and the security manager uses the intersection of the two permission sets of the core and inference plugin?

Update: I think that's the actually the case. So we have two options:

Keep it like this for now

Create our own SocketAccess class, which will implicitly also grant access for GoogleCredentials through our security policy (which feels weird as it has nothing to do with SocketAccess, but reflection?)

I would keep it like this for now, otherwise we need a duplicated class, which also kinda hides the reflection detail. WDYT?

Ah ok, leaving it like you have is fine 👍

jonathan-buttner · 2024-06-24T13:35:40Z

...arch/xpack/inference/external/response/googlevertexai/GoogleVertexAiErrorResponseEntity.java

+ * or null if the response does not contain the `error.message` field
+ */
+ @SuppressWarnings("unchecked")
+ public static GoogleVertexAiErrorResponseEntity fromResponse(HttpResult response) {


Since we're only using the message field, how do you feel about using the implementation here: https://github.com/elastic/elasticsearch/blob/main/x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/response/AzureMistralOpenAiErrorResponseEntity.java#L54

I'm planning on renaming it in my PR: https://github.com/elastic/elasticsearch/pull/109893/files#diff-669128ea579af4136c4ff6995ee69accdc1c7963c1540f1b3efa8185bbf36d1a

I also thought about this (specifically code duplication vs resiliency trade-off). It could be a little bit brittle IMHO to rely on the APIs from very different providers never changing and with separate classes I guess we're in a better spot to adapt more easily. WDYT? No super strong opinion here, just a gut feeling

Yeah leaving it in is fine 👍

.../org/elasticsearch/xpack/inference/services/googlevertexai/GoogleVertexAiSecretSettings.java

...a/org/elasticsearch/xpack/inference/services/googlevertexai/GoogleVertexAiServiceFields.java

jonathan-buttner · 2024-06-24T14:02:02Z

...ck/inference/services/googlevertexai/embeddings/GoogleVertexAiEmbeddingsServiceSettings.java

+
+ public static final String NAME = "google_vertex_ai_embeddings_service_settings";
+
+ public static final String DIMENSIONS_SET_BY_USER = "dimensions_set_by_user";


Do we need this field? This would be needed if the user can set the dimensions in the request to google. I could have missed it but I didn't see that in the google rest docs here: https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-text-embeddings

Newer models, which are currently pre-GA will allow it, which you actually don't find in the english docs, but in the german docs 🤦‍♂️ . I've an additional issue to add the outputDimensionality to the request, but wanted to keep it separate first to check, whether we already want to support this for pre-GA models.

It's also announced here https://cloud.google.com/blog/products/ai-machine-learning/google-cloud-announces-new-text-embedding-models.

@davidkyle what do you think? I suppose we could just not document it as a supported option until we need to pass it through the request.

Given that GoogleVertexAiService::updateModelWithEmbeddingDetails will error if the user set dimensions does not match the embedding size I think we should allow this knowing that new models which support the option are coming.

jonathan-buttner · 2024-06-24T14:05:47Z

...ck/inference/services/googlevertexai/embeddings/GoogleVertexAiEmbeddingsServiceSettings.java

+ validationException
+ );
+ SimilarityMeasure similarityMeasure = extractSimilarity(map, ModelConfigurations.SERVICE_SETTINGS, validationException);
+ Integer dims = extractOptionalPositiveInteger(map, DIMENSIONS, ModelConfigurations.SERVICE_SETTINGS, validationException);


We're not consistent about how we do this but moving forward we should only allow parsing the dimensions field from the persistent parse context. The reason is that we shouldn't allow users to set the value (unless they can actually pass it through to the request like OpenAI). Instead we'll determine the dimensions by doing a request when the inference endpoint is first created.

So if this is the request context let's just default the dimensions to null.

Here's an example of how we did it recently for eland: https://github.com/elastic/elasticsearch/blob/main/x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/elasticsearch/CustomElandInternalTextEmbeddingServiceSettings.java#L51-L74

Considering https://github.com/elastic/elasticsearch/pull/110090/files/ef907ae635f3ba0d47ee82f2f6605a58dc49b23d#r1651260492 we can keep it, right?

Yeah let's see what Dave thinks 👍

jonathan-buttner · 2024-06-24T14:13:43Z

...ck/inference/services/googlevertexai/embeddings/GoogleVertexAiEmbeddingsServiceSettings.java

+ ModelConfigurations.SERVICE_SETTINGS,
+ validationException
+ );
+ SimilarityMeasure similarityMeasure = extractSimilarity(map, ModelConfigurations.SERVICE_SETTINGS, validationException);


Since we're going to need to default this, it probably makes sense to do it in this file (I know previous we've done that in service file after we do an embeddings request to determine the dimensions). It's probably more straightforward to do it here because we'll always have a similarity and element type defined and we can avoid the readOptional* writeOptional* calls.

Here's an example from eland: https://github.com/elastic/elasticsearch/blob/main/x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/elasticsearch/CustomElandInternalTextEmbeddingServiceSettings.java#L100-L101

Removed it from GoogleVertexAiService and added it here: Set default similarity measure in service settings

...in/java/org/elasticsearch/xpack/inference/services/googlevertexai/GoogleVertexAiService.java

jonathan-buttner · 2024-06-24T14:19:47Z

...arch/xpack/inference/external/action/googlevertexai/GoogleVertexAiEmbeddingsActionTests.java

+ webServer.close();
+ }
+
+ // Successful case tested via end-to-end notebook tests in AppEx repo


…ider

timgrein · 2024-06-25T09:04:45Z

Looking good, don't forget to add the new service, task, and secret settings to the registry.

Thanks! Added via Add Google Vertex AI named writeables to InferenceNamedWriteablesProv…

jonathan-buttner

Looks good!

jonathan-buttner · 2024-06-26T12:54:06Z

...org/elasticsearch/xpack/inference/external/request/googlevertexai/GoogleVertexAiRequest.java

+ return null;
+ });
+ } catch (Exception e) {
+ ValidationException validationException = new ValidationException(e);


Hmm should this be validation error or an ElasticsearchException? Or maybe an status exception 🤔 with a permissions denied.

Adjusted with Throw ElasticsearchStatusException instead of ValidationException in … 👍

breskeby

lgtm. thanks!

…GoogleVertexAiRequest

…ddings

timgrein · 2024-06-26T14:13:00Z

@elasticmachine update branch

…ddings

Add Google Vertex AI as provider for text_embedding task type

450e6c9

timgrein added >non-issue :ml Machine learning Team:ML Meta label for the ML team v8.15.0 labels Jun 24, 2024

timgrein requested a review from a team as a code owner June 24, 2024 12:14

timgrein and others added 3 commits June 24, 2024 14:15

Merge branch 'main' into timgrein/inference-api-google-vertex-ai-embe…

6b57ee2

…ddings

Add licenses

26345b3

Fix thirdPartyAudit

ef907ae

breskeby requested changes Jun 24, 2024

View reviewed changes

jonathan-buttner reviewed Jun 24, 2024

View reviewed changes

timgrein added 5 commits June 24, 2024 17:38

Use SocketAccess instead of AccessController directly

e3a8791

Move embedding batch size comment to correct constant

3603206

Use AccessController and add special permission check

adf84b7

Use implementation instead of api command for 3rd party libraries

3d38a4a

Add Google Vertex AI named writeables to InferenceNamedWriteablesProv…

680c41a

…ider

timgrein requested a review from breskeby June 25, 2024 08:58

Set default similarity measure in service settings

d3b378f

timgrein requested a review from jonathan-buttner June 25, 2024 09:04

jonathan-buttner approved these changes Jun 26, 2024

View reviewed changes

breskeby approved these changes Jun 26, 2024

View reviewed changes

timgrein and others added 3 commits June 26, 2024 15:32

Throw ElasticsearchStatusException instead of ValidationException in …

d3d3893

…GoogleVertexAiRequest

Merge branch 'main' into timgrein/inference-api-google-vertex-ai-embe…

197164a

…ddings

Merge branch 'main' into timgrein/inference-api-google-vertex-ai-embe…

09c55fc

…ddings

Merge branch 'main' into timgrein/inference-api-google-vertex-ai-embe…

8405ef4

…ddings

timgrein merged commit dd3e73e into elastic:main Jun 26, 2024
15 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Inference API] Add Google Vertex AI as provider for text_embedding task type #110090

[Inference API] Add Google Vertex AI as provider for text_embedding task type #110090

timgrein commented Jun 24, 2024

elasticsearchmachine commented Jun 24, 2024

breskeby Jun 24, 2024

timgrein Jun 25, 2024 •

edited

Loading

jonathan-buttner left a comment

jonathan-buttner Jun 24, 2024

timgrein Jun 24, 2024

timgrein Jun 25, 2024

timgrein Jun 25, 2024 •

edited

Loading

jonathan-buttner Jun 25, 2024

jonathan-buttner Jun 24, 2024

timgrein Jun 24, 2024 •

edited

Loading

jonathan-buttner Jun 25, 2024

jonathan-buttner Jun 24, 2024

timgrein Jun 24, 2024 •

edited

Loading

jonathan-buttner Jun 25, 2024

davidkyle Jun 25, 2024

jonathan-buttner Jun 24, 2024

timgrein Jun 25, 2024

jonathan-buttner Jun 25, 2024

jonathan-buttner Jun 24, 2024

timgrein Jun 25, 2024

jonathan-buttner Jun 24, 2024

timgrein commented Jun 25, 2024

jonathan-buttner left a comment

jonathan-buttner Jun 26, 2024

timgrein Jun 26, 2024

breskeby left a comment

timgrein commented Jun 26, 2024


		public static final String NAME = "google_vertex_ai_embeddings_service_settings";

		public static final String DIMENSIONS_SET_BY_USER = "dimensions_set_by_user";

[Inference API] Add Google Vertex AI as provider for text_embedding task type #110090

[Inference API] Add Google Vertex AI as provider for text_embedding task type #110090

Conversation

timgrein commented Jun 24, 2024

elasticsearchmachine commented Jun 24, 2024

Choose a reason for hiding this comment

timgrein Jun 25, 2024 • edited Loading

Choose a reason for hiding this comment

jonathan-buttner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

timgrein Jun 25, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

timgrein Jun 24, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

timgrein Jun 24, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

timgrein commented Jun 25, 2024

jonathan-buttner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

breskeby left a comment

Choose a reason for hiding this comment

timgrein commented Jun 26, 2024

timgrein Jun 25, 2024 •

edited

Loading

timgrein Jun 25, 2024 •

edited

Loading

timgrein Jun 24, 2024 •

edited

Loading

timgrein Jun 24, 2024 •

edited

Loading