The current Jackson configuration allows serializing cluster states that cannot be read back #96976

original-brownbear · 2023-06-21T11:13:05Z

The current Jackson configuration uses a default maximum string length of ~5M characters. Any string longer than that, will throw an exception when deserializing.

This was uncovered in #96918 where a 5M+ string resulted in a completed broken cluster after restart. Since such a string cannot be deserialized by the current code, we can neither read the state from disk on node restart, nor can we restore a snapshot containing this string.

It seems the easiest fix here would be to increase the limit for the smile factory that we use to read the cluster state to Integer.MAX_VALUE I think.

elasticsearchmachine · 2023-06-21T11:13:29Z

Pinging @elastic/es-distributed (Team:Distributed)

elasticsearchmachine · 2024-05-22T09:39:45Z

Pinging @elastic/es-core-infra (Team:Core/Infra)

DaveCTurner · 2024-05-22T09:41:27Z

I'm asking @elastic/es-core-infra for help here because this seems like a fundamental problem in the XContent subsystem. In terms of cluster state persistence it's ok if we fail to even write a state because of an over-long string (although clearly better if we succeed and then read it back). It's only a problem if we write something we can't read.

rjernst · 2024-06-21T00:33:19Z

I agree this is a problem with xcontent. However, I do not think we should raise this limit to max int. That would be a 2GB string. Why did we have a 5MB string in the first place? I'm more inclined to block writing the xcontent with a string length over the limit in the first place.

original-brownbear · 2024-06-21T08:19:20Z

I agree, the problem was with the write side of things here initially and we should definitely not allow a string that long.
On the read side, for the cluster state specifically, maybe there shouldn't be a limit though. Assuming the state is not corrupted, having a limit and thus becoming unable to read a state that has a long string in it for whatever reason, means that the cluster is broken if you attempt a full cluster restart since you are unable to read the cluster state from disk.

I wonder if maybe we should have the ability to customize an individual XContentParser instance better here? I can see how you'd want that kind of limit on a string when ingesting documents or deserializing REST requests in general, it's less clear to me that you want it when working with cluster state or other internal on-disk metadata where any failure to read can mean big trouble.

DaveCTurner · 2024-07-03T10:17:28Z

I'm more inclined to block writing the xcontent with a string length over the limit in the first place.

At least if we do this then the state on disk does not become unreadable, which would be much better than how things are today. But just to be clear, if we fail at write time then the master will not be able to commit the state update, which causes it to stand down and start a new election cycle. There's a good chance that the troublesome request came from a TransportMasterNodeAction which will retry on the newly-elected master, triggering another master failover, and so on. A full cluster restart may be needed to clear things out, and hopefully whatever condition led to the bad state update does not persist across the restart.

DaveCTurner · 2024-07-03T10:25:48Z

retry on the newly-elected master

I expect we can find a way to make this situation un-retryable, once we put in place some mechanism to make the whole write fail.

original-brownbear added >bug :Distributed/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. labels Jun 21, 2023

elasticsearchmachine added the Team:Distributed Meta label for distributed team label Jun 21, 2023

original-brownbear mentioned this issue Jun 21, 2023

[DRAFT] read arbitry length strings from smile XContent #96977

Draft

kingherc added the team-discuss label Jun 29, 2023

kingherc mentioned this issue Jun 29, 2023

SLM metadata records enormous string describing last failure #71325

Open

repantis removed the team-discuss label Mar 26, 2024

DaveCTurner added :Core/Infra/Core Core issues without another label and removed :Distributed/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. Team:Distributed Meta label for distributed team labels May 22, 2024

elasticsearchmachine added the Team:Core/Infra Meta label for core/infra team label May 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The current Jackson configuration allows serializing cluster states that cannot be read back #96976

The current Jackson configuration allows serializing cluster states that cannot be read back #96976

original-brownbear commented Jun 21, 2023

elasticsearchmachine commented Jun 21, 2023

elasticsearchmachine commented May 22, 2024

DaveCTurner commented May 22, 2024

rjernst commented Jun 21, 2024

original-brownbear commented Jun 21, 2024

DaveCTurner commented Jul 3, 2024

DaveCTurner commented Jul 3, 2024

The current Jackson configuration allows serializing cluster states that cannot be read back #96976

The current Jackson configuration allows serializing cluster states that cannot be read back #96976

Comments

original-brownbear commented Jun 21, 2023

elasticsearchmachine commented Jun 21, 2023

elasticsearchmachine commented May 22, 2024

DaveCTurner commented May 22, 2024

rjernst commented Jun 21, 2024

original-brownbear commented Jun 21, 2024

DaveCTurner commented Jul 3, 2024

DaveCTurner commented Jul 3, 2024