Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle malformed date when objects are supplied #110049

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

salvatore-campagna
Copy link
Contributor

When an object is supplied as a value for a field whose type is date or date_nanos we need
to trigger handling of the field value using our ignore malformed handling strategy.

Resolves #109539

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

@elasticsearchmachine
Copy link
Collaborator

Hi @salvatore-campagna, I've created a changelog YAML for you.

Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. One question.

refresh: true
index: test
id: "2"
body: { "date_ignored": [ 1, 2, 3 ], "date_not_ignored": "2024-02-03T10:12:43.123Z", date_nanos_ignored: [ 10, 20, 30 ], "date_nanos_not_ignored": "2024-02-03T10:12:43.123456789Z" }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mvg see this

refresh: true
index: test
id: "3"
body: { "date_ignored": [ { "key": "a", "value": 10 }, { "key": "c", "value": 100 } ], "date_not_ignored": "2024-02-03T10:12:43.123Z", date_nanos_ignored: [ { "key": "b", "value": 20 }, { "key": "d", "value": 40 } ], "date_nanos_not_ignored": "2024-02-03T10:12:43.123456789Z" }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mvg and this other one

Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 from my side

Copy link
Contributor

@lkts lkts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a question about a potential API change, otherwise LGTM

}
return;
} else {
throw new IllegalArgumentException("Unable to parse object as a " + mappedFieldType.name() + " field");
Copy link
Contributor

@lkts lkts Jun 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Paranoid check - isn't this an API change? Previously we would have returned an error directly from parser which is not only a different text but a different type of exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will double check this...probably I will add also full-bwc label.

Copy link
Contributor Author

@salvatore-campagna salvatore-campagna Jun 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AT least I hope this is tested...but probably it is not...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea i don't know if we explicitly test such things.

index:
refresh: true
index: test
id: "1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it intentional that we use the same id as previous successful request?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No this is a mistake.

- match: { hits.hits.0._source.date_nanos_ignored.value: 20 }
- match: { hits.hits.0._source.date_nanos_not_ignored: "2024-02-03T10:12:43.123456789Z" }

# NOTE: this is interesting...numeric values translate to millis since epoch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default format indeed accepts epoch_millis so these are not malformed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to test a couple of more cases...arrays with valid values and arrays with mixed valid and invalid values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

date field types fail to index specific malformed data even with ignore_malformed enabled
4 participants