Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A mixed cluster failure due to leak detected in logs #109476

Closed
pgomulka opened this issue Jun 7, 2024 · 8 comments
Closed

A mixed cluster failure due to leak detected in logs #109476

pgomulka opened this issue Jun 7, 2024 · 8 comments
Assignees
Labels
needs:risk Requires assignment of a risk label (low, medium, blocker) :Search/Search Search-related issues that do not fall into other categories Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch Team:Search Meta label for search team >test-failure Triaged test failures from CI

Comments

@pgomulka
Copy link
Contributor

pgomulka commented Jun 7, 2024

CI Link

https://gradle-enterprise.elastic.co/s/5752xbvcac4na/failure#1

Repro line

n/a

Does it reproduce?

Didn't try

Applicable branches

main 8.13.4

Failure history

No response

Failure excerpt

a mixed cluster test failed due to leak detected
https://gradle-enterprise.elastic.co/s/5752xbvcac4na/failure#1

[2024-06-07T12:17:17,259][ERROR][o.e.t.LeakTracker        ] [v8.13.4-2] LEAK: resource was not cleaned up before it was garbage-collected.
»  Recent access records: 
»  #1:
»  	[email protected]/org.elasticsearch.search.SearchHits.deallocate(SearchHits.java:253)
»  	[email protected]/org.elasticsearch.search.SearchHits.decRef(SearchHits.java:244)
»  	[email protected]/org.elasticsearch.search.fetch.FetchSearchResult.deallocate(FetchSearchResult.java:116)
»  	[email protected]/org.elasticsearch.search.fetch.FetchSearchResult.decRef(FetchSearchResult.java:108)
»  	[email protected]/org.elasticsearch.transport.InboundHandler.doHandleResponse(InboundHandler.java:437)
»  	[email protected]/org.elasticsearch.transport.InboundHandler.handleResponse(InboundHandler.java:382)
»  	[email protected]/org.elasticsearch.transport.InboundHandler.executeResponseHandler(InboundHandler.java:147)
»  	[email protected]/org.elasticsearch.transport.InboundHandler.messageReceived(InboundHandler.java:122)
»  	[email protected]/org.elasticsearch.transport.InboundHandler.inboundMessage(InboundHandler.java:96)
»  	[email protected]/org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:821)
»  	[email protected]/org.elasticsearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:124)
»  	[email protected]/org.elasticsearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:96)
»  	[email protected]/org.elasticsearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:61)
»  	[email protected]/org.elasticsearch.transport.netty4.Netty4MessageInboundHandler.channelRead(Netty4MessageInboundHandler.java:48)
»  	[email protected]/io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
»  	[email protected]/io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
»  	[email protected]/io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
»  	[email protected]/io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
»  	[email protected]/io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
»  	[email protected]/io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
»  	[email protected]/io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
»  	[email protected]/io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
»  	[email protected]/io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
»  	[email protected]/io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
»  	[email protected]/io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
»  	[email protected]/io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
»  	[email protected]/io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
»  	[email protected]/io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:689)
»  	[email protected]/io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:652)
»  	[email protected]/io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
»  	and more..
@pgomulka pgomulka added >test-failure Triaged test failures from CI needs:triage Requires assignment of a team area label labels Jun 7, 2024
@elasticsearchmachine elasticsearchmachine added the needs:risk Requires assignment of a risk label (low, medium, blocker) label Jun 7, 2024
@pgomulka pgomulka added :Core/Infra/Core Core issues without another label and removed needs:triage Requires assignment of a team area label needs:risk Requires assignment of a risk label (low, medium, blocker) labels Jun 7, 2024
@elasticsearchmachine elasticsearchmachine added Team:Core/Infra Meta label for core/infra team needs:risk Requires assignment of a risk label (low, medium, blocker) labels Jun 7, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@rjernst
Copy link
Member

rjernst commented Jun 7, 2024

This looks related to search hits, so passing to the search team.

@rjernst rjernst added :Search/Search Search-related issues that do not fall into other categories and removed :Core/Infra/Core Core issues without another label Team:Core/Infra Meta label for core/infra team labels Jun 7, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@elasticsearchmachine elasticsearchmachine added the Team:Search Meta label for search team label Jun 7, 2024
@benwtrent
Copy link
Member

I think this is the bug that @original-brownbear fixed in 8.14, but wasn't backported to 8.13. So our integration tests hit it.

#108562

@benwtrent benwtrent added the Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch label Jun 7, 2024
@rjernst
Copy link
Member

rjernst commented Jun 11, 2024

Another example of this:
https://gradle-enterprise.elastic.co/s/333wraumu24co/console-log?page=3

Excerpt:

[2024-06-11T13:22:55,974][ERROR][o.e.t.LeakTracker        ] [v8.13.4-3] LEAK: resource was not cleaned up before it was garbage-collected.
»  Recent access records:
»  #1:
»  	[email protected]/org.elasticsearch.action.search.SearchResponse.decRef(SearchResponse.java:231)
»  	[email protected]/org.elasticsearch.action.search.MultiSearchResponse.deallocate(MultiSearchResponse.java:166)
»  	[email protected]/org.elasticsearch.action.search.MultiSearchResponse.decRef(MultiSearchResponse.java:155) 
»  	[email protected]/org.elasticsearch.action.ActionListener.respondAndRelease(ActionListener.java:291) 
»  	[email protected]/org.elasticsearch.action.search.TransportMultiSearchAction$1.finish(TransportMultiSearchAction.java:190) 
»  	[email protected]/org.elasticsearch.action.search.TransportMultiSearchAction$1.handleResponse(TransportMultiSearchAction.java:176) 
»  	[email protected]/org.elasticsearch.action.search.TransportMultiSearchAction$1.onResponse(TransportMultiSearchAction.java:161) 
»  	[email protected]/org.elasticsearch.action.search.TransportMultiSearchAction$1.onResponse(TransportMultiSearchAction.java:157) 
»  	[email protected]/org.elasticsearch.action.ActionListener$3.onResponse(ActionListener.java:314)

@benwtrent @original-brownbear The above failure seems to occur several times per day in main and PRs. Is backporting the mentioned PR to 8.13 doable?

@benwtrent
Copy link
Member

@rjernst #108562 is backported, but since there was never an 8.13.5 release, the bugfix version of 8.13.4 doesn't have the commit.

I am not sure what to do here?

I don't know the particular test where its failing, if I did, we could just mute this particular version on this particular test.

@benwtrent
Copy link
Member

@rjernst I will open a PR to mute all collapse yaml tests when ran against 8.13.x.

elasticsearchmachine pushed a commit that referenced this issue Jun 12, 2024
benwtrent added a commit to benwtrent/elasticsearch that referenced this issue Jun 12, 2024
elasticsearchmachine pushed a commit that referenced this issue Jun 12, 2024
* Mute all collapse tests for 8.13 (#109594)

related to: #109476

(cherry picked from commit d846223)

* adjusing skip logic
@benwtrent
Copy link
Member

I have muted the effected tests for < 8.14. Hopefully these failures disappear. Closing issue.

@benwtrent benwtrent self-assigned this Jun 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs:risk Requires assignment of a risk label (low, medium, blocker) :Search/Search Search-related issues that do not fall into other categories Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch Team:Search Meta label for search team >test-failure Triaged test failures from CI
Projects
None yet
Development

No branches or pull requests

4 participants