[PIP-143] Support split bundle by specified boundaries #13761
Labels
type/enhancement
The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages
type/PIP
Motivation
As we all know, a namespace bundle may contain lots of topic partitions belonging to different topics.
The throughput of these topics may vary greatly. Some topics may have a very high rate/throughput while other topics have a very low rate/throughput.
These partitions with high rate/throughput can cause broker overload and bundle unloading.
At this point, if we split bundle manually with
range_equally_divide
ortopic_count_equally_divide
split algorithm, there may need many times split before these high rate/through partitions assigned to different bundles.For convenience, we call these high throughput topics
outstanding topic
and their partitionsoutstanding partition
in this PIP.Goal
Our goal is to make it easier to split
outstanding partition
into new bundles. So we raised up this PIP to introduce a more flexible algorithm to split namespace bundle.The main idea is, for topics in a bundle, we can get their hash position for every topic first. After getting these hash positions, it's much easier for us to decide the position to split the bundle. We can split the bundle into either two throughput-equally bundles or multi throughput-equally bundles.
For example, there is bundle with boundaries
0x00000000
to0x00000200
, and four topics :t1
,t2
,t3
,t4
.Step one. Get the hash position of these topics
t1
with hashcode 10t2
with hashcode 20t3
with hashcode 80t4
with hashcode 90Step two. Split the bundle
Here we have multi choices, like :
topic_count_equally_divide
way, we can split at position between 21 ~ 80API Changes
We need two API changes for this PIP.
Implementation
New API for getting topics positions
Add a new admin command
GetTopicHashPositions
forCmdNamespaces
,Add a new GET method
getTopicHashPositions
forNamespaces
Add support for the split bundle by specified hash positions
Change the admin API to support split bundle by specified hash positions(split boundaries) in
CmdNamespaces
,Change the method of
Namespaces
, adding a parameter for split boundaries.For code consistency, encapsulates all the parameters for bundle split into a new class
BundleSplitOption
Then add a new
NamespaceBundleSplitAlgorithm
namedSpecifiedPositionsBundleSplitAlgorithm
which can valid the split boundaries and return the final split boundaries.Also, add the new bundle split algorithm to conf/broker.conf
Reject Alternatives
Splitting the bundle by
outstanding topic
which will split the bundle into two new bundles and each new bundle contains an equallyoutstanding partition
once a time. This algorithm has a disadvantage, it can only deal with oneoutstanding topic
.The text was updated successfully, but these errors were encountered: