CN110569302A

CN110569302A - method and device for physical isolation of distributed cluster based on lucene

Info

Publication number: CN110569302A
Application number: CN201910762813.7A
Authority: CN
Inventors: 魏枫; 赵云; 孙迁
Original assignee: Suning Cloud Computing Co Ltd
Current assignee: Suning Cloud Computing Co Ltd
Priority date: 2019-08-16
Filing date: 2019-08-16
Publication date: 2019-12-13

Abstract

The invention discloses a method and a device for physical isolation of distributed clusters based on lucene, wherein the method comprises an index creating process, and the index creating process comprises the following steps: inquiring a current cluster resource list and a current index relation list stored in a distributed memory according to the index creating request of the service, and acquiring the use range of a server to which the service is pre-distributed in a cluster; acquiring a server list which can be used by the service according to the inquired using range of the server; and distributing the fragments of the index to be created to each server according to the current running state of each server in the server list, wherein the fragments comprise a main fragment and/or a sub-fragment. According to the invention, by pre-allocating the server range in the cluster to the service (namely dividing cluster resources for the service), different services are physically isolated in the cluster, and the response rate of all the services is obviously improved (the response rate of the cluster is in the second level).

Description

Method and device for physical isolation of distributed cluster based on lucene

Technical Field

the invention relates to the technical field of cluster management, in particular to a method and a device for physical isolation of distributed clusters based on lucene.

background

at present, for a scenario that multiple services need to use a lucene-based distributed cluster (such as an Elasticsearch cluster), there are two general ways. One way is to directly build indexes on the same lucene-based distributed cluster for different services, and in this way, the cluster resources are fully utilized. The other method is to establish a plurality of clusters, and realize that each service corresponds to an independent cluster, so as to avoid the interference caused by different services.

although the first method can fully utilize cluster resources, the services are in a tightly coupled state, which may cause interference between different services, thereby causing some significant performance problems. For example, the data volume of one service a is relatively large, the data volume of another service B is relatively small, and when the service a frequently sends a search request to the cluster, the search request sent by the service B to the cluster may not be responded in time, thereby causing relatively large interference to the service B. That is, the requests of partial service access clusters can not be responded in time delay, and the response delay is in the minute level or even in the hour level, and other performance problems can be caused. More seriously, when a certain service in the cluster is not operated properly, which causes the conditions of insufficient cluster memory or false dead node, etc., all other service requests will be affected.

the second way is to effectively realize the isolation of the service and the resource by the way of isolating the cluster corresponding to each service. However, when the large data volume service and the small data volume service both correspond to different clusters, the clusters become more and more, and the complexity of cluster operation and maintenance is increased.

disclosure of Invention

in order to solve the problems in the prior art, embodiments of the present invention provide a method and an apparatus for physical isolation of a distributed cluster based on lucene, so as to overcome the problems in the prior art that when a plurality of services share a distributed cluster, the services are tightly coupled to cause mutual interference, and when each service establishes an independent distributed cluster, the clusters become more and more, increasing the complexity of cluster operation and maintenance, and the like.

in order to solve one or more technical problems, the invention adopts the technical scheme that:

In one aspect, a method for physical isolation of a lucene-based distributed cluster is provided, where the method includes creating an index flow, where the creating the index flow includes the following steps:

Inquiring a current cluster resource list and a current index relation list stored in a distributed memory according to an index creating request of a service, and acquiring a use range of a server to which the service is pre-distributed in a cluster;

acquiring a server list which can be used by the service according to the inquired using range of the server;

and distributing the fragments of the index to be created to the servers according to the current running state of each server in the server list, wherein the fragments comprise a main fragment and/or a sub-fragment.

Further, the method further includes a process of generating a current cluster resource list and a current index relationship list stored in the distributed memory:

distributing the service range of the servers in the cluster for the service according to the performance index of the service and the running state of the servers in the cluster;

and updating the pre-established cluster resource list and the index relation list according to the use range of the server, generating a current cluster resource list and a current index relation list, and storing the current cluster resource list and the current index relation list in the distributed memory.

Further, the querying a current cluster resource list and a current index relationship list stored in the distributed memory according to the service index creation request includes:

and judging whether the service has a corresponding physical isolation relationship, and if so, inquiring a current cluster resource list and a current index relationship list stored in the distributed memory according to the index creating request of the service.

Further, the method further includes a writing process, and the writing process includes the following steps:

Inquiring a current cluster resource list and a current index relation list stored in the distributed memory according to an index name in a writing request of a service, and assembling server routing information written into a cluster;

constructing an index id of the service according to the routing information, wherein the index id range is in an id range contained in a routing server;

merging the write-in requests of each piece of data of the service and then forwarding the merged write-in requests to the main slice in parallel;

and synchronously writing the data written into the main fragment into the secondary fragment.

further, the method further comprises a search process, wherein the search process comprises the following steps:

Inquiring a current cluster resource list and a current index relation list stored in the distributed memory according to an index name in a search request of a service, and assembling server routing information of a search cluster;

Constructing a destination fragment list according to the server routing information;

Traversing the fragment list, and sending a search request to each fragment in the fragment list in parallel so that each fragment responds to the search request and executes corresponding search.

in another aspect, an apparatus for physical isolation of a lucene-based distributed cluster is provided, the apparatus comprising an index creation module, the index creation module comprising:

The information query unit is used for querying a current cluster resource list and a current index relation list stored in the distributed memory according to the index creating request of the service, and acquiring the service range of a server pre-allocated to the service in the cluster;

The list generating unit is used for acquiring a server list which can be used by the service according to the inquired using range of the server;

and the fragment distribution unit is used for distributing the fragments of the index to be created to the servers according to the current running state of each server in the server list, wherein the fragments comprise a main fragment and/or an auxiliary fragment.

Further, the apparatus further includes a relationship creating module, where the relationship creating module is configured to generate a cluster resource list and an index relationship list that are currently stored in the distributed memory, and the relationship creating module includes:

The resource dividing unit is used for distributing the service range of the servers in the cluster for the service according to the performance index of the service and the running state of the servers in the cluster;

And the list updating unit is used for updating the pre-established cluster resource list and the index relation list according to the use range of the server, generating a current cluster resource list and storing the current index relation list into the distributed memory.

Further, the apparatus further comprises:

and the relationship judgment module is used for judging whether the corresponding physical isolation relationship exists in the service.

further, the apparatus further comprises a data writing module, wherein the data writing module comprises:

The first assembly unit is used for inquiring a current cluster resource list and a current index relation list stored in the distributed memory according to an index name in a writing request of a service, and assembling server routing information written into a cluster;

the index id construction unit is used for constructing an index id of the service according to the routing information, wherein the index id range is in the id range contained in the routing server;

The data writing unit is used for merging the writing requests of each piece of data of the service and then forwarding the merged writing requests to the main slice in parallel;

And the data synchronization unit is used for synchronously writing the data written into the main fragment into the secondary fragment.

Further, the apparatus further comprises a data search module, and the data search module comprises:

The second assembly unit is used for inquiring a current cluster resource list and a current index relation list stored in the distributed memory according to the index name in the search request of the service, and assembling the server routing information of the search cluster;

A list construction unit for constructing a destination shard list according to the server routing information;

And the request sending unit is used for traversing the fragment list and sending a search request to each fragment in the fragment list in parallel so that each fragment responds to the search request and executes corresponding search.

the technical scheme provided by the embodiment of the invention has the following beneficial effects:

1. According to the method and the device for physical isolation of the distributed cluster based on the lucene, provided by the embodiment of the invention, by pre-allocating the server range in the cluster to the service (namely dividing cluster resources for the service), physical isolation of different services in the cluster is realized, and the response rate of all the services is remarkably improved (the response rate of the cluster is in the second level);

2. the method and the device for physical isolation of the distributed cluster based on the lucene provided by the embodiment of the invention manage the fragment distribution by using the physical distributor and the physical perception decision-making machine, realize the fragment balance in the index, and provide the guarantee of high availability and load balance for the physical isolation of the service in the cluster;

3. The method and the device for physical isolation of the distributed cluster based on the lucene provided by the embodiment of the invention unify the cluster management entries and improve the cluster operation and maintenance efficiency.

drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a flow diagram illustrating a create index flow in a method for physical isolation of lucene-based distributed clusters, according to an example embodiment;

FIG. 2 is a flow diagram illustrating a process of generating a current list of cluster resources stored in a distributed memory and a current list of indexing relationships in a method for physical isolation of lucene-based distributed clusters in accordance with an illustrative embodiment;

FIG. 3 is a flow diagram illustrating write flow in a method of physical isolation of a lucene-based distributed cluster, according to an example embodiment;

FIG. 4 is a flow diagram illustrating a search flow in a method of physical isolation of a lucene-based distributed cluster, according to an example embodiment;

Fig. 5 is a block diagram illustrating an apparatus for physical isolation of a lucene-based distributed cluster, according to an example embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

the method and the device for physical isolation of the distributed cluster based on the lucene uniformly manage the service and the server resource range in the distributed cluster based on the lucene, provide access service for all nodes in the cluster, realize physical isolation of different services in one cluster, manage fragment distribution by using a physical distributor and a physical perception decision-making machine, realize fragment balance in indexes, and provide high availability and load balance guarantee for physical isolation of services in the cluster. The method solves the problem that the services are mutually interfered due to tight coupling when the services share one cluster, realizes that the services in one cluster are physically isolated from each other so as to ensure the performance requirement of each service, can avoid the condition of establishing an independent cluster for each service as much as possible, forms a plurality of physical nodes in one cluster for the service by dividing the server resources of a plurality of services on one cluster, gives full play to the utilization rate of the cluster resources, and reduces the complexity of cluster management and operation and maintenance.

it should be noted here that the technical solution provided by the embodiment of the present invention is applicable to a lucene-based distributed cluster, such as an Elasticsearch cluster. The scheme of the present invention is described below by taking an elastic search cluster as an example, but it should be noted here that the scheme provided by the embodiment of the present invention is not limited to be implemented in the elastic search cluster, and other lucene-based distributed clusters are also applicable to the scheme of the present invention.

Fig. 1 is a flowchart illustrating an index creation process in a method for physical isolation of a lucene-based distributed cluster according to an exemplary embodiment, and referring to fig. 1, the index creation process includes the following steps:

s101: and inquiring a current cluster resource list and a current index relation list stored in a distributed memory according to the index creating request of the service, and acquiring the use range of a server to which the service is pre-distributed in a cluster.

Specifically, first, in the embodiment of the present invention, the service is pre-allocated to the use range of the server in the cluster. In specific implementation, a request for applying for a cluster resource may be sent to a cluster (e.g., an elastic search cluster) administrator for each service, after receiving the request, the cluster administrator divides a service range and/or a node range of a server in the cluster according to performance indexes such as data volume of the service and query response, and stores the service range and/or the node range of the server in a cluster resource list (which may be represented by businesscluster range) for subsequent process query, where the node includes but is not limited to a physical machine. Secondly, in the embodiment of the present invention, a corresponding relationship between the service and the index name needs to be agreed in advance, and the corresponding relationship between the service and the index name needs to be stored in an index relationship list (which may be represented by businesslndices mapper) for later use. In the embodiment of the invention, the cluster resource list and the index relation list are maintained and updated in real time or regularly.

after receiving an index creating request of a service, inquiring a current cluster resource list and a current index relation list stored in a distributed memory according to the index creating request of the service, and acquiring a use range of a server pre-allocated to the service in a cluster. During specific implementation, the service name corresponding to the name is found in the index relation list according to the index name in the index creating request, and then the service range of the server corresponding to the service is inquired in the cluster resource list according to the service name, so that preparation is made for fragment distribution.

s102: and acquiring a server list which can be used by the service according to the inquired using range of the server.

Specifically, after the usage scope of the server is queried according to the above steps, all servers that can be used (i.e., pre-allocated) by the current service are obtained from the usage scope of the server, and a server list is generated.

S103: and distributing the fragments of the index to be created to the servers according to the current running state of each server in the server list, wherein the fragments comprise a main fragment and/or a sub-fragment.

specifically, the current operating state (such as load condition) of each server in the server list is comprehensively considered, and the fragments of the index to be created corresponding to the index creation request are distributed to each server, where the fragments include a main fragment and/or a sub-fragment. In a specific implementation, a physical distributor (represented by a physical allocator) and a physical perception decision maker (represented by a physical aware allocation decider) may be additionally arranged, wherein the physical perception decision maker includes a custom decision maker, a load balancing decision maker and the like. And then fixing the server addresses of the main fragment and the copy fragment to be distributed in a corresponding server list through a physical distributor, and finishing final fragment distribution in the circled service addresses through a physical perception decision maker according to the strategies of a user-defined decision maker and a load balancing decision maker. It should be noted that, after the index creation is completed, the newly created index related information needs to be updated into the index relationship list.

As an example, the perceptual decision policy of a physical perceptual decision maker may refer to the following:

1. Server load condition, if the current server load is relatively high, distributing to the server with lower load;

2. the number of fragments, if the number of fragments on the current server is more than that of other servers, the fragments are preferentially distributed to other servers;

3. The method comprises the following steps that (1) disk space is allocated to a server with larger disk space preferentially if the utilization rate of a disk on the server is higher than that of disks of other servers;

4. the primary and secondary sub-slices are prevented from being distributed on the same server;

5. The main fragment and the copy fragment are preferentially distributed on servers crossing the rack and the computer room.

Fig. 2 is a flowchart illustrating a process of generating a current cluster resource list and a current index relationship list stored in a distributed memory in a method for physical isolation of 1 uce-based distributed clusters according to an exemplary embodiment, and as a preferred implementation, referring to fig. 2, in an embodiment of the present invention, the method further includes a process of generating a current cluster resource list and a current index relationship list stored in a distributed memory, and the specific steps are as follows:

s201: and distributing the service range of the servers in the cluster for the service according to the performance index of the service and the running state of the servers in the cluster.

Specifically, the performance index of the service includes indexes such as data volume and query response of the service, and the running state of the server includes load conditions, a CPU, a memory, and the like of the server. And comprehensively considering factors such as the performance index of the service, the running state of the server and the like, and distributing the use range of the server in the cluster for the service. For example, when the data volume of a service is large, the usage range of the server allocated to the service in the cluster is large. The distribution strategy of the use range of the server is not limited too much, the user can set according to the actual requirement, and in the embodiment of the invention, the distribution strategy can be set to support dynamic adjustment.

S202: and updating the pre-established cluster resource list and the index relation list according to the use range of the server, generating a current cluster resource list and a current index relation list, and storing the current cluster resource list and the current index relation list in the distributed memory.

specifically, in the embodiment of the present invention, a cluster resource list and an index relationship list are created in advance, where the cluster resource list stores a correspondence between a service and a cluster server resource, and the index relationship list stores a correspondence between a service and an agreed index name. After the service scope of the server in the cluster is allocated to the service, the information needs to be updated to the cluster resource list and the index relationship list for subsequent use.

in the embodiment of the invention, the updated cluster resource list and the index relation list are persisted to a distributed memory for storage and are provided for cluster access. It should be noted here that storing the physical isolation metadata in the distributed storage may ensure that the physical isolation metadata is still accessible after a cluster fails (e.g., a node goes down), so that high availability of the physical isolation metadata may be achieved.

as a preferred implementation manner, in the embodiment of the present invention, the querying, according to the index creation request of the service, the current cluster resource list and the current index relationship list stored in the distributed memory includes:

Specifically, after receiving an index creating request of a service, it may be determined whether the service has a corresponding physical isolation relationship (i.e., whether a service has a service range pre-allocated to a server in a cluster), if so, a pre-created cluster resource list and an index relationship list are queried according to the index creating request of the service, and preparation is made for allocating a fragment. When index creation and fragment distribution are directly performed, a user can comprehensively consider factors such as performance indexes of services and operation states of servers in a cluster so as to achieve load balancing.

It should be noted that, in the embodiment of the present invention, a physical partition management module (denoted by physical partition) may be configured to manage the service range and the use range of the cluster server. And when judging whether the corresponding physical isolation relationship exists in the service, the method can be realized by judging whether the physical partition parameter is empty, if the physical partition parameter is empty, the service does not have the corresponding physical isolation relationship, and if the physical partition parameter is not empty, the service has the corresponding physical isolation relationship.

Fig. 3 is a flow chart illustrating a write flow in a method for physical isolation of a lucene-based distributed cluster according to an exemplary embodiment, and referring to fig. 3, as a preferred implementation, in an embodiment of the present invention, the method further includes a write flow, where the write flow includes the following steps:

S301: and inquiring a current cluster resource list and a current index relation list stored in the distributed memory according to the index name in the writing request of the service, and assembling the server routing information written into the cluster.

specifically, a write-in request of a service is received, a current index relation list stored in a distributed memory is queried according to an index name in the write-in request, the service name corresponding to the index name is obtained, then a current cluster resource list stored in the distributed memory is queried according to the service name, a use range of a server pre-allocated to the service in a cluster is obtained, all server information corresponding to the service is queried, and server routing information written into the cluster is assembled.

s302: and constructing an index id of the service according to the routing information, wherein the index id range is in an id range contained in a routing server.

Specifically, the index id is used to identify a document id and exists uniquely. The id is constructed and generated according to the server address list and the index fragment number, and with the id, the server to which the document is written can be definitely known.

S303: and merging the write-in requests of each piece of data of the service and then forwarding the merged write-in requests to the main slice in parallel.

Specifically, the write requests of each piece of data of the service are merged and then forwarded to the corresponding main slice in parallel, and data write operation is performed.

S304: and synchronously writing the data written into the main fragment into the secondary fragment.

fig. 4 is a flowchart illustrating a search flow in a method for physical isolation of a lucene-based distributed cluster according to an exemplary embodiment, and referring to fig. 4, as a preferred implementation, in an embodiment of the present invention, the method further includes a search flow, and the search flow includes the following steps:

s401: and inquiring a current cluster resource list and a current index relation list stored in the distributed memory according to the index name in the search request of the service, and assembling the server routing information of the search cluster.

specifically, a search (or query) request of a service is received, a current index relationship list stored in a distributed memory is queried according to an index name in the search request, the service name corresponding to the index name is obtained, then a current cluster resource list stored in the distributed memory is queried according to the service name, a use range of a server pre-allocated to the service in a cluster is obtained, all server information corresponding to the service is queried, and server routing information of the search cluster is assembled.

s402: and constructing a destination fragment list according to the server routing information.

Specifically, all fragments corresponding to the service are inquired according to the server routing information, and a destination fragment list is constructed.

S403: and traversing the fragment list, and sending query requests to all the fragments in the fragment list in parallel so that all the fragments respond to the query requests and execute corresponding queries.

specifically, the fragment list obtained in the above steps is traversed, a search request is sent to each fragment in the fragment list in parallel, and after receiving the search request, each fragment will respond to the request and execute a corresponding search operation.

The index creation process is illustrated below:

Suppose there are 100 servers (100 nodes) in a certain Eltics research cluster, and it is represented by server00 … server 99. There are 10 business parties that need to use this cluster, and in order to avoid mutual interference between the businesses, the clusters need to be physically isolated. Assuming that the usage of the cluster by 10 service parties is substantially even, 100 servers in the cluster can be divided into 10 shares, each of which contains 10 servers. The average division here is only an example and does not limit the scheme of the present invention. Suppose that the business names start with business _ uniformly, and the names of 10 business parties are business _ A, business _ B.

After dividing the service and cluster server resources, a service and cluster resource list is obtained and represented by BussinessClusterRange. Then BusinessClusterRange has

(business_A，(server00，server01，...，server09))

(business_B，(server10，server11，...，server19))

…

(business_J，(server90，server91，...，server99))

The service has an agreed correspondence with the index name. It is assumed that the index name of the service business _ a is uniformly prefixed to business _ a _ index, the index of business _ B is prefixed to business _ B _ index, and so on.

The corresponding relationship between the index and the service can be constructed and recorded into the index relationship list (represented by businesslndices mapper), and there is businesslndices mapper

(business_A_index_*，business_A)

(business_B_index_*，business_B)

…

(business_J_index_*，business_J)

when the service a needs to create an index with an index name of business _ a _ index _ computer _01 (assuming that the number of master shards is 5 and each master shard has 1 copy shard), the service name of the service a is found in the index relationship list as business _ a, and then a service-owned server list is found in the cluster resource list according to the service name business _ a (server00, server 01.., server 09).

When the index fragments are allocated, a physical allocator (which may be called a physical allocator) is used to determine that the server list for allocating the fragments is (server00, server 01.., server 09).

5 master slices and replica slices are allocated using a physical aware arbiter (which may be called a physical aware decider) equal-slice allocator. And the Elstics research broadcasts the metadata information of the index and the fragment and updates the cluster state, and the index creation is completed.

fig. 5 is a schematic structural diagram illustrating an apparatus for physical isolation of a lucene-based distributed cluster according to an exemplary embodiment, and referring to fig. 5, the apparatus includes an index creating module, which includes:

As a preferred implementation manner, in an embodiment of the present invention, the apparatus further includes a relationship creating module, where the relationship creating module is configured to generate a current cluster resource list and a current index relationship list stored in the distributed memory, and the relationship creating module includes:

as a preferred implementation manner, in an embodiment of the present invention, the apparatus further includes:

As a preferred implementation manner, in an embodiment of the present invention, the apparatus further includes a data writing module, where the data writing module includes:

as a preferred implementation manner, in an embodiment of the present invention, the apparatus further includes a data search module, where the data search module includes:

in summary, the technical solution provided by the embodiment of the present invention has the following beneficial effects:

It should be noted that: in the physical isolation apparatus based on the lucene distributed cluster provided in the foregoing embodiment, when an isolation service is triggered, only the division of each functional module is used for illustration, and in practical applications, the function distribution may be completed by different functional modules according to needs, that is, the internal structure of the apparatus is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the apparatus for physical isolation of a distributed cluster based on lucene provided in the foregoing embodiment and the method embodiment for physical isolation of a distributed cluster based on lucene belong to the same concept, that is, the apparatus is based on the method for physical isolation of a distributed cluster based on lucene, and a specific implementation process thereof is detailed in the method embodiment and is not described herein again.

it will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

the above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. a method for physical isolation of lucene-based distributed clusters, the method comprising creating an index flow, the creating an index flow comprising the steps of:

2. The method of physical isolation of a lucene-based distributed cluster as claimed in claim 1, further comprising the process of generating a current list of cluster resources and a current list of indexing relationships stored in the distributed memory:

3. the method of any of claims 1 or 2, wherein querying the current cluster resource list and the current index relationship list stored in the distributed memory according to the index creation request of the service comprises:

4. the method of physical isolation of a lucene-based distributed cluster according to any of claims 1 or 2, further comprising a write flow, said write flow comprising the steps of:

5. method of physical isolation of lucene-based distributed clusters according to claim 1 or 2, characterized in that it further comprises a search procedure comprising the steps of:

6. an apparatus for physical isolation of a lucene-based distributed cluster, the apparatus comprising an index creation module comprising:

7. the apparatus for physical isolation of a lucene-based distributed cluster as claimed in claim 6, further comprising a relationship creation module for generating a current cluster resource list and a current index relationship list stored in the distributed memory, the relationship creation module comprising:

8. the apparatus for physical isolation of a lucene-based distributed cluster according to claim 6 or 7, further comprising:

9. the apparatus for physical isolation of a lucene-based distributed cluster according to claim 6 or 7, further comprising a data write module comprising:

10. The apparatus for physical isolation of a lucene-based distributed cluster according to claim 6 or 7, further comprising a data search module comprising: