CN113298127B

CN113298127B - Method for training anomaly detection model and electronic equipment

Info

Publication number: CN113298127B
Application number: CN202110517926.8A
Authority: CN
Inventors: 卢冠男; 孙芮; 莫林林; 王雅琪
Original assignee: WeBank Co Ltd
Current assignee: WeBank Co Ltd
Priority date: 2021-05-12
Filing date: 2021-05-12
Publication date: 2024-08-06
Anticipated expiration: 2041-05-12
Also published as: CN113298127A

Abstract

The embodiment of the invention discloses a method and electronic equipment for training an anomaly detection model, wherein the method for training the anomaly detection model comprises the following steps: inputting the feature vector corresponding to at least one set abnormal event and the corresponding first weight into an abnormal detection model to obtain a first probability value corresponding to each set abnormal type in a plurality of set abnormal types corresponding to each set abnormal event in the at least one set abnormal event; calculating a loss value of the anomaly detection model by adopting a set loss function; the loss value is calculated based on a first probability value and a second probability value corresponding to each set abnormal event in the at least one set abnormal event; the second probability value represents a calibration probability value corresponding to a set abnormal type corresponding to a set abnormal event; and updating the weight parameters of the anomaly detection model according to the loss value.

Description

Method for training anomaly detection model and electronic equipment

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a method for training an anomaly detection model and an electronic device.

Background

With the development of computer technology, more and more technologies (e.g., big data, artificial intelligence, etc.) are applied in the financial field, and the traditional financial industry is gradually changing to the financial technology, however, the financial technology also puts higher demands on the technologies due to the requirements of safety and real-time property of the financial industry. In the field of finance and technology, under the condition that the multi-classification model is utilized to detect the abnormal type of the abnormal event, the prediction probability of each set abnormal type of the abnormal event output by the multi-classification model is inaccurate, so that the abnormal type of the abnormal event determined based on the prediction probability is also inaccurate.

Disclosure of Invention

In view of the above, an embodiment of the present invention provides a method for training an anomaly detection model and an electronic device, so as to solve the technical problem that the anomaly type of the anomaly event determined in the related art is inaccurate.

In order to achieve the above purpose, the technical scheme of the invention is realized as follows:

the embodiment of the invention provides a method for training an anomaly detection model, which comprises the following steps:

inputting the feature vector corresponding to at least one set abnormal event and the corresponding first weight into an abnormal detection model to obtain a first probability value corresponding to each set abnormal type in a plurality of set abnormal types corresponding to each set abnormal event in the at least one set abnormal event;

Calculating a loss value of the anomaly detection model by adopting a set loss function; the loss value is calculated based on a first probability value and a second probability value corresponding to each set abnormal event in the at least one set abnormal event; the second probability value represents a calibration probability value corresponding to a set abnormal type corresponding to a set abnormal event;

and updating the weight parameters of the anomaly detection model according to the loss value.

In the above scheme, the method further includes determining a feature vector corresponding to the set abnormal event by:

Determining at least one characteristic information corresponding to the set abnormal event based on at least one of the history log, the history alarm information and the version release record;

and fusing the feature vectors corresponding to the determined feature information to obtain the feature vectors corresponding to the set abnormal event.

In the above scheme, the fusing the feature vectors corresponding to the determined feature information includes:

converting the characteristic information of each abnormal index corresponding to the set abnormal event into a corresponding first vector;

summing all the first vectors corresponding to the set abnormal event to obtain a second vector;

transversely combining the second vector and the third vector to obtain a feature vector corresponding to the set abnormal event; wherein the third vector characterizes a vector corresponding to the characteristic information except the characteristic information of the abnormality index.

In the above scheme, the method further comprises:

determining a first weight corresponding to each set abnormal event in at least two set abnormal events meeting set conditions based on the set normal distribution curve; wherein,

When the occurrence time of the first set abnormal event is later than that of the second set abnormal event, the first weight corresponding to the first set abnormal event is greater than the second weight corresponding to the second set abnormal event; the setting condition characterizes that at least two setting abnormal events correspond to the same product or have the same characteristic information.

In the above solution, the calculating the loss value of the anomaly detection model by using a set loss function includes:

respectively calculating a first loss value, a second loss value and a third loss value based on the corresponding set loss function;

determining a loss value of the anomaly detection model based on the first loss value, the second loss value, and the third loss value; wherein,

The first loss value characterizes the difference between the first probability value corresponding to each set abnormal event in the at least one set abnormal event and the corresponding second probability value;

The second loss value represents the average value of the maximum probability value corresponding to each set abnormal event in the at least one set abnormal event; the maximum probability value characterizes the maximum value of the product between the first probability value corresponding to each set abnormality type and the second probability value corresponding to each set abnormality type;

the third loss value characterizes a mean value of a sum of all first probability values corresponding to each of the at least one set anomaly event.

In the above solution, the setting a loss function includes: the first set loss function, the second set loss function, and the third set loss function.

In the above scheme, the second loss value is calculated by:

Calculating a first product between a first probability value and a second probability value corresponding to each set abnormality type corresponding to each set abnormality event in the at least one set abnormality event based on the second set loss function; and

And determining a second loss value based on the maximum first product corresponding to each set abnormal event in the at least one set abnormal event.

In the above scheme, the third loss value is calculated by:

and calculating the average value of the sum of all the first probability values corresponding to each set abnormal event in the at least one set abnormal event based on the third set loss function to obtain a third loss value.

In the above scheme, the method further comprises:

Inputting a feature vector corresponding to a first abnormal event into a first model to obtain a prediction probability corresponding to each set abnormal type in a plurality of set abnormal types corresponding to the first abnormal event;

determining a set abnormal type to which the first abnormal event belongs based on the prediction probability; the first model is an abnormality detection model obtained by training by the method for training the abnormality detection model according to any one of the schemes.

The embodiment of the invention also provides electronic equipment, which comprises:

the prediction unit is used for inputting the feature vector corresponding to the at least one set abnormal event and the corresponding first weight into the abnormal detection model to obtain a first probability value corresponding to each set abnormal type in a plurality of set abnormal types corresponding to each set abnormal event in the at least one set abnormal event;

A calculation unit for calculating a loss value of the abnormality detection model using a set loss function; the loss value is calculated based on a first probability value and a second probability value corresponding to each set abnormal event in the at least one set abnormal event; the second probability value represents a calibration probability value corresponding to a set abnormal type corresponding to a set abnormal event;

and the updating unit is used for updating the weight parameters of the anomaly detection model according to the loss value.

The embodiment of the invention also provides electronic equipment, which comprises: a processor and a memory for storing a computer program capable of running on the processor,

Wherein the processor is configured to execute any one of the above-described methods of training an anomaly detection model when the computer program is run.

The embodiment of the invention also provides a storage medium, on which a computer program is stored, the computer program implementing the steps of any one of the methods for training an anomaly detection model described above when being executed by a processor.

According to the embodiment of the invention, the anomaly detection model is trained based on at least one feature vector corresponding to the set anomaly event and the corresponding first weight, a loss value of the anomaly detection model is calculated by adopting a set loss function in the training process, and the weight parameters of the anomaly detection model are updated according to the calculated loss value. The first weight corresponding to the set abnormal event is input into the abnormal detection model for training, so that the abnormal detection model pays attention to the feature vector of the set abnormal event with the first weight preferentially, the accuracy of the first probability value corresponding to the set abnormal event with the first weight can be improved, and the set abnormal type of the set abnormal event can be accurately determined. Therefore, when the electronic equipment performs abnormality detection by using the trained abnormality detection model, the prediction probability that the first abnormal event belongs to each set abnormality type can be accurately predicted, and the accuracy of the set abnormality type of the abnormal event determined based on the prediction probability is further improved.

Drawings

FIG. 1 is a schematic diagram of an implementation flow of a method for training an anomaly detection model according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of an implementation flow of an anomaly detection method using an anomaly detection model according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present invention;

Fig. 4 is a schematic diagram of a hardware composition structure of an electronic device according to an embodiment of the present invention.

Detailed Description

The technical scheme of the invention is further elaborated below by referring to the drawings in the specification and the specific embodiments.

Fig. 1 is a schematic implementation flow chart of a method for training an anomaly detection model according to an embodiment of the present invention, where an execution body of the flow is an electronic device such as a terminal, a server, etc. As shown in fig. 1, the method of training the anomaly detection model includes:

Step 101: and inputting the feature vector corresponding to the at least one set abnormal event and the corresponding first weight into an abnormal detection model to obtain a first probability value corresponding to each set abnormal type in a plurality of set abnormal types corresponding to each set abnormal event in the at least one set abnormal event.

The electronic equipment determines at least one set of related data of the abnormal event from the abnormal event sample data set and determines a first weight corresponding to each set of abnormal event; extracting feature information of the set abnormal event from the related data of the set abnormal event, and determining a feature vector corresponding to the feature information of the set abnormal event; and inputting the determined feature vector and the corresponding first weight corresponding to the set abnormal event into an abnormal detection model, and processing the feature vector and the corresponding first weight corresponding to the set abnormal event by adopting the abnormal detection model to obtain a first probability value corresponding to each set abnormal type in a plurality of set abnormal types corresponding to each set abnormal event. Wherein,

The anomaly detection model is a multi-classification model and is used for predicting the probability that the set anomaly event belongs to each set anomaly type in a plurality of set anomaly types. The anomaly detection model is composed of a deep neural network (DNN, deep Neural Networks); the first probability value characterizes a predictive probability. In actual application, the electronic equipment acquires a first array output by the abnormality detection model; the first array characterizes a first probability value corresponding to each of a plurality of set exception types corresponding to each of at least one set exception event. The line number of the first array represents the total number of the set abnormal events of the current batch input abnormal detection model; the number of columns of the first array characterizes the total number of set exception types. The first probability value corresponding to the mth row and the nth column in the first array represents the probability (prediction probability) that the mth set abnormal event belongs to the nth set abnormal type.

The abnormal event sample data set comprises a plurality of related data corresponding to the set abnormal event. The set exception event characterizes an exception event monitored during the running of the software system, one corresponding to at least one software product, each software product having at least one set function, each software product corresponding to at least one set performance index.

In actual application, the related data of the set abnormal event comprises at least one of a history log, history alarm information and version release record. The feature information of the set abnormal event includes at least one of: characteristic information of an abnormal index, characteristic information of an interrupt event, characteristic information of a change operation, characteristic information of an alarm event, characteristic information of an abnormal system and the like.

In actual application, when each of the at least two determined set abnormal events is derived from different software products or the characteristic information corresponding to each of the determined set abnormal events is not identical, determining the first weight corresponding to each of the determined set abnormal events as 1.

When the determined at least two set abnormal events correspond to the same product or the determined at least two set abnormal events have the same characteristic information, determining a first weight corresponding to the set abnormal event based on the occurrence time of the set abnormal event. Since setting an abnormal event with a later occurrence time has a greater reference meaning for root cause positioning, setting an abnormal event with a later occurrence time has a greater corresponding first weight.

In order to accurately determine the first weight corresponding to the set abnormal event, in some embodiments, the method further includes:

Determining a first weight corresponding to each set abnormal event in at least two set abnormal events meeting set conditions based on the set normal distribution curve; when the occurrence time of the first set abnormal event is longer than that of the second set abnormal event, the first weight corresponding to the first set abnormal event is greater than the second weight corresponding to the second set abnormal event; the setting condition characterizes that at least two setting abnormal events correspond to the same product or have the same characteristic information.

Here, the curve values of the set normal distribution curves are all smaller than 1 and larger than 0. In practical application, aiming at k set abnormal events of the same product or k set abnormal events with the same characteristic information, the electronic equipment divides a right side curve of a center line (y axis) of a set normal distribution curve into k equal parts to obtain k curve values; and determining a first weight value of each set abnormal event from the k curve values based on the occurrence time of the k set abnormal events.

When at least one feature vector corresponding to the set abnormal event and the corresponding first weight are input to the abnormality detection model, the abnormality detection model preferentially focuses on the feature vector of the set abnormal event with the first weight, thereby ensuring the accuracy of the first probability value corresponding to the set abnormal event with the first weight.

In some embodiments, the method further includes determining a feature vector corresponding to the set exception event by:

The electronic device determines at least one feature information corresponding to the set abnormal event based on at least one of a history log, history alarm information and version release record corresponding to the set abnormal event, determines a feature vector corresponding to each feature information corresponding to the set abnormal event, and fuses the determined feature vectors corresponding to each feature information to obtain the feature vector corresponding to the set abnormal event. In practical application, the feature vectors are fused, which means that the feature vectors are combined.

In this embodiment, feature information of an external factor corresponding to a set abnormal event may be determined based on the version release record; based on at least one of the history log and the history alarm information, the characteristic information of the internal factors corresponding to the set abnormal event can be determined; therefore, feature information corresponding to the set abnormal event is enriched, and the accuracy of the first probability value obtained by prediction can be improved; and fusing the feature vectors corresponding to different types of feature information corresponding to the set abnormal event to obtain the feature vector corresponding to the set abnormal event, so that the feature vector can be subjected to dimension reduction, and the data processing efficiency of the abnormal detection model is improved.

Considering that the number of the set abnormal events is relatively small, the excessive dimensionality of the feature vector corresponding to the set abnormal event affects the classification efficiency of the abnormal detection model.

In some embodiments, the fusing the feature vectors corresponding to the determined feature information includes:

Here, the abnormal index refers to a setting index for triggering an alarm; the setting index for triggering the alarm is determined based on the history log or the alarm information. In practical application, the setting index comprises at least one of the following: traffic volume, traffic success rate, time delay, etc.

And under the condition that the number of the abnormal indexes is at least two, the electronic equipment converts the characteristic information of each abnormal index corresponding to the set abnormal event into corresponding first vectors according to the set hierarchical structure, and sums all the first vectors corresponding to the set abnormal event to obtain second vectors, so that the characteristic vectors corresponding to all the abnormal indexes are obtained. In the case where only one abnormality index corresponding to an abnormality event is set, the first vector is equal to the second vector.

The electronic equipment determines a third vector corresponding to each set abnormal event based on at least one of the characteristic information of the interrupt event, the characteristic information of the change operation, the characteristic information of the alarm event and the characteristic information of the abnormal system; and transversely combining the second vector and the third vector corresponding to each set abnormal event to obtain the feature vector corresponding to the set abnormal event.

In actual application, the characteristic information of the abnormality index extracted by the electronic equipment comprises a product identifier, a scene identifier, an index type identifier and an abnormality type corresponding to the abnormality index. The set hierarchy may be a [ product ] [ scene ] [ set index type ] [ exception type ]; among them, scenes are also called functions such as transfer, repayment, deposit, loan, etc.; the setting of the index type includes: business transaction amount, business success rate and time delay; types of exceptions include sudden increases and sudden decreases.

In actual application, the electronic device determines the number of bits of the first vector based on the determined first number of product types corresponding to at least one set abnormal event, the second number of scenes included in each product type, the third number of set index types corresponding to each scene and the fourth number of abnormal types. Wherein the number of bits of the first vector = first number + first number x second number + third number x fourth number.

For example, the determined setup anomalies are from product A and product B, product A including scenario a and scenario aa; product B includes scene B and scene bb; scene a, scene aa, scene b, and scene bb, respectively comprising 4 set index types: current success rate, system success rate, transaction amount and time delay; then, the number of bits of the first vector corresponding to each anomaly index is: 2+2×2+4×2=14.

For example, under the condition that the characteristic information of the set abnormal event represents that the system success rate corresponding to the scene a of the product A triggers an alarm, the characteristic information of the system success rate corresponding to the product A is subjected to one-hot (one-hot) coding according to the set hierarchical structure, so as to obtain a first vector corresponding to the system success rate of the product A, wherein the first vector is [1,0,1,0,0,0,1,0,0,0,0,0,0,0] or [1,0,1,0,0,0,0,1,0,0,0,0,0,0]. Wherein the first two bits in the first vector characterize product a; bits 3 to sixth of the first vector represent scene a; the last 8 bits of the first vector, "1,0,0,0,0,0,0,0", represent a sudden increase in system success rate, and "0,1,0,0,0,0,0,0" represents a sudden decrease in system success rate.

It should be noted that, under the condition that all the first vectors corresponding to each set abnormal event are determined, the first vectors are summed by bits to obtain corresponding second vectors.

Wherein, the characteristic information of the interrupt event represents whether the message is lost or not; and the third vector corresponding to the feature information of the interrupt event is represented by [0] or [1 ]. When there is a message loss, it is characterized that an interrupt event occurs, and the internal function call has no problem.

The characteristic information of the abnormal system characterizes whether a subsystem with highest time consumption exists or not, or characterizes whether a subsystem which is called deepest and corresponds to a failure log; and the third vector corresponding to the characteristic information of the abnormal system is [0] or [1]. The localization of the abnormal subsystem plays a critical role in the final root cause determination.

The characteristic information of the change operation characterizes whether the change operation record aims at the determined abnormal subsystem or not; the third vector corresponding to the characteristic information of the changing operation is [0] or [1]. When the third vector corresponding to the characteristic information of the change operation indicates that the change operation record aims at the determined abnormal subsystem, the characteristic of the abnormal subsystem is possibly the true root cause of the set abnormal event.

The alarm events include middleware alarm events and network alarm events. And the third vector corresponding to the characteristic information of the alarm event is [0] or [1]. The feature information of the middleware alarm event represents whether the middleware alarm event of a set level related to the abnormal subsystem exists or not; the characteristic information of the network alarm event characterizes whether the network alarm event of a set level related to the abnormal subsystem exists.

It should be noted that, when there is a middleware alarm event of a set level related to an abnormal subsystem, a delay increase or a success rate decrease may be caused; when there is a network alarm event at a set level associated with an abnormal subsystem, an abnormality may be caused to occur in a plurality of set indicators.

Step 102: calculating a loss value of the anomaly detection model by adopting a set loss function; the loss value is calculated based on a first probability value and a second probability value corresponding to each set abnormal event in the at least one set abnormal event; the second probability value characterizes a calibration probability value corresponding to a set anomaly type corresponding to a set anomaly event.

The electronic equipment calculates a loss value of the abnormality detection model by adopting a set loss function based on a first probability value and a second probability value corresponding to each set abnormality type in a plurality of set abnormality types corresponding to each set abnormality event in at least one set abnormality event.

In practical application, the electronic device may use the second number to represent the second probability value corresponding to each of the plurality of set exception types corresponding to each of the at least one set exception event. The number of rows of the second array is the same as the number of rows of the first array, and the number of columns of the second array is the same as the number of columns of the first array. The second probability value corresponding to the mth row and the nth column in the second array represents the probability (true probability) that the mth set abnormal event belongs to the nth set abnormal type.

It should be noted that, the first probability value and the second probability value are both greater than 0 and less than or equal to 1.

Aiming at the set abnormal events with the same characteristics and different calibration probabilities, in order to avoid that the first probability value corresponding to the set abnormal type output by the abnormal detection model is smaller than 0.5, and thus the set abnormal type to which the set abnormal event belongs cannot be determined through the first probability value, the loss function is improved, and the first probability value larger than 0.5 can be output by the abnormal detection model. In some embodiments, the calculating the loss value of the anomaly detection model using a set loss function includes:

the third loss value characterizes a mean value of a sum of all first probability values corresponding to each of the at least one set anomaly event. In some embodiments, the set loss function includes a first loss function, a second loss function, and a third loss function. Wherein,

The electronic equipment calculates the difference between the first probability value and the second probability value corresponding to each set abnormal event in at least one set abnormal event based on the first probability value and the second probability value corresponding to each set abnormal type corresponding to each set abnormal event in at least one set abnormal event by using the first loss function to obtain a first loss value; calculating a mean value of the maximum probability value corresponding to each set abnormal event in the at least one set abnormal event based on the first probability value corresponding to each set abnormal type and the second probability value corresponding to each set abnormal event in the at least one set abnormal event by using the second loss function; and calculating the average value of the sum of all the first probability values corresponding to each set abnormal event in the at least one set abnormal event based on the first probability value corresponding to each set abnormal type corresponding to each set abnormal event in the at least one set abnormal event by using the third loss function to obtain a third loss value.

When the first loss value, the second loss value, and the third loss value are calculated, a sum of the first loss value, the second loss value, and the third loss value is determined as a loss value of the abnormality detection model.

When the method is actually applied, the first loss function is used for calculating the difference between the first probability value and the second probability value corresponding to each set abnormal event in at least one set abnormal event; the expression of the first loss function is:

Wherein,

BCE(x)_i＝-[y_ilogf_i(x)+(1-y_i)log(1-f_i(x))]。

Wherein N represents the total number of set abnormal events, BCE (x _j) represents a loss value corresponding to the jth set abnormal event, C represents the total number of set abnormal types, BCE (x) _i represents the cross entropy of the ith set abnormal type, and f _i (x) represents a first probability value corresponding to the ith set abnormal type; y _i represents a second probability value corresponding to the ith set anomaly type.

In actual application, the electronic device calculates cross entropy of each set exception type of a plurality of set exception types corresponding to each set exception event based on BCE (x) _i＝-[y_ilogf_i(x)+(1-y_i)log(1-f_i (x))Calculating the average value of the cross entropy of each set abnormal type in a plurality of set abnormal types corresponding to each set abnormal event to obtain a loss value corresponding to the corresponding set abnormal event; finally based onAnd calculating the average value of the loss values of all the set abnormal events in at least one set abnormal event to obtain a first loss value.

In some embodiments, the second loss value is calculated by:

When the method is actually applied, the second loss function is used for calculating the average value of the maximum probability value corresponding to each set abnormal event in at least one set abnormal event; the expression of the second loss function is as follows:

Loss2＝1-tf.reduce_mean(tf.reduce_max(tf.multiply(y_pred,y_true),axis＝1))。

wherein y_pred represents a first array of anomaly detection model outputs; y_true represents a second array corresponding to the first array, and the second array represents a second probability value corresponding to each set exception type in a plurality of set exception types corresponding to the set exception event; tf.multiple (y_pred, y_true) characterizes a first product between a first probability value and a second probability value corresponding to each set exception type corresponding to the set exception event; the tf.reduce_max (tf.multiple (y_pred, y_true), axis=1) characterizes that from all the first products corresponding to each set exception event, the largest first product corresponding to each set exception event is determined; i.e. the maximum value of the first product is found per row. The largest first product here corresponds to the maximum probability value mentioned above.

The tf.reduce_mean (tf.reduce_max (tf.multiple (y_pred, y_true), axis=1)) characterization calculates the mean based on all the largest first products determined.

The electronic equipment calculates a first product between a first probability value and a second probability value corresponding to each set abnormal type corresponding to each set abnormal event in at least one set abnormal event based on the second loss function; determining the maximum first product corresponding to each set abnormal event from all the calculated first products corresponding to each set abnormal event; and calculating an average value of the maximum first product corresponding to each set abnormal event in at least one set abnormal event to obtain a first average value, and determining the difference value between 1 and the first average value as a second loss value.

In some embodiments, the third loss value is calculated by:

In practical application, the third loss function is used for calculating the average value of the sum of all the first probability values corresponding to each set abnormal event in at least one set abnormal event. The expression of the third loss function is as follows:

Loss3＝tf.reduce_mean(tf.reduce_sum(input_tensor＝y_pred,axis＝1))。

where tf.reduce_sum (input_ tensor =y_pred, axis=1) characterizes the calculation of the sum of all first probability values for each set exception event.

The tf.reduce_mean (tf.reduce_sum (input_ tensor =y_pred, axis=1)) characterizes the calculation of the mean of the sum of all the first probability values corresponding to each set anomaly event.

The electronic equipment calculates the sum of all first probability values corresponding to each set abnormal event in at least one set abnormal event based on the third loss function; and carrying out average value calculation on the sum of all the first probability values corresponding to each set abnormal event to obtain a third loss value.

It should be noted that, the smaller the maximum value of the first product between the first probability value and the second probability value in the Loss2, the larger the Loss value of the Loss 2; after the abnormality detection model is trained for many times, the abnormality detection model is promoted to improve the output prediction probability, so that the Loss2 can improve the first probability value output by the abnormality detection model, but after the first probability value is improved, the inaccuracy of the set abnormality type which the set abnormality event determined based on the first probability value belongs to is caused, so that the penalty term of Loss3 and Loss3 characterization on the sum of the first probability values is increased, and the accuracy of the set abnormality type which the set abnormality event determined based on the first probability value belongs to is ensured.

Step 103: and updating the weight parameters of the anomaly detection model according to the loss value.

And the electronic equipment updates the weight parameters of the anomaly detection model according to the loss value of the anomaly detection model so as to improve the accuracy of the first probability value output by the anomaly detection model. The electronic device counter-propagates the loss value in the anomaly detection model, calculates a gradient of a loss function according to the loss value in the process of counter-propagating the loss value to each layer of the anomaly detection model, and updates a weight parameter counter-propagated to the current layer along a descending direction of the gradient.

The electronic equipment uses the weight parameters obtained after updating as weight parameters used in the anomaly detection model when the anomaly detection model is trained next time.

Here, the update stop condition may be set, and when the update stop condition is satisfied, the weight parameter obtained by the last update is determined as the weight parameter used by the trained abnormality detection model. And updating a stopping condition such as a set training round (epoch), wherein one training round is a process of training the abnormality detection model once according to the feature vector corresponding to each set abnormality event and the corresponding first weight in at least one set abnormality event. Of course, the update stop condition is not limited to this, and for example, a loss value corresponding to each of the at least one set abnormal event may be less than or equal to a set loss threshold value.

Back propagation is relative to forward propagation, which refers to the feed-forward processing of the model, and the direction of back propagation is opposite to that of forward propagation. The back propagation refers to updating the weight parameters of each layer of the model according to the result output by the model. For example, where the model includes a first layer, a second layer, and a third layer, forward propagation refers to processing in the order of the first layer, the second layer, and the third layer, and backward propagation refers to updating the weight parameters of the respective layers in the order of the third layer, the second layer, and the first layer.

It should be noted that, in the process of training the anomaly detection model, the electronic device completes one training by adopting the related data of the set anomaly event in the same batch. That is, the electronic device inputs the feature vector and the corresponding first weight value corresponding to the set abnormal event in the same batch to the abnormality detection model for training. Different training rounds may correspond to different batches of set exception events.

In the solution provided in this embodiment, training is performed on the anomaly detection model based on at least one feature vector corresponding to the set anomaly event and a corresponding first weight, and in the training process, a loss value of the anomaly detection model is calculated by using a set loss function, and a weight parameter of the anomaly detection model is updated according to the calculated loss value. The first weight corresponding to the set abnormal event is input into the abnormal detection model for training, so that the abnormal detection model pays attention to the feature vector of the set abnormal event with the first weight preferentially, the accuracy of the first probability value corresponding to the set abnormal event with the first weight can be improved, and the set abnormal type of the set abnormal event can be accurately determined.

As another embodiment of the present invention, the abnormality detection model may be put into use after the abnormality detection model has been trained. For example, in the context of anomaly detection, the electronic device may perform anomaly detection using the anomaly detection model trained by the foregoing embodiments to determine the anomaly type to which the anomaly event belongs. The electronic device in the embodiment corresponding to the training abnormality detection model may be different from the electronic device performing root cause positioning by using the abnormality detection model in the present embodiment.

As shown in fig. 2, the implementation process of abnormality detection by the electronic device using the trained abnormality detection model is as follows:

Step 201: and inputting the feature vector corresponding to the first abnormal event into a first model to obtain the prediction probability corresponding to each set abnormal type in a plurality of set abnormal types corresponding to the first abnormal event.

The first model is an abnormality detection model obtained by training by adopting any one of the method for training the abnormality detection model.

Under the condition that the electronic equipment acquires the related data of the first abnormal event, determining the feature vector corresponding to the first abnormal event, and inputting the feature vector corresponding to the first abnormal event into the first model to obtain the prediction probability corresponding to each set abnormal type in a plurality of set abnormal types corresponding to the first abnormal event. In actual application, the weight value corresponding to the first abnormal event defaults to 1.

The method for determining the feature vector corresponding to the first abnormal event is the same as the method for determining the feature vector corresponding to the set abnormal event in the step 101, and the method for predicting the prediction probability corresponding to the first abnormal event is the same as the method for predicting the first probability value corresponding to the set abnormal event in the step 101, which is not described herein.

Step 202: and determining the set abnormal type of the first abnormal event based on the prediction probability.

Here, the electronic apparatus may determine the set abnormality type to which the first abnormality event belongs as the set abnormality type to which the prediction probability is greater than the set threshold. Wherein the set threshold is greater than 0.5.

In this embodiment, the anomaly detection is performed based on the anomaly detection model trained by the method, so that the prediction probability corresponding to each set anomaly type in the multiple set anomaly types corresponding to the first anomaly event can be accurately predicted, and the set anomaly type to which the first anomaly event belongs is accurately determined based on the prediction probability.

Considering that in the application scenario of root cause positioning, a situation that multiple root causes correspond to one abnormal event is often occurred, that is, multiple root causes may cause the same abnormal event, in order to accurately perform root cause positioning, in some embodiments, the electronic device may determine, based on a prediction probability corresponding to each set abnormal type corresponding to the first abnormal event, and based on a confidence level corresponding to each candidate root in at least two candidate root causes corresponding to the first abnormal event, a first score corresponding to each candidate root in each set abnormal type; and determining a target root cause corresponding to the first abnormal event based on the first score corresponding to the determined candidate root cause in each set abnormal type.

Here, the electronic device determines a confidence level corresponding to each candidate root cause of at least two candidate root causes corresponding to the first abnormal event based on the set root cause set, determines a prediction probability corresponding to each set abnormal type corresponding to the first abnormal event, and determines a first score corresponding to each candidate root cause in each set abnormal type based on the determined confidence level corresponding to the candidate root cause. The set root cause set comprises a first corresponding relation between the set abnormal event and the set root cause and a second corresponding relation between the set root cause and the confidence coefficient.

When determining the first score corresponding to each of the set abnormality types, the electronic device may determine the candidate root corresponding to the highest first score as the target root corresponding to the first abnormality event, or may determine the candidate root corresponding to the first score greater than the set threshold as the target root corresponding to the first abnormality event.

In practical application, the electronic device may also sort the first scores corresponding to the determined candidate root causes in each set abnormality type, and determine the target root cause corresponding to the first abnormality event based on the sorted first scores.

In this embodiment, a prediction probability corresponding to each set abnormality type of a plurality of set abnormality types corresponding to a first abnormality event is predicted by an abnormality detection model; determining a first score corresponding to each candidate root cause in each set abnormality type based on the prediction probability corresponding to each set abnormality type corresponding to the first abnormality event and the confidence corresponding to each candidate root cause in at least two candidate root causes corresponding to the first abnormality event; determining a target root cause corresponding to a first abnormal event based on the first score corresponding to the determined candidate root cause in each set abnormal type; therefore, the target root cause can be accurately positioned, and the accuracy of the determined target root cause is improved.

In order to implement the method for training the anomaly detection model according to the embodiment of the present invention, the embodiment of the present invention further provides an electronic device, as shown in fig. 3, where the electronic device includes:

a prediction unit 31, configured to input a feature vector and a corresponding first weight corresponding to at least one set exception event to an exception detection model, so as to obtain a first probability value corresponding to each set exception type of a plurality of set exception types corresponding to each set exception event in the at least one set exception event;

A calculation unit 32 for calculating a loss value of the abnormality detection model using a set loss function; the loss value is calculated based on a first probability value and a second probability value corresponding to each set abnormal event in the at least one set abnormal event; the second probability value represents a calibration probability value corresponding to a set abnormal type corresponding to a set abnormal event;

an updating unit 33, configured to update the weight parameters of the anomaly detection model according to the loss value.

In some embodiments, the electronic device further comprises:

a first determining unit, configured to determine a feature vector corresponding to the set abnormal event by:

In some embodiments, the first determining unit is configured to:

In some embodiments, the electronic device further comprises:

the second determining unit is used for determining a first weight corresponding to each set abnormal event in at least two set abnormal events meeting the set conditions based on the set normal distribution curve; wherein,

In some embodiments, the computing unit 32 is to:

In some embodiments, the computing unit 32 is to: the second loss value is calculated by:

In some embodiments, the computing unit 32 is to: the third loss value is calculated by:

In some embodiments, the electronic device further comprises:

the prediction unit is used for inputting the feature vector corresponding to the first abnormal event into the first model to obtain the prediction probability corresponding to each set abnormal type in a plurality of set abnormal types corresponding to the first abnormal event;

the determining unit is used for determining a set abnormal type to which the first abnormal event belongs based on the prediction probability; wherein,

The first model is an abnormality detection model obtained by training by the method for training an abnormality detection model provided by any one of the embodiments.

In practice, the above units may be implemented by a Processor in an electronic device, such as a central processing unit (CPU, central Processing Unit), a digital signal Processor (DSP, digital Signal Processor), a micro control unit (MCU, microcontroller Unit), or a Programmable gate array (FPGA, field-Programmable GATE ARRAY).

It should be noted that: in the electronic device provided in the above embodiment, only the division of each program module is used for illustration when training the anomaly detection model, and in practical application, the processing allocation may be performed by different program modules according to needs, that is, the internal structure of the device is divided into different program modules, so as to complete all or part of the processing described above. In addition, the electronic device provided in the above embodiment and the method embodiment for training the anomaly detection model belong to the same concept, and the specific implementation process is detailed in the method embodiment, which is not repeated here.

Based on the hardware implementation of the program modules, and in order to implement the method of the embodiment of the present invention, the embodiment of the present invention further provides an electronic device. Fig. 4 is a schematic diagram of a hardware composition structure of an electronic device according to an embodiment of the present invention, and as shown in fig. 4, the electronic device 4 includes:

A communication interface 41 capable of information interaction with other devices such as a network device and the like;

And a processor 42, connected to the communication interface 41, for implementing information interaction with other devices, for executing the method provided by one or more technical solutions of the electronic device when running the computer program. And the computer program is stored on the memory 43.

Of course, in practice, the various components in the electronic device 4 are coupled together by a bus system 44. It is understood that the bus system 44 is used to enable connected communications between these components. The bus system 44 includes a power bus, a control bus, and a status signal bus in addition to the data bus. But for clarity of illustration the various buses are labeled as bus system 44 in fig. 4.

The memory 43 in the embodiment of the present invention is used to store various types of data to support the operation of the electronic device 4. Examples of such data include: any computer program for operation on the electronic device 4.

It will be appreciated that the memory 43 may be either volatile memory or nonvolatile memory, and may include both volatile and nonvolatile memory. The non-volatile Memory may be, among other things, a Read Only Memory (ROM), a programmable Read Only Memory (PROM, programmable Read-Only Memory), erasable programmable Read-Only Memory (EPROM, erasable Programmable Read-Only Memory), electrically erasable programmable Read-Only Memory (EEPROM, ELECTRICALLY ERASABLE PROGRAMMABLE READ-Only Memory), Magnetic random access Memory (FRAM, ferromagnetic random access Memory), flash Memory (Flash Memory), magnetic surface Memory, optical disk, or compact disk-Only (CD-ROM, compact Disc Read-Only Memory); the magnetic surface memory may be a disk memory or a tape memory. The volatile memory may be random access memory (RAM, random Access Memory) which acts as external cache memory. By way of example and not limitation, many forms of RAM are available, such as static random access memory (SRAM, static Random Access Memory), synchronous static random access memory (SSRAM, synchronous Static Random Access Memory), dynamic random access memory (DRAM, dynamic Random Access Memory), synchronous dynamic random access memory (SDRAM, synchronous Dynamic Random Access Memory), and, Double data rate synchronous dynamic random access memory (DDRSDRAM, double Data Rate Synchronous Dynamic Random Access Memory), enhanced synchronous dynamic random access memory (ESDRAM, enhanced Synchronous Dynamic Random Access Memory), synchronous link dynamic random access memory (SLDRAM, syncLink Dynamic Random Access Memory), Direct memory bus random access memory (DRRAM, direct Rambus Random Access Memory). The memory 43 described in embodiments of the present invention is intended to comprise, without being limited to, these and any other suitable types of memory.

The method disclosed in the above embodiment of the present invention may be applied to the processor 42 or implemented by the processor 42. The processor 42 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuitry in hardware or instructions in software in the processor 42. The processor 42 may be a general purpose processor, DSP, or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. Processor 42 may implement or perform the methods, steps, and logic blocks disclosed in embodiments of the present invention. The general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed in the embodiment of the invention can be directly embodied in the hardware of the decoding processor or can be implemented by combining hardware and software modules in the decoding processor. The software modules may be located in a storage medium located in the memory 43 and the processor 42 reads the program in the memory 43 to perform the steps of the method described above in connection with its hardware.

Optionally, when the processor 42 executes the program, a corresponding flow implemented by the terminal in each method of the embodiment of the present invention is implemented, and for brevity, will not be described herein.

In an exemplary embodiment, the present invention also provides a storage medium, i.e. a computer storage medium, in particular a computer readable storage medium, for example comprising a first memory 43 storing a computer program executable by the processor 42 of the terminal for performing the steps of the method described above. The computer readable storage medium may be FRAM, ROM, PROM, EPROM, EEPROM, flash Memory, magnetic surface Memory, optical disk, or CD-ROM.

In the several embodiments provided by the present invention, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above described device embodiments are only illustrative, e.g. the division of the units is only one logical function division, and there may be other divisions in practice, such as: multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. In addition, the various components shown or discussed may be coupled or directly coupled or communicatively coupled to each other via some interface, whether indirectly coupled or communicatively coupled to devices or units, whether electrically, mechanically, or otherwise.

The units described as separate units may or may not be physically separate, and units displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units; some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present invention may be integrated in one processing module, or each unit may be separately used as one unit, or two or more units may be integrated in one unit; the integrated units may be implemented in hardware or in hardware plus software functional units.

Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the above method embodiments may be implemented by hardware associated with program instructions, where the foregoing program may be stored in a computer readable storage medium, and when executed, the program performs steps including the above method embodiments; and the aforementioned storage medium includes: a mobile storage device, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk or optical disk, or the like, which can store program codes.

The technical schemes described in the embodiments of the present invention may be arbitrarily combined without any collision.

The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A method of training an anomaly detection model, comprising:

inputting the feature vector corresponding to at least one set abnormal event and the corresponding first weight into an abnormal detection model to obtain a first probability value corresponding to each set abnormal type in a plurality of set abnormal types corresponding to each set abnormal event in the at least one set abnormal event; when each of the at least two determined set abnormal events is derived from different software products or the characteristic information corresponding to each of the determined set abnormal events is not completely the same, the first weight corresponding to each of the at least one set abnormal event is 1; when the determined at least two set abnormal events correspond to the same product or the determined at least two set abnormal events have the same characteristic information, the first weight is associated with the occurrence time of the set abnormal events;

The third loss value represents the average value of the sum of all first probability values corresponding to each set abnormal event in the at least one set abnormal event; the loss value is calculated based on a first probability value and a second probability value corresponding to each set abnormal event in the at least one set abnormal event; the second probability value represents a calibration probability value corresponding to a set abnormal type corresponding to a set abnormal event;

2. The method of claim 1, further comprising determining a feature vector corresponding to the set exception event by:

3. The method according to claim 2, wherein the fusing the feature vectors corresponding to the determined feature information includes:

4. The method according to claim 1, wherein the method further comprises:

5. The method of claim 1, wherein the setting a loss function comprises: the first set loss function, the second set loss function, and the third set loss function.

6. The method of claim 5, wherein the second loss value is calculated by:

7. The method of claim 5, wherein the third loss value is calculated by:

8. The method according to any one of claims 1 to 4, further comprising:

determining a set abnormal type to which the first abnormal event belongs based on the prediction probability; wherein,

The first model is an anomaly detection model trained by the method for training an anomaly detection model as described above.

9. An electronic device, comprising:

The prediction unit is used for inputting the feature vector corresponding to the at least one set abnormal event and the corresponding first weight into the abnormal detection model to obtain a first probability value corresponding to each set abnormal type in a plurality of set abnormal types corresponding to each set abnormal event in the at least one set abnormal event; when each of the at least two determined set abnormal events is derived from different software products or the characteristic information corresponding to each of the determined set abnormal events is not completely the same, the first weight corresponding to each of the at least one set abnormal event is 1; when the determined at least two set abnormal events correspond to the same product or the determined at least two set abnormal events have the same characteristic information, the first weight is associated with the occurrence time of the set abnormal events;

A calculation unit for calculating a first loss value, a second loss value, and a third loss value, respectively, based on the corresponding set loss function; determining a loss value of the anomaly detection model based on the first loss value, the second loss value, and the third loss value; wherein the first loss value characterizes a difference between a first probability value corresponding to each of the at least one set anomaly event and a corresponding second probability value; the second loss value represents the average value of the maximum probability value corresponding to each set abnormal event in the at least one set abnormal event; the maximum probability value characterizes the maximum value of the product between the first probability value corresponding to each set abnormality type and the second probability value corresponding to each set abnormality type; the third loss value represents the average value of the sum of all first probability values corresponding to each set abnormal event in the at least one set abnormal event; the loss value is calculated based on a first probability value and a second probability value corresponding to each set abnormal event in the at least one set abnormal event; the second probability value represents a calibration probability value corresponding to a set abnormal type corresponding to a set abnormal event;

10. An electronic device, comprising: a processor and a memory for storing a computer program capable of running on the processor, wherein the processor is adapted to perform the steps of the method of training an anomaly detection model according to any one of claims 1 to 8 when the computer program is run.