CN115034278A - Performance index abnormality detection method and device, electronic equipment and storage medium - Google Patents

Performance index abnormality detection method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN115034278A
CN115034278A CN202110200046.8A CN202110200046A CN115034278A CN 115034278 A CN115034278 A CN 115034278A CN 202110200046 A CN202110200046 A CN 202110200046A CN 115034278 A CN115034278 A CN 115034278A
Authority
CN
China
Prior art keywords
data
sample
index data
abnormality detection
index
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110200046.8A
Other languages
Chinese (zh)
Inventor
叶芝高
何林艳
胡远明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Group Guangdong Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Group Guangdong Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Group Guangdong Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN202110200046.8A priority Critical patent/CN115034278A/en
Publication of CN115034278A publication Critical patent/CN115034278A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3409Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Testing And Monitoring For Control Systems (AREA)

Abstract

The invention provides a performance index abnormity detection method, a performance index abnormity detection device, electronic equipment and a storage medium, wherein the method comprises the following steps: determining performance index data to be detected in an IT system; inputting the performance index data into an abnormality detection model to obtain an abnormality detection result output by the abnormality detection model; the anomaly detection model is obtained by training based on a plurality of sample data sets, each sample data set comprises a plurality of sample index data and an anomaly tag thereof, the sample data sets are obtained by carrying out unsupervised clustering on all sample index data, and the anomaly tag is obtained by carrying out differential processing based on a time sequence on all sample index data in the sample data set. The method, the device, the electronic equipment and the storage medium provided by the invention realize automatic labeling of mass sample index data through unsupervised clustering and a time sequence-based differential processing mode, greatly reduce the threshold for realizing anomaly detection and contribute to improving the accuracy and robustness of the anomaly detection of the performance index data.

Description

Performance index abnormality detection method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a performance index anomaly detection method and apparatus, an electronic device, and a storage medium.
Background
The current abnormal detection scheme based on the performance indexes of the IT system is generally divided into single index abnormal detection and multi-index abnormal detection. The multi-index abnormality detection is to determine abnormality by integrating a plurality of indexes.
For high-dimensional data generated by an IT system, a single-index abnormity detection result may have one-sidedness, and the single-index abnormity detection efficiency is low and is impractical, while the problem can be overcome by multi-index abnormity detection. The common implementation modes of the multi-index anomaly detection are an unsupervised learning algorithm and a supervised classification algorithm. Compared with an unsupervised learning algorithm, the supervised classification algorithm is more accurate and has better robustness, but the supervised learning algorithm is difficult to realize because massive data cannot be labeled.
Disclosure of Invention
The invention provides a performance index abnormality detection method, a performance index abnormality detection device, electronic equipment and a storage medium, which are used for solving the problems of poor reliability and high implementation difficulty of the conventional performance index abnormality detection method.
The invention provides a performance index abnormity detection method, which comprises the following steps:
determining performance index data to be detected in an IT system;
inputting the performance index data into an abnormality detection model to obtain an abnormality detection result output by the abnormality detection model;
the anomaly detection model is obtained by training based on a plurality of sample data sets, each sample data set comprises a plurality of sample index data and an anomaly tag thereof, the sample data sets are obtained by carrying out unsupervised clustering on all sample index data, and the anomaly tag is obtained by carrying out differential processing based on a time sequence on each sample index data in the sample data set.
According to the performance index abnormality detection method provided by the invention, unsupervised clustering is carried out on all sample index data to obtain a plurality of data clusters;
arranging the index data of each sample in any data cluster according to a time sequence to obtain a time sequence of each index in any data cluster;
calculating mutation indexes of the time sequences of all indexes in any data cluster at all reference time points, and marking the abnormal labels of all sample index data in any data cluster based on the mutation indexes, wherein the reference time points are randomly determined;
and determining a sample data set corresponding to any data cluster based on each sample index data and the abnormal label thereof in any data cluster.
According to the method for detecting the performance index abnormality, the step of calculating the mutation index of the time sequence of each index in any data cluster at each reference time point comprises the following steps:
dividing the time sequence into a front subsequence and a rear subsequence based on a reference time point;
and determining the mutation index of the reference time point based on the mean value and the standard deviation of the front subsequence and the rear subsequence.
According to the method for detecting the performance index abnormality, provided by the invention, unsupervised clustering is carried out on all sample index data to obtain a plurality of data clusters, and the method also comprises the following steps:
and performing dimensionality reduction on all sample index data based on a principal component analysis algorithm.
According to the method for detecting the performance index abnormity, which is provided by the invention, the dimensionality reduction treatment is carried out on all sample index data based on a principal component analysis algorithm, and the method comprises the following steps:
and (4) carrying out dimensionality reduction on all sample index data by combining a singular value decomposition algorithm and the principal component analysis algorithm.
According to the method for detecting the performance index abnormality provided by the invention, the sample data set corresponding to any data cluster is determined based on each sample index data and the abnormal label thereof in any data cluster, and then the method further comprises the following steps:
and carrying out unique hot coding and/or label coding on the time data of each sample index data in the sample data set.
According to the performance index abnormality detection method provided by the invention, the abnormality detection model is constructed on the basis of an extensible large-scale unsupervised outlier detection framework.
The invention provides a performance index abnormality detection device, comprising:
the data acquisition unit is used for determining performance index data to be detected in the IT system;
the abnormality detection unit is used for inputting the performance index data into an abnormality detection model to obtain an abnormality detection result output by the abnormality detection model;
the anomaly detection model is obtained by training based on a plurality of sample data sets, each sample data set comprises a plurality of sample index data and an anomaly tag thereof, the sample data sets are obtained by carrying out unsupervised clustering on all sample index data, and the anomaly tag is obtained by carrying out differential processing based on a time sequence on each sample index data in the sample data set.
The present invention also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements any of the above steps of the performance index abnormality detection method when executing the computer program.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the performance index abnormality detection method according to any one of the above.
According to the performance index abnormality detection method, the performance index abnormality detection device, the electronic equipment and the storage medium, automatic labeling of massive sample index data is achieved through unsupervised clustering and a time sequence-based differential processing mode, the threshold for achieving abnormality detection is greatly reduced, and accuracy and robustness of performance index data abnormality detection are improved.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of a performance index anomaly detection method according to the present invention;
FIG. 2 is a schematic flow chart of a sample data set determination method provided in the present invention;
FIG. 3 is a second schematic flow chart of the performance index abnormality detection method according to the present invention;
FIG. 4 is a second schematic flowchart of a sample data set determination method according to the present invention;
FIG. 5 is a schematic diagram of a performance index anomaly detection device according to the present invention;
fig. 6 is a schematic structural diagram of an electronic device provided in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Anomaly detection, also known as outlier detection, is used to discover the type of anomaly in a data stream and determine detailed information of its occurrence. Common anomaly detection is divided into single-index anomaly detection and multi-index anomaly detection.
Among them, the single-index anomaly detection, i.e. the time-series anomaly detection, is usually a statistical-based algorithm. The algorithm is very simple and easy to implement, but only deals with simple scenes. With the rapid expansion of the whole internet service, the operation scene becomes more and more complex, the data scale of the IT system index becomes larger, and various monitoring indexes also increase. In the face of a large data scene, the traditional single-index anomaly detection result may have one-sidedness, and the detection efficiency is low and is not practical.
The multi-index anomaly detection is to synthesize multiple indexes to judge anomalies, and can be specifically divided into an unsupervised learning algorithm and a supervised classification algorithm. The unsupervised learning algorithm comprises an IForest isolated forest, a local abnormal factor of LOF, One-Class-SVM, an automatic encoder and the like, does not need to label data, but is difficult in feature selection, low in accuracy of a detection result and low in robustness of execution of the method. Supervised classification algorithms including XGboost, GBDT, decision trees, support vector machines and the like are more accurate, but labeled data is difficult to obtain. The two types of multi-index anomaly detection algorithms of supervised classification and unsupervised learning respectively have advantages and disadvantages, compared with the unsupervised learning algorithm, the supervised classification algorithm is more accurate and has better robustness, but the supervised learning algorithm is difficult to realize because massive data cannot be labeled.
In order to solve the above problems, an embodiment of the present invention provides a method for detecting an abnormality in a performance index. Fig. 1 is a schematic flow chart of a performance index abnormality detection method provided by the present invention, as shown in fig. 1, the method includes:
and step 110, determining performance index data to be detected in the IT system.
Here, the IT system is a system that needs to perform performance index abnormality detection, and the performance index data to be detected may include each performance index data generated by each interface of each device in the IT system.
Step 120, inputting the performance index data into an anomaly detection model to obtain an anomaly detection result output by the anomaly detection model;
the anomaly detection model is obtained by training based on a plurality of sample data sets, each sample data set comprises a plurality of sample index data and an anomaly tag thereof, the sample data sets are obtained by carrying out unsupervised clustering on all sample index data, and the anomaly tag is obtained by carrying out differential processing based on a time sequence on each sample index data in the sample data set.
Specifically, the anomaly detection model is obtained by supervised classification, the training sample includes a plurality of sample data sets, and the sample index data included in each sample data set is labeled with an anomaly label for reflecting whether the corresponding sample index data is abnormal or not.
In consideration of the fact that the sample data required by the multi-index anomaly detection model training is extremely large in scale and manual sample labeling is not practical, the embodiment of the invention realizes automatic labeling of the sample data through unsupervised clustering and a differential processing mode based on a time sequence.
The acquisition of a plurality of sample data sets can be realized by the following steps: firstly, a large amount of sample index data is collected, and unsupervised clustering is carried out on all the sample index data, so that all the sample index data are divided into a plurality of classes, and each class of sample index data corresponds to one sample data set. And respectively carrying out differential processing based on time series on various sample index data, thereby quickly positioning abnormal data in the various sample index data and further labeling the abnormal label of the various sample index data.
After obtaining a plurality of sample data sets, model training can be performed based on the plurality of sample data sets, so as to obtain an anomaly detection model, and the anomaly detection model obtained through supervised classification learns the mapping relationship between the sample index data and the anomaly labels, so that in step 120, the performance index data to be detected can be directly input into the anomaly detection model, the anomaly detection model applies the learned mapping relationship, and the performance index data is mapped to the corresponding anomaly labels, so as to obtain an anomaly detection result and output the anomaly detection result.
The method provided by the embodiment of the invention realizes automatic labeling of massive sample index data through unsupervised clustering and a time sequence-based differential processing mode, greatly reduces the threshold for realizing anomaly detection, and is beneficial to improving the accuracy and robustness of the anomaly detection of the performance index data.
Based on the foregoing embodiment, fig. 2 is a schematic flowchart of a sample data set determining method provided by the present invention, and as shown in fig. 2, the sample data set determining method includes:
and step 210, performing unsupervised clustering on all sample index data to obtain a plurality of data clusters.
Here, the purpose of unsupervised clustering of all sample index data is to classify sample index data of different cycle types, thereby preparing for subsequent time-series-based differential processing.
Step 220, arranging the index data of each sample in any data cluster according to a time sequence to obtain a time sequence of each index in the data cluster;
step 230, calculating mutation indexes of the time sequences of the indexes in the data cluster at reference time points, and marking the abnormal labels of the index data of the samples in the data cluster based on the mutation indexes, wherein the reference time points are randomly determined;
step 240, determining a sample data set corresponding to the data cluster based on each sample index data in the data cluster and the abnormal label thereof.
Specifically, after the unsupervised clustering is completed, step 220 and step 240 may be performed on each data cluster, so as to obtain the sample data set corresponding to each data cluster. The description here takes a single data cluster as an example:
each sample index data in the data cluster carries the generation time, and each sample index data in the data cluster can be sequenced according to the generation time, so that the time sequence of each index is obtained.
For time series, the records of the mutation quantity are marked as opposed to the sample index data. Therefore, after the time series is obtained, the mutation amount detection can be carried out on the time series. The specific detection mode can be a parametric test method, a nonparametric test method and the like, and the method can be adopted only when the sequence sample distribution meets the assumed condition of the test method in consideration of the corresponding requirements of the parametric test method on the sample total distribution, and preferably, the embodiment of the invention can adopt a simpler nonparametric test method. When the nonparametric inspection method is used, the internal structure and the local change characteristics of the time sequence can be reflected without presetting the overall distribution of the sequence samples, and the evolution characteristics of the time sequence on different scales can be distinguished.
For an IT system, a reference time point can be determined through random seed setting, then the time sequence is cut into a front subsequence and a rear subsequence through the reference time point, a mutation index of the reference time point is calculated based on the front subsequence and the rear subsequence obtained through cutting, whether sample index data at the reference time point is abnormal or not relative to other sample index data in the time sequence is judged through the size of the mutation index, and then abnormal marking of each sample index data in a data cluster is achieved.
After the abnormal marking is finished, a sample data set corresponding to the data cluster can be constructed based on each sample index data carrying the abnormal label in the data cluster.
Based on any of the above embodiments, step 230 includes:
dividing the time sequence into a front subsequence and a rear subsequence based on a reference time point;
and determining the mutation index of the reference time point based on the mean value and the standard deviation of the front subsequence and the rear subsequence.
Specifically, for IT systems, the problem of sudden changes can be discussed in terms of both information and noise. Particularly, the method comprises the steps of cutting two subsequences before and after based on a reference time point, and respectively calculating the mean value and the standard deviation of the two subsequences before and after, wherein the mean value can represent information, and the standard deviation can represent noise, so that the mutation index of the reference time point can be determined
Figure BDA0002947794210000081
As shown in the following formula:
Figure BDA0002947794210000082
in the formula, x 1 And S 1 Respectively represent m before the reference time point 1 Mean and standard deviation, x, of subsequences of time segments 2 And S 2 Respectively represent m after the reference time point 2 Time periodAverage and standard deviation of the subsequences of (a).
By continuously setting the reference time points in the time period, the time series of the mutation index S/N can be obtained. On the basis, the abnormal label of the sample index data at the corresponding time point can be determined based on the magnitude of each mutation index S/N. For example, when 1< S/N <2, the exception tag of the sample index data at the corresponding time period is marked as "1", and the exception tag when S/N >2 is marked as "2". Here exception tag 1 may represent normal and exception tag 2 may represent exception.
In addition, in an actual production environment, the IT system has a plurality of devices corresponding to different index data. Different index data can also appear in different equipment and interfaces. For example, in the database, there is a corresponding capacity consumption for each partition, and the database waits for index data having different numbers of events. Therefore, performance index data generated by the IT system is quite huge, and dimension reduction processing is performed before modeling of data at high latitude at present. Along with the continuous reduction of data dimension, the space required by data storage is reduced; low-dimensional data helps reduce computation/training time; some algorithms tend to underperform on high-dimensional data, dimensionality reduction may improve algorithm usability, and so on. In reality, the processing method for reducing the dimension of the data is less, the traditional dimension reduction method consumes longer time, and the calculation cost is high.
In response to this problem, before performing step 210, according to any of the above embodiments, the method further includes:
and 200, performing dimensionality reduction on all sample index data based on a principal component analysis algorithm.
Specifically, the directly acquired sample index data has more dimensions, and the dimension reduction processing can be performed through a Principal Component Analysis (PCA), so that the subsequent calculation amount is reduced, and the subsequent modeling efficiency is improved.
The steps for performing dimensionality reduction by applying PCA are as follows:
1. collecting sample index data, and assuming that the collected sample index data are m n dimensions;
2. forming a matrix X with n rows and m columns by the sample index data according to the columns;
3. zero-averaging each row of the matrix X, namely subtracting the average value of the row;
4. solving covariance matrix
Figure BDA0002947794210000091
5. Solving the eigenvalue of the covariance matrix and the corresponding eigenvector;
6. arranging the eigenvectors into a matrix from top to bottom according to the size of the corresponding eigenvalue and taking K rows to form a matrix P; and Y is PX, namely the data after reducing the dimension to k dimension.
Based on any of the above embodiments, step 200 includes: and (4) carrying out dimensionality reduction on all sample index data by combining a singular value decomposition algorithm and a principal component analysis algorithm.
Specifically, in sample index data obtained by the IT system, the sample dimension is very high, the computation of the covariance matrix during PCA dimension reduction is too slow, and the main information retained by k is only for the training set, and the main information is not necessarily important information, and some information which seems useless but is just important information may be discarded. I.e., applying PCA alone for dimensionality reduction, may also exacerbate overfitting. Considering this special case, the PCA problem can be converted into an SVD (Singular Value Decomposition) problem, thereby avoiding XX T And (4) calculating.
Further, the key to the PCA solution is the solution
Figure BDA0002947794210000092
The key of SVD is A T And (4) calculating A. Can get
Figure BDA0002947794210000093
Then there are:
Figure BDA0002947794210000101
therefore, the SVD and the PCA are equivalent to improve the efficiency of iterative solution of the characteristic value and reduce the calculation amount.
Based on any of the above embodiments, step 240 further includes:
and carrying out one-hot coding and/or label coding on the time data of each sample index data in the sample data set.
Specifically, the time data carried in each sample index data, that is, the information used for representing the generation time of the sample index data, may be split.
For example, the time data 2020-10-0810: 30:00.360000+00:00 can be converted into a date type, and then time information such as year, month, day, etc. is extracted from the date type, and for the hour information hour therein, the hour information hour can also be converted into discrete data according to time, such as mouning, afternoon, etc. The extracted time information of each category can be encoded by one-hot encoding (one-hot encoding) and/or label encoding (label encoding).
Wherein, the one-hot coding can be completed by writing logic codes. The result of the encoding of a variable with K classes is a binary matrix with K columns, where a value of 1 in the ith column means that the observation belongs to the ith class. The label coding directly converts the category into the number, and the original dimension can be kept by using the label coding.
Based on any one of the embodiments, the anomaly detection model is constructed based on an extensible large-scale unsupervised outlier detection framework.
Specifically, an extensible large-scale Unsupervised Outlier Detection Framework (SUOD) can accelerate training without sacrificing the training and prediction effects, so that the problem that a plurality of anomaly Detection models cannot be converged when being trained on high-dimensional large data is solved. In the training process of the anomaly detection model, a plurality of supervised and unsupervised anomaly detection algorithms can be fitted to each sample data set in the SUOD framework, so that a plurality of anomaly detection models can be efficiently solved at low time for performing prediction judgment on performance index data to be detected.
Further, the SUOD framework may include three modules, respectively, random dimensionality reduction, balanced parallel scheduling, and pseudo-supervised model training analysis.
In terms of random dimensionality reduction: in the fitting of massive data models, the operation cost is increased along with the increase of dimensions, and the operation cost is usually increased geometrically, so that the dimension reduction is a very reasonable choice. However, simple dimension reduction methods often lack performance guarantees. The biggest problem of linear dimension reduction, such as PCA, is that the result obtained by dimension reduction is definite, that is, the result obtained by PCA is almost the same, but this is not favorable for the training of a plurality of subsequent models, especially for the training of an ensemble learning model. Another significant problem is that PCA itself can be used as a method for anomaly detection, and a part of anomaly points are lost in the dimension reduction process.
To address this problem, the dimension reduction requirement can be satisfied by Johnson-Lindenstauss (JL) project on the data level. The JL is very simple to operate, and only a projection matrix (such as a completely random gaussian distribution) needs to be randomized.
In terms of system equalization scheduling: training multiple heterogeneous models in parallel can result in large operating overhead differences. For example, decision tree training may be much faster than the operation of the K neighbors. Suppose 100 models are trained, the first 50 are decision trees and the last 50 are K neighbors. Simply dividing the first 50 and last 50 into 2 clusters or processes, the task on cluster 1 will complete much faster than the task on cluster 2. This imbalance in scheduling can make the overall system inefficient to operate.
To resolve the imbalance of the heterogeneous model in the system scheduling, a regression model may be first trained to predict the computation overhead required by each learner. Then, based on the calculation overhead required by each learner, a scheduling method is set to ensure that the different task loads are as close as possible. Through the setting and running of the scheduling method, the number of tasks on each process in the obtained training process can be different, but the total required time is closer, so that the parallel tasks can be completed at almost the same time.
In terms of model training: unsupervised models, especially non-parametric unsupervised models, tend to be very costly to predict. For example, the prediction of the K neighbor is time consuming. Most unsupervised anomaly detection models have a large prediction overhead. In view of this problem, parameterized supervised models can be selected to replace unsupervised models, i.e. through the idea of Pseudo-supervised adaptation (PSA), training with unsupervised anomaly detection on existing data and obtaining their training scores, and simulating decision boundaries with supervised learning models.
Fig. 3 is a second schematic flow chart of the performance index abnormality detection method provided by the present invention, and as shown in fig. 3, the performance index abnormality detection method includes:
s1, processing the sample data set:
collecting index data of each key device of the IT system in a period of time; as sample index data. On the basis, the index apertures are unified through the equipment association relation and the PCA method, and irrelevant index data are compressed and removed. And then classifying time sequence data of different cycle types through an unsupervised clustering algorithm, generating a one-hot code of a new variable and a time date after first-order difference, and finally acquiring a plurality of processed sample data sets.
S2, training an abnormality detection model:
based on the plurality of sample data sets acquired in step S1, unsupervised or pseudo-supervised anomaly detection algorithm fitting is performed on the plurality of sample data sets acquired in step S1 in the SUOD framework, so that an anomaly detection model including a plurality of anomaly detection algorithms is obtained.
S3, abnormality detection:
after the performance index data to be detected is obtained, the performance index data may be input to the abnormality detection model obtained in step S2, so as to obtain an abnormality detection result output by the abnormality detection model. On the basis, the abnormity detection result can be displayed on WEB/APP, and operation and maintenance personnel can conveniently inquire and use the abnormity detection result, so that the intellectualization and the efficiency of the abnormity detection of the IT system of the cloud native architecture are improved.
Based on any of the above embodiments, fig. 4 is a second schematic flow chart of the sample data set determining method provided by the present invention, as shown in fig. 4, S1 includes:
s1.1, collecting index data of each key device of the IT system in a period of time as sample index data.
And S1.2, performing dimensionality reduction on the sample index data by adopting PCA (principal component analysis) or PCA improved based on SVD (singular value decomposition).
S1.3, carrying out unsupervised clustering on all sample index data subjected to dimensionality reduction to obtain a plurality of data clusters.
S1.4, arranging the sample index data in each data cluster according to a time sequence, and further performing data differential calculation on the basis of the time sequence, thereby realizing the abnormal marking of each sample index data.
And S1.5, performing one-hot coding and/or label coding on the time data of each sample index data.
S1.6, combining the abnormal marks and the time codes of the sample index data in the data clusters to construct sample data sets corresponding to the data clusters respectively.
The embodiment of the invention provides a new idea of data processing and model framework analysis aiming at high-dimensional multi-index data of an IT system, and realizes more efficient and effective data dimension reduction and noise reduction through the dimension reduction of a PCA algorithm; the data is better partitioned by unsupervised clustering. The data labeling is carried out through the differential processing based on the time sequence, and compared with the simple fusion of the characteristic variables, the data labeling method can reflect the transformation trend of the data. The multiple sample data sets thus obtained make better preparation for model fitting.
In addition, according to the embodiment of the invention, the SUOD model integration framework is used for establishing a plurality of models for classified data, and the abnormal detection scheme of the plurality of models is used for dividing the data set into different types, so that the periodicity and the trend of the time sequence are essentially utilized, and the SUOD framework not only trains a plurality of models, but also improves the model training speed and does not reduce the model detection effect through a plurality of aspects such as data compression, balanced scheduling and the like. Finally, through the model set, the abnormal time point and the corresponding index value in the IT system can be detected by inputting the relevant detection data. Therefore, the operation and maintenance personnel can more conveniently find and verify the abnormal points in the IT system and accurately find the related abnormal points and root cause indexes.
Based on any of the above embodiments, fig. 5 is a schematic structural diagram of a performance index abnormality detection apparatus provided by the present invention, as shown in fig. 5, the apparatus includes:
a data obtaining unit 510, configured to determine performance index data to be detected in an IT system;
an anomaly detection unit 520, configured to input the performance index data into an anomaly detection model, and obtain an anomaly detection result output by the anomaly detection model;
the anomaly detection model is obtained by training based on a plurality of sample data sets, each sample data set comprises a plurality of sample index data and an anomaly tag thereof, the sample data sets are obtained by carrying out unsupervised clustering on all sample index data, and the anomaly tag is obtained by carrying out differential processing based on a time sequence on each sample index data in the sample data set.
The device provided by the embodiment of the invention realizes automatic labeling of massive sample index data through unsupervised clustering and a time sequence-based differential processing mode, greatly reduces the threshold for realizing anomaly detection, and is beneficial to improving the accuracy and robustness of the anomaly detection of performance index data.
Based on any of the above embodiments, the apparatus further comprises a sample determination unit configured to:
carrying out unsupervised clustering on all sample index data to obtain a plurality of data clusters;
arranging the index data of each sample in any data cluster according to a time sequence to obtain a time sequence of each index in any data cluster;
calculating mutation indexes of the time sequences of all indexes in any data cluster at all reference time points, and marking the abnormal labels of all sample index data in any data cluster based on the mutation indexes, wherein the reference time points are randomly determined;
and determining a sample data set corresponding to any data cluster based on each sample index data and the abnormal label thereof in any data cluster.
Based on any of the above embodiments, the sample determination unit is configured to:
dividing the time sequence into a front subsequence and a rear subsequence based on a reference time point;
and determining the mutation index of the reference time point based on the mean value and the standard deviation of the front subsequence and the rear subsequence.
Based on any of the above embodiments, the sample determination unit is further configured to:
and performing dimensionality reduction on all sample index data based on a principal component analysis algorithm.
Based on any of the above embodiments, the sample determination unit is further configured to:
and (4) carrying out dimensionality reduction on all sample index data by combining a singular value decomposition algorithm and the principal component analysis algorithm.
Based on any of the above embodiments, the sample determination unit is further configured to:
and carrying out unique hot coding and/or label coding on the time data of each sample index data in the sample data set.
Based on any one of the above embodiments, the anomaly detection model is constructed based on an extensible large-scale unsupervised outlier detection framework.
Fig. 6 illustrates a physical structure diagram of an electronic device, which may include, as shown in fig. 6: a processor (processor)610, a communication Interface (Communications Interface)620, a memory (memory)630 and a communication bus 640, wherein the processor 610, the communication Interface 620 and the memory 630 communicate with each other via the communication bus 640. The processor 610 may invoke logic instructions in the memory 630 to perform a performance metric anomaly detection method comprising: determining performance index data to be detected in an IT system; inputting the performance index data into an abnormality detection model to obtain an abnormality detection result output by the abnormality detection model; the anomaly detection model is obtained by training based on a plurality of sample data sets, each sample data set comprises a plurality of sample index data and an anomaly tag thereof, the sample data sets are obtained by carrying out unsupervised clustering on all sample index data, and the anomaly tag is obtained by carrying out differential processing based on a time sequence on each sample index data in the sample data set.
In addition, the logic instructions in the memory 630 may be implemented in software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk, and various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product comprising a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the performance index abnormality detection method provided by the above methods, the method comprising: determining performance index data to be detected in an IT system; inputting the performance index data into an abnormality detection model to obtain an abnormality detection result output by the abnormality detection model; the anomaly detection model is obtained by training based on a plurality of sample data sets, each sample data set comprises a plurality of sample index data and an anomaly tag thereof, the sample data sets are obtained by carrying out unsupervised clustering on all sample index data, and the anomaly tag is obtained by carrying out differential processing based on a time sequence on each sample index data in the sample data set.
In yet another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor is implemented to perform the performance index abnormality detection methods provided above, the method comprising: determining performance index data to be detected in an IT system; inputting the performance index data into an abnormality detection model to obtain an abnormality detection result output by the abnormality detection model; the anomaly detection model is obtained by training based on a plurality of sample data sets, each sample data set comprises a plurality of sample index data and an anomaly tag thereof, the sample data sets are obtained by carrying out unsupervised clustering on all sample index data, and the anomaly tag is obtained by carrying out differential processing based on a time sequence on each sample index data in the sample data set.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A performance index abnormality detection method is characterized by comprising the following steps:
determining performance index data to be detected in an IT system;
inputting the performance index data into an abnormality detection model to obtain an abnormality detection result output by the abnormality detection model;
the anomaly detection model is obtained by training based on a plurality of sample data sets, each sample data set comprises a plurality of sample index data and an anomaly tag thereof, the sample data sets are obtained by carrying out unsupervised clustering on all sample index data, and the anomaly tag is obtained by carrying out differential processing based on a time sequence on each sample index data in the sample data set.
2. The method according to claim 1, wherein the method for determining the plurality of sample data sets comprises:
carrying out unsupervised clustering on all sample index data to obtain a plurality of data clusters;
arranging the index data of each sample in any data cluster according to a time sequence to obtain a time sequence of each index in any data cluster;
calculating mutation indexes of the time sequences of all indexes in any data cluster at all reference time points, and marking the abnormal labels of all sample index data in any data cluster based on the mutation indexes, wherein the reference time points are randomly determined;
and determining a sample data set corresponding to any data cluster based on each sample index data and the abnormal label thereof in any data cluster.
3. The method according to claim 2, wherein the calculating the mutation index of the time series of each index in any data cluster at each reference time point comprises:
dividing the time sequence into a front subsequence and a rear subsequence based on a reference time point;
and determining the mutation index of the reference time point based on the mean value and the standard deviation of the front subsequence and the rear subsequence.
4. The method of claim 2, wherein the unsupervised clustering of all sample index data to obtain a plurality of data clusters further comprises:
and performing dimensionality reduction on all sample index data based on a principal component analysis algorithm.
5. The method according to claim 4, wherein the performing a dimensionality reduction process on all sample index data based on a principal component analysis algorithm comprises:
and (4) carrying out dimensionality reduction on all sample index data by combining a singular value decomposition algorithm and the principal component analysis algorithm.
6. The method according to claim 2, wherein the determining a sample data set corresponding to the any data cluster based on each sample index data in the any data cluster and its abnormal label further comprises:
and carrying out unique hot coding and/or label coding on the time data of each sample index data in the sample data set.
7. The method of any of claims 1 to 6, wherein the anomaly detection model is constructed based on a scalable large-scale unsupervised outlier detection framework.
8. A performance index abnormality detection device, characterized by comprising:
the data acquisition unit is used for determining performance index data to be detected in the IT system;
the abnormality detection unit is used for inputting the performance index data into an abnormality detection model to obtain an abnormality detection result output by the abnormality detection model;
the anomaly detection model is obtained by training based on a plurality of sample data sets, each sample data set comprises a plurality of sample index data and an anomaly tag thereof, the sample data sets are obtained by carrying out unsupervised clustering on all sample index data, and the anomaly tag is obtained by carrying out differential processing based on a time sequence on each sample index data in the sample data set.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the performance indicator abnormality detection method according to any one of claims 1 to 7 when executing the program.
10. A non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor, performs the steps of the performance indicator abnormality detection method according to any one of claims 1 to 7.
CN202110200046.8A 2021-02-22 2021-02-22 Performance index abnormality detection method and device, electronic equipment and storage medium Pending CN115034278A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110200046.8A CN115034278A (en) 2021-02-22 2021-02-22 Performance index abnormality detection method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110200046.8A CN115034278A (en) 2021-02-22 2021-02-22 Performance index abnormality detection method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115034278A true CN115034278A (en) 2022-09-09

Family

ID=83118215

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110200046.8A Pending CN115034278A (en) 2021-02-22 2021-02-22 Performance index abnormality detection method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115034278A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115511106A (en) * 2022-11-15 2022-12-23 阿里云计算有限公司 Method, device and readable storage medium for generating training data based on time sequence data
CN117827524A (en) * 2024-03-06 2024-04-05 建信金融科技有限责任公司 System operation and maintenance method and device
CN117932309A (en) * 2024-01-31 2024-04-26 哈尔滨工业大学 KPI data dimension reduction method based on inter-measurement and time dimension selection, electronic equipment and storage medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115511106A (en) * 2022-11-15 2022-12-23 阿里云计算有限公司 Method, device and readable storage medium for generating training data based on time sequence data
CN115511106B (en) * 2022-11-15 2023-04-07 阿里云计算有限公司 Method, device and readable storage medium for generating training data based on time sequence data
CN117932309A (en) * 2024-01-31 2024-04-26 哈尔滨工业大学 KPI data dimension reduction method based on inter-measurement and time dimension selection, electronic equipment and storage medium
CN117827524A (en) * 2024-03-06 2024-04-05 建信金融科技有限责任公司 System operation and maintenance method and device

Similar Documents

Publication Publication Date Title
CN115034278A (en) Performance index abnormality detection method and device, electronic equipment and storage medium
CN112597062B (en) Military software structured quality data extraction method and device and software testing device
CN115564071A (en) Method and system for generating data labels of power Internet of things equipment
CN114861788A (en) Load abnormity detection method and system based on DBSCAN clustering
CN111737099B (en) Data center anomaly detection method and device based on Gaussian distribution
Zhu et al. Monitoring big process data of industrial plants with multiple operating modes based on Hadoop
CN111984442A (en) Method and device for detecting abnormality of computer cluster system, and storage medium
CN111224805A (en) Network fault root cause detection method, system and storage medium
CN116795977A (en) Data processing method, apparatus, device and computer readable storage medium
CN117851490A (en) Data analysis processing system based on big data
CN116451081A (en) Data drift detection method, device, terminal and storage medium
CN111400122B (en) Hard disk health degree assessment method and device
CN118113999A (en) Data analysis method, device, equipment and computer readable storage medium
CN116340845A (en) Label generation method and device, storage medium and electronic equipment
CN109978038B (en) Cluster abnormity judgment method and device
CN115545104A (en) KPI (Key Performance indicator) anomaly detection method, system and medium based on functional data analysis
CN115796704A (en) Goods and materials sampling inspection method and device based on LightGBM index model
CN115686995A (en) Data monitoring processing method and device
CN111221704B (en) Method and system for determining running state of office management application system
CN113705920A (en) Generation method of water data sample set for thermal power plant and terminal equipment
Mulla et al. The Use of Clustering and Classification Methods in Machine Learning and Comparison of Some Algorithms of the Methods
Zhu et al. Research of system fault diagnosis method based on imbalanced data
CN118296311B (en) Interpolation method and device for hydrologic quality of water missing data and electronic equipment
CN118691321B (en) Online city automobile substation data management and control platform based on edge calculation
CN113723835B (en) Water consumption evaluation method and terminal equipment for thermal power plant

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination