CN110730087A - Method and device for processing alarm storm - Google Patents
Method and device for processing alarm storm Download PDFInfo
- Publication number
- CN110730087A CN110730087A CN201810779197.1A CN201810779197A CN110730087A CN 110730087 A CN110730087 A CN 110730087A CN 201810779197 A CN201810779197 A CN 201810779197A CN 110730087 A CN110730087 A CN 110730087A
- Authority
- CN
- China
- Prior art keywords
- alarm
- network equipment
- storm
- channel
- state
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/0604—Management of faults, events, alarms or notifications using filtering, e.g. reduction of information by using priority, element types, position or time
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The embodiment of the invention discloses a method and a device for processing an alarm storm. Wherein the method comprises the following steps: receiving alarm information reported by each network device through respective alarm reporting channels; the alarm reporting channels are pre-established and correspond to the network equipment one by one; and if the alarm information reported by the network equipment is judged to meet the alarm storm judgment rule, closing an alarm reporting channel of the network equipment. The device is used for executing the method. The method and the device for processing the alarm storm can reduce the occupation of system resources when the alarm storm occurs.
Description
Technical Field
The embodiment of the invention relates to the technical field of Internet, in particular to a method and a device for processing an alarm storm.
Background
In a network device management system, fault management is an important part, and the main purpose of fault management is to monitor the operating conditions of each network device in the system, collect the status information or fault information of each network device, and perform corresponding processing on the information.
When a certain software and hardware resource of a system is abnormal due to equipment failure or some uncontrollable reason, a large amount of alarm information may be generated in a short time, and the reporting of the large amount of alarm information in the short time is called an alarm storm. At present, the following strategies are mainly used for processing the alarm storm: (1) the alarm storm is not processed and is directly reported; (2) using the alarm blacklist to discard, filter or merge the alarm storms conforming to the alarm blacklist; (3) filtering by using a counter, accumulating the same alarm information, periodically resetting and displaying or filtering the alarm information of which the frequency of the same alarm information exceeds a threshold value in a given time period; (4) and filtering by utilizing the alarm characteristics: by utilizing the characteristics of the alarm information, the primary alarm is reserved, and the secondary alarm is discarded or cached. The above strategy for handling an alarm storm has the following problems: for the mode (1), the occupied system memory is increased rapidly, the message accumulation of the message queue is serious, the CPU resource is occupied, and even the system is crashed when the message queue is serious; for the mode (2), the alarm needs to be added into the blacklist database, and the blacklist is accumulated continuously along with the use time, so that the comparison time is increased rapidly, and the processing efficiency of the alarm storm is reduced; for the mode (3), the threshold is difficult to set, and different thresholds cause larger difference of alarm filtering results and poor reliability; for the mode (4), a complex algorithm is needed to calculate the related alarm, and a large amount of system resources are occupied.
Therefore, how to provide an alarm storm processing method, which can reduce the occupation of system resources when an alarm storm occurs, is an important issue to be solved in the industry.
Disclosure of Invention
Aiming at the defects in the prior art, the embodiment of the invention provides a method and a device for processing an alarm storm.
In one aspect, an embodiment of the present invention provides a method for processing an alarm storm, including:
receiving alarm information reported by each network device through respective alarm reporting channels; the alarm reporting channels are pre-established and correspond to the network equipment one by one;
and if the alarm information reported by the network equipment is judged to meet the alarm storm judgment rule, closing an alarm reporting channel of the network equipment.
In another aspect, an embodiment of the present invention provides an apparatus for processing an alarm storm, including:
the receiving unit is used for receiving the alarm information reported by each network device through the respective alarm reporting channel; the alarm reporting channels are pre-established and correspond to the network equipment one by one;
and the closing unit is used for closing the alarm reporting channel of the network equipment after judging that the alarm information reported by the network equipment meets the alarm storm judgment rule.
In another aspect, an embodiment of the present invention provides an electronic device, including: a processor, a memory, and a communication bus, wherein:
the processor and the memory are communicated with each other through the communication bus;
the memory stores program instructions executable by the processor, and the processor calls a processing method of the program instructions capable of executing the alarm storm provided by the embodiments.
In yet another aspect, an embodiment of the present invention provides a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the method for handling an alarm storm as provided in the above embodiments.
According to the method and the device for processing the alarm storm provided by the embodiment of the invention, the alarm information reported by each network device through the respective alarm reporting channel can be received, and the alarm reporting channel of the network device is closed after the alarm information reported by the network device is judged to meet the alarm storm judgment rule, so that the occupation of system resources can be reduced when the alarm storm occurs.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flow chart illustrating a method for handling an alarm storm according to an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of an alarm management system according to an embodiment of the present invention;
FIG. 3 is a flowchart illustrating a method for handling an alarm storm according to another embodiment of the present invention;
fig. 4 is a flowchart illustrating a method for processing a cloud alarm storm according to another embodiment of the present invention;
fig. 5 is a flowchart illustrating a method for processing a cloud alarm storm according to another embodiment of the present invention;
FIG. 6 is a flowchart illustrating steps for establishing a status query channel according to an embodiment of the present invention;
FIG. 7 is a flowchart illustrating steps for establishing an alarm reporting channel according to an embodiment of the present invention;
FIG. 8 is a signaling interaction diagram of the steps of establishing a status query channel and an alarm report channel in the embodiment of the present invention;
FIG. 9 is a signaling interaction diagram of a method for handling an alarm storm according to an embodiment of the present invention;
FIG. 10 is a schematic structural diagram of a storm warning processing device according to an embodiment of the present invention;
fig. 11 is a schematic physical structure diagram of an electronic device according to an embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without any creative effort belong to the protection scope of the embodiments of the present invention.
Fig. 1 is a flowchart illustrating a method for processing an alarm storm according to an embodiment of the present invention, and as shown in fig. 1, the method for processing an alarm storm according to the embodiment of the present invention includes:
s101, receiving alarm information reported by each network device through respective alarm reporting channels; the alarm reporting channels are pre-established and correspond to the network equipment one by one;
specifically, an alarm reporting channel may be established between each network device and a processing apparatus (hereinafter, referred to as a processing apparatus) of an alarm storm, where the network device includes at least one monitoring object, and if the monitoring object fails, an alarm may be generated, and the network device where the monitoring object is located may generate alarm information according to the alarm, and then report the alarm information to the processing apparatus through the alarm reporting channel, and the processing apparatus may receive the alarm information. The alarm reporting channels are pre-established, correspond to the network devices one to one, and can be realized through UDP connection or TCP connection. It is understood that the monitoring object may be software installed on the network device or may be some hardware of the network device.
S102, if the alarm information reported by the network equipment is judged to meet the alarm storm judgment rule, closing an alarm reporting channel of the network equipment.
Specifically, after receiving the alarm information reported by the network device, the processing device determines whether the alarm information meets an alarm storm determination rule, and if the alarm information reported by the network device meets the alarm storm determination rule, determines that the alarm information reported by the network device forms an alarm storm. And the processing device closes the alarm reporting channel of the network equipment, thereby avoiding a large amount of occupation of system resources and avoiding the risk of system breakdown. For example, the alarm reporting channel is a TCP connection, and the processing device may close a socket link corresponding to the alarm reporting channel through a close function, that is, no longer receive the alarm information of the network device.
For example, fig. 2 is a schematic structural diagram of an alarm management system according to an embodiment of the present invention, and as shown in fig. 2, the alarm management system includes a processing apparatus for an alarm storm and n network devices, and the alarm reporting channel is established between each of the n network devices and the processing apparatus. When a monitoring object a of the network device 2 fails, the network device 2 generates an alarm message b, the network device 2 reports the alarm message b to the processing device through a warning reporting channel of the network device 2, the processing device receives the alarm message b and judges whether the alarm message b meets the warning storm judgment rule, and if the alarm message b meets the warning judgment rule, the processing device closes the warning reporting channel of the network device 2.
According to the method and the device for processing the alarm storm provided by the embodiment of the invention, the alarm information reported by each network device through the respective alarm reporting channel can be received, and the alarm reporting channel of the network device is closed after the alarm information reported by the network device is judged to meet the alarm storm judgment rule, so that the occupation of system resources can be reduced when the alarm storm occurs.
On the basis of the foregoing embodiments, further, the alarm storm determination rule includes:
and the number of the same alarm information reported by the network equipment in a preset time period is larger than a threshold value.
Specifically, the processing device counts the number of the same alarm information reported by the network device within a preset time period, then compares the number of the alarm information with a threshold, and if the number of the alarm information is greater than the threshold, the alarm information meets the alarm storm determination rule, which indicates that the alarm information reported by the network device forms the alarm storm. The preset time period is set according to actual experience, and the embodiment of the invention is not limited.
Fig. 3 is a flowchart illustrating a method for processing an alarm storm according to another embodiment of the present invention, and as shown in fig. 3, the method for processing an alarm storm according to the embodiment of the present invention further includes:
s103, periodically inquiring through a state inquiry channel of the network equipment to obtain the state of the monitoring object corresponding to the alarm information meeting the alarm storm judgment rule; the state query channels are pre-established and correspond to the network equipment one by one;
specifically, after the alarm reporting channel of the network device is closed, the processing device may periodically send an inquiry message to the network device through the status inquiry channel of the network device, so as to inquire the status of the monitored object corresponding to the alarm information meeting the alarm storm determination rule, the network device may obtain the status of the monitored object and return the status of the monitored object corresponding to the alarm information meeting the alarm storm determination rule to the processing device through the inquiry channel, and the processing device may receive the status of the monitored object. The state query channel is pre-established, corresponds to the network device one by one, can be realized through UDP connection or TCP connection, and is independent of the alarm report channel; the query period of the network device is set according to actual experience, and the embodiment of the present invention is not limited. Understandably, when the monitoring object has a fault, an alarm is generated, and the state of the alarm is abnormal; and when the monitored object has no fault, the state of the monitored object is normal.
And S104, if the state of the monitored object corresponding to the alarm information meeting the alarm storm judgment rule is judged to meet the preset condition, reestablishing an alarm reporting channel of the network equipment and stopping inquiring the network equipment.
Specifically, the processing device obtains the state of the monitored object corresponding to the alarm information meeting the alarm storm judgment rule, judges whether the state of the monitored object corresponding to the alarm information meeting the alarm storm judgment rule meets a preset condition, and if the state of the monitored object corresponding to the alarm information meeting the alarm storm judgment rule meets the preset condition, reestablishes an alarm reporting channel between the network device and the processing device, and stops the periodic query of the network device.
The method for processing the alarm storm provided by the embodiment of the invention can receive the alarm information reported by each network device through the respective alarm reporting channel, and can close the alarm reporting channel of the network device after judging that the alarm information reported by the network device meets the alarm storm judgment rule, so that the occupation of system resources can be reduced when the alarm storm occurs. Further, by periodically querying the status of the network device causing the alarm storm through the status query channel, the monitoring of the network device can be resumed after the failure of the network device causing the alarm storm is resolved.
Fig. 4 is a flowchart illustrating a method for processing a cloud alarm storm according to another embodiment of the present invention, and as shown in fig. 4, the method for processing an alarm storm according to the embodiment of the present invention further includes:
s105, periodically inquiring through a state inquiry channel of the network equipment to obtain the state of the monitoring object corresponding to the alarm information meeting the alarm storm judgment rule; the state query channels are pre-established and correspond to the network equipment one by one;
specifically, the specific implementation process of this step is similar to step S103, and is not described here again.
S106, if the state of the monitored object corresponding to the alarm information meeting the alarm storm judgment rule is judged to meet the preset condition, inquiring through the state inquiry channel to obtain the states of other monitored objects of the network equipment;
specifically, the processing device determines whether the state of the monitored object corresponding to the alarm information meeting the alarm storm determination rule satisfies a preset condition after obtaining the state of the monitored object corresponding to the alarm information meeting the alarm storm determination rule, if the state of the monitoring object corresponding to the alarm information meeting the alarm storm judgment rule meets a preset condition, the processing device sends an inquiry message of the state of other monitoring objects of the network equipment to the network equipment through a state inquiry channel of the network equipment to inquire the state of other monitoring objects of the network equipment, the network device obtains the states of the other monitoring objects and returns the states of the other monitoring objects to the processing device through the query channel, and the processing device receives the states of the other monitoring objects. The other monitoring objects are monitoring objects except the monitoring object corresponding to the alarm information meeting the alarm storm judgment rule in the network equipment.
And S107, if the states of other monitored objects of the network equipment are judged and known to be normal, reestablishing an alarm reporting channel of the network equipment and stopping the periodic query of the network equipment.
Specifically, after obtaining the states of other monitoring objects of the network device, the processing device determines whether the states of the other monitoring objects are normal or abnormal, and if the states of the other monitoring objects are normal, the processing device reestablishes an alarm reporting channel between the network device and the processing device, and stops the periodic query of the network device.
The method for processing the alarm storm provided by the embodiment of the invention can receive the alarm information reported by each network device through the respective alarm reporting channel, and can close the alarm reporting channel of the network device after judging that the alarm information reported by the network device meets the alarm storm judgment rule, so that the occupation of system resources can be reduced when the alarm storm occurs. Further, the state query channel is used for periodically querying the state of the network device causing the alarm storm, so that the states of other monitoring objects of the network device can be queried and obtained after the fault of the network device causing the alarm storm is relieved, and the alarm information of other monitoring objects is prevented from being lost.
Fig. 5 is a flowchart illustrating a method for processing a cloud alarm storm according to still another embodiment of the present invention, and as shown in fig. 5, the method for processing an alarm storm according to the embodiment of the present invention further includes:
s108, if judging that at least one of the states of other monitoring objects of the network equipment is abnormal, acquiring the states of all monitoring objects of the network equipment periodically queried by a channel through state query of the network equipment;
specifically, after obtaining the states of other monitoring objects of the network device, the processing device determines whether the states of the other monitoring objects are normal or abnormal, and if at least one of the states of the other monitoring objects is abnormal, the processing device sends query messages of the states of all monitoring objects of the network device to the network device through a state query channel of the network device, queries the states of all monitoring objects of the network device, the network device obtains the states of all monitoring objects and returns the states of all monitoring objects to the processing device through the query channel, and the processing device receives the states of all monitoring objects.
And S109, if the states of all the monitored objects of the network equipment are judged to be normal, reestablishing an alarm reporting channel of the network equipment and stopping the periodic query of the network equipment.
Specifically, after obtaining the states of all the monitored objects of the network device, the processing device determines whether the state of each monitored object of the network device is normal or abnormal, and if the states of all the monitored objects are normal, the processing device reestablishes an alarm reporting channel between the network device and the processing device, and stops the periodic query of the network device.
On the basis of the foregoing embodiments, further, the preset conditions include:
and the states obtained by the continuous preset times of inquiry of the monitored object corresponding to the alarm information meeting the alarm storm judgment rule are normal.
Specifically, the processing device queries for a preset number of times to obtain a state of a monitored object corresponding to the alarm information meeting the alarm storm determination rule, and determines whether the state of the monitored object corresponding to the alarm information meeting the alarm storm determination rule obtained by each query is normal or abnormal, and if the states of the monitored objects corresponding to the alarm information meeting the alarm storm determination rule obtained by the queries for the preset number of times are normal, the state of the monitored object corresponding to the alarm information meeting the alarm storm determination rule meets the preset condition. The preset times are set according to actual experience, and the embodiment of the invention is not limited.
For example, the preset number of times is three, the processing device queries for three consecutive times to obtain the state of the monitored object corresponding to the alarm information meeting the alarm storm determination rule, and determines that the state of the monitored object corresponding to the alarm information meeting the alarm storm determination rule obtained by the querying for three consecutive times is normal, so that the state of the monitored object corresponding to the alarm information meeting the alarm storm determination rule meets the preset condition.
Fig. 6 is a schematic flow chart of the step of establishing the status query channel according to the embodiment of the present invention, and as shown in fig. 6, the step of establishing the status query channel includes:
s601, receiving an alarm source registration message of the network equipment, wherein the alarm source registration message comprises an IP address of the network equipment and a port number of a state query channel;
specifically, the network device sends an alarm source registration message to the processing apparatus, where the alarm source registration message includes an IP address of the network device and a port number of a status query channel, and the processing apparatus receives the alarm source registration message. Wherein the port number of the status query channel is preset.
S602, establishing a state query channel of the network equipment according to the IP address of the network equipment and the port number of the state query channel.
Specifically, after receiving the warning source registration message, the processing device establishes a network connection with the network device according to the IP address of the network device and the port number of the status query channel, thereby establishing the status query channel of the network device.
Fig. 7 is a schematic flow chart of a step of establishing an alarm reporting channel according to an embodiment of the present invention, and as shown in fig. 7, the step of establishing the alarm reporting channel includes:
s701, receiving an alarm source registration message of the network equipment, wherein the alarm source registration message comprises an IP address of the network equipment and a port number of an alarm reporting channel;
specifically, the network device sends an alarm source registration message to the processing apparatus, where the alarm source registration message includes an IP address of the network device and a port number of an alarm reporting channel, and the processing apparatus receives the alarm source registration message. Wherein, the port number of the alarm reporting channel is preset. It can be understood that the port number of the alarm reporting channel is different from the port number of the status query channel.
S702, establishing an alarm reporting channel of the network equipment according to the IP address of the network equipment and the port number of the alarm reporting channel.
Specifically, after receiving the alarm source registration message, the processing device establishes a network connection with the network device according to the IP address of the network device and the port number of the alarm reporting channel, thereby establishing the alarm reporting channel of the network device.
On the basis of the foregoing embodiments, further, the alarm information includes an identifier of the network device and an identifier of a monitoring object; correspondingly, the method for processing the alarm storm provided by the embodiment of the invention further comprises the following steps:
and if the alarm information reported by the network equipment meets the alarm storm judgment rule, recording the identifier of the network equipment, the identifier of the monitored object and the alarm storm state of the network equipment, and sending the identifier of the network equipment, the identifier of the monitored object and the corresponding alarm storm state to a network management client.
Specifically, the alarm information includes an identifier of the network device and an identifier of a monitoring object, where the identifier of the monitoring object corresponds to the monitoring object that has a fault. After receiving the alarm information reported by the network device, the processing device determines whether the alarm information meets an alarm storm determination rule, and if the alarm information reported by the network device meets the alarm storm determination rule, records the identifier of the network device, the identifier of the monitored object, and the alarm storm state of the network device, for example, 1 indicates that the alarm storm state of the network device is an alarm storm, and 0 indicates that the alarm storm state of the network device is an alarm storm. The processing device can send the identifier of the network equipment, the identifier of the monitored object and the corresponding alarm storm state to a network management client, so that operation and maintenance personnel can check and know that the alarm storm caused by the network equipment, and know the corresponding fault of the monitored object according to the identifier of the monitored object, thereby carrying out subsequent maintenance processing.
Fig. 8 is a signaling interaction diagram of the steps of establishing a status query channel and an alarm report channel in the embodiment of the present invention, and as shown in fig. 8, the steps of establishing a status query channel and an alarm report channel in the embodiment of the present invention are as follows:
(1) the network equipment sends an alarm source registration message carrying an IP address of the network equipment, a port number of a state inquiry channel and a port number of an alarm reporting channel to the processing device; wherein, the port number of the state inquiry channel is different from the port number of the alarm reporting channel;
(2) the processing device obtains the IP address of the network equipment and the port number of the state query channel from the warning source registration message, and sends state query channel establishment information to the network equipment according to the IP address of the network equipment and the port number of the state query channel;
(3) after receiving the state query channel establishment information, the network equipment returns a state query channel establishment success message, and when the processing device receives the state query channel establishment success message, the state query channel between the processing device and the network equipment is successfully established;
(4) the processing device obtains the IP address of the network equipment and the port number of the alarm reporting channel from the alarm source registration message, and sends alarm reporting channel establishment information to the network equipment according to the IP address of the network equipment and the port number of the alarm reporting channel;
(5) after receiving the alarm reporting channel establishment information, the network equipment returns an alarm reporting channel establishment success message, and when the processing device receives the alarm reporting channel establishment success message, the alarm reporting channel between the processing device and the network equipment is successfully established; it can be understood that the alarm reporting channel and the state query channel are established in a non-sequential order.
Fig. 9 is a signaling interaction diagram of a method for processing an alarm storm according to an embodiment of the present invention, and as shown in fig. 9, a process flow of the alarm storm provided by the embodiment of the present invention is as follows:
(1) when a monitored object of the network equipment has a fault, the network equipment generates alarm information and reports the alarm information to the processing device through an alarm reporting channel of the network equipment; wherein the alarm information may include an identifier of the network device, an identifier of the monitoring object, and alarm content.
(2) After receiving the alarm information, the processing device judges whether the alarm information meets the alarm storm judgment rule;
(3) if the alarm information meets the alarm storm judgment rule, the processing device records alarm storm information, wherein the alarm storm information comprises the identifier of the network equipment, the identifier of the monitored object and the alarm storm state of the network equipment;
(4) the processing device sends the alarm storm information to a network management client so that operation and maintenance personnel can know the alarm storm state of the network equipment and take corresponding processing measures;
(5) if the alarm information meets the alarm storm judgment rule, the processing device closes an alarm reporting channel of the network equipment and does not receive the alarm information reported by the network equipment through the alarm reporting channel;
(6) the processing device periodically sends first state query information to the network equipment through a state query channel, wherein the first state query information comprises an identifier of the monitoring object causing the alarm storm;
(7) after receiving the first state query information, the network device acquires the state of a first query object, wherein the first query object is the monitoring object causing the alarm storm;
(8) the network equipment returns first state query response information to the processing device through the state query channel, wherein the first state query response information comprises the state of the first query object;
(9) after obtaining the state of the first query object, the processing device judges whether the state of the first query object meets a preset condition;
(10) if the state of the first query object meets a preset condition, the processing device sends second state query information to the network equipment through a state query channel, wherein the second state query information is used for querying the states of other monitoring objects except the monitoring object causing the alarm storm;
(11) after receiving the second state query information, the network device obtains the state of a second query object, which is the other monitoring object;
(12) the network equipment returns second state query response information to the processing device through the state query channel, wherein the second state query response information comprises the state of the second query object;
(13) after receiving the state of the second query object, the processing device judges whether the state of the second query object is normal;
(14) if the processing device judges that the state of the second query object is normal, sending alarm reporting channel reconstruction information to the network equipment according to the IP address of the network equipment and the port number of the alarm reporting channel;
(15) and after receiving the alarm reporting channel reconstruction information, the network equipment returns alarm reporting channel reconstruction success information to the processing device. After the processing device receives the information that the alarm reporting channel is successfully reestablished, the alarm reporting channel is successfully established, and the processing device resumes monitoring of the network equipment.
Fig. 10 is a schematic structural diagram of an apparatus for processing an alarm storm according to an embodiment of the present invention, and as shown in fig. 10, the apparatus for processing an alarm storm according to an embodiment of the present invention includes a receiving unit 1001 and a closing unit 1002, where:
the receiving unit 1001 is configured to receive alarm information reported by each network device through a respective alarm reporting channel; the alarm reporting channels are pre-established and correspond to the network equipment one by one; the closing unit 1002 is configured to close an alarm reporting channel of the network device after it is determined that the alarm information reported by the network device meets an alarm storm determination rule.
Specifically, an alarm reporting channel may be established between each network device and a device for processing an alarm storm, where the network device includes at least one monitoring object, and if the monitoring object fails, an alarm may be generated, and the network device where the monitoring object is located may generate alarm information according to the alarm, and then report the alarm information to the receiving unit 1001 through the alarm reporting channel, and the receiving unit 1001 may receive the alarm information. The alarm reporting channels are pre-established, correspond to the network devices one to one, and can be realized through UDP connection or TCP connection. It is understood that the monitoring object may be software installed on the network device or may be some hardware of the network device.
The closing unit 1002, after receiving the alarm information reported by the network device, determines whether the alarm information satisfies an alarm storm determination rule, and if the alarm information reported by the network device satisfies the alarm storm determination rule, determines that the alarm information reported by the network device forms an alarm storm. The closing unit 1002 closes the alarm reporting channel of the network device, thereby avoiding a large amount of occupation of system resources and avoiding the risk of system crash. For example, the alarm reporting channel is a TCP connection, and the closing unit 1002 may close the socket link corresponding to the alarm reporting channel through a close function, that is, no longer receive the alarm information of the network device.
The device for processing the alarm storm provided by the embodiment of the invention can receive the alarm information reported by each network device through the respective alarm reporting channel, and can close the alarm reporting channel of the network device after judging that the alarm information reported by the network device meets the alarm storm judgment rule, so that the occupation of system resources can be reduced when the alarm storm occurs.
The embodiment of the apparatus provided in the embodiment of the present invention may be specifically configured to execute the processing flows of the above method embodiments, and the functions of the apparatus are not described herein again, and refer to the detailed description of the above method embodiments.
Fig. 11 is a schematic physical structure diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 11, the electronic device includes a processor (processor)1101, a memory (memory)1102 and a communication bus 1103;
the processor 1101 and the memory 1102 complete communication with each other through a communication bus 1103;
the processor 1101 is configured to call the program instructions in the memory 1102 to perform the methods provided by the above-described method embodiments, including, for example: receiving alarm information reported by each network device through respective alarm reporting channels; the alarm reporting channel is pre-established, and the reporting channels correspond to the network equipment one by one; and if the alarm information reported by the network equipment is judged to meet the alarm storm judgment rule, closing an alarm reporting channel of the network equipment.
The present embodiment discloses a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the method provided by the above-mentioned method embodiments, for example, comprising: receiving alarm information reported by each network device through respective alarm reporting channels; the alarm reporting channel is pre-established, and the reporting channels correspond to the network equipment one by one; and if the alarm information reported by the network equipment is judged to meet the alarm storm judgment rule, closing an alarm reporting channel of the network equipment.
The present embodiments provide a non-transitory computer-readable storage medium storing computer instructions that cause the computer to perform the methods provided by the above method embodiments, for example, including: receiving alarm information reported by each network device through respective alarm reporting channels; the alarm reporting channel is pre-established, and the reporting channels correspond to the network equipment one by one; and if the alarm information reported by the network equipment is judged to meet the alarm storm judgment rule, closing an alarm reporting channel of the network equipment.
In addition, the logic instructions in the memory may be implemented in the form of software functional units and may be stored in a computer readable storage medium when sold or used as a stand-alone product. Based on such understanding, the technical solutions of the embodiments of the present invention may be essentially implemented or make a contribution to the prior art, or may be implemented in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, an apparatus, or a network device) to execute all or part of the steps of the methods described in the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the embodiments of the present invention, and not to limit the same; although embodiments of the present invention have been described in detail with reference to the foregoing embodiments, those skilled in the art will understand that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.
Claims (12)
1. A method for processing an alarm storm is characterized by comprising the following steps:
receiving alarm information reported by each network device through respective alarm reporting channels; the alarm reporting channels are pre-established and correspond to the network equipment one by one;
and if the alarm information reported by the network equipment is judged to meet the alarm storm judgment rule, closing an alarm reporting channel of the network equipment.
2. The method of claim 1, wherein the alarm storm decision rule comprises:
and the number of the same alarm information reported by the network equipment in a preset time period is larger than a threshold value.
3. The method of claim 1, further comprising:
periodically inquiring through a state inquiry channel of the network equipment to obtain the state of a monitoring object corresponding to the alarm information meeting the alarm storm judgment rule; the state query channels are pre-established and correspond to the network equipment one by one;
and if the state of the monitored object corresponding to the alarm information meeting the alarm storm judgment rule meets the preset condition, reestablishing an alarm reporting channel of the network equipment and stopping the periodic query of the network equipment.
4. The method of claim 1, further comprising:
periodically inquiring through a state inquiry channel of the network equipment to obtain the state of a monitoring object corresponding to the alarm information meeting the alarm storm judgment rule; the state query channels are pre-established and correspond to the network equipment one by one;
if the state of the monitored object corresponding to the alarm information meeting the alarm storm judgment rule is judged to meet the preset condition, inquiring through the state inquiry channel to obtain the states of other monitored objects of the network equipment;
and if the states of other monitored objects of the network equipment are judged to be normal, reestablishing an alarm reporting channel of the network equipment and stopping the periodic query of the network equipment.
5. The method of claim 4, further comprising:
if judging that at least one of the states of other monitoring objects of the network equipment is abnormal, periodically inquiring through a state inquiry channel of the network equipment to obtain the states of all monitoring objects of the network equipment;
and if the states of all the monitored objects of the network equipment are judged to be normal, reestablishing an alarm reporting channel of the network equipment and stopping the periodic query of the network equipment.
6. The method according to claim 3 or 4, wherein the preset conditions include:
and the states obtained by the continuous preset times of inquiry of the monitored object corresponding to the alarm information meeting the alarm storm judgment rule are normal.
7. The method according to claim 3 or 4, wherein the establishing step of the status query channel comprises:
receiving an alarm source registration message of the network device, wherein the alarm source registration message comprises an IP address of the network device and a port number of a status query channel;
and establishing a state query channel of the network equipment according to the IP address of the network equipment and the port number of the state query channel.
8. The method of claim 1, wherein the step of establishing the alarm reporting channel comprises:
receiving an alarm source registration message of the network equipment, wherein the alarm source registration message comprises an IP address of the network equipment and a port number of an alarm reporting channel;
and establishing an alarm reporting channel of the network equipment according to the IP address of the network equipment and the port number of the alarm reporting channel.
9. The method of claim 1, wherein the alarm information includes an identification of the network device and an identification of a monitored object; accordingly, the method further comprises:
and if the alarm information reported by the network equipment meets the alarm storm judgment rule, recording the identifier of the network equipment, the identifier of the monitored object and the alarm storm state of the network equipment, and sending the identifier of the network equipment, the identifier of the monitored object and the corresponding alarm storm state to a network management client.
10. A device for handling an alarm storm, comprising:
the receiving unit is used for receiving the alarm information reported by each network device through the respective alarm reporting channel; the alarm reporting channels are pre-established and correspond to the network equipment one by one;
and the closing unit is used for closing the alarm reporting channel of the network equipment after judging that the alarm information reported by the network equipment meets the alarm storm judgment rule.
11. An electronic device, comprising: a processor, a memory, and a communication bus, wherein:
the processor and the memory are communicated with each other through the communication bus;
the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform the method of any of claims 1 to 9.
12. A non-transitory computer-readable storage medium storing computer instructions that cause a computer to perform the method of any one of claims 1 to 9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810779197.1A CN110730087A (en) | 2018-07-16 | 2018-07-16 | Method and device for processing alarm storm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810779197.1A CN110730087A (en) | 2018-07-16 | 2018-07-16 | Method and device for processing alarm storm |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110730087A true CN110730087A (en) | 2020-01-24 |
Family
ID=69217351
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810779197.1A Withdrawn CN110730087A (en) | 2018-07-16 | 2018-07-16 | Method and device for processing alarm storm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110730087A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111309565A (en) * | 2020-05-14 | 2020-06-19 | 北京必示科技有限公司 | Alarm processing method and device, electronic equipment and computer readable storage medium |
CN112261069A (en) * | 2020-12-22 | 2021-01-22 | 国网江苏省电力有限公司信息通信分公司 | Message blacklist generation method for electric power internet of things management platform |
CN114124690A (en) * | 2021-08-30 | 2022-03-01 | 济南浪潮数据技术有限公司 | Alarm configuration method, system and related device for data center |
CN115484220A (en) * | 2022-08-23 | 2022-12-16 | 中国电子科技集团公司第十研究所 | Domestic SRIO exchange chip event crazy report processing method, equipment and medium |
CN116112337A (en) * | 2023-02-07 | 2023-05-12 | 中国联合网络通信集团有限公司 | Alarm data storm phenomenon processing method, device, equipment and storage medium |
CN117978516A (en) * | 2024-02-21 | 2024-05-03 | 北京火山引擎科技有限公司 | Alarm data processing method, device, medium and electronic equipment |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101217592A (en) * | 2008-01-16 | 2008-07-09 | 中兴通讯股份有限公司 | A method and system applied in storm alarming suppression |
CN101340319A (en) * | 2008-08-20 | 2009-01-07 | 中兴通讯股份有限公司 | Method and device for network management alarm |
WO2011026342A1 (en) * | 2009-09-01 | 2011-03-10 | 中兴通讯股份有限公司 | Processing method and processing device for alarm storm |
CN103905271A (en) * | 2014-03-12 | 2014-07-02 | 广东电网公司电力科学研究院 | Alarm storm suppression method |
-
2018
- 2018-07-16 CN CN201810779197.1A patent/CN110730087A/en not_active Withdrawn
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101217592A (en) * | 2008-01-16 | 2008-07-09 | 中兴通讯股份有限公司 | A method and system applied in storm alarming suppression |
CN101340319A (en) * | 2008-08-20 | 2009-01-07 | 中兴通讯股份有限公司 | Method and device for network management alarm |
WO2011026342A1 (en) * | 2009-09-01 | 2011-03-10 | 中兴通讯股份有限公司 | Processing method and processing device for alarm storm |
CN103905271A (en) * | 2014-03-12 | 2014-07-02 | 广东电网公司电力科学研究院 | Alarm storm suppression method |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111309565A (en) * | 2020-05-14 | 2020-06-19 | 北京必示科技有限公司 | Alarm processing method and device, electronic equipment and computer readable storage medium |
CN112261069A (en) * | 2020-12-22 | 2021-01-22 | 国网江苏省电力有限公司信息通信分公司 | Message blacklist generation method for electric power internet of things management platform |
CN114124690A (en) * | 2021-08-30 | 2022-03-01 | 济南浪潮数据技术有限公司 | Alarm configuration method, system and related device for data center |
CN115484220A (en) * | 2022-08-23 | 2022-12-16 | 中国电子科技集团公司第十研究所 | Domestic SRIO exchange chip event crazy report processing method, equipment and medium |
CN115484220B (en) * | 2022-08-23 | 2023-06-27 | 中国电子科技集团公司第十研究所 | Method, equipment and medium for processing event report of domestic SRIO exchange chip |
CN116112337A (en) * | 2023-02-07 | 2023-05-12 | 中国联合网络通信集团有限公司 | Alarm data storm phenomenon processing method, device, equipment and storage medium |
CN117978516A (en) * | 2024-02-21 | 2024-05-03 | 北京火山引擎科技有限公司 | Alarm data processing method, device, medium and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110730087A (en) | Method and device for processing alarm storm | |
US11968077B2 (en) | Link fault monitoring method and apparatus | |
US8977886B2 (en) | Method and apparatus for rapid disaster recovery preparation in a cloud network | |
CN100448201C (en) | Network supervisor SNMP trap inhibition | |
US7574502B2 (en) | Early warning of potential service level agreement violations | |
US20190327130A1 (en) | Methods, control node, network element and system for handling network events in a telecomunications network | |
US20150036504A1 (en) | Methods, systems and computer readable media for predicting overload conditions using load information | |
EP2800024A1 (en) | System and methods for identifying applications in mobile networks | |
EP3387791B1 (en) | Technique for reporting and processing alarm conditions occurring in a communication network | |
CN107947998B (en) | Real-time monitoring system based on application system | |
CN111130912B (en) | Anomaly positioning method for content distribution network, server and storage medium | |
EP2448210A1 (en) | Method for interacting messages based on simple network management protocol | |
CN106487612A (en) | A kind of server node monitoring method, monitoring server and system | |
CN110875841A (en) | Alarm information pushing method and device and readable storage medium | |
US9674065B2 (en) | Method, apparatus and system for detecting network element load imbalance | |
CN112817815A (en) | Network server fault warning system based on business layer monitoring big data | |
EP2484052B1 (en) | Network fault detection | |
CN111200520A (en) | Network monitoring method, server and computer readable storage medium | |
CN116795643A (en) | Alarm management method | |
JP6057470B2 (en) | Network alarm processing system | |
EP3139536A1 (en) | Alarm reporting method and device | |
CN106209406A (en) | Process the method and device of TR-069 message | |
CN115174356B (en) | Cluster alarm reporting method, device, equipment and medium | |
CN106483913A (en) | A kind of alarm windstorm processing method and processing device | |
US9141462B2 (en) | System and method for error reporting in a network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20200124 |
|
WW01 | Invention patent application withdrawn after publication |