CN113010882B - Custom position sequence pattern matching method suitable for cache loss attack - Google Patents
Custom position sequence pattern matching method suitable for cache loss attack Download PDFInfo
- Publication number
- CN113010882B CN113010882B CN202110292042.7A CN202110292042A CN113010882B CN 113010882 B CN113010882 B CN 113010882B CN 202110292042 A CN202110292042 A CN 202110292042A CN 113010882 B CN113010882 B CN 113010882B
- Authority
- CN
- China
- Prior art keywords
- character
- node
- edge
- mode
- mov
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/52—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity ; Preventing unwanted data erasure; Buffer overflow
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/70—Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer
- G06F21/78—Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure storage of data
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/50—Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- Computer Hardware Design (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a self-defined position sequence pattern matching method suitable for cache loss attack, relates to a matching algorithm, in particular to a self-defined position sequence pattern matching method suitable for cache loss attack, which matches a current scanning character with a current state node by establishing an automaton of a self-defined position sequence; if the current scanning character is successfully matched with the chr in the current state node edge value or the failure pointer, the automatic machine jumps to the next node along the edge or the failure pointer, if the new node is the tail node of the mode self-defined sequence, namely the leaf node of the automatic machine, the mode is hit, the mode recorded in the OUTPUT table is OUTPUT, the current state of the automatic machine jumps to the next node recorded by the current node, and the scanning and matching are continued; the user-defined position sequence matching algorithm solves the problems that the mode matching algorithm always scans to the depth, and the cache hit rate is low and the system processing performance is greatly reduced because a large number of automaton nodes are not in the CPU cache.
Description
Technical Field
The present invention relates to a matching algorithm, and more particularly, to a matching method for a sequence pattern of a user-defined location suitable for a cache miss attack.
Background
Pattern matching has been a research hotspot and difficulty in computer science. In the field of information security, a network security system is severely threatened by a cache loss attack aiming at a pattern matching algorithm. The attacker utilizes the moving sequence of the scanning pointer in the pattern matching algorithm to forge attack data, so that the pattern matching algorithm always scans the deep part of the path during scanning, and the cache loss rate of the system is increased. The multi-pattern exact matching algorithm can be divided into three categories according to the searching mode: prefix search, suffix search and substring search. Among prefix search methods, Aho-corascik (ac) is the most typical algorithm. The algorithm moves the window by calculating the longest common prefix between the text and the pattern. In the suffix search method, the classic algorithm Wu-Manber (WM) is characterized in that a search is performed from right to left backwards in a window. In the substring search method, the representative algorithm is Set Backed Oracle Matching (SBOM).
The AC algorithm constructs a Deterministic Finite Automata (DFA) that records a set of patterns as a Trie tree. The AC algorithm includes three tables: the system comprises a GOTO table, a FAIL table and an OUTPUT table, wherein the GOTO table records the next state according to the current state and the next character, the FAIL table determines which state to return when the next state obtained by the GOTO table is invalid, and the OUTPUT table stores the matched mode in one state. In an AC automaton, if there is a node, the graph contains a GOTO jump edge and a fail pointer.
The WM algorithm includes two steps: preprocessing and scanning, the preprocessing establishes 3 tables: SHIFT table, HASH table and PREFIX table. In the scanning process, the position of a single character in the pattern string is found out, and character matching is carried out based on the sub-string matching block with the fixed length. The scanning step uses a sliding window of a size that starts with the initial character of the text. The first HASH value of the window is calculated for the window suffix. If the HASH value is greater than zero, the window is moved forward in the text character stream using the same value. When the SHIFT value is zero, the HASH table is checked and the suffix HASH value is used to find a candidate list of possible matches.
The SBOM algorithm uses a Factor Oracle structure to construct an automaton for the character string set, and the automaton constructs a superset of all character strings in the character string set. During preprocessing, all character string prefixes with minimum length are taken to reversely construct a Trie structure, and then a Factor Oracle automaton is constructed on a Trie tree. When matching, the text is scanned through a sliding window with the minimum length of the character string prefix, the longest suffix from the initial state is scanned from the right to the left in the window, the suffix is a factor of the character string, if the suffix is hit, the rest part of the character string is scanned, if the suffix is hit, the window is moved backwards, and the state is transferred to the initial state.
In the AC and SBOM algorithms, the scanning position moves by one character correspondingly when the state of the automaton jumps once, the scanning position of the WM algorithm moves in a sliding window (w), and the w is the shortest mode length. It can be seen that the scanning position of the conventional common algorithm is within [1, w ] of each movement range, and cannot cover the longest mode.
The existing algorithms are based on fixed position sequence scanning, so that an attacker can easily implement cache attack by using the algorithms, the attacker only needs to acquire some patterns in advance, remove or modify the characters at the last position of the pattern scanning sequence, and the pattern matching algorithm always scans deeply through a large number of repeated sending, so that the cache hit rate is low and the system processing performance is greatly reduced because a large number of automaton nodes are not in the CPU cache.
Disclosure of Invention
In order to solve the technical problems of low cache hit rate and reduced system processing performance caused by the fact that an attacker is easy to use to implement cache attack in the prior art, the invention provides a custom position sequence pattern matching method suitable for cache loss attack.
A self-defined position sequence pattern matching method suitable for cache loss attack comprises the following steps:
s1, constructing an automaton;
s2, scanning a pointer to point to the first character of the data to be matched, and setting the current state of the automaton as a root node;
s3, matching the current scanning character with the current state node; in current scan character and current state node edge value or fail pointerchrIf the matching is successful, step S4 is executed, the current scan character is matched with all the edge values and the fail pointer of the current state nodechrIf the match fails, the value in the pointer of the match failschrSuccess, the scanning pointer is according tomovThe value being moved by a corresponding distance, the automaton being responsive to the record in its fail pointerSkipping to the next node and rescanning;
s4, the automaton jumps to the next node along the edge, if the new node is the tail node of the mode self-defined sequence, namely the leaf node of the automaton, the step S5 is executed, otherwise, the scanning pointer scans the edgemovMoving the value by a corresponding distance, if the moved value exceeds the tail character of the data to be matched, terminating the scanning, and otherwise, re-scanning;
s5, mode hit, mode recorded in the OUTPUT table is OUTPUT, the current state of the automaton jumps to the next node recorded by the current node, and step S3 is executed.
Preferably, the automaton consists of a GOTO table, a FAIL table and an OUTPUT table.
Preferably, two types of windows are set during the construction of the automaton, namely a fixed window, a variable window and a fixed-size windowVariable size windowWhereinThe window size is [1, shortest mode length-1 ]],The window size is 2,longest pattern tail character position with the second character as the first character],There are a plurality, the number being the size of the first character set of the pattern.
Preferably, the automaton is constructed in the sequence of 1 layer, 2 layers and 3 layers; the sequence of self-defined positions of each layer is as follows:
(1) 1 layer: which comprises 3 characters of the number of the characters,the first character of the window,A window tail character,A longest window tail character;
(2) 2, layer:the window removes the largest substrings of the first and last 2 characters, and reorders in reverse order;
Preferably, the step of constructing the automaton in step S1 is as follows:
s1.1, constructing an automaton according to a user-defined position sequence; establishing a root node with the number of 0;
s1.2. rootCreating new nodes according to the first character of the layer 1, increasing the number progressively, and adding 1 node in each mode; drawing an edge between the root node and the newly added node, wherein the GOTO table is an edge valueedge(chr,mov),chrIs the first character of the 1 layer,movfor the calculated movement distance by window sliding method, FAIL table is for the root node except for the character creating the edgeedgeA value;
s1.3, creating new nodes according to the second character of the layer 1, increasing the number, adding 1 node for each mode, drawing an edge between the current node and the added node for each mode, and taking a GOTO table as an edge valueedge(chr,mov),chrIs the second character of the layer 1,movfor the calculated moving distance by window sliding method, FAIL table is for the current node of the mode except for the character of the created edgeedgeA value;
s1.4, creating new nodes according to the third character of the layer 1, increasing the number, adding 1 node for each mode, drawing an edge between the current node and the added node of each mode, and taking a GOTO table as an edge valueedge(chr,mov),chrIs the third character of the 1 layer,movfor the calculated movement distance by window sliding method, FAIL table is for the current node of the mode except for the character of the created edgeedgeA value;
s1.5, creating new nodes according to the first character of the 2 layers, increasing the number, adding 1 node for each mode, drawing an edge between the current node and the added node of each mode, and taking a GOTO table as an edge valueedge(chr,mov),chrIs the first character of the 2 layers of the character,movfor the calculated movement distance by window sliding method, FAIL table is for the current node of the mode except for the character of the created edgeedgeA value;
s1.6, moving the current node of each mode to a newly created node, and repeating the step S1.1.5 until the last character of the layer 2 completes the creation of the node;
s1.7, creating new nodes according to the first character of the 3 layers, increasing the number, adding 1 node for each mode, drawing an edge between the current node and the added node of each mode, and taking a GOTO table as an edge valueedge(chr,mov),chrIs the first character of the 3 layers,movfor the calculated movement distance by window sliding method, FAIL table is for the current node of the mode except for the character of the created edgeedgeA value;
s1.8, the current node of each mode moves to the newly created node, and the step S1.1.7 is repeated until the last character of the layer 3 completes the creation of the node.
Preferably, the window sliding method in step S1.2 includes the following steps:
s1.2.1, setting the first character of each mode to be aligned with the first character of the window;
s1.2.2, comparing the scanned positions in the window with the positions corresponding to each mode, wherein all the positions are completely matched successfully, executing step S1.2.3, and executing step S1.2.4 if the matching fails;
s1.2.3, finding out the best mode, returning to the next position to be scanned in the mode according to the sequence of the user-defined positions,movthe value is the distance from the nearest character scanning position to the next position to be scanned, and the algorithm is ended;
s1.2.4. the window is moved one character to the left, if the tail position of the window is less than the positions of all the mode first characters, the algorithm is ended, otherwise, the step S1.2.2 is executed.
Preferably, the calculation is based on a step-window sliding methodmovValue of,Calculating by finding a pattern corresponding to a scanned position and calculating a first non-scanned position foundmovA value; computingmovThe value formula is as follows:
in the formulaRepresenting charactersWhether scanned, 1 is scanned, 0 is not scanned;krepresenting a current scan position;presentation modeIs scanned, is detected, and the position of the first unscanned character of (a) is determined.
The invention has the following beneficial effects: the self-defined position sequence pattern matching method exists in an automaton form, and is different from a common AC algorithm in two points: firstly, the automaton construction is not traversed according to the mode byte sequence, but according to the self-defined position sequence; secondly, the edge values of the automaton are defined differently. Under the attack of cache loss, all nodes of the pattern matching algorithm with the self-defined index sequence are concentrated in 1 level, the cache loss rate is not obviously increased, and therefore the matching performance tends to be stable. Each node has a failure list, and all failure transfer states are in the layer 1, so that the nodes for attacking data access are mainly in the layer 1, the cache loss rate is greatly reduced, and the CPU performance is improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
FIG. 1 is a diagram illustrating an exemplary automaton according to an embodiment of the present invention;
FIG. 2 is a flow chart of an automated machine scan according to an embodiment of the present invention;
FIG. 3 is an exemplary diagram of an automated machine scan according to an embodiment of the present invention;
FIG. 4 is a flowchart illustrating a window sliding method according to an embodiment of the present invention;
FIG. 5 is a diagram illustrating an example sequence of windows and custom locations according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions and advantages of the embodiments of the present application more apparent, the following further detailed description of the exemplary embodiments of the present application with reference to the accompanying drawings makes it clear that the described embodiments are only a part of the embodiments of the present application, and are not exhaustive of all embodiments. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
step one, constructing an automaton, and constructing the automaton according to a user-defined position sequence; the automaton is composed of a GOTO table, a FAIL table and an OUTPUT table, the GOTO table records the next state and the scanning position moving distance according to the current state and the current character, the FAIL table determines which state to return to and the scanning position moving distance when the next state obtained by the GOTO table is invalid, and the OUTPUT table stores the hit mode in one state and the next state. Two types of windows are set during the construction of the automaton, namely a fixed window and a variable window. Fixed size windowMultiple variable size windowsWhereinSize of [1, shortest mode Length-1],The size of the particles is 2,longest pattern tail character position with the second character as the first character],There are a plurality, the number being the size of the first character set of the pattern. The algorithm starts a character, ends a character and for each modeAnd preferentially matching the characters at the tail part of the window, and placing the characters which cannot be hit into the first layers of nodes of the automaton. The algorithm divides all the modes according to 3 levels, each level comprises different custom position sequences, the automaton construction sequence is 1 level, 2 levels and 3 levels, and the custom position sequences of each level are as follows:
(1) 1 layer: comprising a plurality of 3 characters of which the number is 3,a first character,A tail character,A longest window tail character;
(2) 2, layer:removing the largest substrings of the first and last 2 characters, and reordering in reverse order;
Referring to fig. 5, the custom position order, pattern set { ADCAB, EBPCPCA, eccapdc, BDCABPDA } has a shortest length of 5, windows of [1, 4], a longest length of 5 for the first character a, windows of [2, 5], a longest length of 7 for E, windows of [2, 7], a longest length of 8 for B, and windows of [2, 8 ]. The 1 layer is composed of 1 st, 4 th and tail characters of all modes, and the length is fixed to be 3; the 2 layer is composed of the 2 nd and 3 rd characters of all modes in reverse order, and the length is fixed to be 2; the 3 layers are composed of the 5 th to the last but one character sequence of all the patterns, and the length of the 3 layers of each pattern is not fixed.
The definition of the edge isedge(chr, mov),chrWhich is indicative of the current character,movindicating the scan pointer movement distance after the current character matches.
Step one, establishing a root node with the number of 0
Step one, creating new nodes according to the first character of the layer 1, increasing the number progressively, and adding 1 node in each mode; drawing an edge between the root node and the newly added node, wherein the GOTO table is an edge valueedge(chr,mov),chrIs the first character of the 1 layer,movfor the moving distance calculated by the window sliding method, the FAIL table is edge values of all other characters except the character of the created edge of the root node;
the window sliding method comprises the following steps:
firstly, setting each mode first character to be aligned with a window first character;
step one, step two, already scanned position and every pattern correspond to the position comparison character in the window, all positions match successfully completely, carry out step one, step two, step three, match and fail, carry out step one, step two, step four;
step one, step three, finding out the best mode, returning to the next position to be scanned of the mode according to the sequence of the user-defined positions, wherein the mov value is the distance from the nearest character scanning position to the next position to be scanned, and the algorithm is ended;
and step one, step two, step four, the window moves a character to the left, if the position of the tail of the window is smaller than the position of the first character of all the modes, the algorithm is ended, otherwise, the step one, step two and step two are executed.
The mov calculation method is as follows:
assuming that the current scanning position is k, k represents the relative position to the window start position,as a windowThe position of the tail character is determined,as a windowThe position of the tail character. The invention designs a window sliding method for calculating the mov value, which comprises the following two steps: firstly, searching a mode which accords with a scanned position; the second step is that: calculating a first unscanned position of the searched mode, and calculating mov;
the first step is as follows: assume a set of patterns asWindowIn which the scanned character and position are. Looking up such patternsAnd, for all of i,satisfy the following requirementsAnd is。
The second step is that: mode(s)Assume that the custom location order isThe mov calculation formula is as follows:
in the formula (I), the compound is shown in the specification,representing charactersWhether scanned, 1 is scanned, 0 is not scanned; k represents the current scan position;presentation modeIs scanned, is not scanned.
Step one, creating new nodes according to the second character of the layer 1, increasing the number, adding 1 node for each mode, drawing an edge between the current node and the added node of each mode, and taking a GOTO table as the edge valueedge(chr,mov),chrIs the second character of the layer 1,movfor the calculated movement distance by window sliding method, FAIL table is for the current node of the mode except for the character of the created edgeedgeA value;
step four, creating new nodes according to the third character of the layer 1, increasing the number, adding 1 node for each mode, drawing an edge between the current node and the added node of each mode, and taking a GOTO table as an edge valueedge(chr,mov),chrIs the third character of the 1 layer,movfor the calculated movement distance by window sliding method, FAIL table is for the current node of the mode except for the character of the created edgeedgeA value;
creating new nodes according to the first character of the 2 layers, increasing the number, adding 1 node for each mode, drawing an edge between the current node and the added node of each mode, and taking a GOTO table as an edge valueedge(chr,mov),chrIs the first character of the 2 layers of the character,movfor the calculated movement distance by window sliding method, FAIL table is for the current node of the mode except for the character of the created edgeedgeA value;
step one, moving the current node of each mode to a newly created node, and repeating the step one and the step five until the last character of the layer 2 finishes the creation of the node;
step one, creating new nodes according to the first character of the 3 layers, increasing the number, adding 1 node for each mode, drawing an edge between the current node and the added node of each mode, and taking a GOTO table as an edge valueedge(chr,mov),chrIs the first character of the 3 layers,movfor the calculated movement distance by window sliding method, FAIL table is for the current node of the mode except for the character of the created edgeedgeA value;
and step one, moving the current node of each mode to the newly created node, and repeating the step one and the step seven until the last character of the layer 3 finishes the creation of the node.
Referring to fig. 1, the automaton constructs results: taking { ADCAB, EBPCA, ECCADCP, BDCABPDA } as an example to show the construction result of the self-defined position sequential algorithm automaton, firstly, a root node is created, and the number is numbered(ii) a Creating a new node, numbered according to the first character { A, B, E } of the 4 mode 1 layers. Root nodeRespectively drawing edges between the nodes and the 3 newly added nodes, wherein the edge values are respectivelyedge(A,3),edge(B,3),edge(E,3). The failure list of the root node is { (, 1),}. Creating a new node, numbered according to the second character { A, A, C, A } of the 4 mode 1 layers. And withDraw an edge between, the edge value isedge(A,1),Is listed asWhereinRepresenting a set, i.e. corresponding to different charactersmov,{(B,-1)(C,-2)(D,-1)(E,3)(P,-2)(*,1)}。Andand the edge and fail list ofSimilarly. The construction of the remaining nodes is similar to that in the step (3) and is not repeated. Each branch of the tree corresponds to a mode, a node is created by each character according to the sequence of the self-defined position, the edge is a pair of values, the first is the character, and the second is the position moving distance after scanning. Each node has a failure list, and all failure transfer states are in the layer 1, so that the nodes for attacking data access are mainly in the layer 1, the cache loss rate is greatly reduced, and the CPU performance is improved.
Step two, scanning the pointer to point to the first character of the data to be matched, and setting the current state of the automaton as a root node;
step three, matching the current scanning character with the current state node; all edge values and fail pointers for current scan character and current state nodechrMatching successfully, executing step four, wherein the current scanning character and all the edge values and the failure pointers of the current state nodechrFail to match, if a value in the fail pointer is matchedchrSuccess, the scanning pointer is according tomovThe value is moved by a corresponding distance, the automaton jumps to the next node according to the value recorded in the failure pointer of the automaton, and rescanning is carried out; referring to FIG. 2 for robot scanningThe flow chart understands this step.
Step four, the automaton jumps to the next node along the edge, if the new node is the tail node of the mode self-defined sequence, namely the leaf node of the automaton, the step five is executed, otherwise, the scanning pointer scans the leaf node of the pointer according to the edgemovMoving the value by a corresponding distance, if the moved value exceeds the tail character of the data to be matched, terminating the scanning, and otherwise, re-scanning;
and step five, matching a mode, outputting the mode recorded in the OUTPUT table, jumping to the next node recorded by the current node by the automaton according to the current state, and executing the step three.
Referring to fig. 3, illustrating the process of the automaton scanning attack sample, attack data is first constructed by taking { ADCAB, EBPCPCA, eccapdc, bdbcapa } as an example, attack AC algorithm data { ECCADC, bdbcapd }, attack WM algorithm data { EBMCP }, attack SBOM algorithm data { CDCAB }, an attack sample { CDCAB EBMCP eccadbdcbpd } is generated according to the attack data, the numbers on the sides in the figure indicate that the scanning is performed for the second time,is the next state node of the automaton. Specifically, the method comprises the following steps:
first, the state machine enters the root nodeAt this point, scan the 1 st character C, the state machine fails to match C, find { (. 1) from the fail list,the state is stillThe scanning pointer moves 1 character to the right, and the current character is a 2 nd character D;
secondly, scanning the 2 nd character D, failing the state machine to match D, finding out { (. 1) from the failure list,},the state is still asThe scanning pointer moves 1 character to the right, and the current character is the 3 rd character C;
secondly, scanning the 3 rd character C, failing the state machine to match C, finding { (. 1) from the failure list,the state is stillThe scanning pointer moves 1 character to the right, and the current character is a 4 th character A;
secondly, scan the 4 th character A, the state machine match A succeeds, according toEdge value edge (A, 3), scanning pointer moves to right 3 characters, current character is 7 th character B, automaton jumps to state along edge;
Second, scan the 7 th character B, the state machine fails to match B,find from the failure list { (B, -1),scanning the pointer to move 1 character to the left, the current character is the 6 th character E, and the automaton jumps to along the failure pointer;
Secondly, scan the 6 th character E, the state machine match E is successful, according toEdge value edge (E, 3), scanning pointer moves right 3 characters, current character is 9 th character C, automaton jumps to state along edge;
Finally, the subsequent steps are scanned by the method until the scanning pointer exceeds the last character D of the character string to be matched.
In this example, the data to be scanned includes 23 characters, the automaton scans 16 times in total, which accounts for 69.6%, and 8 characters are skipped, which accounts for 34.8%. Automaton node jump sequence 16 nodes are all in layer 1, root nodeThe number of the plants is 8, and the percentage is 50%. Therefore, under the attack of cache loss, all nodes of the pattern matching algorithm with the self-defined sequence are concentrated in 1 level, and cache loss can not occur, so that the matching performance is stable.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.
Claims (5)
1. A self-defined position sequence pattern matching method suitable for cache loss attack is characterized by comprising the following steps:
s1, constructing an automaton, comprising the following steps:
s1.1, constructing an automaton according to a user-defined position sequence; establishing a root node with the number of 0;
s1.2, creating new nodes according to the first character of the layer 1, increasing the number progressively, and adding 1 node in each mode; drawing an edge between the root node and the newly added node, wherein the GOTO table is an edge valueedge(chr,mov),chrIs the first character of the 1 layer,movfor the calculated movement distance by window sliding method, FAIL table is for root node except for the character of creating edgeedgeA value;
the window sliding method comprises the following steps:
s1.2.1, setting the first character of each mode to be aligned with the first character of the window;
s1.2.2, comparing the scanned positions in the window with the positions corresponding to each mode, wherein all the positions are completely matched successfully, executing step S1.2.3, and executing step S1.2.4 if the matching fails;
s1.2.3, finding out the best mode, returning to the next position to be scanned in the mode according to the sequence of the user-defined positions,movthe value is the distance from the nearest character scanning position to the next position to be scanned, and the algorithm is ended;
s1.2.4, moving the window by one character to the left, if the tail position of the window is less than the positions of all the mode first characters, ending the algorithm, and if not, executing a step S1.2.2;
s1.3, creating new nodes according to the second character of the layer 1, increasing the number, adding 1 node for each mode, drawing an edge between the current node and the added node of each mode, and taking a GOTO table as an edge valueedge(chr,mov),chrIs the second character of the layer 1,movfor the calculated movement distance by window sliding method, FAIL table is for the current node of the mode except for the character of the created edgeedgeA value;
s1.4, creating new nodes according to the third character of the layer 1, increasing the number, adding 1 node for each mode, drawing an edge between the current node and the added node of each mode, and taking a GOTO table as an edge valueedge(chr,mov),chrIs the third character of the 1 layer,movfor the calculated movement distance by window sliding method, FAIL table is for the current node of the mode except for the character of the created edgeedgeA value;
s1.5, creating new nodes according to the first character of the 2 layers, increasing the number, adding 1 node for each mode, drawing an edge between the current node and the added node of each mode, and taking a GOTO table as an edge valueedge(chr,mov),chrIs the first character of the 2 layers of the character,movfor the calculated moving distance by window sliding method, FAIL table is for the current node of the mode except for the character of the created edgeedgeA value;
s1.6, moving the current node of each mode to a newly created node, and repeating the step S1.1.5 until the last character of the layer 2 completes the creation of the node;
s1.7, creating new nodes according to the first character of the 3 layers, increasing the number, adding 1 node for each mode, drawing an edge between the current node and the added node of each mode, and taking a GOTO table as an edge valueedge(chr,mov),chrIs the first character of the 3 layers,movfor the calculated movement distance by window sliding method, FAIL table is for the current node of the mode except for the character of the created edgeedgeA value;
s1.8, moving the current node of each mode to a newly created node, and repeating the step S1.1.7 until the last character of the 3 layers completes the creation of the node;
s2, scanning a pointer to point to the first character of the data to be matched, and setting the current state of the automaton as a root node;
s3, matching the current scanning character with the current state node; all edge values and fail pointers for current scan character and current state nodechrIf the matching is successful, step S4 is executed, the current scan character is matched with all the edge values and the failure pointers of the current state nodechrFail to match, if a value in the fail pointer is matchedchrSuccess, the scanning pointer is according tomovThe value being moved by a corresponding distance, the automaton being responsive to the record in its fail pointerSkipping to the next node and rescanning;
s4, the automaton jumps to the next node along the edge, if the new node is the tail node of the mode self-defined sequence, namely the leaf node of the automaton, the step S5 is executed, otherwise, the scanning pointer scans the edgemovThe value is moved by a corresponding distance, if the moved value exceeds the tail character of the data to be matched, the scanning is terminated, otherwise, the scanning is rescanned;
S5, mode hit, mode recorded in the OUTPUT table is OUTPUT, the current state of the automaton jumps to the next node recorded by the current node, and step S3 is executed.
2. The method of claim 1, wherein the automaton comprises a GOTO table, a FAIL table, and an OUTPUT table.
3. The method as claimed in claim 2, wherein the automaton is configured to set two types of windows, namely a fixed window and a variable window, and a fixed-size windowMultiple variable size windowsWhereinSize of [1, shortest mode Length-1],The size of the particles is 2,longest pattern tail character position with the second character as the first character],There are a plurality, the number being the size of the first character set of the pattern.
4. The matching method of self-defined position sequence patterns suitable for the cache loss attack as claimed in claim 3, wherein the automaton has a construction sequence of 1 layer, 2 layers and 3 layers; the sequence of self-defined positions of each layer is as follows:
(1) 1 layer: comprising a plurality of 3 characters of which the number is 3,a first character,A tail character,A longest window tail character;
(2) 2, layer:removing the largest substrings of the first and last 2 characters, and reordering in reverse order;
5. The method according to claim 4, wherein the step of window sliding computation is based on a step-by-step sequential pattern matching methodmovValue of,Calculating by finding a pattern corresponding to a scanned position and calculating a first non-scanned position foundmovA value; calculating outmovThe value formula is as follows:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110292042.7A CN113010882B (en) | 2021-03-18 | 2021-03-18 | Custom position sequence pattern matching method suitable for cache loss attack |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110292042.7A CN113010882B (en) | 2021-03-18 | 2021-03-18 | Custom position sequence pattern matching method suitable for cache loss attack |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113010882A CN113010882A (en) | 2021-06-22 |
CN113010882B true CN113010882B (en) | 2022-08-30 |
Family
ID=76402487
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110292042.7A Active CN113010882B (en) | 2021-03-18 | 2021-03-18 | Custom position sequence pattern matching method suitable for cache loss attack |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113010882B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103500178A (en) * | 2013-09-09 | 2014-01-08 | 中国科学院计算机网络信息中心 | Quick multi-mode matching method on worst-case scenario of FS algorithm |
CN109977276A (en) * | 2019-03-22 | 2019-07-05 | 华南理工大学 | A kind of single pattern matching method based on Sunday algorithm improvement |
Family Cites Families (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7725510B2 (en) * | 2006-08-01 | 2010-05-25 | Alcatel-Lucent Usa Inc. | Method and system for multi-character multi-pattern pattern matching |
US8504510B2 (en) * | 2010-01-07 | 2013-08-06 | Interdisciplinary Center Herzliya | State machine compression for scalable pattern matching |
WO2013103994A2 (en) * | 2012-01-08 | 2013-07-11 | Oppenheimer Steven Charles | System and method for item self-assessment as being extant or displaced |
CA2892471C (en) * | 2013-01-11 | 2023-02-21 | Db Networks, Inc. | Systems and methods for detecting and mitigating threats to a structured data storage system |
US9996387B2 (en) * | 2013-11-04 | 2018-06-12 | Lewis Rhodes Labs, Inc. | Context switching for computing architecture operating on sequential data |
CN106464598B (en) * | 2014-04-23 | 2019-04-23 | 贝匡特有限公司 | Method and apparatus for the web impact factor based on transmission rate gradient |
CN104796354A (en) * | 2014-11-19 | 2015-07-22 | 中国科学院信息工程研究所 | Out-of-order data packet string matching method and system |
CN105260354B (en) * | 2015-08-20 | 2018-08-21 | 及时标讯网络信息技术(北京)有限公司 | A kind of Chinese AC automatic machines working method based on keyword dictionary tree construction |
CN106067039B (en) * | 2016-05-30 | 2019-01-29 | 桂林电子科技大学 | Method for mode matching based on decision tree beta pruning |
US10678907B2 (en) * | 2017-01-26 | 2020-06-09 | University Of South Florida | Detecting threats in big data platforms based on call trace and memory access patterns |
CN110071871A (en) * | 2019-03-13 | 2019-07-30 | 国家计算机网络与信息安全管理中心 | A kind of large model pool ip address matching process |
CN109918548A (en) * | 2019-04-08 | 2019-06-21 | 上海凡响网络科技有限公司 | A kind of methods and applications of automatic detection document sensitive information |
CN110362669B (en) * | 2019-07-18 | 2022-07-01 | 中科信息安全共性技术国家工程研究中心有限公司 | Method suitable for fast keyword retrieval |
CN111046938B (en) * | 2019-12-06 | 2020-12-01 | 邑客得(上海)信息技术有限公司 | Network traffic classification and identification method and equipment based on character string multi-mode matching |
-
2021
- 2021-03-18 CN CN202110292042.7A patent/CN113010882B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103500178A (en) * | 2013-09-09 | 2014-01-08 | 中国科学院计算机网络信息中心 | Quick multi-mode matching method on worst-case scenario of FS algorithm |
CN109977276A (en) * | 2019-03-22 | 2019-07-05 | 华南理工大学 | A kind of single pattern matching method based on Sunday algorithm improvement |
Also Published As
Publication number | Publication date |
---|---|
CN113010882A (en) | 2021-06-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9171153B2 (en) | Bloom filter with memory element | |
CN108846016B (en) | Chinese word segmentation oriented search algorithm | |
CN106980656B (en) | A kind of searching method based on two-value code dictionary tree | |
US8583961B2 (en) | Method and device for creating pattern matching state machine | |
Rasool et al. | String matching methodologies: A comparative analysis | |
WO2017000859A1 (en) | Leaping search algorithm for similar sub-sequences in character sequence and application thereof in searching in biological sequence database | |
CN106599097A (en) | Massive feature string sets matching method and apparatus | |
US9690873B2 (en) | System and method for bit-map based keyword spotting in communication traffic | |
CN108920483B (en) | Suffix array-based character string fast matching method | |
US20070204344A1 (en) | Parallel Variable Length Pattern Matching Using Hash Table | |
CN103544208B (en) | The matching process of massive feature cluster set and system | |
CN113010882B (en) | Custom position sequence pattern matching method suitable for cache loss attack | |
US8051060B1 (en) | Automatic detection of separators for compression | |
CN105335245A (en) | Fault storage method and apparatus and fault search method and apparatus | |
CN112287655B (en) | Matching text de-duplication method and device and electronic equipment | |
CN111159490B (en) | Method, device and equipment for processing pattern character strings | |
CN113065419B (en) | Pattern matching algorithm and system based on flow high-frequency content | |
CN108304467B (en) | Method for matching between texts | |
CN111984828B (en) | Neighbor node retrieval method and device | |
KR100992440B1 (en) | A Multiple Pattern Matching Method using Multiple Consecutive Sub-patterns | |
CN109460495B (en) | Redundant field filtering method based on improved BM algorithm and suffix array | |
CN111814009B (en) | Mode matching method based on search engine retrieval information | |
CN115525801A (en) | Pattern matching algorithm for network security system | |
KR101626721B1 (en) | An efficient algorithm for boxed mesh permutation pattern matching | |
Hon et al. | Succinct indexes for circular patterns |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |