CN117609992A

CN117609992A - Data disclosure detection method, device and storage medium

Info

Publication number: CN117609992A
Application number: CN202311600724.5A
Authority: CN
Inventors: 张佳发; 莫嘉永; 邹洪; 曾子峰; 许伟杰; 金浩; 江家伟; 陈锋
Original assignee: China Southern Power Grid Digital Power Grid Group Information Communication Technology Co ltd
Current assignee: China Southern Power Grid Digital Power Grid Group Information Communication Technology Co ltd
Priority date: 2023-11-27
Filing date: 2023-11-27
Publication date: 2024-02-27

Abstract

The embodiment of the invention discloses a data disclosure detection method, a data disclosure detection device and a storage medium. The method may include: aiming at a service system to be subjected to data disclosure detection, collecting and identifying a plurality of hypertext transfer protocol flows of the service system in a target time period to obtain a plurality of access objects for accessing the service system in the target time period; for each access object in a plurality of access objects, determining a target flow corresponding to the access object in a plurality of hypertext transfer protocol flows, and identifying the target flow to obtain a target data type of key data accessed by the access object in a service system in a target time period; and acquiring the historical data type of the key data accessed by the access object in the service system in the first historical time period, comparing the target data type with the historical data type, and detecting whether the service system has a data leakage risk or not. The technical scheme of the invention can accurately detect whether the service system has the risk of data disclosure.

Description

Data disclosure detection method, device and storage medium

Technical Field

The embodiment of the invention relates to the technical field of data processing, in particular to a data leakage detection method, a data leakage detection device and a storage medium.

Background

With the increasing abundance of data assets, it is important for enterprises to accurately detect whether an internally used service system has a risk of data disclosure so as to improve the overall security of the enterprise network environment.

At present, data disclosure detection is mainly performed based on content identification. However, this implementation cannot effectively distinguish between normal data access and abnormal data leakage, which can lead to a large number of false positives, and improvement is needed.

Disclosure of Invention

The embodiment of the invention provides a data disclosure detection method, a device and a storage medium, which can accurately detect whether a service system has a data disclosure risk.

According to an aspect of the present invention, there is provided a data disclosure detection method, including:

aiming at a service system to be subjected to data disclosure detection, collecting a plurality of hypertext transfer protocol flows of the service system in a target time period, and identifying the plurality of hypertext transfer protocol flows to obtain a plurality of access objects for accessing the service system in the target time period;

for each access object in a plurality of access objects, determining a target flow corresponding to the access object in a plurality of hypertext transfer protocol flows, and identifying the target flow to obtain a target data type of key data accessed by the access object in a service system in a target time period;

And acquiring the historical data type of the key data accessed by the access object in the service system in the first historical time period, comparing the target data type with the historical data type, and detecting whether the service system has a data leakage risk or not.

According to another aspect of the present invention, there is provided a data disclosure detection apparatus including:

the access object obtaining module is used for aiming at a service system to be subjected to data disclosure detection, collecting a plurality of hypertext transfer protocol flows of the service system in a target time period, and identifying the plurality of hypertext transfer protocol flows to obtain a plurality of access objects for accessing the service system in the target time period;

the target data type obtaining module is used for determining target flow corresponding to the access object in the hypertext transfer protocol flow aiming at each access object in the plurality of access objects, and identifying the target flow to obtain the target data type of the key data accessed by the access object in the service system in the target time period;

the data leakage risk detection module is used for acquiring the historical data type of the key data accessed by the access object in the service system in the first historical time period, comparing the target data type with the historical data type, and detecting whether the service system has data leakage risk or not.

According to another aspect of the present invention, there is provided a computer readable storage medium having stored thereon computer instructions for causing a processor to implement any of the data disclosure detection methods provided by any of the embodiments of the present invention when executed.

According to the technical scheme of the embodiment of the invention, aiming at the service system to be subjected to data disclosure detection, a plurality of hypertext transfer protocol flows of the service system in a target time period are collected, the hypertext transfer protocol flows are identified, a plurality of access objects for accessing the service system in the target time period are obtained, the access objects of the data are comprehensively identified, and accurate tracking of the disclosure objects is facilitated; for each access object in a plurality of access objects, determining a target flow corresponding to the access object in a plurality of hypertext transfer protocol flows, identifying the target flow, obtaining a target data type of key data accessed by the access object in a service system within a target time period, and accurately identifying the access content of each access object; the method comprises the steps of obtaining a historical data type of key data accessed by an access object in a service system in a first historical time period, comparing the target data type with the historical data type, detecting whether the service system has a data leakage risk, comparing the target data type with the historical data type, rapidly identifying the access object with abnormal access data type, and rapidly detecting the access object with data leakage. According to the technical scheme, the access object is accurately identified through the hypertext transfer protocol flow, and then whether the access object performs data disclosure on the service system or not is detected through comparing the current target data type and the daily historical data type of the access object, compared with simple content identification, the method is beneficial to effectively distinguishing normal data access and abnormal data disclosure, and further whether the service system has data disclosure risk or not can be accurately detected, and false alarm is avoided.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention, nor is it intended to be used to limit the scope of the invention. Other features of the present invention will become apparent from the description that follows.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a method for detecting data leakage according to an embodiment of the present invention;

FIG. 2 is a flow chart of another method for detecting data leakage according to an embodiment of the present invention;

FIG. 3 is a flow chart of yet another method for detecting data leakage according to an embodiment of the present invention;

FIG. 4 is a flowchart of an alternative example of a further method of detecting data compromise according to an embodiment of the present invention;

FIG. 5 is a block diagram of a data disclosure detection device according to an embodiment of the present invention;

Fig. 6 is a schematic structural diagram of an electronic device implementing a data disclosure detection method according to an embodiment of the present invention.

Detailed Description

In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. The cases of "target", "original", etc. are similar and will not be described in detail herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

In the technical scheme of the invention, the related aspects of acquisition, collection, updating, analysis, processing, use, transmission, storage and the like of the personal information of the user accord with the rules of relevant laws and regulations, are used for legal purposes, and do not violate the popular public sequence. Necessary measures are taken for the personal information of the user, thereby preventing illegal access to the personal information data of the user and maintaining the personal information security, network security and national security of the user.

Fig. 1 is a flowchart of a data disclosure detection method according to an embodiment of the present invention. The embodiment is applicable to the situation of data disclosure detection, and particularly applicable to the situation of data disclosure detection of the dimension of the access object. The method may be performed by the data disclosure detecting device provided by the embodiment of the present invention, where the device may be implemented by software and/or hardware, and the device may be integrated on an electronic device, where the electronic device may be various user terminals or servers, and the electronic device may be a device running a service system described below.

Referring to fig. 1, the method of the embodiment of the present invention specifically includes the following steps:

s110, aiming at a service system to be subjected to data disclosure detection, collecting a plurality of hypertext transfer protocol flows of the service system in a target time period, and identifying the plurality of hypertext transfer protocol flows to obtain a plurality of access objects for accessing the service system in the target time period.

Wherein a business system is understood to be a system constructed to transact related business.

Hypertext transfer protocol (HyperText Transfer Protocol, HTTP) traffic is understood to be traffic exchanged over a network, such as traffic resulting from uploading, browsing, downloading, etc. information.

The target period may be understood as a whole or a part of a period for which data disclosure detection is to be performed, and may be set according to actual situations, and is not particularly limited herein.

An access object may be understood as an object that performs data access to a business system within a target time period.

And collecting a plurality of hypertext transfer protocol flows of the service system in a target time period, and identifying the plurality of hypertext transfer protocol flows to obtain all access objects accessed to the service system in the target time period so as to prevent omission. It is emphasized that these access objects have authorized the electronic device performing the method to collect its own hypertext transfer protocol traffic at the business system.

S120, determining target traffic corresponding to the access object in the hypertext transfer protocol traffic for each access object in the plurality of access objects, and identifying the target traffic to obtain the target data type of the key data accessed by the access object in the service system in the target time period.

For each access object in the plurality of access objects, the target traffic corresponding to the access object can be understood as the hypertext transfer protocol traffic generated by the access of the access object in the service system.

Critical data may be understood as non-compromised data in a business system, such as sensitive data, confidential data, or core data. It is emphasized that these access objects have authorized the electronic device performing the method to identify key data accessed by itself in the business system.

The target data type may be understood as a type of critical data, such as sensitive data, confidential data or core data, etc., where confidential data is taken as an example, and further may be technical confidential data, strategic confidential data or architecture confidential data, etc.

And identifying the corresponding target flow of the access object in the hypertext transfer protocol flows aiming at each access object in the plurality of access objects to obtain the target data type of the key data accessed by the access object in the service system in the target time period, and corresponding each access object to the target data type of the key data accessed by the access object, so that the access object with information leakage can be conveniently found.

S130, acquiring a historical data type of key data accessed by the access object in the service system in a first historical time period, comparing the target data type with the historical data type, and detecting whether the service system has a data leakage risk.

The first history period may be understood as a period of time in which the access object accesses the key data on the service system in the past history period, for example, may be the past 1 month, the past 2 months, or the past half year, and may be set according to actual situations, and is not particularly limited herein.

The historical data type may be understood as all data types of key data accessed by the access object during the first historical period.

The data types of key data obtained by each access object in the service system are limited, the data types of the key data which are frequently accessed or can be accessed by the access object can be known through the historical data types for each access object, whether the access object accesses the data types of other key data or not can be known through comparing the target data types of the access object with the historical data types, and whether the service system has data leakage risks or not can be further detected.

According to the technical scheme of the embodiment of the invention, aiming at the service system to be subjected to data disclosure detection, a plurality of hypertext transfer protocol flows of the service system in a target time period are collected, the hypertext transfer protocol flows are identified, a plurality of access objects for accessing the service system in the target time period are obtained, and based on the hypertext transfer protocol flows, the accurate identification of the access objects is realized, so that the combination with the follow-up steps is convenient for determining whether the follow-up key data access is normal data access or abnormal data disclosure; for each access object in a plurality of access objects, determining a target flow corresponding to the access object in a plurality of hypertext transfer protocol flows, identifying the target flow, obtaining a target data type of key data accessed by the access object in a service system within a target time period, and accurately identifying the access content of each access object; the method comprises the steps of obtaining a historical data type of key data accessed by an access object in a service system in a first historical time period, comparing a target data type with the historical data type, detecting whether the service system has a data leakage risk, and accurately identifying the access object with abnormal key data access by comparing the target data type with the historical data type. According to the technical scheme, the access object is accurately identified through the hypertext transfer protocol flow, and then whether the access object performs data disclosure on the service system or not is detected through comparing the current target data type and the daily historical data type of the access object, compared with simple content identification, the method is beneficial to effectively distinguishing normal data access and abnormal data disclosure, and further whether the service system has data disclosure risk or not can be accurately detected, and false alarm is avoided.

An optional technical solution, before collecting a plurality of hypertext transfer protocol flows of a service system in a target time period for a service system to be subjected to data disclosure detection, and identifying the plurality of hypertext transfer protocol flows to obtain a plurality of access objects for accessing the service system in the target time period, the data disclosure detection method further includes: acquiring a pre-constructed key data identification library, wherein the key data identification library stores preset key data of each level and/or each type; identifying the target flow to obtain the target data type of the key data accessed by the access object in the service system in the target time period, wherein the method comprises the following steps: identifying the target flow to obtain all data accessed by the access object in the service system in the target time period; based on the key data identification library, key data is identified from the total data.

The key data identification database may be understood as a database that integrates key data information in the service system, and in particular may be understood as a database that stores key data at preset levels and/or types, where each level may include, for example, a very critical level, a more critical level, a less critical level, etc., and each type may include, for example, sensitive data, core data, confidential data, etc.

And identifying the corresponding target flow of any access object in each access object, obtaining all data accessed by the access object in the service system in a target time period, and comparing the all data with the data in the key data identification library to identify the key data in all data accessed by the access object.

Through the steps, the key data corresponding to each access object can be accurately obtained, so that the specific disclosure object which can be identified by the data disclosure detection can be obtained.

Another optional technical solution, after detecting whether the service system has a risk of data disclosure, further includes: and under the condition that the data leakage risk of the service system is determined according to the obtained detection result, carrying out alarm prompt and/or tracing path tracking on key data causing the data leakage risk.

The tracing path tracking can be understood as tracing and tracking the path and source of the data disclosure through the data disclosure detection.

After detecting that the data leakage risk exists, alarming prompt and/or tracing path tracking are carried out on key data which causes the data leakage risk, so that the leakage range can be effectively prevented from being further enlarged, and loss is reduced.

Fig. 2 is a flowchart of another data disclosure detection method provided in an embodiment of the present invention. The present embodiment is optimized based on the above technical solutions. In this embodiment, optionally, identifying a plurality of hypertext transfer protocol flows to obtain a plurality of access objects for accessing the service system in the target period may specifically include: identifying the hypertext transfer protocol traffic for each of a plurality of hypertext transfer protocol traffic to obtain a login interface for logging in the business system by an access object corresponding to the hypertext transfer protocol traffic; based on the login interface, obtaining a session credential when the access object accesses the service system, and identifying the access object based on the session credential; and obtaining a plurality of access objects for accessing the service system in the target time period according to the access objects respectively corresponding to the hypertext transfer protocol flows. Wherein, the explanation of the same or corresponding terms as the above embodiments is not repeated herein.

Referring to fig. 2, the method of this embodiment may specifically include the following steps:

s210, aiming at a service system to be subjected to data disclosure detection, collecting a plurality of hypertext transfer protocol (HTTP) traffic of the service system in a target time period.

S220, identifying the hypertext transfer protocol traffic for each of the plurality of hypertext transfer protocol traffic to obtain a login interface for logging in the business system by the access object corresponding to the hypertext transfer protocol traffic.

The login interface may be understood as an entry for login access to a service system, such as a web page, an Application (APP), an applet, or the like.

S230, based on the login interface, obtaining a session credential when the access object accesses the service system, and identifying the access object based on the session credential.

Session credentials may be understood as credentials used to temporarily authorize access to an object for data access.

After the access object logs in the service system through the login interface, the service system can return the session credential to the login interface, the access object performs a subsequent series of operations based on the session credential, and the access object can be identified based on the session credential.

S240, according to the access objects respectively corresponding to the hypertext transfer protocol flows, obtaining a plurality of access objects for accessing the service system in the target time period.

S250, determining target traffic corresponding to the access object in the hypertext transfer protocol traffic for each access object in the plurality of access objects, and identifying the target traffic to obtain the target data type of the key data accessed by the access object in the service system in the target time period.

S260, acquiring a historical data type of key data accessed by the access object in the service system in the first historical time period, comparing the target data type with the historical data type, and detecting whether the service system has a data leakage risk.

According to the technical scheme, the login interface corresponding to each access object is obtained through the hypertext transfer protocol flow corresponding to each access object, the access credential corresponding to each access object is determined based on the login interface corresponding to each access object, and the corresponding access object can be identified through the access credential. The accurate identification of the access object corresponding to each hypertext transfer protocol flow is realized.

An optional technical solution, identifying hypertext transfer protocol traffic to obtain a login interface for logging in an access object corresponding to the hypertext transfer protocol traffic to a service system, includes: identifying the hypertext transfer protocol flow to obtain the target login characteristics of the access object corresponding to the hypertext transfer protocol flow for the service system; and acquiring a pre-constructed login feature library, determining a matched login feature matched with the target login feature from all candidate login features stored in the login feature library, and taking a login interface corresponding to the matched login feature as an access object to log in to a login interface of a service system.

The target login feature may be understood as a feature involved in accessing the object login service system, and may be, for example, a uniform resource locator (Uniform Resource Locator, URL) feature, an object name feature, and the like.

A login feature library is understood to be a database in which candidate login features are stored, each of which corresponds to a respective login interface.

Identifying hypertext transfer protocol traffic to obtain target login features of access objects corresponding to the hypertext transfer protocol traffic for a service system, obtaining a pre-built login feature library, determining a matched login feature matched with the target login feature from candidate login features stored in the login feature library, and logging in a login interface corresponding to the matched login feature as an access object to a login interface of the service system, wherein the method can be used for identifying and analyzing the hypertext transfer protocol traffic, extracting URL features and object name features in the hypertext transfer protocol traffic, and transmitting the URL features and the object name features into the pre-built login feature library for matching, so that the login interface corresponding to the hypertext transfer protocol traffic is obtained.

The login interface corresponding to each access object can be accurately obtained through feature matching.

Another optional solution, determining a target traffic corresponding to the access object in the plurality of hypertext transfer protocol traffic, includes: acquiring a session credential corresponding to an access object, and determining credential traffic corresponding to the session credential from a plurality of hypertext transfer protocol traffic; and taking the credential traffic as the target traffic corresponding to the access object.

Credential traffic, among other things, may be understood as traffic generated by an access object performing data access through session credentials.

And determining the credential traffic corresponding to the session credential from a plurality of hypertext transfer protocol traffic, wherein the credential traffic is the target traffic corresponding to the access object represented by the session credential, thereby ensuring the accuracy and efficiency of determining the target traffic.

Fig. 3 is a flowchart of still another method for detecting data disclosure according to an embodiment of the present invention. The present embodiment is optimized based on the above technical solutions. In this embodiment, optionally, the method for detecting data disclosure further includes obtaining a key data type of key data contained in the service system in the second historical time period, comparing the target data type with the historical data type, and detecting whether the service system has a risk of data disclosure may specifically include: and comparing the target data type with the historical data type and comparing the target data type with the key data type respectively, and detecting whether the service system has a data leakage risk or not. Wherein, the explanation of the same or corresponding terms as the above embodiments is not repeated herein.

Referring to fig. 3, the method of this embodiment may specifically include the following steps:

s310, aiming at a service system to be subjected to data disclosure detection, collecting a plurality of hypertext transfer protocol flows of the service system in a target time period, and identifying the plurality of hypertext transfer protocol flows to obtain a plurality of access objects for accessing the service system in the target time period.

S320, determining target traffic corresponding to the access object in the hypertext transfer protocol traffic for each access object in the plurality of access objects, and identifying the target traffic to obtain the target data type of the key data accessed by the access object in the service system in the target time period.

S330, acquiring the historical data types of the key data accessed by the access object in the business system in the first historical time period, and acquiring the key data types of the key data contained in the business system in the second historical time period.

The second history period may be understood as a period of time during which the service system accesses the key data on the service system in the past history period, and may be, for example, the past 1 month, the past 2 months, the past half year, or the like. The first historical time period and the second historical time period may be the same or different time periods, and may be set according to practical situations, which is not particularly limited herein.

The key data type is understood to be a data type of key data, such as sensitive data, confidential data, or core data, and here, confidential data is taken as an example, and further, such as technical confidential data, strategic confidential data, or architecture confidential data.

S340, comparing the target data type with the historical data type and comparing the target data type with the key data type respectively, and detecting whether the service system has a data disclosure risk.

And if the type of the target data accessed by the access object is different from the type of the historical data accessed by the access object in the first historical time period or exceeds the range of the key data type of the service system in the second historical time period, the risk of data disclosure is indicated.

According to the technical scheme, whether the access object has a secret leakage behavior is judged by comparing the target data type accessed by the access object with the historical data type accessed by the access object in the first historical time period, and whether the access object has an overrun is judged by judging whether the target data type accessed by the access object is in the range of the key data type in the second historical time period of the service system. The accuracy of data disclosure detection is further improved.

On the basis, an optional technical scheme is that the data disclosure detection method further comprises the following steps:

acquiring a target data flow direction of key data accessed by an access object in a service system in a target time period, and acquiring a key data flow direction of the key data contained in the service system in a second historical time period;

comparing the target data type with the historical data type and the target data type with the key data type respectively, and detecting whether the service system has data leakage risk or not comprises the following steps: and comparing the target data type with the historical data type, the target data type with the key data type and the target data flow direction with the key data flow direction respectively, and detecting whether the service system has a data leakage risk according to the obtained comparison result.

A target data stream is understood to mean the process of transferring or copying target data from a source to a target site.

By comparing the target data flow direction of the access object with the key data flow direction of the service system in the second historical time period, whether the target data flow direction of the target object is abnormal or not can be judged, and the accuracy of data leakage detection is further improved.

In order to better understand the above-described respective technical solutions, an exemplary description is made below in connection with specific examples. As shown in fig. 4, exemplary steps are as follows:

step 1: hypertext transfer protocol traffic in a business system is collected.

Step 2: and analyzing the hypertext transfer protocol flow to extract URL features and object name features. Then, matching the pre-constructed feature library with the extracted URL feature and the object name feature, identifying a login interface corresponding to the hypertext transfer protocol flow, acquiring a session credential based on the login interface, finding a target flow corresponding to an access object represented by the session credential from all collected hypertext transfer protocol flows based on the session credential, analyzing key data accessed by the access object from the target flow, and determining a target data type of the key data.

Step 3: and based on the key data identification library, identifying various possible key data such as plaintext, coded data, compressed files, documents, pictures and the like in the hypertext transfer protocol traffic. Specifically, the compressed file, the encoded data, the document, the picture and other data in the hypertext transfer protocol traffic can be obtained by analyzing based on a file analysis module and an optical character recognition (Optical Character Recognition, OCR) module which are built in the service system. And combining the analyzed data with a key data identification library to identify the key data contained in the hypertext transfer protocol flow.

Step 4: and in combination with the key data identification, establishing the association between each access object in the service system and the key data and the association between the service system and the key data.

Association of access objects with critical data in a business system: the key data types of the key data acquired by each access object in the service system are limited, and after statistics for a period of time, a corresponding association table can be formed.

Association of business system with critical data: the key data types and key data flow directions of the key data contained in the service system are stable under the condition of no service change, so that the key data types and key data flow directions (such as cross-border or cross-province and the like) related to the service system can be formed after statistics are performed for a period of time.

Step 5: and judging whether the key data are compromised according to the strategy.

Strategy: policies refer to the generation of alarms when the data transfer behavior of an access object violates the access behavior of the access object history or the critical data types and critical data flows involved in the business system.

Step 6: tracing the source tracing path of key data which causes the risk of data disclosure; and optimizing various strategies and rule bases, and reducing false alarm rate.

The specific example greatly improves the detection capability of the key data leakage risk in the business system through the accurate identification of the dimension of the access object and the judgment of the security policy, so that the security of the data asset reaches an entirely new height.

Fig. 5 is a block diagram of a data disclosure detecting device according to an embodiment of the present invention, where the device is configured to execute the data disclosure detecting method according to any of the foregoing embodiments. The device and the data disclosure detection method of each embodiment belong to the same invention conception, and reference is made to the embodiment of the data disclosure detection method for details which are not described in detail in the embodiment of the data disclosure detection device. Referring to fig. 5, the apparatus may specifically include: an access object obtaining module 410, a target data type obtaining module 420, and a data leakage risk detecting module 430.

The access object obtaining module 410 is configured to collect a plurality of hypertext transfer protocol flows of the service system in a target time period for a service system to be subjected to data disclosure detection, and identify the plurality of hypertext transfer protocol flows to obtain a plurality of access objects for accessing the service system in the target time period;

a target data type obtaining module 420, configured to determine, for each access object in the plurality of access objects, a target flow corresponding to the access object in the plurality of hypertext transfer protocol flows, and identify the target flow, to obtain a target data type of key data accessed by the access object in the service system in a target time period;

The data disclosure risk detection module 430 is configured to obtain a historical data type of key data accessed by the access object in the service system in the first historical period, compare the target data type with the historical data type, and detect whether the service system has a data disclosure risk.

Optionally, the access object obtaining module further includes:

the login interface obtaining submodule is used for identifying the hypertext transfer protocol traffic according to each hypertext transfer protocol traffic in a plurality of hypertext transfer protocol traffic to obtain a login interface for logging in the business system by an access object corresponding to the hypertext transfer protocol traffic;

the access object identification sub-module is used for obtaining a session credential when the access object accesses the service system based on the login interface and identifying the access object based on the session credential;

and the access object obtaining sub-module is used for obtaining a plurality of access objects for accessing the service system in the target time period according to the access objects respectively corresponding to the hypertext transfer protocol flows.

On the basis, an optional login interface obtaining submodule comprises:

the target login feature obtaining unit is used for identifying the hypertext transfer protocol flow and obtaining the target login feature of the access object corresponding to the hypertext transfer protocol flow for the service system;

The login interface determining unit is used for acquiring a pre-constructed login feature library, determining a matched login feature matched with the target login feature from all candidate login features stored in the login feature library, and taking a login interface corresponding to the matched login feature as a login interface of the access object to log in the service system.

Alternatively, the target data type obtaining module further includes:

the certificate flow determining sub-module is used for acquiring a session certificate corresponding to the access object and determining the certificate flow corresponding to the session certificate from a plurality of hypertext transfer protocol flows;

and the target flow determination submodule is used for taking the credential flow as the target flow corresponding to the access object.

Optionally, the data disclosure detection device further includes:

the key data type acquisition module is used for acquiring key data types of key data contained in the service system in the second historical time period;

the data disclosure risk detection module further includes:

and the data leakage risk detection sub-module is used for comparing the target data type with the historical data type and comparing the target data type with the key data type respectively and detecting whether the service system has data leakage risk or not.

On the basis, the method is optional:

the data flow obtaining module is used for obtaining a target data flow of the key data accessed by the access object in the service system in the target time period and obtaining a key data flow of the key data contained in the service system in the second historical time period;

the data disclosure risk detection sub-module further includes:

the data leakage risk detection unit is used for comparing the target data type with the historical data type, the target data type with the key data type and the target data flow direction with the key data flow direction respectively, and detecting whether the service system has data leakage risk according to the obtained comparison result.

Optionally, the data disclosure detection device further includes:

the key data identification base construction module is used for acquiring a pre-constructed key data identification base, wherein the key data identification base stores preset key data of each level and/or each type;

the target data type obtaining module further includes:

the data obtaining sub-module is used for identifying the target flow to obtain all data accessed by the access object in the service system in the target time period;

and the key data identification sub-module is used for identifying key data from all data based on the key data identification library.

Optionally, the data disclosure detection device further includes:

the tracing path tracking module is used for carrying out alarm prompt and/or tracing the key data causing the data disclosure risk under the condition that the data disclosure risk exists in the service system according to the obtained detection result.

According to the data leakage detection device provided by the embodiment of the invention, the access object obtaining module is used for collecting a plurality of hypertext transfer protocol flows of the service system in a target time period aiming at the service system to be subjected to data leakage detection, and identifying the plurality of hypertext transfer protocol flows to obtain a plurality of access objects of the access service system in the target time period, so that the access objects of the data are comprehensively identified, and the accurate tracking of the leakage objects is facilitated; determining a target flow corresponding to the access object in a plurality of hypertext transfer protocol flows according to each access object in a plurality of access objects through a target data type obtaining module, and identifying the target flow to obtain a target data type of key data accessed by the access object in a service system in a target time period, thereby accurately identifying the access content of each access object; the method comprises the steps of acquiring a historical data type of key data accessed by an access object in a business system in a first historical time period through a data leakage risk detection module, comparing the target data type with the historical data type, detecting whether the business system has data leakage risk, comparing the business system with the historical data type, rapidly identifying the access object with abnormal access data type, and rapidly detecting the access object with data leakage. The data disclosure detection device provided by the embodiment of the invention can effectively distinguish normal data access from abnormal data disclosure, and further can accurately detect whether a service system has a data disclosure risk or not, so as to avoid false alarm.

It should be noted that, in the embodiment of the data disclosure detecting device, each unit and module included are only divided according to the functional logic, but not limited to the above division, so long as the corresponding function can be implemented; in addition, the specific names of the functional units are also only for distinguishing from each other, and are not used to limit the protection scope of the present invention.

Fig. 6 shows a schematic diagram of the structure of an electronic device 10 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.

As shown in fig. 6, the electronic device 10 includes at least one processor 11, and a memory, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively connected to the at least one processor 11, in which the memory stores a computer program executable by the at least one processor, and the processor 11 may perform various appropriate actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM13, various programs and data required for the operation of the electronic device 10 may also be stored. The processor 11, the ROM12 and the RAM13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.

Various components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, etc.; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.

The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 11 performs the various methods and processes described above, such as the data leakage detection method.

In some embodiments, the data disclosure detection method may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as the storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM12 and/or the communication unit 19. When the computer program is loaded into RAM13 and executed by processor 11, one or more steps of the data leakage detection method described above may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the data disclosure detection method in any other suitable manner (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) through which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.

The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.

The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims

1. The data disclosure detection method is characterized by comprising the following steps:

for each access object in the plurality of access objects, determining a target flow corresponding to the access object in the plurality of hypertext transfer protocol flows, and identifying the target flow to obtain a target data type of key data accessed by the access object in the service system in the target time period;

Acquiring a historical data type of key data accessed by the access object in the service system in a first historical time period, comparing the target data type with the historical data type, and detecting whether the service system has a data leakage risk or not.

2. The method of claim 1, wherein said identifying said plurality of hypertext transfer protocol traffic to obtain a plurality of access objects for accessing said business system within said target time period comprises:

identifying the hypertext transfer protocol traffic for each of the plurality of hypertext transfer protocol traffic to obtain a login interface for logging in the business system by an access object corresponding to the hypertext transfer protocol traffic;

based on the login interface, obtaining a session credential when the access object accesses the service system, and identifying the access object based on the session credential;

and obtaining a plurality of access objects for accessing the service system in the target time period according to the access objects respectively corresponding to the hypertext transfer protocol flows.

3. The method according to claim 2, wherein the identifying the hypertext transfer protocol traffic to obtain the access object corresponding to the hypertext transfer protocol traffic is logged into the login interface of the service system, includes:

Identifying the hypertext transfer protocol flow to obtain a target login characteristic of an access object corresponding to the hypertext transfer protocol flow for the service system;

and acquiring a pre-constructed login feature library, determining a matched login feature matched with the target login feature from all candidate login features stored in the login feature library, and taking a login interface corresponding to the matched login feature as the login interface of the access object to log in the service system.

4. The method of claim 2, wherein determining a target traffic of the plurality of hypertext transfer protocol traffic corresponding to the access object comprises:

acquiring a session credential corresponding to the access object, and determining credential traffic corresponding to the session credential from the plurality of hypertext transfer protocol traffic;

and taking the credential traffic as the target traffic corresponding to the access object.

5. The method as recited in claim 1, further comprising:

acquiring key data types of key data contained in the service system in a second historical time period;

the comparing the target data type with the historical data type, and detecting whether the business system has a data disclosure risk includes:

And comparing the target data type with the historical data type and the target data type with the key data type respectively, and detecting whether the business system has a data disclosure risk or not.

6. The method as recited in claim 5, further comprising:

acquiring a target data flow direction of key data accessed by the access object in the service system in the target time period, and acquiring a key data flow direction of key data contained in the service system in the second historical time period;

the comparing the target data type with the historical data type and the target data type with the key data type respectively, and detecting whether the service system has a data disclosure risk includes:

and comparing the target data type with the historical data type, the target data type with the key data type and the target data flow direction with the key data flow direction respectively, and detecting whether the service system has a data leakage risk according to the obtained comparison result.

7. The method as recited in claim 1, further comprising:

Acquiring a pre-constructed key data identification library, wherein the key data identification library stores preset key data of each level and/or each type;

the identifying the target flow to obtain the target data type of the key data accessed by the access object in the service system in the target time period comprises the following steps:

identifying the target flow to obtain all data accessed by the access object in the service system in the target time period;

and identifying key data from all the data based on the key data identification library.

8. The method of claim 1, further comprising, after said detecting whether the business system is at risk of data disclosure:

and under the condition that the business system is determined to have the data leakage risk according to the obtained detection result, carrying out alarm prompt and/or tracing path tracking on key data which causes the data leakage risk.

9. A data disclosure detection apparatus, comprising:

the access object obtaining module is used for collecting a plurality of hypertext transfer protocol flows of the service system in a target time period aiming at the service system to be subjected to data disclosure detection, and identifying the plurality of hypertext transfer protocol flows to obtain a plurality of access objects for accessing the service system in the target time period;

A target data type obtaining module, configured to determine, for each access object in the plurality of access objects, a target flow corresponding to the access object in the plurality of hypertext transfer protocol flows, and identify the target flow, to obtain a target data type of key data accessed by the access object in the service system in the target time period;

10. A computer readable storage medium storing computer instructions for causing a processor to perform the method of detecting data compromise according to any one of claims 1 to 8.