Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail below with reference to the accompanying drawings, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The terminology used in the embodiments of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise, the "plurality" generally includes at least two.
It should be understood that although the terms first, second, third, etc. may be used to describe … … in embodiments of the present invention, these … … should not be limited to these terms. These terms are only used to distinguish … …. For example, the first … … may also be referred to as the second … …, and similarly the second … … may also be referred to as the first … …, without departing from the scope of embodiments of the present invention.
It should be understood that the term "and/or" as used herein is merely one relationship describing the association of the associated objects, meaning that there may be three relationships, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship.
The words "if", as used herein, may be interpreted as "at … …" or "at … …" or "in response to a determination" or "in response to a detection", depending on the context. Similarly, the phrase "if determined" or "if detected (stated condition or event)" may be interpreted as "when determined" or "in response to determination" or "when detected (stated condition or event)" or "in response to detection (stated condition or event), depending on the context.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a product or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such product or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a commodity or device comprising such element.
The dynamic privacy data encryption method provided by the invention mainly aims at two technical problems of data security and privacy protection, which are increasingly important in the commercial field. The automatic identification and protection of the sensitive data and the data privacy and security enhancement are realized by combining the steps of automatic identification of the sensitive data, data partitioning, association with multiparty computing (MPC) tasks, dynamic incremental encryption and the like.
As shown in fig. 1, the present invention discloses a dynamic privacy data encryption method, which comprises:
step S1: and automatically identifying and partitioning the sensitive data.
In the step S1, sensitive data needs to be automatically analyzed and identified, and partitioned according to the sensitivity level. The data of each partition is partitioned into blocks according to its association with a particular MPC (multi-party computing) task, forming a plurality of blocks per packet. The same data block organization structure is applied to each distributed storage node, so that the unified management and storage of the data block structure are facilitated, and data packets with different versions can be stored in each data block.
Step S2: update the data block and associate the data block with the MPC task.
In step S2, when a certain data block is ready to be updated, it is associated with the relevant MPC task. When the data block changes, all participants of the related MPC task are notified, and an incremental encryption mechanism is activated.
Step S3: a dynamic delta encryption mechanism is implemented.
In step S3, the relevant MPC participant turns on the delta encryption function. Incremental encryption is performed on the changed data based on the data block size, the current task processing condition, and the resource availability condition, wherein the incremental encryption is based on a dynamic authority key. After the incremental encryption is completed, the MPC participant distributes the incremental encrypted data to all storage nodes.
Step S4: incremental decryption and application results
In the step S4, the storage node applies the incrementally encrypted data to the relevant data block, while maintaining the encryption status of the unchanged data.
The invention automatically identifies and partitions the sensitive data, is beneficial to reducing human errors and improves the efficiency and accuracy of the sensitive data protection. The invention enhances the security of data by associating with MPC tasks and implementing dynamic delta encryption where only authorized participants can access and process the data. In addition, the incremental encryption allows processing only the changed data blocks, thereby reducing the calculation amount and improving the flexibility and efficiency of data processing.
To achieve automatic identification and marking of sensitive data, the data is partitioned appropriately according to its sensitivity level. Data is first collected and preprocessed before automatic identification begins. Preprocessing may include cleansing the data (removing duplicate entries, correcting errors, filling in missing values), format normalization (date format unification), text normalization, etc.
And then carrying out sensitivity evaluation on the preprocessed data, wherein the sensitivity evaluation comprises the steps of carrying out keyword feature extraction and then carrying out keyword matching according to the regular expression so as to identify sensitive data in the business data.
Regular expressions are a rule-based algorithm that determines the sensitivity level of data to pre-processed data based on predefined sensitive data identifiers and business rules.
Sensitive data includes, but is not limited to, personal identification information (e.g., identification card number, cell phone number), financial information (e.g., bank card number), personal privacy (e.g., address, mailbox address), etc. The number of the identification card (18 digits, the last digit may be a number or letter X), the number of the mobile phone (the current mobile phone number in China is 11 digits and starts with 13, 14, 15, 16, 17, 18 and 19), the number of the bank card (usually 16 to 19 digits), the address of the mailbox (simple mailbox address matching), the address (matching Chinese characters and common address components). With the updating of rules and standards, regular expressions need to be updated regularly to accommodate new formats.
The data is partitioned into different categories according to the level of sensitive data. Partitioning is based on the sensitivity level of the data (e.g., levels 1-5), with lower levels of data being less sensitive and higher levels of data being more sensitive.
For the sensitivity level, the sensitivity level is optionally determined according to the following rule:
Sensitivity level 1, e.g., publicly published information. Sensitivity level 2, such as a customer personal mailbox address. Sensitivity level 3, such as customer address information. Sensitivity level 4, such as customer cell phone number, identification card number. Sensitivity level 5, such as a bank card number, and corresponding bank account transaction information.
Once the data partitioning is complete, the data partitioning is divided into a plurality of data blocks, each block of data needs to be associated with a particular MPC task.
The MPC task performs encryption processing on the data of the data block to ensure proper encryption strength.
The data blocks are associated with corresponding MPC tasks according to their sensitivity level. Ensuring that the MPC task uses a corresponding encryption protocol and algorithm for each sensitivity level data block. The organization of the data blocks in the partition is determined based on the calculation complexity, network bandwidth, processor capacity and other factors of the encryption algorithm.
The data blocks are partitioned based on the computational complexity of the encryption algorithm, including when the created data block is smaller, indicating that the higher the data sensitivity, the higher the computational complexity of the encryption algorithm needs to be applied. Conversely, when the created data block is larger, indicating that the data sensitivity is lower, an encryption algorithm with low computational complexity may be applied to reduce the total number of encryption operations.
Specifically, all parameters are obtained by initialization to calculate the data block size, including the following parameters:
and C, calculating complexity by an encryption algorithm, and identifying by using the CPU cycle number required by each bit.
The processing power of the processor is expressed in bits per second (bps).
Network bandwidth, expressed in bits per second (bps) that can be transmitted.
T_max, maximum allowable encryption operation delay time (seconds).
S_sec, a security parameter, and a data block size coefficient determined according to security requirements.
The algorithm overhead, expressed as a constant time (seconds) independent of data, may be determined in algorithm initialization time.
Then, the data block size N1 is calculated based on the maximum delay, including determining the size of the data block based on the maximum delay time for each encryption operation using the following formula:
N1=(T_max-O)*P/C,
wherein,
(T_max-O) represents the time actually used for data encryption, and P/C represents the amount of data that can be processed per unit time, for defining the degree of limitation of the upper limit of the processing capacity of the CPU to the data block size.
Meanwhile, calculating the data block size N2 based on the network bandwidth includes determining the size N2 of the data block based on the amount of data the network can transmit in the t_max time using the following formula: n2=b×t_max. In this case, N2 should be less than or equal to the maximum amount of data that the network can transmit within t_max.
In addition, the size of the data block needs to be adjusted according to different security requirements according to the security parameter s_sec. For data requiring higher security, security is increased by reducing the size of the data block, expressed as: n_adj=n×s_sec.
If S_sec is less than 1, it means that the size of the data block needs to be reduced to improve security; if S_sec is greater than 1, the size of the data block may be increased.
Finally, the size of the data block is determined by comprehensively considering factors such as processing capacity, network bandwidth, security requirements and the like. Thus, the final data block size n_final is the minimum of the above several factor calculations: n_final=min (N, n_adj, b×t_max).
The n_final is an organization manner of the data blocks in the corresponding partitions, and because s_sec of different partitions is different, and the processing capacities P of CPUs allocated by the partitions corresponding to the partitions may be different, the complexity of encryption algorithms of different partitions may be different, so that the organization manner of the data blocks in different partitions is different.
Among these, for the determination of s_sec, there are the following two determination methods:
mode one uses a mapping of sensitivity levels to security weights.
Specifically, each level is assigned a security weight W_sec that scales with the sensitivity level of the data.
Class 1:w_sec=1.0 (default weight, no additional security measures are required)
Grade 2:w_sec=0.8
Grade 3:w_sec=0.6
Grade 4:w_sec=0.4
Class 5:w_sec=0.2 (highest class sensitivity, minimum data block is required)
Here, a w_sec value less than 1 indicates that as the sensitivity level increases, the size of the data block needs to be reduced to improve security.
And if the data partition is only performed based on the data attribute, a plurality of data contents with different attributes exist in one partition, and when the data is partitioned, only the data contents with the same attribute exist in one data block. For example, if the level 1 partition has information such as a bank card number, a transaction record, a transaction total amount, a transaction time, etc., but only the bank card number is saved for a specific data block, the security weights w_sec of different data blocks in the same data partition may be determined according to the second mode, so that the security weights w_sec are more flexibly allocated.
The following is the formula for calculating S_sec in mode two:
S_sec=W_sec*F_legal*F_threat*F_env,
wherein,
w_sec, security weight defined according to sensitivity level (refer to mode one).
F_legal, a legal requirement factor, which is between a constant b (e.g., 0.5) and 1 if there is a legal requirement that determines F_legal.
F_thread-threat model factor, f_thread is determined from high risk threats, this value being between a constant b (e.g. 0.5) and 1.
F_env, an environmental factor, which is determined based on the degree of security of the data processing environment, is between a constant b (e.g., 0.5) and 1.
The legal requirements factor (F_legal) reflects the degree of strictness of the data protection and privacy laws that an organization must adhere to. This factor can be determined based on the complexity of legal requirements, the severity of the fine, and compliance costs. Where a value of F_leg should be between 0 and 1, a smaller value indicates that legal requirements are more stringent and more safety measures are required to ensure compliance.
The threat model factor (f_thread) takes into account the type and severity of security threats faced by the organization. The threat models faced by different attribute types of data and different business models may be different, e.g., banking card information may be more threatening than mailbox information. Alternatively, a value of F_thread between a constant b (e.g., 0.5) and 1, with a larger value indicating a higher likelihood and impact of a potential threat, more security measures are required to mitigate the risk.
The f_thread may be determined based on a score calculation, specifically including:
All potential threats that may affect data security are identified. For example: external attacks (e.g., DDoS attacks, phishing attacks); internal threats (e.g., malicious insiders, mishandling); system vulnerabilities (e.g., software not updated).
The likelihood and impact of the threat is assessed, particularly for each potential threat, the likelihood of its occurrence and the impact on the organization. The following scores may be used:
0 point: threats are unlikely to occur or otherwise have negligible impact.
1, the method comprises the following steps: the threat probability is low and the impact is small.
2, the method comprises the following steps: threat potential is moderate and the impact is significant.
3, the method comprises the following steps: the threat probability is high and the influence is serious.
Finally, f_thread is calculated, using the following formula:
F_threat=F_threat=b+(b*(1-(Σ(Score_i)/(N*Max_Score)))),
in this formula:
the constant of the lower limit of the set parameter value indicates the maximum degree of influence of the set factor, for example, set to 0.5.
Score_i, score of ith potential threat.
N number of potential threats.
Max_score: the highest Score of a single potential threat (3 in this example).
Based on the above formula, in the case where the scores of all potential threats are highest (i.e., least safe), the value of f_thread will be b, and if the scores of all potential threats are lowest (i.e., most safe), the value of f_thread will be 1 (b is 0.5).
For the environmental factor f_env, the security level may be determined according to control measures related to the security of the data processing environment, including physical security measures (e.g., intranet, extranet), and the configuration level of network security devices (e.g., firewall, intrusion detection system), and the higher the environmental factor f_env, the lower the environmental factor f_env.
For the data block update in step S2 and associating the data block with the MPC task, comprising: the system detects this update through the registered event listener; searching the MPC task associated with the data block, and determining that the current task is affected; the system sends a security notification to the participants participating in the current MPC task; after receiving the notification, the MPC participant starts an incremental encryption algorithm to encrypt and transmit only the modified data portion; the MPC participants update the local calculations, recalculate or adjust the parts they are responsible for.
First, there must be a mechanism to monitor when a data block is ready for updating, including by version control, trigger or event monitoring. The data owner or controller should version manage the data blocks and generate update events when the data blocks are to be updated. The trigger condition for adding a section of version upgrade can be data accumulation amount decision.
Each data block update event needs to be associated with one or more MPC tasks, including maintaining a mapping which records which MPC tasks depend on which data blocks, and querying this mapping for affected MPC tasks when the data block is updated.
When an update occurs, the system needs to notify the participants of the relevant MPC task. This may be achieved by a secure messaging system that ensures that participants are notified and become aware that the dependent data block has been updated.
Since MPC tasks typically involve sensitive data, delta encryption mechanisms should be used to protect the security of the data as it is updated, during which only the changing parts are re-encrypted and distributed instead of the entire data block.
Upon receiving the update notification, the participant needs to adjust its own computing portion to the new data block content, including re-running certain computing steps or restarting the MPC process altogether.
For one MPC instance, each MPC participant may be based on a framework including protecting data for transmission over the network through TLS (transport layer security) or DTLS (datagram transport layer security) protocols using an open source MPC protocol library, implementing synchronization of operations using distributed locks or consensus algorithms, managing data using database encryption, secure storage or key management systems, deploying HSMs to enhance security and encryption operations of keys, and creating secure enclaves on SGX-enabled CPUs to perform sensitive computations.
In the step S3, a dynamic incremental encryption mechanism is implemented, including: and the system determines the incremental encryption strategy of the current MPC task according to the size of the data block, the current CPU load, the memory use condition and the network bandwidth. And the system generates a new authority key based on the current data version number and performs incremental encryption based on the authority key. Where encryption is applied only to the changed data blocks, without re-encrypting the entire data set. The encrypted incremental version of the data block is sent over a secure communication channel to all storage nodes. Each storage node confirms the received data block, verifies and synchronizes the received data block, manages the version of the data of each storage node, and ensures that all participants use the latest encrypted data block.
The method realizes the fine management of data blocks of different versions and parallel encryption, and improves the encryption speed.
Alternatively, the generation of the rights key may be based on the current data version number, in combination with a key generation function KeyGen, which can use a hashing algorithm or pseudo-random function to generate the key from the data version number:
Permission_Key=KeyGen(Data_Version),
the KeyGen function here is a deterministic function to ensure that the same input (data version number) will produce the same rights key. The deterministic function KeyGen is used to generate the rights key, whose core property is to always produce the same output for the same input. KeyGen typically generates a Key by inputting a Data Version number by means of a cryptographically secure hash function or pseudo-random function, such as permission_key=sha256 (data_version), where data_version is the primary input for Key generation, sha256 is a one-way hash function that ensures that the original input cannot be derived in reverse. Since KeyGen uses the SHA256 hash function to generate the Key, the length of the permission_key will be fixed because the output length of SHA256 is fixed. The output length of the SHA256 hash function is 256 bits (i.e., 32 bytes). SHA256 always produces a fixed length output regardless of the length of the input data. This means that the length of the permission_key will always be 256 bits when SHA256 is used.
Optionally, the data block size, the current CPU load, the memory usage and the network bandwidth determine an incremental encryption policy of the current MPC task, so that parameters of the encryption operation, including selection of an encryption algorithm, a size of a key, and concurrency of the encryption operation, are dynamically adjusted according to performance parameters of the system.
The following are the calculation formulas and parameter definitions for the incremental encryption strategy formulation:
the value of the Block Size_Changed is N_final determined in the step S1 of the data block corresponding to the data packet of the incremental encryption.
CPU Load-the percentage of the current CPU Load, is expressed as a value between 0 and 100.
Memory_usage: the amount of Memory currently in use.
Memory_total, total system Memory.
Network_bandwidth: currently available Network Bandwidth.
Data Version-Version of the current Data.
According to the system resource and the data block state, the following incremental encryption strategy is formulated:
(1) Key Length key_length:
Key_Length=f(CPU_Load,Memory_Usage,Network_Bandwidth)
where the f-function may select an appropriate key length based on system resource parameters. High CPU load and memory usage may mean selecting a shorter key length to reduce computational overhead.
Optionally, the function f selects a key length based on system resource parameters, the selection of the key length requiring a trade-off between security and performance. Optionally, the key length is dynamically adjusted according to CPU load, memory usage, and network bandwidth.
The references defining the key length are respectively:
Key_Length_Max, an optional maximum Key Length, e.g., 256 bits.
Key_Length_Min an optional minimum Key Length, e.g., 128 bits.
Defining a threshold for resource usage, comprising:
cpu_load_max: the maximum acceptable percentage of CPU Load, e.g., 80%.
Memory_usage_max: the maximum acceptable percentage of Memory Usage, e.g., 75%.
Network_bandwidth_min, the minimum acceptable value of Network Bandwidth, e.g., 1Mbps.
Key_length=f (cpu_load, memory_use, network_bandwidth) is determined, i.e., the Key Length key_length is dynamically adjusted according to CPU Load, memory Usage, and Network Bandwidth.
First, the Memory Usage memory_usage is converted into a percentage memory_usage_percentage: memory_use_percentage= (memory_use/memory_total) ×100
Then, the resource pressure Index stress_index is calculated, with a value range of [0,1]:
Stress_Index=(CPU_Load/CPU_Load_Max+Memory_Usage_Percent/Memory_Usage_Max+(Network_Bandwidth_Min/Network_Bandwidth))/3,
the Key Length Key_Length_1 is selected according to the resource pressure Index stress_Index:
Key_Length_1=Key_Length_Max-(Stress_Index*(Key_Length_Max-Key_Length_Min)),
Finally, it is necessary to ensure that the Key key_length Length is not less than a minimum value:
Key_Length=max(Key_Length_1,Key_Length_Min)。
wherein, stress_index is a calculation Index, and CPU load, memory usage percentage and network bandwidth are comprehensively considered. When the stress_index value increases, it indicates that the pressure of the system resource increases. The Key Length is dynamically adjusted between a maximum value Key Length Max and a minimum value Key Length Min, and as the system pressure increases, a shorter Key Length is selected to reduce computational overhead. The f-function described above ensures that the Key Length is not lower than the minimum security standard key_length_min defined.
The key_length variable defines the Length of the encryption Key, and the key_length is used to control the Key Length of the permission_key, so that the encryption Key Length (key_length) in the system needs to be dynamically adjusted to be smaller than or equal to the Length of the output of the SHA256, and the output of the SHA256 needs to be truncated to match the required key_length. Specifically, the previous Key_Length bit is truncated for the permission_Key of the output to SHA256 to obtain a Key of Length Key_Length.
(2) Selection of Encryption Algorithm encryption_algorithm:
Encryption_Algorithm=g(BlockSize_Changed),
wherein the g function may select an appropriate encryption algorithm based on the size of the data block in which the data packet to be incrementally encrypted is located. Since the size of the data block where the data packet to be incrementally encrypted is located represents the security sensitivity of the data within the data packet, smaller data blocks require the use of a computationally intensive encryption algorithm to provide greater security.
Optionally, the g function should select different encryption algorithms based on the size of the data block in which the data packet to be incrementally encrypted is located, and define two thresholds for the size partition of the data block in which the data packet to be incrementally encrypted is located: block Size_Small_threshold, defining the maximum Threshold for Small data blocks, block Size_Medium_threshold, defining the maximum Threshold for Medium data blocks.
Determining a corresponding encryption algorithm according to different areas of the size of a data block of a data packet to be incrementally encrypted, wherein the encryption algorithm comprises the following steps:
when the size of the data block of the data packet to be incrementally encrypted is smaller than the BlockSize_Small_Threshold, entering an Algorithm_Secur mode, and adopting a safer encryption Algorithm RSA asymmetric encryption Algorithm.
And when the size of a data block of the data packet to be incrementally encrypted is between the block size_small_threshold and the block size_medium_threshold, entering an Algorithm_Balanced mode, and adopting an encryption Algorithm AES-CBC Algorithm with Balanced security and performance.
When the size of the data block of the data packet to be incrementally encrypted is larger than the block size_medium_threshold, entering an Algorithm_fast mode, adopting a smaller calculated amount, and executing a faster encryption Algorithm AES-CTR Algorithm.
(3) Concurrency of Encryption operation encryptions_Concurrency:
Encryption_Concurrency=h(Network_Bandwidth,Memory_Usage)
wherein the h-function may determine the number of concurrent encryption tasks to perform based on available network bandwidth and memory usage. And when the network bandwidth is large and the memory usage is not high, the concurrency can be increased to accelerate the encryption process.
An h-function for determining the number of encryption tasks that can be performed simultaneously based on network bandwidth and memory usage, comprising:
defining a threshold for resource usage:
network_bandwidth_max: the maximum value of Network Bandwidth.
memory_usage_Max, the maximum Memory Usage that can be allocated to encryption operations in the total Memory of the system.
Determining a range defining a concurrency:
Concurrency_Min, a minimum Concurrency, e.g., 1 (indicating that at least one encryption operation is running).
Concurrency_Max, maximum Concurrency, e.g., 256 (based on the maximum number of Concurrency that the system can handle).
The specific form of the h function is: h (network_bandwidth, memory_use), comprising the following procedures:
first, the current network bandwidth utilization is calculated, assuming that higher bandwidth utilization allows more concurrent encryption operations:
Network_Utilization=Network_Bandwidth/Network_Bandwidth_Max,
then, the current memory utilization is calculated, and the lower memory utilization allows more concurrent encryption operations, then:
Memory_Utilization=Memory_Usage/Memory_Usage_Max,Memory_Available=1-Memory_Utilization。
Then, the Concurrency degree of the Encryption operation, namely the encryption_Concurrency_1, is calculated, and depends on the network_Utilization of the Network resource Utilization and the memory_available of the Memory resource Utilization.
Encryption_Concurrency_1=int((Network_Utilization+Memory_Available)/2*Concurrency_Max)。
Finally, limiting the scope of the encryption_Concurrency_1 to obtain the Concurrency degree of the final Encryption operation, including:
Encryption_Concurrency=max(Encryption_Concurrency_1,Concurrency_Min),
Encryption_Concurrency=min(Encryption_Concurrency_1,Concurrency_Max)。
the network_availability and the memory_availability are two factors between 0 and 1, which represent the Network bandwidth and the Utilization of the Available Memory, respectively.
A standardized Concurrency factor is calculated by taking the average value of the network bandwidth utilization rate and the memory availability rate, and is multiplied by Concurrency_Max to obtain the initial Concurrency number.
The final Concurrency is then adjusted by ensuring that the Concurrency is not lower than Concurrency_Min and not higher than Concurrency_Max. The int function is used to ensure that the concurrency is an integer, mathematically equivalent to a round-down calculation function.
Finally, encryption is applied based on the determined dynamic delta encryption policy:
Encrypted_Block=Encrypt(Data,Permission_Key,Encryption_Algorithm)
where Encrypt is the Encryption function, data is the incremental Data packet to be encrypted, permission_key is the rights Key for Encryption, and encrypton_algoritm is the Algorithm selected according to the policy.
The system can dynamically adjust the incremental encryption mode, so as to utilize the current system resource in an optimal mode and ensure the safety of data.
Encryptions_algorithm is an Encryption Algorithm that defines how Data is encrypted with a limited-length Key permission_key and a selected Encryption Algorithm under the set concurrency conditions. The setting algorithms may be RSA, AES-CTR and AES-CBC algorithms. The set concurrency interval is 1-256, and the limit length is 128-256 bits.
Finally, the Encrypt function combines Data, permission _key and encryption_algorithm to generate an encryptedjblock.
Alternatively, in a multiparty computing (MPC) scenario, participants often need to share information while maintaining the privacy of the data. After the completion of the delta encryption, the MPC participant distributes the Encrypted data Block (encrypted_block) to all storage nodes.
The MPC participant first determines which data has changed since the last update. They then encrypt these delta data using the permission_key and the selected encryption_algorithm, generating encrypted_block.
After encryption is complete, the MPC participant sends an Encrypted Block to each storage node in the network.
The MPC participant sends and limits Key lengths key_length and encrypted_block to each storage node in the network.
After receiving the encrypted_block, the storage node stores it locally. Since the data is encrypted, the data content remains secure even if the storage node is attacked or corrupted.
In a distributed storage system, users typically need to access encrypted data stored on nodes in a secure manner. In the present invention, when access or processing of these data is required, an authorized user can obtain decrypted encrypted_block based on a corresponding decryption key storage node, thereby obtaining original incremental data, including:
before a user initiates a request to a storage node, the user needs to confirm whether the user has the authority to access the data corresponding to the specific version number through the identity verification of the system. Once authenticated, the user requests a particular version number of packets and their corresponding Key Length key_length. After receiving the request, the storage node verifies whether the request of the user is valid or not, and the verification of the user request is valid or not comprises confirming whether the request is sent within a preset time period after verification is passed or not. Upon confirming that the user has access to the data, the storage node sends the requested data packet (still in encrypted state) and the required key_length to the user.
The user generates a corresponding encryption Key through a Key generation mechanism (KeyGen function) according to the supplied version number and key_length. The version number and key length are public and the key itself is private.
The user decrypts the received encrypted data using the generated key. After decryption is successful, the user may access the plaintext version of the data and perform subsequent operations, such as reading, editing, or processing the data.
By the method and the system, the encryption requirement of the whole data set is reduced by using incremental encryption and decryption, and only the changed part is processed, so that the data processing speed is improved. In addition, the data access authority is allowed to be controlled in fine granularity through the dynamic authority key and a data management mechanism associated with the MPC task, so that the flexibility of authority management is improved. And the invention greatly strengthens the data privacy protection by partitioning sensitive data and implementing incremental encryption, reduces the risk of data leakage, and is beneficial to optimizing the use of resources and reducing the operation and maintenance cost by dynamically performing the incremental encryption according to the current task processing condition and the resource availability condition.
It should be noted that the computer readable medium described in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.
The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device.
Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments of the present disclosure may be implemented by means of software, or may be implemented by means of hardware. Wherein the names of the units do not constitute a limitation of the units themselves in some cases.
The foregoing description of the preferred embodiments of the present invention has been presented for purposes of clarity and understanding, and is not intended to limit the invention to the particular embodiments disclosed, but is intended to cover all modifications, alternatives, and improvements within the spirit and scope of the invention as outlined by the appended claims.