CN110795477A

CN110795477A - Data training method, device and system

Info

Publication number: CN110795477A
Application number: CN201910894089.3A
Authority: CN
Inventors: 何安珣; 王健宗
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2019-09-20
Filing date: 2019-09-20
Publication date: 2020-02-14
Also published as: WO2021051610A1

Abstract

The invention provides a data training method, a device and a system, wherein the method comprises the following steps: sending an initial training model to a plurality of clients, wherein the plurality of clients are each in individual communication with a server; receiving a plurality of sets of first model parameters sent by the plurality of clients, wherein the first model parameters are obtained by the clients through training the initial training model according to first medical data of a local database; carrying out weighted average on the multiple sets of first model parameters to obtain second model parameters; and sending the second model parameters to the plurality of clients, wherein the second model parameters are used for constructing the same second training model at the plurality of clients respectively. The invention solves the technical problems that the algorithm model for processing the medical data is complex, large-scale medical data which is high in safety and inconvenient to flow cannot be processed and the like in the related technology.

Description

Data training method, device and system

Technical Field

The invention relates to the field of computers, in particular to a data training method, a device and a system.

Background

In the related technology, medical image auxiliary identification is a mature application of an artificial intelligent image identification technology in the medical field, and a plurality of organizations at home and abroad establish a standardized regional medical image data center cloud platform service by taking the technology as a core, and integrate the functions of auxiliary diagnosis, centralized data storage and management, regional major disease analysis, regional crowd health portrait and the like. The currently widely used regional cloud platform is only a regional health information sharing system as the name implies, and is essentially a private cloud with clinics and hospitals as units or several hospitals as units.

Because of the privacy of medical health data, the large-scale effect can not be generated, the problem of the data island still exists, the training of the medical health model is still limited by the limited data, some medical institutions need to spend higher cost to purchase the trained model of a third-party institution, the whole information sharing degree of the industry is low, the economic efficiency is not high, and the medical health ecology is difficult to further develop on the basis.

Traditional data structure and machine learning is to integrate data and then train based on the integrated data set. The method requires data to be transmitted at a distributed data set and a central server, and the central server integrates mass data, so that the calculation force required by training a model is high, the calculation cost is correspondingly high, and the response time is long. Meanwhile, for some data which are relatively high in safety and inconvenient to flow, such as medical health data, the method cannot be used for model training on a large scale.

In view of the above problems in the related art, no effective solution has been found at present.

Disclosure of Invention

The embodiment of the invention provides a data training method, a device and a system, which are used for at least solving the technical problems that an algorithm model for processing medical data in the related technology is complex, cannot process large-scale medical data which is high in safety and inconvenient to flow, and the like.

According to an embodiment of the present invention, there is provided a data training method including: sending an initial training model to a plurality of clients, wherein the plurality of clients are each in individual communication with a server; receiving a plurality of sets of first model parameters sent by the plurality of clients, wherein the first model parameters are obtained by the clients through training the initial training model according to first medical data of a local database; carrying out weighted average on the multiple sets of first model parameters to obtain second model parameters; and sending the second model parameters to the plurality of clients, wherein the second model parameters are used for constructing the same second training model at the plurality of clients respectively.

Optionally, before performing weighted average on the multiple sets of first model parameters to obtain a second model parameter, the method further includes: and decrypting the first model parameter according to a preset private key, wherein the private key and public keys corresponding to the plurality of clients form a group of key pairs, and the public keys are used for encrypting the first model parameter.

Optionally, performing weighted average on the multiple sets of first model parameters to obtain a second model parameter, including: selecting among M sets of first model parameters

The method comprises the following steps of firstly, selecting N sets of first model parameters each time, and carrying out weighted average on the N sets of first model parameters selected each time to obtain a primary model parameter, wherein N is an integer smaller than M; to pair

And carrying out weighted average on the first-stage model parameters to obtain the second model parameters.

According to another embodiment of the present invention, there is provided a data training method including: receiving an initial training model sent by a server; training the initial training model according to first medical data of a local database to obtain a first training model; sending the first model parameters of the first training model to the server, wherein the server is used for carrying out weighted average on a plurality of sets of first model parameters of a plurality of clients to obtain second model parameters of a second training model, and feeding the second model parameters back to the plurality of clients; and constructing a second training model according to the second model parameters, and training second medical data of the local database by using the second training model.

Optionally, training the initial training model according to the first medical data of the local database to obtain a first training model, including: performing batch gradient calculation on the initial training model by using the first medical data of the local database to obtain a plurality of gradient values; calculating an average gradient of the plurality of gradient values; and updating the initial weight value of the initial training model by using the average gradient to obtain the first model parameter.

According to an embodiment of the present invention, there is provided a data training apparatus including: the system comprises a first sending module, a second sending module and a third sending module, wherein the first sending module is used for sending an initial training model to a plurality of clients, and the clients are communicated with a server independently; the receiving module is used for receiving a plurality of sets of first model parameters sent by the clients, wherein the first model parameters are obtained by the clients through training the initial training model according to first medical data of a local database; the calculation module is used for carrying out weighted average on the plurality of sets of first model parameters to obtain second model parameters; and the second sending module is used for sending the second model parameters to the plurality of clients, wherein the second model parameters are used for constructing the same second training model at the plurality of clients respectively.

Optionally, the apparatus further comprises: and the decryption module is used for decrypting the first model parameters according to a preset private key before carrying out weighted average on the plurality of sets of first model parameters to obtain second model parameters, wherein the private key and a public key sent to the target terminal are a group of key pairs, and the public key is used for encrypting the first model parameters.

Optionally, the calculation module includes: a selection unit for selecting among the M sets of first model parameters

The method comprises the following steps of firstly, selecting N sets of first model parameters each time, and carrying out weighted average on the N sets of first model parameters selected each time to obtain a primary model parameter, wherein N is an integer smaller than M; a computing unit for

According to another embodiment of the present invention, there is provided a data training apparatus including: the receiving module is used for receiving the initial training model sent by the server; the first training module is used for training the initial training model according to first medical data of a local database to obtain a first training model; the sending module is used for sending the first model parameters of the first training model to the server, wherein the server is used for carrying out weighted average on a plurality of sets of first model parameters of a plurality of clients to obtain second model parameters of a second training model, and feeding the second model parameters back to the plurality of clients; and the second training module is used for constructing a second training model according to the second model parameters and training second medical data of the local database by using the second training model.

Optionally, the first training module includes: the first calculation unit is used for performing batch gradient calculation on the initial training model by using the first medical data of the local database to obtain a plurality of gradient values; a second calculation unit for calculating an average gradient of the plurality of gradient values; and the third calculating unit is used for updating the initial weight value of the initial training model by using the average gradient to obtain the first model parameter.

There is also provided, in accordance with yet another embodiment of the present invention, a system for training data, including: a server and a plurality of clients, wherein the server comprises: the first sending module is used for sending the initial training model to a plurality of clients; the receiving module is used for receiving a plurality of sets of first model parameters sent by the plurality of clients; the calculation module is used for carrying out weighted average on the plurality of sets of first model parameters to obtain second model parameters; a second sending module, configured to send the second model parameter to the multiple clients; the plurality of clients, each in individual communication with the server, include: a receiving module, configured to receive the initial training model; the first training module is used for training the initial training model according to first medical data of a local database to obtain a first training model; a sending module, configured to send a first model parameter of the first training model to the server; and the second training module is used for constructing a second training model according to the second model parameters and training second medical data of the local database by using the second training model.

According to a further embodiment of the present invention, there is also provided a storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps in any of the apparatus embodiments described above when executed.

According to yet another embodiment of the present invention, there is also provided a computer device comprising a memory having a computer program stored therein and a processor configured to execute the computer program to perform the steps of any of the above method embodiments.

According to the method and the system, the server sends the initial training model to the plurality of clients, so that the clients locally train local medical data to obtain the updated first training model, only the first model parameters of the first training model are sent to the server, and the local medical data are not required to be integrated and summarized to the server, so that the safety of the local data is guaranteed, and the workload and the storage resources of the server are reduced; the server performs weighting processing on the obtained model parameters and returns the weighted model parameters to the plurality of clients for training, so that the plurality of clients share the same training model, and the technical problems that an algorithm model for processing medical data in the related technology is complex, cannot process large-scale medical data with high safety and is inconvenient to flow and the like are solved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:

FIG. 1 is a block diagram of a hardware structure of a computer terminal to which a data training method according to an embodiment of the present invention is applied;

FIG. 2 is a flow chart of a method of training data provided in accordance with the present invention;

FIG. 3 is a block diagram of another method for training data according to an embodiment of the invention;

FIG. 4 is a flow chart of federally learned medical data-based healthcare provided in accordance with an embodiment of the present invention;

FIG. 5 is a block diagram of an apparatus for training data according to an embodiment of the present invention;

FIG. 6 is a block diagram of an alternative data training apparatus according to an embodiment of the present invention;

FIG. 7 is a block diagram of a system for training data according to an embodiment of the present invention.

Detailed Description

The invention will be described in detail hereinafter with reference to the accompanying drawings in conjunction with embodiments. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.

Example 1

The method provided by the first embodiment of the present application may be executed in a mobile terminal, a server, a computer terminal, or a similar computing device. Taking the example of the data training method running on the computer terminal as an example, fig. 1 is a hardware structure block diagram of the data training method applied to the computer terminal according to the embodiment of the present invention. As shown in fig. 1, the computer terminal may include one or more (only one shown in fig. 1) processors 102 (the processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA) and a memory 104 for storing data, and optionally, a transmission device 106 for communication functions and an input-output device 108. It will be understood by those skilled in the art that the structure shown in fig. 1 is only an illustration and is not intended to limit the structure of the computer terminal. For example, the computer terminal may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.

The memory 104 may be used to store a computer program, for example, a software program and a module of application software, such as a computer program corresponding to the data training method in the embodiment of the present invention, and the processor 102 executes the computer program stored in the memory 104 to execute various functional applications and data processing, i.e., to implement the method described above. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to a computer terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmission device 106 is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the computer terminal. In one example, the transmission device 106 includes a Network adapter (NIC), which can be connected to other Network devices through a base station so as to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.

In the embodiment, a data training method is provided, and fig. 2 is a flowchart of a data training method according to the present invention. As shown in fig. 2, the process includes the following steps:

step S202, sending an initial training model to a plurality of clients, wherein the clients are all in independent communication with a server;

step S204, receiving a plurality of sets of first model parameters sent by a plurality of clients, wherein the first model parameters are obtained by the clients through training the initial training model according to first medical data of a local database;

the first medical data of the client local database may include attribute information of the patient, diagnosis and treatment information of the patient, and the like, such as: personal identity information such as age and sex of the patient, medical records such as past medical history and prescription effect.

Step S206, carrying out weighted average on a plurality of sets of first model parameters to obtain second model parameters;

in this embodiment, the server performs weighted calculation on the model parameters sent by the plurality of clients according to a federal average algorithm, where the weight is determined according to the training effect of each client.

And step S208, sending second model parameters to the plurality of clients, wherein the second model parameters are used for constructing the same second training model on the plurality of clients respectively.

According to the method and the system, the server sends the initial training model to the plurality of clients, so that the clients locally train local medical data to obtain the updated first training model, only the first model parameters of the first training model need to be returned to the server, the local medical data does not need to be integrated and summarized to the server, the safety of the local data is guaranteed, and the workload and the storage resources of the server are reduced; the server performs weighting processing on the obtained model parameters and returns the model parameters to the plurality of clients for training, so that the plurality of clients share the same training model, and the technical problems that an algorithm model for processing medical data in the related technology is complex, cannot process large-scale medical data which is high in safety and inconvenient to flow and the like are solved.

Optionally, before performing weighted average on a plurality of sets of first model parameters to obtain a second model parameter, the method further includes: and decrypting the first model parameter according to a preset private key, wherein the private key and public keys corresponding to the plurality of clients form a group of key pairs, and the public keys are used for encrypting the first model parameter.

In this embodiment, in order to ensure the information security between each client and the server, the parameters transmitted between the client and the server are encrypted and trained in the following manner: (1) the server sends a public key to each client for encrypting the parameters (namely the first model parameters) needing interaction, wherein the server is also provided with a private key corresponding to the public key, namely the public key and the private key are a group of key pairs; (2) and after receiving the encrypted parameters, the server decrypts the parameters according to the private key.

In an optional example, performing a weighted average on a plurality of sets of first model parameters to obtain a second model parameter includes: selecting among M sets of first model parameters

And carrying out weighted average on the primary model parameters to obtain second model parameters.

In an alternative embodiment, the server uses a federal averaging algorithm to average and weight the first model parameters after training of each client local database. Taking 3 clients as an example, (i.e., client 1, client 2, and client 3), according to the selection ratio of the clients performing the calculation in each round, assume that 2 clients are selected from 3 clients in each round, and the total number is 2

A seed selection method (namely one group of client 1 and client 2, one group of client 1 and client 3, and one group of client 2 and client 3); sending client 1 and client 2Weighting the sent first model parameters to obtain a parameter 1 (namely the first-level model parameters), weighting the first model parameters sent by the client 1 and the client 3 to obtain a parameter 2, and weighting the first model parameters sent by the client 2 and the client 3 to obtain a parameter 3; and finally, carrying out average weighting on the parameters 1, 2 and 3 to obtain second model parameters (equivalent to secondary model parameters).

In this embodiment, another data training method is provided and applied to a client, and fig. 3 is a block diagram of another data training method according to an embodiment of the present invention. As shown in fig. 3, the process includes the following steps:

step S302, receiving an initial training model sent by a server;

step S304, training the initial training model according to the first medical data of the local database to obtain a first training model;

step S306, sending the first model parameters of the first training model to a server, wherein the server is used for carrying out weighted average on a plurality of sets of first model parameters of a plurality of clients to obtain second model parameters of a second training model, and feeding the second model parameters back to the plurality of clients;

and step S308, constructing a second training model according to the second model parameters, and training second medical data of the local database by using the second training model.

According to the embodiment of the invention, the plurality of clients train respective local medical data according to the initial training model provided by the server to obtain the updated first training model, only the first model parameter of the first training model is required to be sent to the server, the local medical data is not required to be integrated and summarized to the server, and the safety of the local data is ensured; the multiple clients perform average weighting according to the second model parameters returned by the server, so that the multiple clients continue to train local medical data, the purpose that the multiple clients share one same training model is achieved, and the technical problems that an algorithm model for processing medical data in the related technology is complex, high in safety and inconvenient to flow large-scale medical data cannot be processed and the like are solved.

In an alternative embodiment, training the initial training model according to the first medical data of the local database to obtain a first training model includes: performing batch gradient calculation on the initial training model by using the first medical data of the local database to obtain a plurality of gradient values; calculating an average gradient of the plurality of gradient values; and updating the initial weight value of the initial training model by using the average gradient to obtain a first model parameter.

In this embodiment, since the local medical data of each client is continuously updated, in order to make the model trained by federal learning adaptive to each client and minimize the loss of local medical data, the loss gradients of the local medical data of a plurality of clients are calculated according to the proportion of the client devices performing calculation in each round by performing batch Gradient calculation (SGD algorithm, which is called as Stochastic Gradient decision, random Gradient Descent) on the initial training model, which is equivalent to calculating the average value of the gradients of the clients in the randomly extracted subset by a plurality of parallel data channels, and updating the weight of the initial training model according to the average value of the gradients; each client then performs a further gradient descent on the current model (i.e. the initial training model) according to the average of the gradients using the local medical data, and the server performs a weighted average of the resulting model (i.e. the second model parameters). By averaging the gradients across multiple models across multiple clients, the loss of local medical data is minimized and is better than training the acquired models on two clients individually.

The following further describes an embodiment of the present invention with reference to a specific embodiment:

fig. 4 is a flowchart of medical data learning based on federal learning provided by an embodiment of the present invention, and as shown in fig. 4, it is assumed that the distributed data center has 3 clients, i.e., data set No. 1, data set No. 2, and data set No. 3 in fig. 4, and a central server. The central server provides an initial model (namely the initial training model) for the distributed data center, and the data set 1 performs model training on the initial model according to own data (namely first medical data of the local database) recorded in the local database of the data set, so as to obtain model update 1 and first model parameters 1 of the model; meanwhile, the data set 2 trains the initial model according to the own data recorded in the local data of the data set to obtain a model update 2 and a first model parameter 2; similarly, for the data set 3, a model update 3 and first model parameters 3 are obtained.

The three model parameters are sent to the central server, and data sets of all distributed data centers do not need to be integrated into the central server, so that the workload of the central server can be reduced, and the processing speed of the central server can be increased. And the central server performs weighted calculation on the received three parameters according to a federal average algorithm to obtain second model parameters, and the central server returns the second model parameters to the data sets 1, 2 and 3.

Through the steps, the distributed data center does not need to send the locally stored medical data to the central server, but sends the model parameters of each data center training model to the central server after encryption processing, so that the data security of a data set side and the personal privacy of a user are ensured; the central server does not need to integrate the data sets of all the terminal sides, but carries out average weighting on the model parameters from all the data sets to obtain second model parameters, and the purpose of uniformly updating the models of all the data sets is achieved. The computing cost and the computing time of the central server are reduced, and therefore the processing efficiency of the server is improved. The technical problems that in the related technology, medical data of users which are high in safety and inconvenient to flow cannot be trained on a large scale, the sharing degree of the medical data is low and the like are solved.

Example 2

In this embodiment, a data training apparatus is further provided, and the data training apparatus is used to implement the foregoing embodiments and preferred embodiments, which have already been described and are not described again. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.

Fig. 5 is a block diagram of a data training apparatus according to an embodiment of the present invention, as shown in fig. 5, the apparatus including: a first sending module 502, configured to send the initial training model to a plurality of clients, where the plurality of clients are all in individual communication with the server; a receiving module 504, connected to the first sending module 502, configured to receive multiple sets of first model parameters sent by multiple clients, where the first model parameters are obtained by the clients training an initial training model according to first medical data in a local database; a calculating module 506, connected to the receiving module 504, configured to perform weighted average on multiple sets of first model parameters to obtain second model parameters; a second sending module 508, connected to the calculating module 506, configured to send the second model parameters to the multiple clients, where the second model parameters are used to construct the same second training model at the multiple clients, respectively.

Optionally, the apparatus further comprises: and the decryption module is used for decrypting the first model parameters according to a preset private key before carrying out weighted average on the multiple sets of first model parameters to obtain the second model parameters, wherein the private key and public keys corresponding to the multiple clients form a group of key pairs, and the public keys are used for encrypting the first model parameters.

Optionally, the calculation module includes: a selection unit for selecting among M sets of first model parameters

Fig. 6 is a block diagram of another data training apparatus according to an embodiment of the present invention, as shown in fig. 6, the apparatus including: a receiving module 602, configured to receive an initial training model sent by a server; a first training module 604, connected to the receiving module 602, configured to train an initial training model according to first medical data in a local database, so as to obtain a first training model; a sending module 606, connected to the first training module 604, configured to send the first model parameters of the first training model to a server, where the server is configured to perform weighted average on multiple sets of first model parameters of multiple clients to obtain second model parameters of a second training model, and feed the second model parameters back to the multiple clients; and a second training module 608, connected to the sending module 606, configured to construct a second training model according to the second model parameters, and train second medical data of the local database using the second training model.

According to another embodiment of the present invention, there is also provided a data training system, and fig. 7 is a block diagram of a data training system according to an embodiment of the present invention, including: the server and a plurality of clients, wherein, the server includes: the first sending module is used for sending the initial training model to a plurality of clients; the receiving module is used for receiving a plurality of sets of first model parameters sent by a plurality of clients; the calculation module is used for carrying out weighted average on a plurality of sets of first model parameters to obtain second model parameters; the second sending module is used for sending the second model parameters to the plurality of clients; a plurality of clients, each in individual communication with the server, comprising: a receiving module, configured to receive an initial training model; the first training module is used for training the initial training model according to the first medical data of the local database to obtain a first training model; the sending module is used for sending the first model parameters of the first training model to the server; and the second training module is used for constructing a second training model according to the second model parameters and training second medical data of the local database by using the second training model.

It should be noted that, the above modules may be implemented by software or hardware, and for the latter, the following may be implemented, but not limited to: the modules are all positioned in the same processor; alternatively, the modules are respectively located in different processors in any combination.

Example 3

Embodiments of the present invention also provide a storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.

Alternatively, in the present embodiment, the storage medium may be configured to store a computer program for executing the steps of:

s1, sending the initial training model to a plurality of clients, wherein the clients are all in independent communication with the server;

s2, receiving a plurality of sets of first model parameters sent by the plurality of clients, wherein the first model parameters are obtained by the clients through training the initial training model according to first medical data of a local database;

s3, carrying out weighted average on the multiple sets of first model parameters to obtain second model parameters;

and S4, sending the second model parameters to the plurality of clients, wherein the second model parameters are used for respectively constructing the same second training models at the plurality of clients.

s1, receiving the initial training model sent by the server;

s2, training the initial training model according to the first medical data of the local database to obtain a first training model;

s3, sending the first model parameters of the first training model to the server, wherein the server is used for carrying out weighted average on a plurality of sets of first model parameters of a plurality of clients to obtain second model parameters of a second training model, and feeding the second model parameters back to the plurality of clients;

and S4, constructing a second training model according to the second model parameters, and training second medical data of the local database by using the second training model.

Optionally, in this embodiment, the storage medium may include, but is not limited to: various media capable of storing computer programs, such as a usb disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.

Embodiments of the present invention also provide an electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer program to perform the steps of any of the above method embodiments.

Optionally, the electronic apparatus may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.

Optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:

s1, receiving the initial training model sent by the server;

Optionally, the specific examples in this embodiment may refer to the examples described in the above embodiments and optional implementation manners, and this embodiment is not described herein again.

It will be apparent to those skilled in the art that the modules or steps of the present invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and alternatively, they may be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method for training data, comprising:

sending an initial training model to a plurality of clients, wherein the plurality of clients are each in individual communication with a server;

receiving a plurality of sets of first model parameters sent by the plurality of clients, wherein the first model parameters are obtained by the clients through training the initial training model according to first medical data of a local database;

carrying out weighted average on the multiple sets of first model parameters to obtain second model parameters;

and sending the second model parameters to the plurality of clients, wherein the second model parameters are used for constructing the same second training model at the plurality of clients respectively.

2. The method of claim 1, wherein prior to performing a weighted average of the plurality of sets of first model parameters to obtain second model parameters, the method further comprises:

and decrypting the first model parameter according to a preset private key, wherein the private key and public keys corresponding to the plurality of clients form a group of key pairs, and the public keys are used for encrypting the first model parameter.

3. The method of claim 1, wherein the weighted averaging of the plurality of sets of first model parameters to obtain second model parameters comprises:

selecting among M sets of first model parameters

Second first model parameters, wherein N sets of first model parameters are selected at a time, and N sets of the first model parameters are selected at a timeCarrying out weighted average on the number to obtain a first-level model parameter, wherein N is an integer less than M;

to pairAnd carrying out weighted average on the first-stage model parameters to obtain the second model parameters.

4. A method for training data, comprising:

receiving an initial training model sent by a server;

training the initial training model according to first medical data of a local database to obtain a first training model;

sending the first model parameters of the first training model to the server, wherein the server is used for carrying out weighted average on a plurality of sets of first model parameters of a plurality of clients to obtain second model parameters of a second training model, and feeding the second model parameters back to the plurality of clients;

and constructing a second training model according to the second model parameters, and training second medical data of the local database by using the second training model.

5. The method of claim 4, wherein training the initial training model based on the first medical data of the local database to obtain a first training model comprises:

performing batch gradient calculation on the initial training model by using the first medical data of the local database to obtain a plurality of gradient values;

calculating an average gradient of the plurality of gradient values;

and updating the initial weight value of the initial training model by using the average gradient to obtain the first model parameter.

6. An apparatus for training data processing, comprising:

the system comprises a first sending module, a second sending module and a third sending module, wherein the first sending module is used for sending an initial training model to a plurality of clients, and the clients are communicated with a server independently;

the receiving module is used for receiving a plurality of sets of first model parameters sent by the clients, wherein the first model parameters are obtained by the clients through training the initial training model according to first medical data of a local database;

the calculation module is used for carrying out weighted average on the plurality of sets of first model parameters to obtain second model parameters;

and the second sending module is used for sending the second model parameters to the plurality of clients, wherein the second model parameters are used for constructing the same second training model at the plurality of clients respectively.

7. An apparatus for training data, comprising:

the receiving module is used for receiving the initial training model sent by the server;

the first training module is used for training the initial training model according to first medical data of a local database to obtain a first training model;

the sending module is used for sending the first model parameters of the first training model to the server, wherein the server is used for carrying out weighted average on a plurality of sets of first model parameters of a plurality of clients to obtain second model parameters of a second training model, and feeding the second model parameters back to the plurality of clients;

and the second training module is used for constructing a second training model according to the second model parameters and training second medical data of the local database by using the second training model.

8. A system for training data, comprising: a server and a plurality of clients, wherein,

the server, comprising: the first sending module is used for sending the initial training model to a plurality of clients; the receiving module is used for receiving a plurality of sets of first model parameters sent by the plurality of clients; the calculation module is used for carrying out weighted average on the plurality of sets of first model parameters to obtain second model parameters; a second sending module, configured to send the second model parameter to the multiple clients;

the plurality of clients, each in individual communication with the server, include: a receiving module, configured to receive the initial training model; the first training module is used for training the initial training model according to first medical data of a local database to obtain a first training model; a sending module, configured to send a first model parameter of the first training model to the server; and the second training module is used for constructing a second training model according to the second model parameters and training second medical data of the local database by using the second training model.

9. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 5 when executing the computer program.

10. A computer storage medium on which a computer program is stored, characterized in that the computer program, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 5.