CN112100459A - Search space generation method and device, electronic equipment and storage medium - Google Patents

Search space generation method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN112100459A
CN112100459A CN202011026459.0A CN202011026459A CN112100459A CN 112100459 A CN112100459 A CN 112100459A CN 202011026459 A CN202011026459 A CN 202011026459A CN 112100459 A CN112100459 A CN 112100459A
Authority
CN
China
Prior art keywords
search space
model
model structure
target
search
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202011026459.0A
Other languages
Chinese (zh)
Inventor
希滕
张刚
温圣召
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202011026459.0A priority Critical patent/CN112100459A/en
Publication of CN112100459A publication Critical patent/CN112100459A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/907Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/909Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using geographical or spatial information, e.g. location

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Library & Information Science (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a search space generation method, a search method and device based on probability distribution, electronic equipment and a storage medium, and relates to the fields of artificial intelligence, computer vision, deep learning, intelligent cloud technology and the like. The specific implementation scheme of the method for generating the search space is as follows: obtaining at least one search space according to at least one search space generation strategy; and for the at least one search space, performing iterative updating according to the probability distribution of the performance of the search space, and obtaining a target search space according to at least one search space obtained after training from iterative updating to iterative convergence. By adopting the method and the device, at least hardware performances such as processing speed, processing precision and the like can be improved.

Description

Search space generation method and device, electronic equipment and storage medium
Technical Field
The application relates to the field of artificial intelligence. The application particularly relates to the fields of computer vision, deep learning, intelligent cloud technology and the like.
Background
In the field of information processing, it is necessary to use hardware (such as a terminal, a server, and a chip thereof) having better hardware performance or to set up a hardware system having better hardware performance by combining a plurality of pieces of hardware, regardless of various information such as text information, multimedia information including audio or video, image information, and image information extracted from a frame in video processing, so as to obtain an optimal processing effect.
However, in the related art, an effective solution is not provided for how to improve hardware performance, such as processing speed, processing accuracy, and the like.
Disclosure of Invention
The application provides a search space generation method, a probability distribution-based search method and device, electronic equipment and a storage medium.
According to an aspect of the present application, there is provided a method for generating a search space, including:
obtaining at least one search space according to at least one search space generation strategy;
and for the at least one search space, performing iterative updating according to the probability distribution of the performance of the search space, and obtaining a target search space according to at least one search space obtained after training from iterative updating to iterative convergence.
According to another aspect of the present application, there is provided a search method based on probability distribution, including:
obtaining at least one search space according to at least one search space generation strategy;
training the at least one search space in an iterative updating mode according to the probability distribution of the performance of the search space, and under the condition that the at least one search space is iteratively updated to iterative convergence according to the first target parameter, finishing the training to obtain at least one trained search space;
and responding to the first search operation, and searching the at least one trained search space to obtain a target search space.
According to another aspect of the present application, there is provided a search space generation apparatus, including:
the first processing module is used for obtaining at least one search space according to at least one generation strategy of the search space;
and the second processing module is used for performing iterative update on the at least one search space according to the probability distribution of the performance of the search space, and obtaining a target search space according to at least one search space obtained after training from iterative update to iterative convergence.
According to another aspect of the present application, there is provided a probability distribution-based search apparatus, including:
the first search module is used for obtaining at least one search space according to at least one generation strategy of the search space;
the second search module is used for training the at least one search space in an iterative updating mode according to the probability distribution of the performance of the search space, and under the condition that the at least one search space is iteratively updated to iterative convergence according to the first target parameter, the training is ended to obtain at least one trained search space;
and the third searching module is used for responding to the first searching operation and searching at least one trained searching space to obtain a target searching space.
According to another aspect of the present application, there is provided an electronic device including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method as provided by any one of the embodiments of the present application.
According to another aspect of the present application, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing a computer to perform a method provided by any one of the embodiments of the present application.
By adopting the method and the device, at least one search space can be obtained according to at least one search space generation strategy, iterative updating is carried out on the at least one search space according to probability distribution of search space performance, a search space with optimal performance is obtained according to the trained at least one search space obtained from iterative updating to iterative convergence, and the search space with optimal performance is used as a target search space. Because the target search space is a search space with the best performance among the possibilities of a plurality of search spaces (i.e. at least one search space) through iterative update training based on probability distribution, the model structure obtained by searching in the subsequent target search space also has the best performance, so that the target search space and the model structure obtained by searching are applied to scenes such as image processing (such as image classification, image recognition and image detection), the hardware performance such as the processing speed and the processing precision of hardware in scenes such as image processing can be improved, and the use amount of hardware can be reduced along with the improvement of the hardware performance, for example, the same hardware performance as the prior art can be achieved by using less hardware, thereby reducing the hardware cost.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present application, nor do they limit the scope of the present application. Other features of the present application will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:
fig. 1 is a schematic flow chart of a method for generating a search space according to an embodiment of the present application;
FIG. 2 is a schematic flow chart diagram of a search method based on probability distribution according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a generating apparatus of a search space according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a structure of a search apparatus based on probability distribution according to an embodiment of the present application;
fig. 5 is a block diagram of an electronic device for implementing a search space generation method or a probability distribution-based search method according to an embodiment of the present application.
Detailed Description
The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. The term "at least one" herein means any combination of at least two of any one or more of a plurality, for example, including at least one of A, B, C, and may mean including any one or more elements selected from the group consisting of A, B and C. The terms "first" and "second" used herein refer to and distinguish one from another in the similar art, without necessarily implying a sequence or order, or implying only two, such as first and second, to indicate that there are two types/two, first and second, and first and second may also be one or more.
Furthermore, in the following detailed description, numerous specific details are set forth in order to provide a better understanding of the present application. It will be understood by those skilled in the art that the present application may be practiced without some of these specific details. In some instances, methods, means, elements and circuits that are well known to those skilled in the art have not been described in detail so as not to obscure the present application.
With the development of artificial intelligence and deep learning technology, in order to improve hardware performance, a neural network can be trained according to the index of the hardware performance, so that the aim that the trained neural network is applied to hardware to meet the expected index can be achieved. The quality of the neural network structure is crucial, and the quality of the neural network structure has very important influence on the quality of hardware performance which can be executed by finally loading a model based on the neural network structure on hardware. The manual design of network topology requires very rich experience and many attempts, and many parameters can generate explosive combinations, and Random Search (Random Search) is hardly feasible, so that the recently emerging Neural network Architecture Search technology (NAS) becomes a research hotspot.
In NAS, a search space is very important, and a conventional NAS work search space is a specified search space designed manually, and a model structure is searched in the specified search space. The manual design of the search space has limitations, the search space is fixed for use after being determined, and cannot be adjusted according to the requirements of users, the search space is determined well, the number of the search spaces designed manually is limited, that is, the search has an upper limit rather than more possibilities, even if the optimal model structure which can be searched currently is obtained by searching in the determined search space, the performance of the optimal model structure may be poor, if the determined search space is not suitable for the design of the search space due to the fact that different people understand the difference of experience values during the manual design, the model structure which can be searched currently is obtained by searching in the determined search space, and whether the model structure is the optimal model structure is evaluated.
By adopting the method and the device, at least one search space can be obtained according to at least one search space generation strategy, iterative updating training is carried out on the at least one search space, and the search space is not manually set but can be independently trained for iterative updating, so that the search space with multiple possibilities can be obtained, and the search space presents diversification and nondeterministry. And, the process of searching for the model structure is no longer to search for the model structure, but search for and obtain a search space with optimal performance from at least one search space obtained after the iterative update training first, and regard as the search space of the goal, then search for the model structure in the search space of the goal, thus, not merely can find out the optimal model structure from the search space of the goal, can also appoint users to come better design optimal model structure accordingly, when deploying the optimal model structure to the corresponding hardware, can reach the optimum hardware performance that expects, such as the optimum processing speed, optimum processing accuracy, etc., meanwhile, along with the improvement of the performance of the hardware, can also reduce the use quantity of the hardware, for example, can reach the equivalent hardware performance in the past with the hardware less than quantity in the past, thus has reduced the hardware cost.
The application range of the application can be applied to the fields of Artificial intelligence, deep learning, cloud computing, image processing and the like, and can also be applied to PaddleSlim of model compression, Paddlecloud for cloud computing, easy dl for image recognition, and Artificial Intelligence (AI) applet and the like. The Paddle is a name of a Deep Learning framework, and refers to an abbreviation of Parallel Distributed Deep Learning (Parallel Distributed Deep Learning), and model structure training suitable for various application scenarios can be deployed on the basis of the Deep Learning framework. Besides realizing the quantization function in model compression, PaddleSlim can integrate clipping, distillation, model structure search, model hardware search and the like in model compression. Paddlecloud is adapted to the positioning trend of 'cloud computing + big data + artificial intelligence', and can deploy a required model at a cloud end and share a large amount of computing processing logic. Easy dl is a customized image recognition platform that can get SDK or API interface services based on the generated model structure). The AI can implement various applets, etc. for different application scenarios, through which techniques for simulating, extending, and extending artificial intelligence can be developed, techniques that correspond to different application scenarios and utilize human intelligence can be presented through a desired model, focusing on user interaction.
According to an embodiment of the present application, a method for generating a search space is provided, and fig. 1 is a flowchart of the method for generating a search space according to the embodiment of the present application, which may be applied to a device for generating a search space, for example, where the device may be deployed in a terminal or a server or other processing device, and may be implemented in an image processing scenario such as image classification, image recognition, image detection, and the like, and a video processing scenario such as classification, recognition, detection, and the like after a video frame is extracted from a video. Among them, the terminal may be a User Equipment (UE), a mobile device, a cellular phone, a cordless phone, a Personal Digital Assistant (PDA), a handheld device, a computing device, a vehicle-mounted device, a wearable device, and so on. In some possible implementations, the method may also be implemented by a processor calling computer readable instructions stored in a memory. As shown in fig. 1, includes:
s101, obtaining at least one search space according to at least one search space generation strategy.
In one example, the generation strategy of the search space includes any one or more of the following:
1) generating the search space according to the type of the middle layer of the selected training model and the number of the middle layers of the selected training model;
2) generating the search space according to the type of the layer in the selected training model;
3) generating the search space according to the convolution attribute of the middle layer of the selected training model and the number of the middle layers of the selected training model;
4) generating the search space according to a topology of a layer in a selected training model; wherein the topology comprises a single-branch topology or a multi-branch topology.
In one example, the at least one search space may be obtained by a search space generator and at least one search space generation strategy.
And S102, performing iterative updating on the at least one search space according to probability distribution of the performance of the search space, and obtaining a target search space according to at least one search space obtained after training from iterative updating to iterative convergence.
In an example, for iterative updating, when the iteration number is 0, the at least one search space, that is, the initial search space, is corresponding to the at least one search space, and as the iterative updating is continued, for example, when the iteration number reaches the iteration number 100 preset in the iteration rule, the trained at least one search space obtained after the iterative updating training, that is, the search space obtained after the iterative convergence is corresponding to the at least one search space after the iterative updating training, so that the target search space is obtained by searching from the trained at least one search space.
In one example, under the condition that at least one search space is obtained through a search space generator and at least one search space generation strategy, if the number of updating times of the search space generator does not reach a preset iteration rule, continuing to iteratively update and train the at least one search space until the iteration rule is met, then iteratively converging, ending the training, and searching to obtain the target search space. For the preset iteration rule, the preset iteration rule may be that a preset iteration number is reached, for example, the preset iteration number is 100 or 200, and the like, and if the preset iteration number has reached 100 or 200, and the like, the iteration converges, and the training is ended; the preset iteration rule may also be based on the target parameter to perform iterative training, where the corresponding performance has continuously reached a preset number of times, for example, the preset number of times is 50 times or 100 times, and if the performance has continuously reached the target 50 times or 100 times and has not been optimized, the iteration converges, and the training is ended.
In the related art, a plurality of model structures exist in a manually designed search space (the search space is determined and has no possibility), a model structure with the optimal hardware performance, such as processing speed and processing precision, is obtained by searching from the search space, and the optimal model structure can be searched only in a limited number of the search spaces, so that more possibilities of the search space are limited. For example, taking mnsnet as an example, the model structure can only be limited to the number of search channels, expansion coefficients, and the like, and is actually a mobilene _ v2-like structure, that is, the model structure is searched in a determined search space, which limits the upper bound of the model structure that can be searched on one hand; on the other hand, if in an inappropriate search space, even if an optimal model structure can be found, the performance of the model structure may be poor.
By adopting the method and the device, at least one search space can be obtained according to at least one search space generation strategy, iterative updating is carried out according to probability distribution of search space performance, the search space with the optimal performance can be obtained under the condition that iterative updating is carried out until iterative convergence, and the search space with the optimal performance is used as a target search space. In the present application, since the search space is automatically generated according to the definition of the respective possibility, that is, the search space has a plurality of possibilities and is uncertain, in the generation process of the search space, the training can be repeated continuously, and the iterative training can be performed based on the probability distribution, for example, all the search spaces obtained by the iterative training can be conditionally sampled according to the probability distribution of the performance of the search space, at least one search space sampled in the continuous iterative training is obtained, in order to search for the optimal search space from the at least one search space resulting from the sampling, that is, the iteratively updated training, the target search space is a search space with the best performance among the possibility of a plurality of search spaces (namely at least one search space), and the search space with the best performance can be obtained more quickly and accurately by carrying out iterative training based on probability distribution. Then, if the model structure obtained by subsequent search in the target search space is also optimal in performance, then after the search space with the optimal performance is obtained, a model structure with the best hardware performance "such as processing speed and processing accuracy" can be obtained by searching in the optimal search space, so that the hardware performance in the scene of image processing (such as image classification, image recognition and image detection) can be improved, such as the processing speed and processing accuracy of hardware, and the number of used hardware can be reduced with the improvement of the hardware performance, such as the same hardware performance as before can be achieved with less hardware than before, thereby reducing the hardware cost.
As for the generation policy of the search space, it may be a rule definition for the search space, and in some examples, the possibility that a space/set in the search space contains all the search spaces may include the following:
1) taking a layer (block) in the training model as an example, if the type of the selected layer in the search space is a residual block in a residual network, and the number of the layers is selected to be resnet50(resnet50 is the design of a convolutional network structure), the search space generated according to the above can search for a model structure of any resnet50-like structure. The block may also be referred to as a block in the training model, and the block is not limited to a specific name, and any module or layer constituting the training model is within the scope of the present application.
2) Based on the above 1), if the search space does not limit the number of layers, only the type of the selected layer is residual block, the search space generated according to the above can search the model structure of any resnet-like structure.
3) Taking a layer (block) in the training model as an example, if the convolution attribute of the selected layer in the search space is depth-wise block, the depth-wise block is a layer type with depth separable convolution attribute, and the number of layers is selected as mobilene _ v2(mobilene _ v2 is designed as one of depth separable convolution network structures), the search space generated according to the above can search for a model structure of any mobilene _ v2-like structure.
4) If the topological structure of the layer in the selected training model is darts which is a single-branch topological structure, the model structure of any darts structure can be searched by the search space generated according to the data.
It should be noted that: the search space may be obtained according to the above-mentioned generation strategy of the search space, and may also be obtained manually, that is, the search space in the present application is not limited to the determined search space set manually, and also includes possibilities (not limited to the possibilities in the above-mentioned example) of the various search spaces obtained according to the generation strategy of the search space, and the possibilities of the determined search space and the various search spaces obtained according to the generation strategy of the search space may further be freely and arbitrarily combined according to the hardware (such as the algorithm logic of a functional module on a chip) requirements designed and applied by the user.
It should be noted that: the "space/set" is actually a search space, and only because the search space in the present application is uncertain, but has multiple possibilities and can be continuously updated iteratively, there is more than one search space, and for convenience of description, the "space/set" in the "search space" may also be referred to as a "sub-search space" compared with the "search space" itself with a broader meaning.
According to an embodiment of the present application, a search method based on probability distribution is provided, and fig. 2 is a schematic flowchart of the search method based on probability distribution according to the embodiment of the present application, as shown in fig. 2, including:
s201, obtaining at least one search space according to at least one search space generation strategy.
In one example, the generation strategy of the search space includes any one or more of the following:
1) generating the search space according to the type of the middle layer of the selected training model and the number of the middle layers of the selected training model;
2) generating the search space according to the type of the layer in the selected training model;
3) generating the search space according to the convolution attribute of the middle layer of the selected training model and the number of the middle layers of the selected training model;
4) generating the search space according to a topology of a layer in a selected training model; wherein the topology comprises a single-branch topology or a multi-branch topology.
In one example, the at least one search space may be obtained by a search space generator and at least one search space generation strategy.
S202, training the at least one search space in an iterative updating mode according to probability distribution of search space performance, and ending the training under the condition that the at least one search space is iteratively updated to iterative convergence according to the first target parameter to obtain the trained at least one search space.
In one example, the first target parameter may include: the first super-parameter used for the search space performance evaluation can be used for measuring the quality of the search space performance, such as measuring the search space performance such as average performance, performance median, performance variance, and the like.
S203, responding to the first search operation, and searching in the trained at least one search space to obtain a target search space.
In an example, the target search space may be an optimal search space of the trained at least one search space, that is, an optimal search space with the best hardware performance is selected from the trained at least one search space with more possibilities.
The method can also comprise the following steps:
s204, training at least one model structure in the target search space in an iterative updating mode according to probability distribution of model structure performance, and finishing the training under the condition that the at least one model structure is iteratively updated to iterative convergence according to a second target parameter to obtain at least one trained model structure.
In one example, the second target parameter includes: and the second super-parameter used for the model structure performance evaluation can measure the quality of the search space performance, such as the average performance, the performance median, the performance variance and the like.
S205, in response to the second searching operation, searching the trained at least one model structure to obtain a target model structure.
In an example, in addition to the target search space in S203, which may be obtained by iterative training based on probability distribution, the present S204-S205 may be further performed, that is, after an optimal search space with the best hardware performance is selected from the trained at least one search space with more possibilities, in a search process of searching for a model structure in the optimal search space, at least one model structure in the target search space is trained in an iterative update manner according to the probability distribution of model structure performance, and the optimal model structure is obtained when the at least one model structure is iteratively updated to iterative convergence according to a second target parameter, where the optimal model structure is the target model structure. Since the optimal search space can be obtained through S203, and further the optimal model structure can be obtained through S204-S205 according to the optimal search space and based on iterative training of probability distribution, the model structure can be preferably selected to obtain the optimal model structure by searching the optimal search space.
By adopting the method and the device, at least one search space can be obtained according to at least one search space generation strategy, iterative updating is carried out on the at least one search space based on probability distribution, the search space with the optimal performance can be obtained under the condition from iterative updating to iterative convergence, and the search space with the optimal performance is used as a target search space. And responding to the first search operation, and searching the at least one trained search space to obtain a target search space. Further, for at least one model structure in the target search space, iterative updating may also be performed based on probability distribution, the trained at least one model structure is obtained under the condition that the iterative updating is performed until the iterative convergence, and in response to the second search operation, an optimal model structure is obtained by searching in the trained at least one model structure, where the optimal model structure is the target model structure. In the present application, since the search space is automatically generated according to the definition of each possibility, that is, the search space has multiple possibilities and is uncertain, in the generation process of the search space, iterative training may be performed continuously, and iterative updating of the search space is performed based on probability distribution (a conditional form constrained by probability distribution), so as to search an optimal search space from the search space that is continuously iteratively trained, and compared with unconditional iterative updating, the search space is faster and more accurate, that is, after training of iterative updating, the target search space is a search space with the best performance among the possibilities of a plurality of search spaces (i.e., at least one search space). Then, the model structure can be searched from the target search space, and the performance of the model structure is also optimal. That is, after the search space with the optimal performance is obtained, the model structure with the best hardware performance "such as processing speed and processing precision" can be searched from the optimal search space, and the model structure can be iteratively updated based on the probability distribution (the conditional form constrained by the probability distribution), which is faster and more accurate than the unconditional iterative update, so that the target search space and the model structure obtained by search can be applied to the scenes of image processing (such as image classification, image recognition and image detection), the hardware performance in the scenes of image processing and the like, such as the processing speed and processing precision of hardware, can be improved, and the use amount of hardware can be reduced along with the improvement of the hardware performance, such as the same hardware performance as before can be achieved by using less hardware than before, thereby reducing hardware costs.
In an embodiment, the obtaining at least one search space according to at least one search space generation policy includes: the search space generator is initialized, for example, according to the subspace/subset combination in the search space. And obtaining the at least one search space according to the search space generator and the generation strategy of the at least one search space.
In an embodiment, training the at least one search space in an iterative update manner according to a probability distribution of search space performance, and ending the training when the at least one search space is iteratively updated according to a first target parameter until iteration converges to obtain the trained at least one search space, includes: modeling the at least one search space according to a probability distribution of the performance of the search space to obtain a first probability model (e.g., a probability model obtained by modeling the search space based on the probability distribution); taking a first hyper-parameter for the search space performance evaluation as the first target parameter; iteratively updating the first probability model in accordance with the first hyper-parameter to iteratively update the at least one search space based on the first probability model to an iterative convergence.
In an embodiment, training at least one model structure in the target search space in an iterative update manner according to probability distribution of model structure performance, and ending the training when the at least one model structure is iteratively updated to iterative convergence according to a second target parameter to obtain the trained at least one model structure, includes: modeling the at least one model structure according to the probability distribution of the model structure performance to obtain a second probability model (e.g., a probability model obtained by modeling the model structure based on the probability distribution); taking a second hyper-parameter for the model structure performance evaluation as the second target parameter; iteratively updating the second probabilistic model according to the second hyperparameter to iteratively update the at least one model structure based on the second probabilistic model to an iterative convergence.
Based on any of the above embodiments, implementation modes and combinations thereof, the method further includes: the image to be processed can be obtained, and the image to be processed is input into a target model structure obtained according to the target search space search for image processing, so that a target image is obtained. Wherein the image processing comprises: and at least one of image classification, image recognition and image detection.
Application example:
the processing flow of the search model structure in the embodiment of the application comprises the following contents:
the method comprises the steps of firstly, collecting data of a batch of target scenes (such as data of scenes for image classification, image recognition, image detection and the like), wherein the data can be marked data obtained from a database or required marked data obtained by marking through marking personnel, and each marking personnel can mark the data according to own subjective judgment. And simultaneously scoring the plurality of annotating personnel, obtaining a data annotation result according to a final annotation score, wherein the final annotation score can be an average value of the plurality of annotating personnel, and finally obtaining the required annotation data which can be used in the following training process of searching the search space and the model structure.
And secondly, generating a search space and searching a model structure in the search space based on probability distribution.
1. Modeling the performance of the search space according to a first probability model, aimed at predicting the average performance of any search space by means of the first probability model, wherein a first hyper-parameter in the first probability model may be initialized randomly and updated step by step as the search space is sampled based on the first probability model.
2. And iteratively updating the at least one search space according to the first probability model until iterative convergence is achieved, and obtaining the trained at least one search space to predict the performance of enough search spaces.
3. And (3) selecting an optimal search space from the at least one search space after training predicted in the step (2) and taking the optimal search space as a target search space.
4. Modeling a second probabilistic model of model structure performance in the search space according to the target search space in 3 above, wherein the second hyper-parameter in the second probabilistic model can be randomly initialized and gradually updated as the model structure is sampled based on the second probabilistic model.
5. And carrying out iteration updating on the at least one model structure according to the second probability model until iteration convergence, and obtaining the trained at least one model structure so as to predict the performance of enough model structures.
6. And selecting model structures of top k from the model structures in the step 3 in a sampling mode for training, and recording the performances of the model structures, wherein k is the number of samples for sampling the model structures.
7. And updating a second hyper-parameter in the second probability model according to the performance of the top k model structure obtained by sampling in the step 6, wherein k is the number of samples for sampling the model structure.
8. And if the updating times of the second hyper-parameter do not reach the preset iteration rule, returning to the step 5.
9. The first hyper-parameter in the first probability model is updated based on the integrated performance (e.g., average performance, median performance) of the model structure in the sampled search space.
10. And if the updating times of the first hyper-parameter do not reach the preset iteration rule, returning to the step 2.
11. And outputting the optimal search space.
12. And outputting the optimal model structure in the optimal search space.
By adopting the application example, the searching process does not search the model structure in the search space determined by manual setting, but searches the search space with multiple possibilities to obtain the optimal search space, and searches the optimal model structure based on the optimal search space, so that the optimal model structure can be found out and a user can be guided to better design the model structure. The processing speed, the processing precision and other hardware performances of the model on specific hardware are improved, and the hardware cost of the product can be reduced.
According to an embodiment of the present application, there is provided a device for generating a search space, fig. 3 is a schematic structural diagram of a device for generating a search space according to an embodiment of the present application, and as shown in fig. 3, the device includes a first processing module 31, configured to obtain at least one search space according to at least one generation strategy of the search space; the second processing module 32 is configured to perform iterative update on the at least one search space according to the probability distribution of the performance of the search space, and obtain a target search space according to the trained at least one search space obtained after iterative update to iterative convergence.
In one embodiment, the generation strategy of the search space includes any one or more of the following strategies:
generating the search space according to the type of the middle layer of the selected training model and the number of the middle layers of the selected training model;
generating the search space according to the type of the layer in the selected training model;
generating the search space according to the convolution attribute of the middle layer of the selected training model and the number of the middle layers of the selected training model;
generating the search space according to a topology of a layer in a selected training model; wherein the topology comprises a single-branch topology or a multi-branch topology.
According to an embodiment of the present application, a probability distribution-based search apparatus is provided, fig. 4 is a schematic diagram of a composition structure of the probability distribution-based search apparatus according to the embodiment of the present application, as shown in fig. 4, the apparatus includes a first search module 41, configured to obtain at least one search space according to at least one generation strategy of the search space; the second search module 42 is configured to train the at least one search space in an iterative update manner according to the probability distribution of the performance of the search space, and end the training when the at least one search space is iteratively updated to iterative convergence according to the first target parameter, so as to obtain at least one trained search space; and a third searching module 43, configured to search for a target search space in the trained at least one search space in response to the first searching operation.
In one embodiment, the first target parameter includes: a first hyper-parameter for the search space performance evaluation.
In an embodiment, the system further includes a fourth search module, configured to train at least one model structure in the target search space in an iterative update manner according to probability distribution of model structure performance, and end the training to obtain at least one trained model structure under a condition that the at least one model structure is iteratively updated to iterative convergence according to a second target parameter; and responding to a second searching operation, and searching the trained at least one model structure to obtain a target model structure.
In one embodiment, the second target parameter includes: a second hyper-parameter for performance evaluation of the model structure.
In one embodiment, the second search module is configured to model the at least one search space according to a probability distribution of search space performance to obtain a first probability model; taking a first hyper-parameter for the search space performance evaluation as the first target parameter; iteratively updating the first probability model in accordance with the first hyper-parameter to iteratively update the at least one search space based on the first probability model to an iterative convergence.
In one embodiment, the fourth searching module is configured to perform modeling on the at least one model structure according to probability distribution of model structure performance to obtain a second probability model; taking a second hyper-parameter for the model structure performance evaluation as the second target parameter; iteratively updating the second probabilistic model according to the second hyperparameter to iteratively update the at least one model structure based on the second probabilistic model to an iterative convergence.
Based on any of the embodiments, implementation modes and combinations thereof of the present application, the image processing system further includes an image processing module, configured to acquire an image to be processed; and inputting the image to be processed into a target model structure obtained according to the target search space search for image processing to obtain a target image. Wherein the image processing comprises: and at least one of image classification, image recognition and image detection.
The functions of each module in each apparatus in the embodiment of the present application may refer to corresponding descriptions in the above method, and are not described herein again.
According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.
Fig. 5 is a block diagram of an electronic device for implementing the method for generating a search space or the method for searching based on probability distribution according to the embodiment of the present application. The electronic device may be the aforementioned deployment device or proxy device. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.
As shown in fig. 5, the electronic apparatus includes: one or more processors 801, memory 802, and interfaces for connecting the various components, including a high speed interface and a low speed interface. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). In fig. 5, a processor 801 is taken as an example.
The memory 802 is a non-transitory computer readable storage medium as provided herein. Wherein the memory stores instructions executable by at least one processor to cause the at least one processor to perform a method of generating a search space or a method of searching based on probability distribution as provided herein. The non-transitory computer-readable storage medium of the present application stores computer instructions for causing a computer to execute the generation method of the search space or the probability distribution-based search method provided herein.
The memory 802 may be used as a non-transitory computer readable storage medium for storing non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the generation method of the search space or the probability distribution based search method in the embodiment of the present application (for example, modules of the first processing module, the second processing module, and the like in the generation apparatus of the search space shown in fig. 3; modules of the first search module, the second search module, the third search module, and the like in the search apparatus shown in fig. 4). The processor 801 executes various functional applications of the server and data processing, i.e., implements the generation method of the search space or the probability distribution-based search method in the above-described method embodiments, by running non-transitory software programs, instructions, and modules stored in the memory 802.
The memory 802 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the electronic device, and the like. Further, the memory 802 may include high speed random access memory and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 802 optionally includes memory located remotely from the processor 801, which may be connected to the electronic device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The electronic device of the search space generation method or the probability distribution-based search method may further include: an input device 803 and an output device 804. The processor 801, the memory 802, the input device 803, and the output device 804 may be connected by a bus or other means, as exemplified by the bus connection in fig. 5.
The input device 803 may receive input numeric or character information and generate key signal inputs related to user settings and function controls of the electronic device, such as a touch screen, keypad, mouse, track pad, touch pad, pointer stick, one or more mouse buttons, track ball, joystick, or other input device. The output devices 804 may include a display device, auxiliary lighting devices (e.g., LEDs), and haptic feedback devices (e.g., vibrating motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
By adopting the method and the device, at least one search space can be obtained according to at least one search space generation strategy, iterative updating is carried out on the at least one search space according to probability distribution of search space performance, a search space with optimal performance is obtained according to the trained at least one search space obtained from iterative updating to iterative convergence, and the search space with optimal performance is used as a target search space. Because the target search space is a search space with the best performance among the possibilities of a plurality of search spaces (i.e. at least one search space) through iterative update training based on probability distribution, the model structure obtained by searching in the subsequent target search space also has the best performance, so that the target search space and the model structure obtained by searching are applied to scenes such as image processing (such as image classification, image recognition and image detection), the hardware performance such as the processing speed and the processing precision of hardware in scenes such as image processing can be improved, and the use amount of hardware can be reduced along with the improvement of the hardware performance, for example, the same hardware performance as the prior art can be achieved by using less hardware, thereby reducing the hardware cost.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present application can be achieved, and the present invention is not limited herein.
The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (20)

1. A method of generating a search space, the method comprising:
obtaining at least one search space according to at least one search space generation strategy;
and for the at least one search space, performing iterative updating according to the probability distribution of the performance of the search space, and obtaining a target search space according to at least one search space obtained after training from iterative updating to iterative convergence.
2. The method of claim 1, wherein the generation strategy of the search space comprises any one or more of the following:
generating the search space according to the type of the middle layer of the selected training model and the number of the middle layers of the selected training model;
generating the search space according to the type of the layer in the selected training model;
generating the search space according to the convolution attribute of the middle layer of the selected training model and the number of the middle layers of the selected training model;
generating the search space according to a topology of a layer in a selected training model; wherein the topology comprises a single-branch topology or a multi-branch topology.
3. A probability distribution-based search method, the method comprising:
obtaining at least one search space according to at least one search space generation strategy;
training the at least one search space in an iterative updating mode according to the probability distribution of the performance of the search space, and under the condition that the at least one search space is iteratively updated to iterative convergence according to the first target parameter, finishing the training to obtain at least one trained search space;
and responding to the first search operation, and searching the at least one trained search space to obtain a target search space.
4. The method of claim 3, wherein the first target parameter comprises: a first hyper-parameter for the search space performance evaluation.
5. The method of claim 3, further comprising:
training at least one model structure in the target search space in an iterative updating mode according to probability distribution of model structure performance, and under the condition that the at least one model structure is iteratively updated to iterative convergence according to a second target parameter, finishing the training to obtain at least one trained model structure;
and responding to a second searching operation, and searching the trained at least one model structure to obtain a target model structure.
6. The method of claim 5, wherein the second target parameter comprises: a second hyper-parameter for performance evaluation of the model structure.
7. The method according to claim 3, wherein the training of the at least one search space according to the probability distribution of the performance of the search space in an iterative updating manner, and in a case that the at least one search space is iteratively updated according to the first target parameter until iteration converges, ending the training to obtain the trained at least one search space comprises:
modeling the at least one search space according to probability distribution of the performance of the search space to obtain a first probability model;
taking a first hyper-parameter for the search space performance evaluation as the first target parameter;
iteratively updating the first probability model in accordance with the first hyper-parameter to iteratively update the at least one search space based on the first probability model to an iterative convergence.
8. The method according to claim 5, wherein the training at least one model structure in the target search space in an iterative updating manner according to a probability distribution of model structure performance, and in a case that the at least one model structure is iteratively updated according to a second target parameter until iteration converges, ending the training to obtain the trained at least one model structure comprises:
modeling the at least one model structure according to the probability distribution of the model structure performance to obtain a second probability model;
taking a second hyper-parameter for the model structure performance evaluation as the second target parameter;
iteratively updating the second probabilistic model according to the second hyperparameter to iteratively update the at least one model structure based on the second probabilistic model to an iterative convergence.
9. The method according to any one of claims 3-8, further comprising:
acquiring an image to be processed;
inputting the image to be processed into a target model structure obtained according to the target search space search for image processing to obtain a target image;
wherein the image processing comprises: and at least one of image classification, image recognition and image detection.
10. An apparatus for generating a search space, the apparatus comprising:
the first processing module is used for obtaining at least one search space according to at least one generation strategy of the search space;
and the second processing module is used for performing iterative update on the at least one search space according to the probability distribution of the performance of the search space, and obtaining a target search space according to at least one search space obtained after training from iterative update to iterative convergence.
11. The apparatus of claim 10, the generation strategy of the search space comprises any one or more of the following:
generating the search space according to the type of the middle layer of the selected training model and the number of the middle layers of the selected training model;
generating the search space according to the type of the layer in the selected training model;
generating the search space according to the convolution attribute of the middle layer of the selected training model and the number of the middle layers of the selected training model;
generating the search space according to a topology of a layer in a selected training model; wherein the topology comprises a single-branch topology or a multi-branch topology.
12. A probability distribution-based search apparatus, the apparatus comprising:
the first search module is used for obtaining at least one search space according to at least one generation strategy of the search space;
the second search module is used for training the at least one search space in an iterative updating mode according to the probability distribution of the performance of the search space, and under the condition that the at least one search space is iteratively updated to iterative convergence according to the first target parameter, the training is ended to obtain at least one trained search space;
and the third searching module is used for responding to the first searching operation and searching at least one trained searching space to obtain a target searching space.
13. The apparatus of claim 12, wherein the first target parameter comprises: a first hyper-parameter for the search space performance evaluation.
14. The apparatus of claim 12, further comprising a fourth search module to:
training at least one model structure in the target search space in an iterative updating mode according to probability distribution of model structure performance, and under the condition that the at least one model structure is iteratively updated to iterative convergence according to a second target parameter, finishing the training to obtain at least one trained model structure;
and responding to a second searching operation, and searching the trained at least one model structure to obtain a target model structure.
15. The apparatus of claim 14, wherein the second target parameter comprises: a second hyper-parameter for performance evaluation of the model structure.
16. The apparatus of claim 12, wherein the second searching module is configured to:
modeling the at least one search space according to probability distribution of the performance of the search space to obtain a first probability model;
taking a first hyper-parameter for the search space performance evaluation as the first target parameter;
iteratively updating the first probability model in accordance with the first hyper-parameter to iteratively update the at least one search space based on the first probability model to an iterative convergence.
17. The apparatus of claim 14, wherein the fourth searching module is configured to:
modeling the at least one model structure according to the probability distribution of the model structure performance to obtain a second probability model;
taking a second hyper-parameter for the model structure performance evaluation as the second target parameter;
iteratively updating the second probabilistic model according to the second hyperparameter to iteratively update the at least one model structure based on the second probabilistic model to an iterative convergence.
18. The apparatus of any of claims 12-17, further comprising an image processing module to:
acquiring an image to be processed;
inputting the image to be processed into a target model structure obtained according to the target search space search for image processing to obtain a target image;
wherein the image processing comprises: and at least one of image classification, image recognition and image detection.
19. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-9.
20. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-9.
CN202011026459.0A 2020-09-25 2020-09-25 Search space generation method and device, electronic equipment and storage medium Withdrawn CN112100459A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011026459.0A CN112100459A (en) 2020-09-25 2020-09-25 Search space generation method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011026459.0A CN112100459A (en) 2020-09-25 2020-09-25 Search space generation method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112100459A true CN112100459A (en) 2020-12-18

Family

ID=73755491

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011026459.0A Withdrawn CN112100459A (en) 2020-09-25 2020-09-25 Search space generation method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112100459A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115294967A (en) * 2022-05-18 2022-11-04 国网浙江省电力有限公司营销服务中心 Full-automatic construction method of learning model search space suitable for speech classification

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115294967A (en) * 2022-05-18 2022-11-04 国网浙江省电力有限公司营销服务中心 Full-automatic construction method of learning model search space suitable for speech classification

Similar Documents

Publication Publication Date Title
KR102484617B1 (en) Method and apparatus for generating model for representing heterogeneous graph node, electronic device, storage medium and program
CN111582453B (en) Method and device for generating neural network model
CN110795569B (en) Method, device and equipment for generating vector representation of knowledge graph
CN111667056B (en) Method and apparatus for searching model structures
CN111667057B (en) Method and apparatus for searching model structures
CN112102448B (en) Virtual object image display method, device, electronic equipment and storage medium
CN111582479B (en) Distillation method and device for neural network model
CN112241764A (en) Image recognition method and device, electronic equipment and storage medium
CN111695698B (en) Method, apparatus, electronic device, and readable storage medium for model distillation
CN110717340B (en) Recommendation method, recommendation device, electronic equipment and storage medium
JP7427627B2 (en) Video segment extraction method, video segment extraction apparatus, electronic device, computer readable storage medium and computer program
CN111177339B (en) Dialogue generation method and device, electronic equipment and storage medium
CN110706701B (en) Voice skill recommendation method, device, equipment and storage medium
CN112100466A (en) Method, device and equipment for generating search space and storage medium
CN111967591B (en) Automatic pruning method and device for neural network and electronic equipment
CN111652354B (en) Method, apparatus, device and storage medium for training super network
CN112580723B (en) Multi-model fusion method, device, electronic equipment and storage medium
CN110532404A (en) One provenance multimedia determines method, apparatus, equipment and storage medium
CN112100459A (en) Search space generation method and device, electronic equipment and storage medium
CN111680599B (en) Face recognition model processing method, device, equipment and storage medium
CN111553169B (en) Pruning method and device of semantic understanding model, electronic equipment and storage medium
CN111160552B (en) News information recommendation processing method, device, equipment and computer storage medium
CN112699314A (en) Hot event determination method and device, electronic equipment and storage medium
CN112100468A (en) Search space generation method and device, electronic equipment and storage medium
CN111914884A (en) Gradient descent tree generation method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20201218