CN117034133A - Data processing method, device, equipment and medium - Google Patents

Data processing method, device, equipment and medium Download PDF

Info

Publication number
CN117034133A
CN117034133A CN202211263524.0A CN202211263524A CN117034133A CN 117034133 A CN117034133 A CN 117034133A CN 202211263524 A CN202211263524 A CN 202211263524A CN 117034133 A CN117034133 A CN 117034133A
Authority
CN
China
Prior art keywords
data
attribute
dimension
description information
attribute data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211263524.0A
Other languages
Chinese (zh)
Inventor
王安然
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202211263524.0A priority Critical patent/CN117034133A/en
Publication of CN117034133A publication Critical patent/CN117034133A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the application discloses a data processing method, a device, equipment and a medium, wherein the method comprises the following steps: acquiring first attribute data and second attribute data of an application program; the first attribute data and the second attribute data are attribute data with label tags and attribute data without label tags respectively; invoking a target classification model to identify attribute data of each mode dimension under the first attribute data to obtain a reference application type of the application program, and determining a first difference between the reference application type and the labeling application type; acquiring enhancement data corresponding to the second attribute data, and calling a target classification model to respectively identify the second attribute data and the enhancement data, so that two identification types can be obtained; and determining a second difference between the two recognition types, and performing model training on the target classification model according to the first difference and the second difference to obtain a trained target classification model. Therefore, the robustness of the model can be enhanced, and the type recognition capability of the model is improved.

Description

Data processing method, device, equipment and medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a data processing method, apparatus, device, and medium.
Background
With the rapid development of artificial intelligence technology, various models constructed based on neural networks can be used for realizing automation applications with different functions. In the classification field, automated classification processes for various objects (e.g., plants, animals, applications, etc.) using classification models are also becoming mature. Training of the classification model is an important link before the model is applied to actual classification, and the classification accuracy can be directly influenced by the model training effect. In order to make the classification model processing more accurate, a large amount of labeling data can be used for training the classification model, but in some application scenarios, the accumulation time of labeled sample data is long, the labeling cost is often relatively high, and although a small amount of labeling data can be used for training the classification model, the classification model obtained by training at present has the problem of insufficient stability, and the robustness of the model has a certain improvement space.
Disclosure of Invention
The embodiment of the application provides a data processing method which can effectively enhance the robustness of a classification model and improve the application type identification capability of the classification model.
In one aspect, an embodiment of the present application provides a data processing method, including:
acquiring first attribute data and second attribute data of an application program; the first attribute data are attribute data with label tags and one or more modal dimensions, wherein the label tags are used for indicating the label application types of the application programs; the second attribute data is attribute data without label tags;
invoking a target classification model to identify attribute data of each mode dimension under the first attribute data to obtain a reference application type of the application program, and determining a first difference between the reference application type and the labeling application type;
acquiring enhancement data corresponding to the second attribute data, and calling a target classification model to respectively identify the second attribute data and the enhancement data to obtain an identification type of the application program based on the second attribute data and an identification type of the application program based on the enhancement data;
determining a second difference between the recognition type of the application program based on the second attribute data and the recognition type of the application program based on the enhancement data, and performing model training on the target classification model according to the first difference and the second difference to obtain a trained target classification model; the trained target classification model is used for identifying the application type.
In one aspect, an embodiment of the present application provides a data processing apparatus, including:
the acquisition module is used for acquiring the first attribute data and the second attribute data of the application program; the first attribute data are attribute data with label tags and one or more modal dimensions, wherein the label tags are used for indicating the label application types of the application programs; the second attribute data is attribute data without label tags;
the identification module is used for calling the target classification model to identify the attribute data of each mode dimension under the first attribute data so as to obtain the reference object type of the target object;
the determining module is used for determining a first difference between the reference application type and the labeling application type;
the acquisition module is also used for acquiring the enhancement data corresponding to the second attribute data;
the recognition module is also used for calling the target classification model to respectively recognize the second attribute data and the enhancement data so as to obtain the recognition type of the application program based on the second attribute data and the recognition type of the application program based on the enhancement data;
the determining module is further used for determining a second difference between the identification type of the application program based on the second attribute data and the identification type of the application program based on the enhancement data;
The training module is used for carrying out model training on the target classification model according to the first difference and the second difference to obtain a trained target classification model; the trained target classification model is used for identifying the application type.
Accordingly, an embodiment of the present application provides a data processing apparatus, including: a processor, a memory, and a network interface; the processor is connected to the memory and the network interface, wherein the network interface is used for providing a network communication function, the memory is used for storing program codes, and the processor is used for calling the program codes to execute the data processing method in the embodiment of the application.
Accordingly, embodiments of the present application provide a computer readable storage medium storing a computer program comprising program instructions that, when executed by a processor, perform a data processing method of embodiments of the present application.
In the embodiment of the application, the first attribute data (with the labeling label) and the second attribute data (without the labeling label) of the application program can be obtained, and the labeling label is used for indicating the labeling application type of the application program, wherein the labeling application type is the real class of the application program under a certain classification system. The target classification model is called to identify the attribute data of each mode dimension under the first attribute data, so that the reference application type of the application program can be obtained, further, the first difference between the reference application type and the labeling application type can be determined, the supervised model training can be carried out on the target classification model through the first difference, the learning capacity of the model can be enhanced through the multi-mode attribute data, and the model training can have accurate type identification capacity through integration. In addition, the enhancement data obtained after the data enhancement processing of the second attribute data can be introduced, the target classification model is called to conduct recognition processing on the second attribute data and the enhancement data respectively, two recognition types are obtained, the second difference between the two recognition types is determined, the difference of classification results of the second attribute data of the application program before and after data enhancement can be measured through the second difference, unsupervised model training on the target classification model is achieved, and the evaluation of the target classification model on the jitter condition of the subtle changes of the second attribute data is achieved, so that the stability of the application type of the application program by the target classification model in recognition is enhanced, and further, when the first difference and the second difference are combined to conduct model training on the target classification model, supervised training and unsupervised training can be conducted on the target classification model, the application type recognition capability of the trained target classification model can be guaranteed, the anti-interference capability of the completed target classification model can be effectively improved through unsupervised training, the robustness of the model is further enhanced, and the type recognition of the application program by the target classification model is enabled to be more stable. Thus, after model training is completed, the obtained trained target classification model can be used for carrying out stable and accurate application type identification.
Drawings
FIG. 1 is a block diagram of a data processing system according to an embodiment of the present application;
FIG. 2 is a schematic flow chart of a data processing method according to an embodiment of the present application;
FIG. 3 is a flowchart of another data processing method according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a concept including feature representation construction based on a pre-trained model provided by an embodiment of the present application;
FIG. 5 is a flowchart illustrating an overall method for classifying applications according to an embodiment of the present application;
FIG. 6a is a schematic diagram of a training process of a target classification model according to an embodiment of the present application;
FIG. 6b is a schematic diagram of a training process of another object classification model according to an embodiment of the application;
FIG. 7 is a schematic diagram of an application classification process according to an embodiment of the present application;
FIG. 8 is a schematic diagram of a data processing apparatus according to an embodiment of the present application;
fig. 9 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application.
Detailed Description
The embodiment of the application provides a data processing scheme, wherein data processing equipment can acquire first attribute data and second attribute data of an application program, the first attribute data is attribute data carrying a labeling tag and one or more modal dimensions, the second attribute data is attribute data not carrying the labeling tag, the labeling tag is used for indicating a labeling application type of the application program, and the labeling application type is a real type of the application program; then, the data processing device may invoke the target classification model to identify attribute data for each modality dimension under the first attribute data, obtain a reference application type for the application program, and may determine a first difference between the reference application type and the labeling application type. The classification capability of the model can be enhanced by training the model through multi-modal attribute data, the recognition accuracy of the application type of the application program by the target classification model can be measured through the first difference, and the first difference can be used for performing supervised model training on the target classification model, so that the model learning capability and recognition accuracy can be trained. In addition, the data processing device may perform data enhancement processing on the obtained second attribute data to obtain enhancement data corresponding to the second attribute data, then the data processing device may call a target classification model to perform recognition processing on the enhancement data to obtain a recognition type of the application program based on the enhancement data, and may call the target classification model to perform recognition processing on the second attribute data to obtain a recognition type of the application program based on the second attribute data, then the data processing device may determine a second difference between the recognition type of the application program based on the second attribute data and the recognition type based on the enhancement data, and may measure a difference between classification results of the second attribute data of the application program before and after data enhancement, and may perform unsupervised model training on the target classification model based on the second difference, and may learn a jitter condition of the target classification model on a slight change of the input second attribute data and adjust the model, thereby enhancing stability of the application type of the application program by the target classification model. After the first difference and the second difference are obtained, the data processing equipment can perform model training on the target classification model based on the first difference and the second difference, can perform supervised training and unsupervised training on the target classification model through the two differences, can ensure the recognition accuracy of the trained target classification model based on the attribute data with the labeling label, and can perform unsupervised training based on the difference of the type recognition results of the second attribute data before and after data enhancement, so that the type recognition result of the target classification model under the second attribute data without the labeling label is more stable, the anti-interference capability of the trained target classification model is effectively improved, and the robustness of the model is further enhanced. After model training is finished, a target classification model after training is finished can be obtained and can be used for identifying application types, and the effect of type identification is improved.
It can be understood that the Application type accurately identified by the target classification model obtained based on the attribute data of different Application programs (Application, APP) and the labeling Application type indicated by the labeling label used during training belong to the same classification system, the classification system includes class information describing the Application program from a certain dimension, for example, the game type (such as role playing and shooting) of the game APP belongs to one classification system, and the wind type (such as American, cartoon and dream) of the game APP belongs to another classification system. The labeling labels under different classification systems are different, so that different category requirements under different application scenes can be met.
Artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results. The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions. The data processing scheme provided by the embodiment of the application relates to Computer Vision (CV), voice (Speech Technology), natural language processing (Nature Language Processing, NLP) and machine learning/deep learning technologies in artificial intelligence.
The Computer Vision technology (CV) Computer Vision is a science of researching how to make a machine "look at", and more specifically, it means to replace a human eye with a camera and a Computer to perform machine Vision such as identifying and measuring on a target, and further perform graphic processing, so that the Computer processing becomes an image more suitable for the human eye to observe or transmit to an instrument for detection. As a scientific discipline, computer vision research-related theory and technology has attempted to build artificial intelligence systems that can acquire information from images or multidimensional data. Computer vision techniques typically include image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D techniques, virtual reality, augmented reality, synchronous positioning, and map construction, among others, as well as common biometric recognition techniques such as face recognition, fingerprint recognition, and others.
Key technologies to the speech technology (Speech Technology) are automatic speech recognition technology (ASR) and speech synthesis technology (TTS) and voiceprint recognition technology. The method can enable the computer to listen, watch, say and feel, is the development direction of human-computer interaction in the future, and voice becomes one of the best human-computer interaction modes in the future. Natural language processing (Nature Language Processing, NLP) is an important direction in the fields of computer science and artificial intelligence. It is studying various theories and methods that enable effective communication between a person and a computer in natural language. Natural language processing is a science that integrates linguistics, computer science, and mathematics. Thus, the research in this field will involve natural language, i.e. language that people use daily, so it has a close relationship with the research in linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic questions and answers, knowledge graph techniques, and the like.
Machine Learning (ML) is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, etc. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like. The application relates to a technology of image processing when attribute data of an application program comprises image data of image mode dimension, particularly relates to a voice recognition technology when the attribute data of the application program comprises voice data of voice mode dimension, particularly relates to a text processing technology when the attribute data of the application program comprises text data of text mode dimension.
The scheme provided by the embodiment of the application is applied to Application program classification scenes, a small amount of labeling data of an Application program (APP) can be adopted, and simultaneously, non-labeling data and enhancement data obtained after non-labeling data enhancement are introduced to train a target classification model together, so that a classification model for carrying out type identification on the Application program is obtained, and labeling personnel can be prevented from carrying out labeling on a large amount of APP in a manner of downloading and experiencing the APP. By the aid of the method and the device, automatic labeling can be achieved, labeling cost is greatly reduced, and correct labels can be automatically labeled for the application program by means of a trained target classification model according to any attribute data (such as brief introduction, images and the like) of the application program. For automatic accurate labeling of application programs, more convenience can be provided for operation of the application programs, for example, when the application programs are promoted, advertisement features required by promotion are automatically constructed through automatically labeled labels so as to facilitate promotion to corresponding object groups.
Based on the above described data processing scheme, an architecture diagram of a data processing system as shown in FIG. 1 may be provided. The data processing system includes a database 101 and a data processing device 102, where the database 101 may be a local database or a cloud database, or may be a private database or a public database, and the data processing device 102 may be a server or a terminal device, where the server may be an independent physical server, or may be a server cluster or a distributed system formed by multiple physical servers, or may be a cloud server that provides basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDN (Content Deliver Network, content distribution network), and big data, and the like, but is not limited thereto. Terminal devices include, but are not limited to: smart phones, tablet computers, smart wearable devices, smart voice interaction devices, smart home appliances, computers, vehicle terminals, and the like, to which the present application is not limited.
The database 101 may be configured to store attribute data of an application program, including attribute data with a label and attribute data without a label, where the label is used to indicate a labeling application type of the application program, where the labeling application type is a real class of the application program under a predefined classification system, and the attribute data is used to represent an application attribute of the application program. The attribute data of the application may include, without limitation, a usage profile for the application, an interface picture of the application, a tag added to the application in an application store, and the like. The data processing device 102 may obtain the first attribute data and the second attribute data of the application program from the database 101, and perform model training on the target classification model according to the data processing scheme described above by using the first attribute data, the second attribute data, and the enhancement data corresponding to the second attribute data, to obtain a trained target classification model. When the data volume of the first attribute data (belonging to the marked data) is too small, the second attribute data (belonging to the unmarked data) is introduced, and particularly, the difference of the classification results of the second attribute data before and after data enhancement is added to train the target classification model together on the basis of common supervised training difference calculation, so that the accuracy and stability of the model on application type identification of the application program can be effectively improved, and the robustness of the model is higher.
Referring to fig. 2, fig. 2 is a flow chart of a data processing method according to an embodiment of the application. The method may be performed by the data processing apparatus described above. The data processing method includes the following steps S201 to S204.
S201, acquiring first attribute data and second attribute data of an application program.
An Application (APP) is a computer program with a specific function, and may be run in various computer devices (such as a terminal device, a server, etc.). The application programs can comprise third party application programs and local application programs according to source division of the application programs, the application programs can comprise application programs to be installed and application programs without installation according to installation mode of the application programs, and the application programs can comprise web (webpage) application programs (accessible through a browser) according to access mode of the application programs. The application program may provide an interactive interface for using the object to interact and generate corresponding data, such as text data, voice data, image data, etc., during the interaction, which may be attribute data of the application program. The attribute data of the application program is data for describing application attributes of the application program, and can be used for intuitively representing the application program, wherein the application attributes refer to basic characteristics of the application program, such as functions of the application program, interface design and the like, and can be regarded as application attributes. According to the method and the device, attribute data of different application programs, such as attribute data of an application program A and attribute data of an application program B, can be obtained, and the attribute data of any application program is processed according to the scheme described by the method and the device. The attribute data of the application may be used as sample data in the model training phase, with which the model is trained.
According to whether the attribute data has the label tag, the attribute data of the application program can be divided into first attribute data and second attribute data. The first attribute data is attribute data with a labeling tag and one or more modal dimensions, the second attribute data is attribute data without the labeling tag, and the labeling tag is used for indicating a labeling application type of an application program. The above-described modality dimensions may also be referred to as modalities, since the source or form of data is varied, one source or form of data may be considered a modality for ease of distinction. Such as text data in the text modality dimension, image data in the image modality dimension, voice data in the voice modality dimension, and so forth. The labeling tag can be an application type of application program labeling under a predefined classification system, can be manually labeled or machine labeled, and the labeling application type of the application program indicated by the labeling tag is a real class of the application program under the predefined classification system. The second attribute data of the application may be attribute data for which the annotation tag and one or more modality dimensions are absent. When the first attribute data and the second attribute data each include attribute data of a plurality of modal dimensions from the aspect of the modal dimensions, the first attribute data and the second attribute data are multi-modal data, i.e. data composed of different forms or sources. From the labeling condition of the label, the first attribute data belongs to labeling attribute data, and the second attribute data belongs to non-labeling attribute data. The data processing apparatus may acquire the first attribute data of the application program and the second attribute data of the application program from a database for storing the attribute data of the application program.
For example, in an APP store, a profile, an internal interface design picture, and labels in some APP stores are often stored, and these data are product key information that the APP developer wants to show most, and may be used as raw information resources of the APP, and since these data may represent basic properties of the APP, such as functions, design styles, use experiences, and so on of the APP, these raw information resources may be used as attribute data of an application program, and the data processing apparatus may be easily obtained by retrieving a crawler.
S202, calling a target classification model to identify attribute data of each mode dimension under the first attribute data, obtaining a reference application type of the application program, and determining a first difference between the reference application type and the labeling application type.
After the first attribute data and the second attribute data of the application program are acquired, the data processing device can call the target classification model to identify the attribute data of each mode dimension contained in the first attribute data, so as to obtain the reference application type of the application program. Wherein the object classification model is a model with basic recognition processing capability, and based on the processing steps involved in the recognition processing of the first attribute data, the object classification model may comprise processing modules of corresponding functions, which may be one or both of a pre-trained model and a randomly initialized model. The pre-training model can be a model which is pre-trained through massive data (such as text corpus data), and can be continuously trained or used for other purposes on the basis of the preliminarily trained model. The general pre-training model usually adopts a large amount of data to train some general tasks, obtains general knowledge through pre-training on multiple tasks, and can use the marking data on a small amount of target tasks to carry out fine adjustment, so that the fine-adjusted model can well process the target tasks. The pre-training model is, for example, a pre-trained language characterization model in the text field, such as a BERT (BidirectionalEncoder Representations from Transformer, transform-based bi-directional encoder representation) model, a BiT (BigTransfer, a pre-trained image model) model in the image field, and so on. The attribute data of all modal dimensions contained in the first attribute data are identified, so that multi-modal learning of the model can be realized, and the identification accuracy of the model to the application program is improved.
The reference application type may be a type of target classification model prediction, and since the target classification model is not stable enough in initial training, the identification of the application type of the application program may not be accurate enough, so that a certain difference exists between the reference application type and the labeling application type of the application program obtained here, and the difference between the reference application type and the labeling application type of the application program is smaller and smaller along with the training of the model, that is, the identification of the application type of the application program of the model tends to be accurate. The reference application type and the labeling application type can be represented by type distribution information, for example, the reference application type is probability distribution of application program belonging to each application type, the labeling application type can be 0-1 distribution, the true labeling application type of the application program is 1, and other labeling application types are 0. The data processing apparatus may determine a first difference between the reference application type and the labeling application type, alternatively the first difference may be measured by a cross entropy loss between the two distributions. The first difference is used to reflect the classification accuracy of the target classification model. The first difference can be used for realizing supervised training of the target classification model, so that the model can see standard sample data, and the accuracy of model training is ensured.
S203, acquiring the enhancement data corresponding to the second attribute data, and calling the target classification model to respectively identify the second attribute data and the enhancement data to obtain the identification type of the application program based on the second attribute data and the identification type of the application program based on the enhancement data.
The data processing device may acquire enhancement data corresponding to the second attribute data, where the enhancement data is obtained by performing data enhancement processing on the acquired second attribute data of the application program, and specifically, enhancement-processed second attribute data. When the data amount of the first attribute data (one type of labeling data) is insufficient, the sample learning amount of the model can be expanded by introducing the second attribute data (one type of non-labeling data) and the second attribute data after enhancement processing, so that the model sees more samples. Then, the data processing device can call the target classification model to conduct identification processing on the second attribute data to obtain an identification type of the application program based on the second attribute data, and conduct identification processing on the enhancement data to obtain an identification type of the application program based on the enhancement data. The recognition type can also be represented by a type distribution, which can be a probability distribution that an application program belongs to each application type, and the recognition accuracy of the model can be better evaluated by representing the recognition type by the type distribution in the training stage.
Since the enhancement data is obtained after the second attribute data is transformed, the recognition processing of the enhancement data by the target classification model may not be accurate enough, and thus, there is a certain difference between the recognition type of the application program based on the enhancement data and the recognition type of the application program based on the second attribute data.
S204, determining a second difference between the recognition type of the application program based on the second attribute data and the recognition type of the application program based on the enhancement data, and performing model training on the target classification model according to the first difference and the second difference to obtain a trained target classification model.
The data processing equipment can determine the difference of the classification results of the second attribute data before and after data enhancement, namely the second difference, and perform model training on the target classification model according to the first difference and the second difference, so that on the basis of common supervised training of classification tasks, unsupervised training is introduced in the model training process, particularly, the difference of the classification results of the non-labeling attribute data before and after data enhancement is combined with training, thus ensuring that the target classification model outputs a stable model prediction result under the non-labeling attribute data, effectively improving the stability of the model, and further enhancing the robustness of the model.
It should be noted that, the model training of the target classification model is an iterative training, that is, the above steps S201 to S203 may be repeated until the target classification model training is completed. When the model training reaches the preset iteration times or the difference reaches the convergence condition, a target classification model after the training is obtained, and the target classification model after the training is used for identifying the application type. In one implementation, the data processing device may invoke the trained target classification model to identify attribute data of the object to be classified, thereby obtaining an application type of the object to be classified. The present application, unless otherwise specified, "type" and "category" represent the same concept.
The data processing method provided by the embodiment of the application can acquire the first attribute data (with the labeling label) and the second attribute data (without the labeling label) of the application program, wherein the labeling label is used for indicating the labeling application type of the application program, and the labeling application type is the true category of the application program. The first attribute data is identified by calling the target classification model, so that the reference application type of the application program can be obtained, further, the first difference between the reference application type and the labeling application type can be determined, and the target classification model can be subjected to supervised model training through the first difference, so that the model training has accurate identification capability. In addition, the enhancement data obtained after the data enhancement processing of the second attribute data can be introduced, the target classification model is called to respectively identify the second attribute data and the enhancement data, two identification types are obtained, the second difference between the two identification types is determined, the difference of the classification results of the second attribute data of the application program before and after the data enhancement can be measured through the second difference, the unsupervised model training is carried out on the target classification model, the shaking condition of the target classification model on the input fine change of the second attribute data is obtained, and the model parameters of the target classification model are adjusted, so that the stability of identifying the application type of the application program by the target classification model is enhanced. Furthermore, model training can be performed on the target classification model based on the first difference and the second difference, so that supervised training and unsupervised training can be performed on the target classification model, the capability of performing application type recognition of the trained target classification model can be guaranteed through the supervised training, the anti-interference capability of the trained target classification model can be effectively improved through the unsupervised training, the robustness of the model is further improved, and the type recognition of an application program of the target classification model is more stable. Therefore, after model training is finished, the obtained target classification model after training has better application type recognition capability, and can be used for stably and accurately recognizing the application type.
Referring to fig. 3, fig. 3 is a flowchart of a data processing method according to an embodiment of the application. The method may be performed by the data processing apparatus described above. The data processing method includes the following steps S301 to S305.
S301, acquiring first attribute data and second attribute data of an application program.
In one embodiment, obtaining first attribute data of an application program includes: acquiring a modal dimension for describing application attributes of an application program; acquiring attribute description information of an application program in a corresponding mode dimension based on the mode dimension, and acquiring a labeling label of the application program; and associating the label tag with the corresponding attribute description information, and taking the attribute description information of the associated label tag as first attribute data.
Specifically, the data processing apparatus may first obtain a modal dimension describing application attributes of the application program, where the modal dimension includes one or more of the following: text dimension, image dimension, and speech dimension. The application properties of the application may be described from different angles for different modality dimensions, and the data modalities in the different modality dimensions are different. The data processing device may obtain attribute description information for the application in each modality dimension, including but not limited to: attribute description information in the text dimension (i.e., text data), attribute description information in the image dimension (i.e., image data), attribute description information in the voice dimension (i.e., voice data). Further, the modality dimensions may also include a social dimension, the attribute descriptive information of which is, for example, a social relationship between two social objects. For example, the application program is a game APP, and the text data may be text data generated in the game APP, a profile or a usage rating of the game APP in an application store, or a tag set for the game APP in the application store. The image data may be a game interface of the game APP, the voice data may be voice data of virtual characters built in the game APP, and the social data may be social relations between players in the game APP, or social information generated by social activities between players through the game APP, and the like.
Optionally, the attribute description information includes, but is not limited to: object description text, object images and scene description text in the corresponding object scene. Wherein the object description text is text data for describing the application program, such as text profile information for simply introducing the application program; the object image is image data for describing the content of the application program, such as an object interactive interface diagram of the application program; scene description text is text data describing the scene in which the application is located, such as a category label of the application in an application store (an application providing download use of various third party applications), where the category label is specific in the application store and may be different from the required label. The data processing device can acquire the labeling label of the application program, correlate the labeling label with the attribute description information under each mode dimension, and the attribute description information correlated with the labeling label can be used as first attribute data. When the modal dimension includes at least two, the first attribute data obtained by the above method is multi-modal data with label, where the multi-modal data is attribute data of multiple modal dimensions, and specifically comprises attribute description information of different forms.
It should be noted that, for the acquisition of the second attribute data of the application program, the data processing device may be similar to the acquisition of the first attribute data of the application program, and may acquire attribute description information of the application program in a corresponding mode dimension, directly use the attribute description information as the second attribute data of the application program, and when the mode dimension includes at least two, the acquired second attribute data is multi-mode data without label tags.
In one embodiment, the target classification model is a pre-trained target classification model. The pre-trained object classification model includes a pre-trained encoding module. The number of pre-trained encoding modules may include at least one encoding network, one pre-trained encoding network may be used to process attribute description information for one modality dimension, e.g., pre-trained encoding modules corresponding to attribute description information for a text dimension may include a BERT model, and pre-trained encoding modules corresponding to attribute description information for an image dimension may include a BiT model. Compared with the random initialization model parameters, the pre-trained coding module adopts the pre-trained model parameters, so that the pre-trained model is subjected to training fine adjustment through data of a small amount of target tasks, a classification model meeting the conditions can be obtained, the model is dominant in training convergence speed and accuracy, and calculation resources required by training can be saved. An embodiment of invoking the target classification model to identify attribute data of each modal dimension under the first attribute data may be found below.
S302, invoking a pre-trained coding module in the target classification model to perform feature coding processing on the first attribute data, and obtaining coding features of the first attribute data.
The data processing apparatus may first invoke the pre-trained encoding module to perform feature encoding processing on the first attribute data, and may construct an encoding feature representing that the application program is based on the first attribute data, where the encoding feature may be vector representation information, such as an unbedding (a vector). When the first attribute data includes attribute description information of a plurality of modal dimensions, the resulting encoded feature is a multi-modal representation feature that may characterize the application from different angles, thereby enhancing the representation of the application. Thus, by combining multi-modal information (including text, image, voice, etc.) of the multi-source heterogeneous application program, a feature representation of the application program is commonly constructed, so that multi-modal learning can be realized. Multiple angles of description can be made to the application based on multi-modal learning, so that there is better performance in downstream tasks (e.g., classification tasks).
In one embodiment, if the first attribute data includes attribute description information of a plurality of (i.e., at least two) modality dimensions, the feature encoding process of the first attribute data may specifically include the following descriptions (1) - (2).
(1) Invoking a pre-trained coding module in the target classification model, and respectively carrying out feature coding processing on attribute description information of different mode dimensions contained in the first attribute data to obtain description features corresponding to the attribute description information of the corresponding mode dimensions in the first attribute data.
The pre-trained encoding module may include an encoding network that matches the corresponding modal dimension, the network parameters used by the encoding network being those employed by the pre-trained encoding module. When the data processing equipment calls the pre-trained coding module to process the attribute data of each mode dimension in the first attribute data, the data processing equipment can specifically call a coding network matched with the corresponding mode dimension to perform feature coding processing on the attribute description information in the corresponding mode dimension so as to obtain the description feature in the corresponding mode dimension. The corresponding modal dimension may be any one of a plurality of modal dimensions, such as a text dimension, an image dimension, a speech dimension, and the like. The attribute description information is attribute data for describing application attributes of the application program, and can have various existing forms, namely, attribute description information of various mode dimensions, such as text dimensions, can comprise a brief introduction and category labels of the application program, and attribute description information of image dimensions can comprise a functional interface diagram of the application program and the like. And similar processing modes are adopted for the attribute description information of each mode dimension to obtain corresponding description characteristics, and each description characteristic can be represented by an enabling vector. For example, the first attribute data includes attribute description information of a text dimension and attribute description information of an image dimension, and then the pre-trained BERT model may be called to perform feature encoding processing on the attribute description information of the text dimension to obtain text description features in the text mode dimension, and the pre-trained BiT model (an image representation model trained on a large-scale image corpus) may be called to perform feature encoding processing on the attribute description information of the image dimension to obtain image description features in the image mode dimension. Through feature coding processing, attribute description information (such as text and image information) of corresponding mode dimensions can be coded into description features (such as embedding vectors) which can be identified and classified by data processing equipment, so that original data are mapped to a hidden layer feature space, and abstract representation of the attribute description information is realized.
Further, in one embodiment, the attribute description information in any one of the modal dimensions is plural (i.e., at least two), and the information types of the plural attribute description information in any one of the modal dimensions may be different. For information type determination, in one manner, information types of attribute description information in the same modality dimension may be determined based on information sources, for example, development code designed for an application by a developer and a category label marked after the application is online to an application store may be divided into two different information types. In another approach, the information type of the attribute description information in the same modality dimension may also be determined based on the information length. For example, for a profile and category label of an application in a text modality, the profile may be divided into two information types because the category label is shorter text data and the profile is longer text data. And for the attribute description information of different information types, the attribute description information can be processed respectively so as to improve the accuracy of encoding and the convenience of processing.
When the attribute description information in any mode dimension is a plurality of and the information types of the attribute description information in any mode dimension are different, the feature encoding processing of the attribute description information in any mode dimension is performed to obtain the description feature corresponding to the attribute description information in any mode dimension, and the method may include the following steps: firstly, the data processing equipment can call a pre-trained coding module in the target classification model, and respectively code attribute description information corresponding to different information types in any mode dimension in the first attribute data to obtain description characteristics corresponding to the attribute description information of the corresponding information type in any mode dimension. Then, the data processing equipment can use the description features corresponding to the obtained attribute description information of each information type as the description features corresponding to the attribute description information in any mode dimension; or, the spliced description characteristic obtained based on the description characteristic corresponding to the attribute description information of each information type is used as the description characteristic corresponding to the attribute description information in any mode dimension.
For the attribute description information of different information types in the same mode dimension, a coding network (a pre-training model) matched with the mode dimension can be adopted to respectively code the attribute description information of each information type, so as to obtain the description characteristics of the attribute description information of the corresponding information type. In one implementation manner, the description feature corresponding to the attribute description information of each information type can be directly used as the description feature corresponding to the attribute description information in the mode dimension, that is, the description feature corresponding to the same mode dimension includes the description feature corresponding to each information type. In another implementation manner, the description features corresponding to the attribute description information of each information type can be spliced to obtain spliced description features, and the spliced description features are used as the corresponding description features in the same mode dimension. For example, for text class information (i.e., attribute description information in text dimension) such as APP profile and label, a BERT model (i.e., a pre-trained encoding module) is used for encoding, where the BERT model is a language model based on a transducer (a basic model) structure trained on a large-scale text corpus, and it should be noted that other pre-trained language models still apply here, and the application is not limited thereto. By entering text into the language model BERT, the empdding of the profile and the empdding of the label, here denoted V1 and V2, can be obtained respectively. The V1 and the V2 can be directly used as the corresponding description features in the text dimension, or V1 and V2 can be spliced to be used as the corresponding description features in the text dimension.
It will be appreciated that although the coding modules used in training are of the same type, the coding modules obtained after training are different due to the input of the attribute description information of different information types, and are embodied on the different model parameters. For example, two identical pre-trained BERT models, one for processing application profiles and the other for processing application labels, are trained to be BERT models with different model parameters.
Therefore, on the aspect of feature representation, feature coding processing is performed through the pre-trained coding module, a mature pre-training model (for example, a pre-training model corresponding to text and picture forms) can be fully utilized, so that model learning can be trained on the basis of a large number of models after corpus training, and the models have more accurate learning results.
(2) And splicing the description features corresponding to the attribute description information under different mode dimensions, and taking the spliced description features as the coding features of the first attribute data.
The data processing device can splice the description features corresponding to the attribute description information in each mode dimension to obtain spliced description features, and the spliced description features can be used as coding features of the first attribute data, wherein the coding features are distributed feature representations (feature representations capable of improving feature generalization capability). When the coding features of the first attribute data are represented from a plurality of modal dimensions, the finally obtained coding features can strengthen the representation of the application program compared with the description features of a single modal dimension, and the coding features comprise the description features of the plurality of dimensions and can strengthen the model learning effect. In addition, the attribute description information used in the feature representation is key information, so that excessive redundant information can be avoided, and model training is easier and simpler.
For example, the first attribute data of the application program includes screenshot information of APP, profile, and tags in the application store, and schematic diagram of feature representation construction based on the pre-trained model can be seen in fig. 4. Wherein the profile and label of the APP are encoded using a text encoding network (e.g., BERT model), wherein the text encoding network comprises a text encoding network a and a text encoding network B, which are text encoding networks with the same model parameters prior to training. The text coding network A is used for processing the brief introduction of the APP, the text coding network B is used for processing the label of the APP, the text coding network A can output the vector representation V1 of the brief introduction, the text coding network B can output the vector representation V2 of the label, and the screenshot information of the APP can be input into the image coding network (such as a BiT model) to obtain the vector representation V3 of the screenshot. Then, after V1, V2, and V3 respectively pass through the multi-layer fully connected neural network (Multilayer Perceptron, MLP), the learned distributed feature representation can be mapped to the sample label space through the fully connected layer, and then the vector representation processed by the MLP is subjected to vector stitching (contact) to form a final APP feature representation (i.e., coding feature). Classification results are typically obtained by passing features through softmax (a normalized exponential function).
It should be noted that, by expanding the modal dimension of the attribute data, that is, obtaining attribute description information of more modal dimensions, in the feature encoding processing stage, the obtained encoding features are also added with the representation dimension of the application program, so as to further strengthen the encoding features and enhance the representation learning capability of the model. The mode dimension is not limited in the present application. Correspondingly, the second attribute data also comprises attribute description information of a plurality of modal dimensions, and for the feature encoding processing of the second attribute data, the feature encoding processing process of the first attribute data can be referred to, and the encoding feature obtained by the feature encoding processing of the second attribute data is also a multi-modal representation feature.
The target classification model contains, in addition to the pre-trained coding modules, recognition modules that do not complete training, which can be considered a classifier, for example, a softmax regression model. In the identification process of the first attribute data, the identification module which has not completed training functions as described in S303 below.
S303, an identification module which does not complete training is adopted, the application type of the application program is identified according to the coding features of the first attribute data, the reference application type of the application program is obtained, and the first difference between the reference application type and the labeling application type is determined.
The data processing device may use an identification module that does not complete training, output the coding feature of the first attribute data according to the coding model that has completed pre-training, perform identification processing on the application type of the application program, and further the identification module that does not complete training may output the reference application type of the application program. Both the reference application type and the annotation application type can be represented by type distribution information. For example, if application A belongs to any of categories a1-a4, then the resulting reference application type may be represented by a probability distribution that the application belongs to the respective application type, such as [0.1,0.5,0.9,0.2], and the labeling of application A may also be represented as a category distribution, which may be specifically a 0-1 distribution, such as [0, 1,0]. Thus, for the determination of the first difference between the reference application type and the labeling application type, in particular the determination of the difference between the two distribution information. It will be appreciated that after the recognition module completes training, the application types obtained in the model application stage may be directly represented as corresponding categories, for example, the application type corresponding to the highest probability is selected as the application type obtained by final recognition.
S304, acquiring the enhancement data corresponding to the second attribute data, and calling the target classification model to respectively identify the second attribute data and the enhancement data to obtain the identification type of the application program based on the second attribute data and the identification type of the application program based on the enhancement data.
In one embodiment, the second attribute data is data containing attribute description information of a plurality of modality dimensions, and the second attribute data belongs to multi-modality data. The data enhancement processing of the second attribute data can be realized by the following modes: firstly, the data processing equipment can determine the mode dimensions corresponding to each attribute description information in the second attribute data respectively, and acquire a data enhancement algorithm matched with the corresponding mode dimensions; then, the data processing device may perform data enhancement processing on the attribute description information in the corresponding mode dimension by using a matched data enhancement algorithm, and use the attribute description information after enhancement processing as enhancement data of the second attribute data.
Data enhancement is a technology for expanding training data by using an algorithm, and by using the method of data enhancement, training data can be automatically enhanced and expanded by using the algorithm when the data volume is insufficient. For example, a large amount of text labeling data needs to be acquired in the text classification task to improve the precision and generalization capability of the model, but the cost of manual labeling is high, and at this time, the required labeling data can be expanded by a data enhancement method. Therefore, the data enhancement can effectively improve the learning performance and accuracy of the model with lower cost under the data constraint environment, and the data enhancement algorithm matched with the corresponding mode dimension can automatically conduct data enhancement processing on the non-labeling attribute data to obtain enhanced data.
In order to better perform data enhancement processing on each attribute description information in the second attribute data, determining a mode dimension corresponding to each attribute description information in the second attribute data based on a corresponding relation between the attribute description information and the mode dimension, so as to acquire a data enhancement algorithm matched with the mode dimension. The data enhancement algorithm is a policy mechanism or rule for data enhancement, under which the data processing device can implement data enhancement processing based on the corresponding algorithm instructions. The second attribute data may include one or more of a plurality of data enhancement algorithms that match the determined modal dimensions when the modal dimensions include one, and a plurality of data enhancement algorithms that match each of the plurality of modal dimensions when the modal dimensions include a plurality. For example, the modality dimensions corresponding to the attribute description information include a text dimension and an image dimension, and then the data processing apparatus may obtain a data enhancement algorithm matching the text dimension and a data enhancement algorithm matching the image dimension. The data enhancement algorithm matched with the corresponding mode dimension can be used for indicating a specific rule for enhancing the attribute description information in the corresponding mode dimension, and then the matched data enhancement algorithm can be adopted for enhancing the attribute description information in the corresponding mode dimension, for example, the data enhancement algorithm matched with the text dimension can be used for enhancing the attribute description information in the text dimension, the data enhancement algorithm matched with the image dimension can be used for enhancing the attribute description information in the image dimension. The data processing device may use the attribute description information after enhancement processing in each modal dimension as enhancement data of the second attribute data, where when the second attribute data relates to at least two modal dimensions, the enhancement data also relates to at least two modal dimensions, and at this time, the enhancement data also belongs to multi-modal data.
In one implementation, when the second attribute data includes attribute description information corresponding to a modality dimension being a text dimension, the determined matched data enhancement algorithm is a data enhancement algorithm matched with the text dimension. The attribute description information of the text dimension can be understood as data of the text mode, and when the second attribute data is subjected to data enhancement processing, specifically, a data enhancement algorithm matched with the text dimension is adopted to perform data enhancement processing on the attribute description information of the text mode dimension, and the detailed method can comprise the following two modes.
Firstly, acquiring attribute description information of a text dimension from second attribute data, and translating the acquired attribute description information of the text dimension to obtain a translation text corresponding to the attribute description information of the text dimension; performing back translation processing on the translation text to obtain a back translation text of the translation text; the translated text is used as the attribute description information after the data enhancement processing is carried out on the acquired attribute description information of the text dimension.
The data processing device may obtain attribute description information of a text dimension from the second attribute data, where the attribute description information of the text dimension is text data including one or more languages, and according to a conversion rule between two languages, the data processing device may translate the attribute description information of the text dimension to obtain a translated text, where the translated text is text data of another language. For example, the obtained attribute description information of the text dimension is a text profile in chinese, and based on the translation rule between chinese and english, the attribute description information can be translated into the text profile in english to obtain a translated text, and then the translated text can be subjected to back-translation processing to restore the language of the translated text into text data of the original language. In brief, for example, the APP profile in chinese is translated into english, and then the english is translated back into chinese, thus completing the data enhancement processing of the text mode.
Since there is a certain difference in the rules followed by the translation when the translation directions of different languages are different, the resulting translated text is a certain difference from the pre-translated text. The translated text is obtained through translation and back translation, so that not only is part of original text mode attribute description information maintained, but also a part different from the original text mode attribute description information exists, in addition, different expression modes under the same semantic meaning can be enriched through back translation of the text mode data, and data enhancement processing of the text mode attribute description information is realized. The data processing device may use the translated text as attribute description information after data enhancement according to the matched data enhancement algorithm by using the attribute description of the extracted text dimension.
Obtaining attribute description information of a text dimension from the second attribute data, and performing random information deletion processing on the obtained attribute description information of the text dimension to obtain a deleted text corresponding to the attribute description information of the text dimension; the deleted text is used as the attribute description information after the data enhancement processing is carried out on the acquired attribute description information of the text dimension.
In this way, the data processing device may randomly select information to be deleted from the acquired attribute description information of the text dimension, and delete the information, where the information may be one or more of text segments, sentences, words, etc., for example, randomly delete a part of segments in the text, such as about 10% of segment content, and obtain a corresponding deleted text through random information deletion processing, where the deleted text also retains a part of information in the attribute description information of the text dimension that is originally acquired, and may avoid the problem of too many strongly-referred text segments, reduce data redundancy, and strengthen global information. The data processing device may use the deleted text as attribute description information in the text dimension, and perform data enhancement according to the matched data enhancement algorithm to obtain attribute description information after data enhancement. Training the target classification model by using the deleted text can enhance the learning ability and robustness of the model to global information.
It should be noted that, the data enhancement processing of the attribute description information of the text dimension includes, but is not limited to, the above two ways, but may be implemented in other manners, such as synonym substitution, random insertion, random substitution, and the like, which are not limited herein. The obtained attribute description information of the text dimension can also be subjected to data enhancement by adopting a plurality of modes, for example, one part of the attribute description information adopts a first mode, and the other part of the attribute description information can be subjected to combination processing by adopting a second mode.
In one implementation manner, when the second attribute data includes attribute description information of which the corresponding mode dimension is the image dimension, the determined matched data enhancement algorithm is a data enhancement algorithm matched with the image dimension, the attribute description information of the image dimension can be understood as data of the image mode, and when the data of the corresponding mode is enhanced, the data enhancement algorithm matched with the image dimension can be specifically adopted to conduct data enhancement processing on the attribute description information of the image mode dimension, and the method can specifically include the following two modes.
Firstly, acquiring attribute description information of an image dimension from second attribute data, and acquiring target noise; superposing the target noise into the attribute description information of the image dimension to obtain a noise image corresponding to the attribute description information of the image dimension; the noise image is used as attribute description information after data enhancement processing is performed on the obtained attribute description information of the image dimension.
The data processing apparatus may acquire a target noise, which may be any one of gaussian noise, impulse noise, white noise, and the like, and then may superimpose the target noise on the attribute description information of the image dimension of the second attribute data, equivalent to superimposing the noise on the basis of the original information of the image, and may obtain a corresponding noise image after the noise superimposition, and the data processing apparatus may perform the data enhancement processing of the attribute description information according to a data enhancement algorithm matched with the image dimension with the noise image as the attribute description information of the image dimension. By adding noise to the image data, the fine change of the images in different image environments can be simulated, and the noise image obtained after noise superposition is adopted to train the target classification model, so that the anti-interference capability of the model can be enhanced.
Acquiring attribute description information of the image dimension from the second attribute data, and determining a target image area from the attribute description information of the image dimension; carrying out random replacement processing on pixel values of image pixels in a target image area to obtain an occlusion image area corresponding to the target image area; the attribute description information containing the image dimension of the occlusion image area is used as the attribute description information after the data enhancement processing is performed on the acquired attribute description information of the image dimension.
The attribute description information of the image dimension obtained from the second attribute data is an original image, the data processing device can randomly select an image area from the original image and take the image area as a target image area, and then the data processing device can replace the pixel value of the image pixel in the target image area with a random value to realize the change of the pixel value in the target image area, which is equivalent to the partial shielding of the original image. Then, the target image area after the pixel value replacement can be regarded as an occlusion image area, so that random erasure of the original image is realized. The image containing the occlusion image area can be used as the attribute description information of the acquired image dimension, and the attribute description information after data enhancement is carried out according to the matched data enhancement algorithm. The data is adopted to train the target classification model, and the learning of the model on global information can be enhanced.
It should be noted that, for the same attribute description information of the image dimension, the above manner may be performed multiple times, so as to obtain multiple attribute description information after data enhancement. For example, for the same image, multiple images identical to the same image may be copied, and then enhancement processing is performed on each image in the same manner, for example, target image areas selected by 5 images are different, and after pixel values of image pixels in each target image area are replaced by random values, 5 different images may be obtained, that is, the image after data enhancement may be obtained. In addition, similar to the attribute description information of the text dimension, the data enhancement processing of the attribute description information of the image dimension can be combined with the two modes, namely, one part of the attribute description information adopts a noise superposition mode to realize data enhancement, and the other part of the attribute description information adopts a pixel value random replacement mode to realize data enhancement, so that different enhancement data are converted to train the target classification model, and the anti-interference capability of the target classification model and the learning of global information are improved. In addition, the data enhancement processing of the attribute description information of the image dimension may further include other manners, such as color transformation (e.g., adding or subtracting some color components), geometric transformation (e.g., various operations such as flipping, clipping, deforming, scaling, etc.), and the like, which is not limited in the present application.
The model training sample data can be further expanded by carrying out data enhancement processing on the non-labeling attribute data, the non-labeling attribute data before and after data enhancement is used for training, so that the model can see more different attribute data, after training is finished, the model is more stable in identifying data generating fine changes, and the model robustness is effectively enhanced, thereby improving the overall learning effect of the model.
After the data enhancement processing is performed on the second attribute data, the data processing device may acquire the enhancement data, and may call the target classification model to perform the identification processing on the enhancement data and the second attribute data. In one embodiment, the recognition processing of the second attribute data and the enhancement data by the calling target classification model is the same as the recognition processing logic of the first attribute data by the calling target classification model, namely: firstly calling the pre-trained coding module to perform feature coding processing on the second attribute data (or enhancement data) to obtain coding features of the second attribute data (or enhancement data), then adopting the identification module which does not complete training to identify the application type of the application program according to the coding features of the second attribute data (or enhancement data) to obtain the identification type of the application program based on the second attribute data (or enhancement data),
S305, determining a second difference between the recognition type of the application program based on the second attribute data and the recognition type of the application program based on the enhancement data, and performing model training on the target classification model according to the first difference and the second difference to obtain a trained target classification model.
The identification type of the application program based on the second attribute data can be used as reference data of the identification type of the application program based on the enhancement data, and the second difference is used for reflecting the result identification difference of the non-labeling attribute data before and after data enhancement. Alternatively, the identification type based on the second attribute data and the identification type based on the enhancement data may be represented by a classification distribution, based on which the second difference is measured by a loss value calculated by KL (Kullback-Leibler) divergence, and if the KL divergence is larger, it is explained that the two classification distribution differences are larger, and further, that the model generates jitter on a subtle change of the input sample, the model needs to be adjusted.
When the target classification model contains a pre-trained coding module and an uncompleted recognition module, model training of the target classification model based on the first and second differences includes training of the pre-trained coding module and training of the uncompleted recognition module. This is because, although the already pre-trained encoding module (a pre-training model) can already encode the entered attribute description information (e.g., text data and image data) well, training in the context of a specific task is still required in order to obtain a better quality feature representation, through which the pre-training model can be fine-tuned. It will be appreciated that after model training of the target classification model is completed, training of the recognition module is completed, and the trained target classification model includes the recognition module that has been trained and the trimmed coding module.
In one embodiment, model training is performed on the target classification model according to the first difference and the second difference, so as to obtain an implementation manner of the target classification model after training, which may include: acquiring a first training weight set for the first difference and a second training weight set for the second difference; and respectively carrying out weighted summation treatment on the first difference and the second difference by adopting the first training weight and the second training weight to obtain a target difference for training the target classification model, and training the target classification model by adopting the target difference to obtain a target classification model after training.
In the training stage of the target classification model, the training influence of different differences on the target classification model is different, and when the target classification model is trained based on the combination of different differences, the corresponding training weight can be set to control the influence of the differences on the target classification model so as to train the target classification model better. In the application, the first training weight set for the first difference and the second training weight set for the second difference can be the same or different, after the data processing equipment acquires the first training weight and the second training weight, the new first difference can be determined according to the first training weight and the first difference, the new second difference can be determined according to the second training weight and the second difference, and then the new first difference and the new second difference are summed to obtain the target difference for training the target classification model.
The first difference is the Loss of marked portions (Loss) and the second difference is the Loss of unmarked portions, and the final Loss, i.e., the target difference, can be obtained by adding the Loss of marked and unmarked portions. For the calculation of the weighted summation described above, taking the first difference as the cross entropy loss and the second difference as the KL divergence as an example, a specific calculation expression is shown in the following formula 1.
Wherein x1 represents first attribute data (i.e. attribute data with label), y represents label, L represents first attribute data set of application program, θ represents model parameters of object classification model, and p θ (y|x 1 ) Representing probability distribution outputted by the target classification model for identifying the first attribute data, E x,y∈L Represents the cross entropy loss averaged over all of the first attribute data, x2 represents the second attribute data (i.e., attribute data for which no labeling exists),representing enhanced data (i.e. enhanced second attribute data), a second attribute data>Representing a distribution relationship between the second attribute data and the enhanced data, y' representing knowledge of the target classification model predicting the application based on the second attribute dataAnother type, U, represents the set of second attribute data of the application, ++ >Indicating the desire for KL divergence between the second attribute data and the enhanced data, D KL Representing KL divergence calculation, p θ (y′|x 2 ) Probability distribution representing the type of recognition based on the second attribute data,/for the second attribute data>The probability distribution representing the recognition type based on the enhancement data, the first training weight being a constant of 1 and λ being the second training weight.
When the first training weight and the second training weight are the same, the specific gravity of the first difference and the second difference in the target difference is the same, the training effect on the target classification model is identical, and when the first training weight and the second training weight are different, the specific gravity of the first difference and the second difference in the target difference is different, and the training effect on the target classification model is different. When the target classification model is trained by adopting the target difference, the model parameters of the target classification model can be adjusted through the back propagation of the target difference, the model training and optimizing target is to minimize the target difference, and when the target classification model meets the convergence condition, the trained target classification model can be obtained. The convergence condition may be that the number of iterations reaches a threshold or that the target variance is smaller than a preset variance, where the preset variance is a minimum variance value, and is used to indicate that the recognition accuracy of the application program by the target classification model has reached a desired recognition accuracy, and that the stability of the target classification model has reached a desired stability.
From the above, in the training process of the target classification model, on the basis of the first supervised difference (such as cross entropy Loss), the second difference of the classification result of the second attribute data (a kind of label-free data) before and after data enhancement is further added, and the difference combination of the two parts trains the target classification model, so that the trained target classification model has accurate recognition capability and better anti-interference capability, and can stably and accurately recognize the application type.
For the general training procedure described in the embodiments of the present application, reference may be made to the general flowchart of the classification method of the application program shown in fig. 5, which generally includes two steps.
Step 1: multimodal data for an application is obtained. The multi-mode data of the application program is multi-mode attribute data, specifically comprises attribute description information under a plurality of mode dimensions, and can be specifically divided into marked data and unmarked data, wherein the marked data is the attribute data with marked labels, and the unmarked data is the attribute data without marked labels, and the attribute data relates to the plurality of mode dimensions.
Step 2: the method is based on the co-training of marked data and unmarked data of the application program. The target classification model can be trained jointly by the marked data and the unmarked data of the application program, and the supervised training and the unsupervised training are performed on the target classification model at the same time, so that the accuracy and the stability of the model in application type identification are improved. Specifically, this step includes two sub-steps, respectively: construction is represented based on features of the pre-trained model (step 2.1) and loss is calculated based on the constructed feature representation (step 2.2). The method comprises the steps of constructing feature representation through a pre-training model, extracting description features of an application program from a plurality of mode dimensions, and splicing the description features to obtain coding features for abstractly representing the application program, wherein the coding features are reinforced feature representation, so that the application program can be more comprehensively represented, and a target classification model can learn more information based on the feature representation. Then, based on the constructed feature representation (i.e. the coding feature), corresponding losses can be calculated, specifically including losses corresponding to the marked data and losses corresponding to the unmarked data before and after enhancement, the two losses can be used as final target losses, and the model training optimization target is that the target losses are minimized. Through the steps, when the optimization target is reached, a multi-mode classification model based on data enhancement training can be obtained, and under the condition of being applied to an APP classification scene, the APP can be marked with a corresponding predefined label, so that the APP can be popularized and operated conveniently.
Further, for a schematic diagram of training principle of the object classification model, see fig. 6a and 6b in particular. The supervised training with annotation is included as in fig. 6a with the non-supervised training of the non-annotation data by data augmentation. The method comprises the steps of classifying marked data after feature coding under supervision training, calculating cross entropy Loss with marked labels, changing unmarked data into enhanced data by a data enhancement mode under a corresponding mode under non-supervision training, wherein the enhanced data keeps part of information of the unmarked data, and changing to a certain extent. And performing feature coding processing on the obtained unmarked data and the enhancement data and classifying to obtain two classification results, wherein the classification results are specifically classification distribution, and the KL divergence edge can be calculated to be used as the Loss of the unmarked part. And finally, carrying out weighted summation on the cross entropy loss and the KL divergence to obtain the target loss, and further adjusting the model parameters of the target classification model based on the target loss. Based on the training principle schematic diagram of the target classification model shown in fig. 6a, more briefly, the target classification model shown in fig. 6b adopts the structure shown in fig. 4, and each data is processed by the target classification model, so that a corresponding type is output for loss calculation, and model parameters of the target classification model are adjusted based on back propagation of loss, so that training of the target classification model is realized. When the method is applied to APP classification scenes, the unlabeled data and the labeled data can be original multi-mode data of the APP.
Experiments show that in an APP classification scene, if the labeling sample of a predefined classification system is insufficient, the data processing method provided by the embodiment of the application can enhance the APP expression learning capacity of a model and the robustness of the model in a data enhancement and multi-mode, so that the classification accuracy and recall rate can be improved. In multimodal information, both the picture and the tag can increase the overall F value (i.e., the harmonic mean of the correct rate and recall rate) compared to using only the APP profile. On an evaluation set, adding a picture mode to the profile can improve the F value by 1%; the F value can be increased by 4% by adding the label mode. In the aspect of data enhancement, compared with a mode of training by only using marked cross entropy Loss, the method has the advantage that the KL divergence Loss before and after unmarked data enhancement is increased, and the F value can be increased by 5%. The data processing method provided by the application is an APP classification method based on data enhancement and multi-mode representation learning when applied to APP classification, so that the classification capacity of a model can be improved, and the robustness of the model can be increased.
In one possible embodiment, in order to obtain a better training effect for the target classification model, at least two training phases are included when model training is performed on the target classification model. By adjusting the corresponding training weights in different training stages, the target classification model can be trained in the corresponding capacity in different training stages, so that the training effect of the target classification model is improved more efficiently. In one embodiment, it is possible to: acquiring a current training stage and a target training stage set for a target classification model; when the current training phase is the training phase before the target training phase, adjusting the value of the second training weight so that the adjusted second training weight is smaller than the first training weight; and when the current training stage is the target training stage, adjusting the value of the second training weight so that the adjusted second training weight is greater than the first training weight.
The data processing apparatus may set a target training phase for the target classification model, the target training phase being a training phase into which the target classification model enters after training reaches a preset stability condition, where the preset stability condition is used to indicate a phase in which the target classification model starts to walk into relative stability for recognition of an application type of the application program. The preset stability condition may be that the target variance is less than a variance threshold that is greater than a preset variance used when the target classification model converges. The data processing device may determine whether the obtained current training phase is a training phase preceding the target training phase, and may specifically determine whether the target difference obtained in the current training phase reaches a difference threshold. If the current training stage is the training stage before the target training stage, the recognition stability of the application type of the application program of the target classification model is relatively poor, so that the second training weight can be adjusted to be smaller than the first training weight, the ratio of the first difference in the target difference is enabled to be larger than the ratio of the second difference, the target classification model is subjected to supervised training based on the first training weight and the first difference, the target classification model can see more attribute data (namely real sample data) with labels, and the classification accuracy of the model is rapidly improved. If the current training stage is the target training stage, it is explained that the target difference obtained by processing the target classification model in the current training stage is more stable, and at the moment, the second training weight can be adjusted to be a value larger than the first training weight, so that the duty ratio of the second difference in the target difference obtained later is larger than that of the first difference, namely the effect brought by the second difference on the target classification model is larger, the target classification model can see more attribute data without labels (namely sample data without labels), and the training of the target classification model based on the target difference can enhance the learning of non-label attribute data and improve the stability of the model.
From the above, the two training stages included in the model training focus on the classification accuracy and the classification stability of the training model respectively, and when the model is trained to be relatively stable through the training stage before the target training stage, a more accurate recognition type can be obtained in the recognition processing of the non-labeling attribute data, so that the stability of the model is more effectively improved. It will be appreciated that the first training weight and the second training weight may not be adjusted throughout the training of the target classification model. For example, the first training weight and the second training weight are both constant 1, which is equivalent to training the target classification model by directly adopting the sum of the first difference and the second difference, and the same training effect can be achieved.
In one implementation, after the trained target classification model is obtained through model training, the data processing device may invoke the trained target classification model to perform application type identification processing on the object to be classified, where the identified application type may be used as a classification label of the application program. Taking the scenario of the application of the scheme as an example of the scenario of classifying the application program (i.e. APP), a processing diagram of the application program classification is shown in fig. 7. Through a given predefined classification system and corresponding labeling attribute data, the trained target classification model can comprise a generation model (equivalent to a trained coding module) for generating an APP representation and a classifier (equivalent to a trained recognition module) for classifying the APP according to the APP representation, so that two forms of description exist for the APP, namely, the description is performed on the APP through an APP vector representation learned by the classification model, and under some application scenarios, the APP vector representation can be used for calculating the similarity among the APPs, further selecting the similar APPs, and pushing the APP to be promoted to an object population using the similar APPs. The other kind is explicit depiction, namely, the classification label of the APP can be obtained through a classifier, and the effect of abstracting and understanding the APP content can be achieved by marking the APP with a predefined classification label, so that the object portrait can be constructed in an auxiliary mode according to the classification label or the advertisement feature corresponding to the APP can be constructed.
In one embodiment, the number of label tags existing in the application program is multiple, which indicates that the application program can be categorized in different classification systems, and each classification system corresponds to a predefined label tag, and the object classification model is multitask-trained by using attribute data of the label tags, so that compared with a single training task in one classification system, the trained object classification model has the capability of multi-classifying the application program, namely, the class of the application program in each classification system is identified. For example, if the application program is a game APP, different tags may be labeled for the game APP from the two classification systems of game wind and game type. In training the target classification model in a multi-task training scenario, the respective differences may be the sum of Loss for each training task. The first difference may be determined by the following: determining a reference application type associated with a labeling label of the application program from a plurality of reference application types output by the target classification model; a sub-difference is constructed based on an associated one of the labeling tags and one of the reference application types, and the resulting total sub-difference is used as a first difference between the reference application type and the labeling application type.
Specifically, the multiple reference application types output by the target classification belong to application types under different classification systems, and for each reference application type, a label under a corresponding classification system is correspondingly associated, and one label of the application program can be associated with one reference application type. For the reference application type under one classification system, the data processing device can construct sub-differences between the labeling label and the associated reference application type, and each classification system can correspondingly obtain one sub-difference, so that each sub-difference can be used as a first difference between the reference application type and the labeling application type, and specifically, the sum of each sub-difference can be used as the first difference.
It should be noted that, in the multi-task classification scenario, the determination of the second difference is also the same principle, that is, the sub-differences between the recognition types obtained under each classification system are calculated, and each sub-difference is taken as the second difference. For example, the Loss under the Multi-training task may be upgraded to the Loss under the MMOE (Multi-gate media-of-experiences) mode, and in the MMOE mode, the underlying network includes a plurality of expert networks (expert networks), each expert network performs a classification task, each task uses a separate gate network (gate networks), and the gate networks of each task selectively utilize the expert networks through different final output weights, so that the gate networks of different tasks can learn different modes of the combined expert networks, thereby capturing the relevance and distinction of the tasks better.
Referring to fig. 8, fig. 8 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application. The data processing means may be a computer program (comprising program code) running in the data processing device, for example the data processing means is an application software; the data processing device may be used to execute corresponding steps in the data processing method provided by the embodiment of the present application. As shown in fig. 8, the data processing apparatus 800 may include: acquisition module 801, identification module 802, determination module 803, training module 804.
An acquiring module 801, configured to acquire first attribute data and second attribute data of an application program; the first attribute data is attribute data with a labeling tag and one or more modal dimensions, wherein the labeling tag is used for indicating a labeling application type of an application program; the second attribute data is attribute data without label tags;
the identifying module 802 is configured to invoke the target classification model to identify attribute data of each mode dimension under the first attribute data, so as to obtain a reference application type of the application program;
a determining module 803 for determining a first difference between the reference application type and the labeling application type;
The obtaining module 801 is further configured to obtain enhancement data corresponding to the second attribute data, and call the target classification model to respectively identify the second attribute data and the enhancement data, so as to obtain an identification type of the application program based on the second attribute data and an identification type of the application program based on the enhancement data;
a determining module 803, configured to determine a second difference between the identification type of the application program based on the second attribute data and the identification type of the application program based on the enhancement data;
the training module 804 is configured to perform model training on the target classification model according to the first difference and the second difference, so as to obtain a trained target classification model; the trained target classification model is used for identifying the application type.
In one embodiment, the obtaining module 801 is configured to: acquiring a modal dimension describing application attributes of an application program, wherein the modal dimension comprises one or more of the following: text dimension, image dimension, and speech dimension; acquiring attribute description information of an application program in a corresponding mode dimension based on the mode dimension, and acquiring a labeling label of the application program; and associating the label tag with the corresponding attribute description information, and taking the attribute description information of the associated label tag as first attribute data.
In one embodiment, the target classification model is a pre-trained target classification model, wherein the pre-trained target classification model comprises a pre-trained encoding module; the identification module 802 is specifically configured to: invoking a pre-trained coding module in the target classification model to perform feature coding processing on the first attribute data to obtain coding features of the first attribute data; the target classification model also comprises an identification module which does not complete training; and adopting an identification module which does not complete training, and carrying out identification processing on the application type of the application program according to the coding characteristics of the first attribute data to obtain the reference application type of the application program.
In one embodiment, if the first attribute data includes attribute description information of a plurality of modal dimensions; the identification module 802 is specifically configured to: invoking a pre-trained coding module in the target classification model, and respectively carrying out feature coding processing on attribute description information of different mode dimensions contained in the first attribute data to obtain description features corresponding to the attribute description information of the corresponding mode dimensions in the first attribute data; and splicing the description features corresponding to the attribute description information under different mode dimensions, and taking the spliced description features as the coding features of the first attribute data.
In one embodiment, when the attribute description information in any mode dimension is a plurality of attribute description information, and the information types of the plurality of attribute description information in any mode dimension are different; the identification module 802 is specifically configured to: invoking a pre-trained coding module in the target classification model, and respectively coding attribute description information corresponding to different information types in any mode dimension in the first attribute data to obtain description features corresponding to the attribute description information of the corresponding information type in any mode dimension; the description features corresponding to the attribute description information of each information type are taken as the description features corresponding to the attribute description information in any mode dimension; or, the spliced description characteristic obtained based on the description characteristic corresponding to the attribute description information of each information type is used as the description characteristic corresponding to the attribute description information in any mode dimension.
In one embodiment, the second attribute data is data containing attribute description information for a plurality of modality dimensions; the obtaining module 801 is specifically configured to: determining the mode dimensions corresponding to the attribute description information in the second attribute data respectively, and acquiring a data enhancement algorithm matched with the corresponding mode dimensions; and adopting a matched data enhancement algorithm to perform data enhancement processing on the attribute description information under the corresponding mode dimension, and taking the attribute description information after enhancement processing as enhancement data of the second attribute data.
In one embodiment, when the second attribute data includes attribute description information corresponding to a modality dimension being a text dimension, the determined matched data enhancement algorithm is a data enhancement algorithm matched with the text dimension; the obtaining module 801 is specifically configured to: acquiring attribute description information of a text dimension from the second attribute data, and translating the acquired attribute description information of the text dimension to obtain a translation text corresponding to the attribute description information of the text dimension; performing back translation processing on the translation text to obtain a back translation text of the translation text; the translated text is used as the attribute description information after the data enhancement processing is carried out on the acquired attribute description information of the text dimension.
In one embodiment, when the second attribute data includes attribute description information corresponding to a modality dimension being a text dimension, the determined matched data enhancement algorithm is a data enhancement algorithm matched with the text dimension; the obtaining module 801 is specifically configured to: acquiring attribute description information of a text dimension from the second attribute data, and performing random information deletion processing on the acquired attribute description information of the text dimension to obtain a deleted text corresponding to the attribute description information of the text dimension; the deleted text is used as the attribute description information after the data enhancement processing is carried out on the acquired attribute description information of the text dimension.
In one embodiment, when the second attribute data includes attribute description information corresponding to a modality dimension being an image dimension, the determined matched data enhancement algorithm is a data enhancement algorithm matched with the image dimension; the obtaining module 801 is specifically configured to: acquiring attribute description information of the image dimension from the second attribute data, and acquiring target noise; superposing the target noise into the attribute description information of the image dimension to obtain a noise image corresponding to the attribute description information of the image dimension; the noise image is used as attribute description information after data enhancement processing is performed on the obtained attribute description information of the image dimension.
In one embodiment, when the second attribute data includes attribute description information corresponding to a modality dimension being an image dimension, the determined matched data enhancement algorithm is a data enhancement algorithm matched with the image dimension; the obtaining module 801 is specifically configured to: acquiring attribute description information of the image dimension from the second attribute data, and determining a target image area from the attribute description information of the image dimension; carrying out random replacement processing on pixel values of image pixels in a target image area to obtain an occlusion image area corresponding to the target image area; the attribute description information containing the image dimension of the occlusion image area is used as the attribute description information after the data enhancement processing is performed on the acquired attribute description information of the image dimension.
In one embodiment, training module 804 is specifically configured to: acquiring a first training weight set for the first difference and a second training weight set for the second difference; and respectively carrying out weighted summation treatment on the first difference and the second difference by adopting the first training weight and the second training weight to obtain a target difference for training the target classification model, and training the target classification model by adopting the target difference to obtain a target classification model after training.
In one embodiment, at least two training phases are involved in model training the target classification model; training module 804, further configured to: acquiring a current training stage and a target training stage set for a target classification model; when the current training phase is the training phase before the target training phase, adjusting the value of the second training weight so that the adjusted second training weight is smaller than the first training weight; and when the current training stage is the target training stage, adjusting the value of the second training weight so that the adjusted second training weight is greater than the first training weight.
In one embodiment, if the number of labeling labels existing in the application program is a plurality of, invoking the target classification model to identify that the number of reference application types obtained by the first attribute data is a plurality of; the determining module 803 is specifically configured to: determining a reference application type associated with a labeling label of the application program from a plurality of reference application types output by the target classification model; a sub-difference is constructed based on an associated one of the labeling tags and one of the reference application types, and the resulting total sub-difference is used as a first difference between the reference application type and the labeling application type.
It may be understood that the functions of each functional module of the data processing apparatus described in the embodiments of the present application may be specifically implemented according to the method in the embodiments of the method, and the specific implementation process may refer to the relevant description of the embodiments of the method and will not be repeated herein. In addition, the description of the beneficial effects of the same method is omitted.
Referring to fig. 9, fig. 9 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application. The data processing device 900 may include separate devices (e.g., one or more of nodes, terminals, etc.) or may include components (e.g., chips, software modules, hardware modules, etc.) internal to the separate devices. The data processing device 900 may comprise at least one processor 901 and a network interface 902, and further optionally the data processing device 900 may comprise at least one memory 903 and a bus 904. Wherein the processor 901, the network interface 902 and the memory 903 are coupled by a bus 904.
The processor 901 is a module for performing arithmetic operation and/or logic operation, and may specifically be one or more of a central processing unit (central processing unit, CPU), a picture processor (graphics processing unit, GPU), a microprocessor (microprocessor unit, MPU), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a field programmable gate array (Field Programmable Gate Array, FPGA), a complex programmable logic device (Complex programmable logic device, CPLD), a coprocessor (assisting the central processing unit to perform corresponding processing and application), a micro control unit (Microcontroller Unit, MCU), and other processing modules.
The network interface 902 may be used to provide information input or output to at least one processor. And/or the network interface 902 may be configured to receive data sent externally and/or send data to the outside, and may be a wired link interface including, for example, an ethernet cable, or may be a wireless link (Wi-Fi, bluetooth, universal wireless transmission, vehicle-mounted short-range communication technology, and other short-range wireless communication technologies, etc.) interface. The network interface 902 may be used as a network interface.
The memory 903 is used to provide storage space in which data such as an operating system and computer programs can be stored. The memory 903 may be one or more of a random access memory (random access memory, RAM), a read-only memory (ROM), an erasable programmable read-only memory (erasable programmable read only memory, EPROM), or a portable read-only memory (compact disc read-only memory, CD-ROM), etc.
The at least one processor 901 in the data processing device 900 is arranged to invoke the computer programs stored in the at least one memory 903 for performing the data processing method described in the illustrated embodiment of the application.
In a possible implementation, the processor 901 in the data processing device 900 is configured to invoke a computer program stored in the at least one memory 903 for performing the following operations: acquiring first attribute data and second attribute data of an application program; the first attribute data is attribute data with a labeling tag and one or more modal dimensions, wherein the labeling tag is used for indicating a labeling application type of an application program; the second attribute data is attribute data without label tags; invoking a target classification model to identify attribute data of each mode dimension under the first attribute data to obtain a reference application type of the application program, and determining a first difference between the reference application type and the labeling application type; acquiring enhancement data corresponding to the second attribute data, and calling a target classification model to respectively identify the second attribute data and the enhancement data to obtain an identification type of the application program based on the second attribute data and an identification type of the application program based on the enhancement data; determining a second difference between the recognition type of the application program based on the second attribute data and the recognition type of the application program based on the enhancement data, and performing model training on the target classification model according to the first difference and the second difference to obtain a trained target classification model; the trained target classification model is used for identifying the application type.
In one embodiment, processor 901 is configured to: acquiring a modal dimension describing application attributes of an application program, wherein the modal dimension comprises one or more of the following: text dimension, image dimension, and speech dimension; acquiring attribute description information of an application program in a corresponding mode dimension based on the mode dimension, and acquiring a labeling label of the application program; and associating the label tag with the corresponding attribute description information, and taking the attribute description information of the associated label tag as first attribute data.
In one embodiment, the target classification model is a pre-trained target classification model, wherein the pre-trained target classification model comprises a pre-trained encoding module; the processor 901 is specifically configured to: invoking a pre-trained coding module in the target classification model to perform feature coding processing on the first attribute data to obtain coding features of the first attribute data; the target classification model also comprises an identification module which does not complete training; and adopting an identification module which does not complete training, and carrying out identification processing on the application type of the application program according to the coding characteristics of the first attribute data to obtain the reference application type of the application program.
In one embodiment, if the first attribute data includes attribute description information of a plurality of modal dimensions; the processor 901 is specifically configured to: invoking a pre-trained coding module in the target classification model, and respectively carrying out feature coding processing on attribute description information of different mode dimensions contained in the first attribute data to obtain description features corresponding to the attribute description information of the corresponding mode dimensions in the first attribute data; and splicing the description features corresponding to the attribute description information under different mode dimensions, and taking the spliced description features as the coding features of the first attribute data.
In one embodiment, when the attribute description information in any mode dimension is a plurality of attribute description information, and the information types of the plurality of attribute description information in any mode dimension are different; the processor 901 is specifically configured to: invoking a pre-trained coding module in the target classification model, and respectively coding attribute description information corresponding to different information types in any mode dimension in the first attribute data to obtain description features corresponding to the attribute description information of the corresponding information type in any mode dimension; the description features corresponding to the attribute description information of each information type are taken as the description features corresponding to the attribute description information in any mode dimension; or, the spliced description characteristic obtained based on the description characteristic corresponding to the attribute description information of each information type is used as the description characteristic corresponding to the attribute description information in any mode dimension.
In one embodiment, the second attribute data is data containing attribute description information for a plurality of modality dimensions; the processor 901 is specifically configured to: determining the mode dimensions corresponding to the attribute description information in the second attribute data respectively, and acquiring a data enhancement algorithm matched with the corresponding mode dimensions; and adopting a matched data enhancement algorithm to perform data enhancement processing on the attribute description information under the corresponding mode dimension, and taking the attribute description information after enhancement processing as enhancement data of the second attribute data.
In one embodiment, when the second attribute data includes attribute description information corresponding to a modality dimension being a text dimension, the determined matched data enhancement algorithm is a data enhancement algorithm matched with the text dimension; the processor 901 is specifically configured to: acquiring attribute description information of a text dimension from the second attribute data, and translating the acquired attribute description information of the text dimension to obtain a translation text corresponding to the attribute description information of the text dimension; performing back translation processing on the translation text to obtain a back translation text of the translation text; the translated text is used as the attribute description information after the data enhancement processing is carried out on the acquired attribute description information of the text dimension.
In one embodiment, when the second attribute data includes attribute description information corresponding to a modality dimension being a text dimension, the determined matched data enhancement algorithm is a data enhancement algorithm matched with the text dimension; the processor 901 is specifically configured to: acquiring attribute description information of a text dimension from the second attribute data, and performing random information deletion processing on the acquired attribute description information of the text dimension to obtain a deleted text corresponding to the attribute description information of the text dimension; the deleted text is used as the attribute description information after the data enhancement processing is carried out on the acquired attribute description information of the text dimension.
In one embodiment, when the second attribute data includes attribute description information corresponding to a modality dimension being an image dimension, the determined matched data enhancement algorithm is a data enhancement algorithm matched with the image dimension; the processor 901 is specifically configured to: acquiring attribute description information of the image dimension from the second attribute data, and acquiring target noise; superposing the target noise into the attribute description information of the image dimension to obtain a noise image corresponding to the attribute description information of the image dimension; the noise image is used as attribute description information after data enhancement processing is performed on the obtained attribute description information of the image dimension.
In one embodiment, when the second attribute data includes attribute description information corresponding to a modality dimension being an image dimension, the determined matched data enhancement algorithm is a data enhancement algorithm matched with the image dimension; the processor 901 is specifically configured to: acquiring attribute description information of the image dimension from the second attribute data, and determining a target image area from the attribute description information of the image dimension; carrying out random replacement processing on pixel values of image pixels in a target image area to obtain an occlusion image area corresponding to the target image area; the attribute description information containing the image dimension of the occlusion image area is used as the attribute description information after the data enhancement processing is performed on the acquired attribute description information of the image dimension.
In one embodiment, the processor 901 is specifically configured to: acquiring a first training weight set for the first difference and a second training weight set for the second difference; and respectively carrying out weighted summation treatment on the first difference and the second difference by adopting the first training weight and the second training weight to obtain a target difference for training the target classification model, and training the target classification model by adopting the target difference to obtain a target classification model after training.
In one embodiment, at least two training phases are involved in model training the target classification model; processor 901, further configured to: acquiring a current training stage and a target training stage set for a target classification model; when the current training phase is the training phase before the target training phase, adjusting the value of the second training weight so that the adjusted second training weight is smaller than the first training weight; and when the current training stage is the target training stage, adjusting the value of the second training weight so that the adjusted second training weight is greater than the first training weight.
In one embodiment, if the number of labeling labels existing in the application program is a plurality of, invoking the target classification model to identify that the number of reference application types obtained by the first attribute data is a plurality of; the processor 901 is specifically configured to: determining a reference application type associated with a labeling label of the application program from a plurality of reference application types output by the target classification model; a sub-difference is constructed based on an associated one of the labeling tags and one of the reference application types, and the resulting total sub-difference is used as a first difference between the reference application type and the labeling application type.
It should be understood that the data processing apparatus 900 described in the embodiment of the present application may perform the description of the data processing method in the embodiment corresponding to the foregoing description, and may also perform the description of the data processing apparatus 800 in the embodiment corresponding to the foregoing description of fig. 8, which is not repeated herein. In addition, the description of the beneficial effects of the same method is omitted.
In addition, it should be noted that, in an exemplary embodiment of the present application, a storage medium is further provided, where a computer program of the foregoing data processing method is stored, where the computer program includes program instructions, and when one or more processors loads and executes the program instructions, descriptions of the data processing method in the embodiment may be implemented, and details of beneficial effects of the same method are not repeated herein, and are not repeated herein. It will be appreciated that the program instructions may be deployed to be executed on one or more data processing devices that are capable of communicating with one another.
The computer readable storage medium may be the data processing apparatus provided in any one of the foregoing embodiments or an internal storage unit of the data processing device, for example, a hard disk or a memory of the data processing device. The computer readable storage medium may also be an external storage device of the data processing device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) card, a flash card (flash card) or the like, which are provided on the data processing device. Further, the computer readable storage medium may also include both internal storage units and external storage devices of the data processing apparatus. The computer readable storage medium is used to store the computer program and other programs and data required by the data processing apparatus. The computer-readable storage medium may also be used to temporarily store data that has been output or is to be output.
In one aspect of the application, a computer program product or computer program is provided that includes computer instructions stored in a computer readable storage medium. The processor of the data processing apparatus reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions, so that the data processing apparatus performs the method provided in an aspect of the embodiment of the present application.
The steps in the method of the embodiment of the application can be sequentially adjusted, combined and deleted according to actual needs.
The modules in the device of the embodiment of the application can be combined, divided and deleted according to actual needs.
The above disclosure is only a few examples of the present application, and it is not intended to limit the scope of the present application, but it is understood by those skilled in the art that all or a part of the above embodiments may be implemented and equivalents thereof may be modified according to the scope of the present application.

Claims (16)

1. A method of data processing, the method comprising:
acquiring first attribute data and second attribute data of an application program; the first attribute data is attribute data with a labeling label and one or more modal dimensions, and the labeling label is used for indicating a labeling application type of the application program; the second attribute data is attribute data without label tags;
Invoking a target classification model to identify attribute data of each mode dimension under the first attribute data to obtain a reference application type of the application program, and determining a first difference between the reference application type and the labeling application type;
acquiring enhancement data corresponding to the second attribute data, and calling the target classification model to respectively identify the second attribute data and the enhancement data to obtain an identification type of the application program based on the second attribute data and an identification type of the application program based on the enhancement data;
determining a second difference between the recognition type of the application program based on the second attribute data and the recognition type of the enhancement data, and performing model training on the target classification model according to the first difference and the second difference to obtain a trained target classification model; the trained target classification model is used for identifying the application type.
2. The method of claim 1, wherein the obtaining the first attribute data of the application program comprises:
a modal dimension describing application attributes of an application program is obtained, wherein the modal dimension comprises one or more of the following: text dimension, image dimension, and speech dimension;
Acquiring attribute description information of the application program in a corresponding mode dimension based on the mode dimension, and acquiring a labeling label of the application program;
and associating the labeling label with the corresponding attribute description information, and taking the attribute description information associated with the labeling label as first attribute data.
3. The method of claim 1, wherein the target classification model is a pre-trained target classification model, wherein the pre-trained target classification model comprises a pre-trained encoding module; the calling target classification model identifies attribute data of each mode dimension under the first attribute data to obtain a reference application type of the application program, and the method comprises the following steps:
invoking a pre-trained coding module in the target classification model to perform feature coding processing on the first attribute data to obtain coding features of the first attribute data; the target classification model also comprises an identification module which does not complete training;
and adopting the recognition module which does not complete training, and recognizing the application type of the application program according to the coding characteristics of the first attribute data to obtain the reference application type of the application program.
4. A method according to claim 3, wherein if the first attribute data comprises attribute description information for a plurality of modality dimensions; the invoking the pre-trained coding module in the target classification model to perform feature coding processing on the first attribute data to obtain coding features of the first attribute data includes:
invoking a pre-trained coding module in the target classification model, and respectively carrying out feature coding processing on attribute description information of different mode dimensions contained in the first attribute data to obtain description features corresponding to the attribute description information of the corresponding mode dimensions in the first attribute data;
and splicing the description features corresponding to the attribute description information under different mode dimensions, and taking the spliced description features as the coding features of the first attribute data.
5. The method of claim 4, wherein when the attribute description information in any one of the mode dimensions is plural, and the information types of the plural attribute description information in the any one of the mode dimensions are different; invoking the pre-trained coding module in the target classification model to perform feature coding processing on the attribute description information of any mode dimension in the first attribute data to obtain a mode of describing features corresponding to the attribute description information in the corresponding mode dimension, wherein the mode comprises the following steps:
Invoking a pre-trained coding module in the target classification model, and respectively coding attribute description information corresponding to different information types in any mode dimension in the first attribute data to obtain description characteristics corresponding to the attribute description information of the corresponding information type in any mode dimension;
the description features corresponding to the attribute description information of each information type are taken as the description features corresponding to the attribute description information in any mode dimension; or, the spliced description characteristic obtained based on the description characteristic corresponding to the attribute description information of each information type is used as the description characteristic corresponding to the attribute description information in any mode dimension.
6. The method of any of claims 1-5, wherein the second attribute data is data comprising attribute description information for a plurality of modality dimensions; the obtaining the enhancement data corresponding to the second attribute data includes:
determining the mode dimension corresponding to each attribute description information in the second attribute data, and acquiring a data enhancement algorithm matched with the corresponding mode dimension;
and adopting a matched data enhancement algorithm to perform data enhancement processing on the attribute description information under the corresponding mode dimension, and taking the attribute description information after enhancement processing as enhancement data of the second attribute data.
7. The method of claim 6, wherein when the second attribute data includes attribute description information corresponding to a modality dimension being a text dimension, the determined matched data enhancement algorithm is a data enhancement algorithm matched to the text dimension; the data enhancement processing is carried out on the attribute description information under the corresponding mode dimension by adopting a matched data enhancement algorithm, and the data enhancement processing comprises the following steps:
acquiring attribute description information of a text dimension from the second attribute data, and translating the acquired attribute description information of the text dimension to obtain a translation text corresponding to the attribute description information of the text dimension;
performing back translation processing on the translation text to obtain a back translation text of the translation text; the translated text is used as the attribute description information after the data enhancement processing is carried out on the acquired attribute description information of the text dimension.
8. The method of claim 6, wherein when the second attribute data includes attribute description information corresponding to a modality dimension being a text dimension, the determined matched data enhancement algorithm is a data enhancement algorithm matched to the text dimension; the data enhancement processing is carried out on the attribute description information under the corresponding mode dimension by adopting a matched data enhancement algorithm, and the data enhancement processing comprises the following steps:
Acquiring attribute description information of a text dimension from the second attribute data, and performing random information deletion processing on the acquired attribute description information of the text dimension to obtain a deleted text corresponding to the attribute description information of the text dimension;
the deleted text is used as the attribute description information after the data enhancement processing is carried out on the acquired attribute description information of the text dimension.
9. The method of claim 6, wherein the determined matched data enhancement algorithm is a data enhancement algorithm matched to the image dimension when the second attribute data includes attribute description information corresponding to the modality dimension as the image dimension; the data enhancement processing is carried out on the attribute description information under the corresponding mode dimension by adopting a matched data enhancement algorithm, and the data enhancement processing comprises the following steps:
acquiring attribute description information of image dimension from the second attribute data, and acquiring target noise;
the target noise is added to the attribute description information of the image dimension to obtain a noise image corresponding to the attribute description information of the image dimension; the noise image is used as attribute description information after data enhancement processing is carried out on the obtained attribute description information of the image dimension.
10. The method of claim 6, wherein the determined matched data enhancement algorithm is a data enhancement algorithm matched to the image dimension when the second attribute data includes attribute description information corresponding to the modality dimension as the image dimension; the data enhancement processing is carried out on the attribute description information under the corresponding mode dimension by adopting a matched data enhancement algorithm, and the data enhancement processing comprises the following steps:
acquiring attribute description information of an image dimension from the second attribute data, and determining a target image area from the attribute description information of the image dimension;
carrying out random replacement processing on pixel values of image pixels in the target image area to obtain an occlusion image area corresponding to the target image area;
the attribute description information of the image dimension of the occlusion image area is used as the attribute description information after the data enhancement processing of the acquired attribute description information of the image dimension.
11. The method of claim 1, wherein the model training the target classification model based on the first variance and the second variance results in a trained target classification model, comprising:
Acquiring a first training weight set for the first difference and a second training weight set for the second difference;
and respectively carrying out weighted summation treatment on the first difference and the second difference by adopting the first training weight and the second training weight to obtain a target difference for training the target classification model, and training the target classification model by adopting the target difference to obtain a trained target classification model.
12. The method of claim 11, comprising at least two training phases when model training the target classification model; further comprises:
acquiring a current training stage and a target training stage set for the target classification model;
when the current training stage is the training stage before the target training stage, adjusting the value of the second training weight so that the adjusted second training weight is smaller than the first training weight;
and when the current training stage is the target training stage, adjusting the value of the second training weight so that the adjusted second training weight is greater than the first training weight.
13. The method of claim 1, wherein if the number of labels existing in the application program is a plurality of labels, invoking the target classification model to identify the number of reference application types obtained by the first attribute data is a plurality of labels; the determining the first difference between the reference application type and the labeling application type includes:
Determining a reference application type associated with a labeling label of the application program from a plurality of reference application types output by the target classification model;
constructing a sub-difference based on an associated labeling label and a reference application type, and taking the obtained total sub-difference as a first difference between the reference application type and the labeling application type.
14. A data processing apparatus, comprising:
the acquisition module is used for acquiring the first attribute data and the second attribute data of the application program; the first attribute data is attribute data with a labeling label and one or more modal dimensions, and the labeling label is used for indicating a labeling application type of the application program; the second attribute data is attribute data without label tags;
the identification module is used for calling a target classification model to identify attribute data of each mode dimension under the first attribute data so as to obtain a reference application type of the application program;
the determining module is used for determining a first difference between the reference application type and the labeling application type;
the acquisition module is further used for acquiring enhancement data corresponding to the second attribute data;
The recognition module is further used for calling the target classification model to respectively recognize the second attribute data and the enhancement data to obtain the recognition type of the application program based on the second attribute data and the recognition type of the application program based on the enhancement data;
the determining module is further configured to determine a second difference between the identification type of the application program based on the second attribute data and the identification type of the enhancement data;
the training module is used for carrying out model training on the target classification model according to the first difference and the second difference to obtain a trained target classification model; the trained target classification model is used for identifying the application type of the object type.
15. A data processing apparatus, comprising: a processor, a memory, and a network interface; the processor is connected to the memory and the network interface, wherein the network interface is configured to provide network communication functions, the memory is configured to store program code, and the processor is configured to invoke the program code to perform the data processing method of any of claims 1-13.
16. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program comprising program instructions which, when executed by a processor, perform the data processing method of any of claims 1-13.
CN202211263524.0A 2022-10-10 2022-10-10 Data processing method, device, equipment and medium Pending CN117034133A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211263524.0A CN117034133A (en) 2022-10-10 2022-10-10 Data processing method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211263524.0A CN117034133A (en) 2022-10-10 2022-10-10 Data processing method, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN117034133A true CN117034133A (en) 2023-11-10

Family

ID=88630538

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211263524.0A Pending CN117034133A (en) 2022-10-10 2022-10-10 Data processing method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN117034133A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118262181A (en) * 2024-05-29 2024-06-28 山东鲁能控制工程有限公司 Automatic data processing system based on big data

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118262181A (en) * 2024-05-29 2024-06-28 山东鲁能控制工程有限公司 Automatic data processing system based on big data

Similar Documents

Publication Publication Date Title
CN112084331B (en) Text processing and model training method and device, computer equipment and storage medium
CN113762322B (en) Video classification method, device and equipment based on multi-modal representation and storage medium
CN114298121B (en) Multi-mode-based text generation method, model training method and device
CN113761153B (en) Picture-based question-answering processing method and device, readable medium and electronic equipment
WO2022253074A1 (en) Data processing method and related device
CN117033609B (en) Text visual question-answering method, device, computer equipment and storage medium
CN114722826B (en) Model training method and device, electronic equipment and storage medium
CN115223020A (en) Image processing method, image processing device, electronic equipment and readable storage medium
CN116975350A (en) Image-text retrieval method, device, equipment and storage medium
CN117540221B (en) Image processing method and device, storage medium and electronic equipment
CN114282013A (en) Data processing method, device and storage medium
CN113779225B (en) Training method of entity link model, entity link method and device
CN117437317A (en) Image generation method, apparatus, electronic device, storage medium, and program product
Khurram et al. Dense-captionnet: a sentence generation architecture for fine-grained description of image semantics
CN117315070A (en) Image generation method, apparatus, electronic device, storage medium, and program product
CN116541492A (en) Data processing method and related equipment
CN116258147A (en) Multimode comment emotion analysis method and system based on heterogram convolution
CN114492661B (en) Text data classification method and device, computer equipment and storage medium
CN117034133A (en) Data processing method, device, equipment and medium
CN117711001B (en) Image processing method, device, equipment and medium
CN114330483A (en) Data processing method, model training method, device, equipment and storage medium
CN111445545B (en) Text transfer mapping method and device, storage medium and electronic equipment
CN116824308B (en) Image segmentation model training method and related method, device, medium and equipment
CN115292439A (en) Data processing method and related equipment
CN115858816A (en) Construction method and system of intelligent agent cognitive map for public security field

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination