CN114708593A - Heterogeneous multi-model-based waste electronic product brand identification method - Google Patents
Heterogeneous multi-model-based waste electronic product brand identification method Download PDFInfo
- Publication number
- CN114708593A CN114708593A CN202111673248.0A CN202111673248A CN114708593A CN 114708593 A CN114708593 A CN 114708593A CN 202111673248 A CN202111673248 A CN 202111673248A CN 114708593 A CN114708593 A CN 114708593A
- Authority
- CN
- China
- Prior art keywords
- model
- character
- output
- image
- electronic product
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 239000002699 waste material Substances 0.000 title claims abstract description 42
- 238000000034 method Methods 0.000 title claims abstract description 29
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 24
- 238000013145 classification model Methods 0.000 claims abstract description 24
- 238000007781 pre-processing Methods 0.000 claims description 22
- 238000000605 extraction Methods 0.000 claims description 17
- 230000006870 function Effects 0.000 claims description 11
- 230000008569 process Effects 0.000 claims description 10
- 238000007637 random forest analysis Methods 0.000 claims description 10
- 238000012549 training Methods 0.000 claims description 9
- 230000002457 bidirectional effect Effects 0.000 claims description 8
- 230000011218 segmentation Effects 0.000 claims description 7
- 238000001514 detection method Methods 0.000 claims description 6
- 230000009466 transformation Effects 0.000 claims description 5
- 238000011156 evaluation Methods 0.000 claims description 4
- 238000011160 research Methods 0.000 claims description 4
- 238000006243 chemical reaction Methods 0.000 claims description 2
- 238000013135 deep learning Methods 0.000 claims description 2
- 238000005259 measurement Methods 0.000 claims description 2
- 238000011176 pooling Methods 0.000 claims description 2
- 230000005540 biological transmission Effects 0.000 claims 1
- 238000010606 normalization Methods 0.000 claims 1
- 238000011027 product recovery Methods 0.000 abstract description 2
- 238000012015 optical character recognition Methods 0.000 description 16
- 238000004064 recycling Methods 0.000 description 7
- 239000011159 matrix material Substances 0.000 description 6
- 238000010586 diagram Methods 0.000 description 4
- 238000010276 construction Methods 0.000 description 3
- 238000011084 recovery Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- DWDGSKGGUZPXMQ-UHFFFAOYSA-N OPPO Chemical compound OPPO DWDGSKGGUZPXMQ-UHFFFAOYSA-N 0.000 description 1
- 244000062793 Sorghum vulgare Species 0.000 description 1
- 238000005299 abrasion Methods 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 235000019713 millet Nutrition 0.000 description 1
- 210000005036 nerve Anatomy 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 230000007306 turnover Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/24323—Tree-organised classifiers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The waste electronic product identification method based on the heterogeneous multi-model is provided, and aims to solve the problems that related data sets are limited, and the identification method precision is difficult to meet the actual industrial requirements. Extracting a character region on the back of the electronic product by using a CTAFT algorithm, and extracting a character part and an overall characteristic of the electronic product to be recovered by using a VGG19 model pre-trained by ImageNet as an image characteristic embedded model; establishing an OCR character recognition model aiming at the character partial characteristics to obtain an OCR sub-model recognition result, and establishing a deep forest classification model aiming at the characters and the overall characteristics to obtain a deep forest sub-model recognition result; and linearly combining the OCR recognition result and the deep forest classification vector, obtaining a category weight vector by using a softmax nonlinear function, and taking the result with the highest weight as an electronic product brand recognition result. The effectiveness is verified based on the real mobile phone and the tablet image shot by the waste electronic product recovery equipment.
Description
Technical Field
The invention belongs to the field of recovery of waste electronic products.
Background
With the development of science and technology and the rapid popularization of 5G, the speed of changing intelligent electronic products is increasing. According to Strategy analysis forecast, the goods output of the global intelligent electronic products in 2021 is reflected by 6.5 percent on year-on-year basis, and the total amount reaches 13.8 hundred million. The increased speed of replacing electronic products is the main reason for the increasing of the goods output, which also leads to the increasing of the number of personal idle electronic products year by year. Therefore, domestic and foreign markets put higher demands on the recycling efficiency of the electronic product recycling industry. Waste electronic products are taken as a typical urban renewable resource, and are recycled by using unmanned and intelligent recycling equipment, so that a large amount of labor cost can be saved. The intelligent waste electronic product identification method is the key for completing the tasks.
Image recognition has been widely applied in the fields of target detection, face recognition and the like, and how to use a related data set to construct a classification model to intelligently recognize waste electronic products also becomes the key research point of current intelligent recycling equipment. But the deep neural network model construction based on the image depends on massive marked samples. The data set for identifying the problems of the waste electronic products only comes from the real shot pictures of the recovery equipment prototypes, the data volume is small, an effective neural network classification model is difficult to construct, the definition of the shot images in the industrial process is low, and the problems of poor image integrity of the electronic products, partial area mirror images of the electronic products and the like can be caused by the non-standard operation of users. How to complete the classification of electronic product brands on the premise of small sample amount and low sample quality becomes a main problem to be solved currently.
Based on the current research situation, the authors of the present invention propose a waste electronic product identification system based on parallel differential evolution-gradient feature deep forest, and use the back image of the waste mobile phone to construct a mobile phone brand classification model, so that the model classification accuracy can reach 80.12%; the waste electronic product identification system based on optical character identification utilizes the back characters of waste electronic products to construct a character classification model, and the character classification result is mapped into a waste electronic product brand through a mapping rule, and the model classification accuracy can reach 86.37%. However, the method only constructs the classification model from a single angle such as texture features and character features, and the model precision still hardly meets the actual industrial requirements. Therefore, the invention provides a waste electronic product identification method based on heterogeneous multi-models.
Firstly, extracting a back character region of an electronic product by using a CTAFT algorithm; secondly, extracting the characteristics of the back image of the electronic product and the character characteristic region of the back image by using a VGG19 model pre-trained by ImageNet, and replacing single-dimensional characteristics with high-dimensional convolution characteristics; then, an Optical Character Recognition (OCR) model is built based on the character features, and a deep forest electronic product classification model is built based on the image features and the character features of the electronic product; and finally, linearly splicing the classification results of different models, and obtaining a final classification result through a Softmax activation function. Based on a typical electronic product image data set of a telecommunication equipment authentication center of the Ministry of industry and informatization, the effectiveness of the algorithm in identifying the waste electronic products is verified.
Disclosure of Invention
The method for identifying the waste electronic products based on the heterogeneous multi-model comprises the following steps: the image preprocessing module, the multi-element feature extraction module and the heterogeneous multi-model identification module are all 3 parts. The overall system structure is shown in fig. 1.
The meanings of the occurrence variables of the present invention are shown in Table 1.
TABLE 1 meaning of variables
The input of the image preprocessing module isThe data enhancement pre-processing output is XimgThe output of the character preprocessing using the CRAFT character level target detection algorithm is Xdigit;
The multivariate feature extraction module uses a VGG19 network based on ImageNet pre-training to obtain the representation of character features and whole back image pixel features in a high-dimensional space, and the input of the multivariate feature extraction module is XimgAnd XdigitThe outputs are respectivelyAnd
the heterogeneous multi-model recognition module comprises 3 parts including an OCR character recognition submodule, a deep forest electronic product recognition submodule and a softmax nonlinear output layer submodule, wherein: the OCR submodule inputs asOutput is asThe deep forest submodule is input asOutput is asThe softmax nonlinear output layer submodule maps the output result of the classification submodule to obtainThe label with the highest score is the final output
2.1 image preprocessing Module
2.1.1 data enhancement preprocessing
Data enhancement randomly changing training samples can reduce the dependence of the model on certain attributes, thereby improving the generalization capability of the model. The data enhancement method mainly comprises geometric transformation, color space transformation, kernel filter, image mixing, random erasing, countermeasure training, enhancement based on generation of countermeasure network and nerve style transfer. In the recovery process of the waste electronic products, the electronic product images generate position deviation due to different placement positions of users, so that data used by the method is enhanced mainly by geometric transformation. The method specifically comprises the following steps: rotation, turnover, mirror image, translation, addition of Gaussian noise and the like.
2.1.2 character enhancement preprocessing
In the electronic product recycling process, the problems of incomplete image acquisition of a camera, image mirroring of the electronic product and the like can be caused by non-standard operation of a user, and the model construction and electronic product brand prediction effect are poor by directly using a picture acquired by recycling equipment. The back characters of the electronic product are important bases for identifying the brand of the electronic product, but the characters are worn and shielded in the using process of a user, and the model is limited only by taking the back characters as classification bases. Therefore, the author selects character features in the back image of the electronic product as one of classification bases, determines and segments the character positions of the electronic product by adopting a CRAFT character-level image positioning algorithm, and linearly splices the character features and the whole picture to be used as a subsequent model for input. The method solves the problem that the classification model is difficult to construct according to the mobile phone image and the problem that the model constructed according to the single character features is limited.
A large number of experiments show that target detection algorithms such as YOLO3 and Fast-RCNN are widely applied to the fields of face detection, license plate detection and the like, but the aspect ratio of the detected targets of the algorithms is relatively fixed, and the problems of deformation, abrasion and the like do not occur mostly. The CRAFT algorithm trains an artificial data set with character tags in a weak supervised learning mode, when a back picture of an electronic product without character division is used as input, a model can detect, synthesize and generate corresponding character tags and then recognize the character tags, and the algorithm predicts the region where a text is located according to the closeness degree between characters. The CRAFT model training process is shown in FIG. 2.
For an artificial data set, the data set comprises a Gaussian heat map of a single character in the map, and the CRAFT algorithm carries out supervised training on the part; for an image data set at the back of an electronic product, firstly marking a text box area in an image of the electronic product, and stretching the text box area to a more positive text box through perspective transformation; and then obtaining a position frame of a single character by using a watershed algorithm, generating a corresponding Gaussian heat map, and pasting the position frame back to the corresponding position of the label map corresponding to the original image after conversion. The formula for calculating the score of the watershed algorithm segmentation result is as follows:
wherein l (w) represents the length of the text box of the image of the electronic product, and lc(w) is the result of dividing the length of the character string by the watershed algorithm.
Obtaining the evaluation of the algorithm according to the formula (1) after the character string length is obtained by the watershed algorithm segmentation, and obtaining the confidence coefficient S if the evaluation is consistent with the real character lengthc(p) is 1, and a lower score indicates a poorer confidence in the segmentation result.
The electronic product image collected by the industrial equipment can be represented as XimgThe size is 400 x 300, and the character image can be represented as the character image after the 2.1-section image preprocessingWhereinIndicating characterThe size of the ith character in the character image is 50 x 50, and the size of the whole character image is 50 x (50 x m). For visual description of pixel variation of multi-feature extraction module, FIG. 3 of the present invention takes a five-character image as an example, XdigitSize 50 x 250. The output dimension of the multi-element feature extraction module is artificially determined, so that the output dimension of the m-character image is consistent with the output dimension of the five-character image. The structure of the multivariate feature extraction module is shown in figure 3.
The module adopts a VGG19 model pre-trained based on ImageNet as a base model, and firstly, the parameters of a convolutional layer and a pooling layer in the VGG19 model are solidified; then, constructing full connection layers with different sizes according to different image characteristics; and finally, linearly combining the model outputs of different images to serve as the input characteristics of a subsequent classification model. And the feature dimension after the multivariate feature extraction is determined by the dimension of the full connection layer. Aiming at mobile phone images X with different sizesimgAnd character image XdigitThe feature extraction process is shown in formula (2):
wherein f isVGG(. cndot.) represents the VGG19 model output process.
2.3 heterogeneous multi-model identification module
The accuracy of the electronic product classification model which is built only from a single angle of texture features, character features and the like is verified, and the actual industrial requirement is still difficult to achieve. Therefore, the invention adopts the Stracking integration idea, linearly combines different angle characteristics to construct a heterogeneous multi-classification model, and improves the overall model precision through the integration of a plurality of different models. Aiming at the waste electronic product brand classification problem, an OCR character recognition model and a deep forest recognition model are constructed, and the sub-model structure is shown as follows.
2.3.1OCR character recognition model submodule
In the back character recognition process of the OCR electronic product, only 2.2 subsections of the character features are usedAs an input. Firstly, extracting character sequence characteristics containing complete context information through a bidirectional LSTM; then, the problem that input characteristics and output sequences cannot be aligned to the CTC network is solved; finally, determining the distance between the OCR output character string and the known label through the Levensstein distance, and finally obtaining the brand classification result of the electronic productThe structure of the OCR character recognition model is shown in FIG. 4.
As shown in FIG. 4, the OCR character recognition model derives character features from the image preprocessing portionConstruction of k (where k is>m) LSTM elementary units. The bidirectional LSTM network contains two sub-network structures, equations (3), (4) representing antecedent and consequent delivery, respectively.
Wherein k is the LSTM basic unit hyperparameter,representing the output result of forward LSTM at time i,represents the output result of the backward LSTM at time i, represents the ith input xiThe bi-directional LSTM output at time i is:
then, CTC network outputs [ h ] to bidirectional LSTM network1,h2,...,hx]In the repeated recognition, the character is de-duplicated to be [ y1,y2,...,yn]. Since the bidirectional LSTM base unit is more than the number n of handset characters, resulting in repeated character division, for example, "honor" may be segmented into "hononorr". Multiple substrings of 'hoonorr' can be mapped into correct result 'honor', as shown in formula (6)
The CTC network obtains the final result Y by maximizing the posterior probability P < Y | X > given input X, where P < Y | X > is as shown in equation (7):
where π ∈ B (Y) denotes the set of all substrings that can be integrated into Y.
2.3.2 deep forest recognition model submodule
In the image recognition process of the waste electronic products in the deep forest, the character features are usedAnd image featuresLinear combination to obtain deep forest input characteristics XDFThe process is shown in equation (8):
first, using XDFConstructing different random forests and obtaining different random forest outputsThen outputting the random forestAnd XDFLinear combination, which is used as input to be transmitted into a next layer model to construct different random forests, and whether to continue constructing the next layer network model is determined according to the classification precision of the current model; and finally, ending the model growth when the model precision is not improved any more, weighting the final multiple random forest classification results to obtain the final classification resultThe structure of the deep forest recognition model is shown in FIG. 5.
2.3.3 Multi-model output weighting Module submodule
OCR character recognition model output in the heterogeneous multi-model recognition moduleThe continuous character strings are output as a certain electronic product brand after being mapped based on distance measurement, and the deep forest recognition model is outputThe probabilities of all electronic product brands. In order to solve the problem that the output forms of heterogeneous models are different or the output results are inconsistent, a multi-model output weighting module is added at the end of a classification model. The multi-model weighted output module is shown in fig. 6.
The softmax function is also called a normalized exponential function, and is a classifier widely used in a supervised learning part of a deep network in the current deep learning research. The Softmax function is shown in equation (9):
wherein n +1 represents the heterogeneous multi-model output vector dimension, and e represents the natural logarithm. In the classification model of the invention, n labels of waste electronic products are set, and OCR character recognition resultsDepth forest recognition resultObtaining n +1 dimensional result vector after linear splicingAs input to the softmax function, the final result isTaking the label with the highest weight corresponding to the weight as the final classification result of the waste electronic products
1. And (3) performing feature extraction on the images of the waste electronic products by using a VGG19 network pre-trained by ImageNet. With the deepening of the convolutional layer, the receptive field of a single feature is continuously increased, the characterization capability of the feature is continuously enhanced, and the feature is better than that of a single-angle feature extraction method. Compared with the deep forest classification model constructed only by the HOG features sensitive to the textures, the model constructed by the VGG19 is obviously improved in precision.
2. The method comprises the steps of constructing a waste mobile phone electronic product classification model by adopting a heterogeneous multi-model method, constructing classification models for different tasks through the same data set, and finally weighting output of a plurality of models through nonlinear functions to obtain a final classification result. The test proves that. Compared with a single OCR recognition model and a single depth forest recognition model, the heterogeneous multi-model provided by the invention has obviously improved precision.
Drawings
FIG. 1 shows a structure diagram of a recognition method of heterogeneous multi-model waste electronic products
FIG. 2CRAFT positioning clipping module structure diagram
FIG. 3 Multi-feature extraction Module
FIG. 4OCR character recognition model
FIG. 5 deep forest identification model
FIG. 6 Multi-model weighted output Structure
FIG. 7 application scenario of waste electronic product recycling equipment
FIG. 8 is a diagram of data enhancement effect
FIG. 9 image preprocessing results
FIG. 10OCR recognition model confusion matrix
FIG. 11 deep forest recognition model confusion matrix
FIG. 12 is a confusion matrix of heterogeneous multi-model waste electronic product classification models
Detailed Description
An application scenario of the waste electronic product recovery equipment is shown in fig. 7, and experimental data of the invention are derived from a real-shot picture of the equipment. The data set comprises 123 images, which comprise 10 categories of brands of waste electronic products, namely Huawei mobile phone (HUAWEI), Huawei tablet (MatePad) glory (Honor), millet (Mi), Zhongxing (ZTE), OPPO, VIVO, apple (iPhone), apple tablet (iPad) and other brands (other).
Because the waste electronic equipment real shooting samples are few, the training set and the test set samples are expanded by adopting a data enhancement method before the classification model is constructed. Taking the honor mobile phone back image as an example, the mobile phone back image is rotated, folded, added with noise and the like, 1 mobile phone back image sample is expanded to 12, and the total number of the samples is 400 images expanded to 4800. The sample expansion diagram is shown in fig. 8.
Then, a CRAFT character segmentation algorithm is adopted to segment the images of the waste electronic products to be recovered, so as to obtain a corresponding electronic product character data set, and the image preprocessing result is shown in FIG. 9.
The multi-feature extraction part uses an ImageNet data set of 1400 million pictures and 2 million classes to pre-train the VGG19 model, and the pre-trained VGG model is expressed as fVGG(. cndot.). And the multi-element feature extraction part adds full connection layers with different sizes in the VGG model according to the difference of input images, wherein a 1024-dimensional full connection layer is added for a 400 x 300 waste electronic product image, and a 512-dimensional full connection layer is added for a 50 x 50 waste electronic character image.
The OCR character recognition module is an easy OCR Chinese and English character pre-training model and constructs 128 LSTM basic units, namely k is 128. The depth forest identification model uses a random forest and GBDT as base classifiers of each layer of depth forest model, wherein the RF and the GBDT are constructed by using 100 decision trees, and a GBDT loss function is optimized by using L1+ L2 regularization.
An OCR character recognition model classification confusion matrix is constructed by using the character pictures after image preprocessing as shown in FIG. 10, and a deep forest recognition model classification confusion matrix is constructed by using the waste electronic product images and the character pictures as shown in FIG. 11.
The results of the 2 models are integrated through a multi-model output weighting module, so that a waste electronic product classification model confusion matrix based on heterogeneous multi-models is obtained, and as shown in fig. 12, the classification precision can reach 90.17%.
In order to verify the effectiveness of the method, the same waste electronic product data set is used for respectively constructing 10 classification models of a single feature + deep forest, a VGG feature + deep forest and a VGG feature + OCR model. The accuracy of different brand classification models based on waste electronic product image data sets is shown in table 1.
TABLE 1 precision comparison table for identification model of waste electronic products
Claims (1)
1. The method for identifying the waste electronic products based on the heterogeneous multi-model is characterized by comprising the following steps: the image preprocessing module, the multi-element feature extraction module and the heterogeneous multi-model identification module are all 3 parts;
the meanings of appearance variables are shown in table 1;
TABLE 1 meaning of variables
The input of the image preprocessing module isThe data enhancement pre-processing output is XimgThe output of the character preprocessing using the CRAFT character level target detection algorithm is Xdigit;
The multivariate feature extraction module obtains the representation of character features and whole back image pixel features in a high-dimensional space by using a VGG19 network based on ImageNet pre-training, and the input of the module is XimgAnd XdigitThe outputs are respectivelyAnd
the heterogeneous multi-model recognition module comprises 3 parts including an OCR character recognition submodule, a deep forest electronic product recognition submodule and a softmax nonlinear output layer submodule, wherein: the OCR submodule inputs asOutput is asThe deep forest submodule is input asOutput is asThe softmax nonlinear output layer submodule maps the output result of the classification submodule to obtainThe label with the highest score is the final output
The image preprocessing module comprises data enhancement preprocessing and character enhancement preprocessing;
character features in an image on the back of an electronic product are selected as one of classification bases in character enhancement preprocessing, the character positions of the electronic product are determined and segmented by adopting a CRAFT character-level image positioning algorithm, and the character features and an overall picture are linearly spliced to be input as a subsequent model;
for an artificial data set, the data set comprises a Gaussian heat map of a single character in the map, and the CRAFT algorithm carries out supervised training on the part; for an image data set at the back of an electronic product, firstly marking a text box area in an image of the electronic product, and stretching the text box area to a more positive text box through perspective transformation; then, obtaining a position frame of a single character by using a watershed algorithm, generating a corresponding Gaussian heat map, and pasting the position frame back to the corresponding position of the label map corresponding to the original image after conversion; the formula for calculating the score of the watershed algorithm segmentation result is as follows:
wherein l (w) represents the length of the text box of the image of the electronic product, and lc(w) is the result of dividing the length of the character string by the watershed algorithm;
obtaining the evaluation of the algorithm according to the formula (1) after the character string length is obtained by the watershed algorithm segmentation, and obtaining the confidence coefficient S if the evaluation is consistent with the real character lengthc(p) is 1, the lower the score is, the worse the reliability of the segmentation result is;
the electronic product image collected by the industrial equipment is represented as XimgSize 400 x 300, and the character image after image preprocessing is represented asWhereinThe method comprises the steps of representing the ith character in a character image, wherein the size of a single character is 50 x 50, and the size of the whole character image is 50 x (50 x m); the output dimension of the m character image is consistent with that of the five character image;
the module adopts a VGG19 model pre-trained based on ImageNet as a base model, and firstly, parameters of a convolutional layer and a pooling layer in a VGG19 model are solidified; then, constructing full connection layers with different sizes according to different image characteristics; finally, the models of different images are output to be linearly combined to be used as the input characteristics of a subsequent classification model; the feature dimension after the multi-element feature extraction is determined by the dimension of the full connection layer; aiming at mobile phone images X with different sizesimgAnd character image XdigitThe feature extraction process is shown in formula (2):
wherein f isVGG() represents the VGG19 model output process;
constructing an OCR character recognition model and a deep forest recognition model, wherein the sub-model structure is shown as follows;
a) OCR character recognition model submodule
In the OCR electronic product back character recognition flow, only the character features are usedAs an input; firstly, extracting character sequence characteristics containing complete context information through a bidirectional LSTM; then, the problem that input characteristics and output sequences cannot be aligned to the CTC network is solved; finally, determining the distance between the OCR output character string and the known label through the Levensstein distance, and finally obtaining the brand classification result of the electronic product
OCR character recognition model obtains character features according to image preprocessing partConstructing k (wherein k > m) LSTM basic units; the bidirectional LSTM network comprises two sub-network structures, and formulas (3) and (4) respectively represent the transmission of a front item and a back item;
wherein k is the LSTM basic unit hyperparameter,representing the output result of forward LSTM at time i,represents the output result of the backward LSTM at time i, represents the ith input xiThe bi-directional LSTM output at time i is:
next, the CTC network outputs [ h ] to the bidirectional LSTM network1,h2,...,hx]In the repeated recognition, the character is de-duplicated to be [ y1,y2,...,yn](ii) a Since the bidirectional LSTM basic unit is more than the number n of mobile phone characters, the characters are repeatedly divided, for example, "honor" is divided into "hononorr"; multiple substrings of 'hoonorr' can be mapped into correct result 'honor', as shown in formula (6)
The CTC network obtains the final result Y by maximizing the posterior probability P < Y | X > given input X, where P < Y | X > is as shown in equation (7):
wherein, pi epsilon B (Y) represents all substring sets which can be integrated into Y;
b) deep forest recognition model submodule
In the image recognition process of the waste electronic products in the deep forest, the character features are usedAnd image featuresLinear combination to obtain deep forest input characteristic XDFThe process is shown in equation (8):
first, using XDFConstructing different random forests to obtain different random forest outputsThen outputting the random forestAnd XDFLinear combination, which is used as input to be transmitted into a next layer model to construct different random forests, and whether to continue constructing the next layer network model is determined according to the classification precision of the current model; finally, ending model growth when model accuracy is no longer improved, and adding the last moreWeighting the classification result of the forest to obtain the final classification result
c) Multi-model output weighting module submodule
OCR character recognition model output in the heterogeneous multi-model recognition moduleThe continuous character strings are output as a certain electronic product brand after being mapped based on distance measurement, and the deep forest recognition model is outputProbability of all electronic product brands; in order to solve the problem that the output forms of the heterogeneous models are different or the output results are inconsistent, a multi-model output weighting module is added to the classification model;
the softmax function is also called a normalization index function, and is a classifier widely used in a supervised learning part of a deep network in the current deep learning research; the Softmax function is shown in equation (9):
wherein n +1 represents the dimension of the heterogeneous multi-model output vector, and e represents the natural logarithm; in the classification model, n labels of the waste electronic products are set, and OCR character recognition results are obtainedDepth forest recognition resultObtaining n +1 dimensional result vector after linear splicingAs input to the softmax function, the final result isTaking the label with the highest weight corresponding to the weight as the final classification result of the waste electronic products
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111673248.0A CN114708593B (en) | 2021-12-31 | 2021-12-31 | Heterogeneous multi-model-based brand recognition method for waste electronic products |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111673248.0A CN114708593B (en) | 2021-12-31 | 2021-12-31 | Heterogeneous multi-model-based brand recognition method for waste electronic products |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114708593A true CN114708593A (en) | 2022-07-05 |
CN114708593B CN114708593B (en) | 2024-06-14 |
Family
ID=82167256
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111673248.0A Active CN114708593B (en) | 2021-12-31 | 2021-12-31 | Heterogeneous multi-model-based brand recognition method for waste electronic products |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114708593B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111931953A (en) * | 2020-07-07 | 2020-11-13 | 北京工业大学 | Multi-scale characteristic depth forest identification method for waste mobile phones |
WO2021022970A1 (en) * | 2019-08-05 | 2021-02-11 | 青岛理工大学 | Multi-layer random forest-based part recognition method and system |
-
2021
- 2021-12-31 CN CN202111673248.0A patent/CN114708593B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021022970A1 (en) * | 2019-08-05 | 2021-02-11 | 青岛理工大学 | Multi-layer random forest-based part recognition method and system |
CN111931953A (en) * | 2020-07-07 | 2020-11-13 | 北京工业大学 | Multi-scale characteristic depth forest identification method for waste mobile phones |
Non-Patent Citations (2)
Title |
---|
王德青;吾守尔・斯拉木;许苗苗;: "场景文字识别技术研究综述", 计算机工程与应用, no. 18, 31 December 2020 (2020-12-31), pages 7 - 21 * |
班晓娟;宿彦京;谢建新;: "深度学习在材料显微图像分析中的应用与挑战", 材料科学与工艺, no. 03, 31 December 2020 (2020-12-31), pages 74 - 81 * |
Also Published As
Publication number | Publication date |
---|---|
CN114708593B (en) | 2024-06-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Yan et al. | A graph convolutional neural network for classification of building patterns using spatial vector data | |
CN111639544B (en) | Expression recognition method based on multi-branch cross-connection convolutional neural network | |
CN110321967B (en) | Image classification improvement method based on convolutional neural network | |
Huang et al. | Multiple morphological profiles from multicomponent-base images for hyperspectral image classification | |
CN113657450B (en) | Attention mechanism-based land battlefield image-text cross-modal retrieval method and system | |
Kang et al. | Deep learning-based weather image recognition | |
CN112633350A (en) | Multi-scale point cloud classification implementation method based on graph convolution | |
Das et al. | Automated Indian sign language recognition system by fusing deep and handcrafted feature | |
Luo et al. | On the eigenvectors of p-Laplacian | |
CN112163114B (en) | Image retrieval method based on feature fusion | |
Kollapudi et al. | A New Method for Scene Classification from the Remote Sensing Images. | |
Liu et al. | A semi-supervised high-level feature selection framework for road centerline extraction | |
Su et al. | Probabilistic collaborative representation based ensemble learning for classification of wetland hyperspectral imagery | |
CN116935100A (en) | Multi-label image classification method based on feature fusion and self-attention mechanism | |
Zhou et al. | Infrared handprint classification using deep convolution neural network | |
Vijayalakshmi K et al. | Copy-paste forgery detection using deep learning with error level analysis | |
CN114913337A (en) | Camouflage target frame detection method based on ternary cascade perception | |
Ying et al. | License plate detection and localization in complex scenes based on deep learning | |
CN113011506A (en) | Texture image classification method based on depth re-fractal spectrum network | |
CN114708593B (en) | Heterogeneous multi-model-based brand recognition method for waste electronic products | |
Turtinen et al. | Contextual analysis of textured scene images. | |
Eurviriyanukul et al. | Evaluation of recognition of water-meter digits with application programs, APIs, and machine learning algorithms | |
Zhong et al. | Fuzzy neighborhood learning for deep 3-D segmentation of point cloud | |
Sun et al. | The recognition framework of deep kernel learning for enclosed remote sensing objects | |
Anggoro et al. | Classification of Solo Batik patterns using deep learning convolutional neural networks algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |