CN110751191A

CN110751191A - Image classification method and system

Info

Publication number: CN110751191A
Application number: CN201910925538.6A
Authority: CN
Inventors: 刘学文
Original assignee: Guangdong Inspur Big Data Research Co Ltd
Current assignee: Guangdong Inspur Smart Computing Technology Co Ltd
Priority date: 2019-09-27
Filing date: 2019-09-27
Publication date: 2020-02-04

Abstract

The application discloses a method and a system for classifying images, which comprise the following steps: inputting the image with the artificial classification label into an auto-encoder to train the auto-encoder; inputting a test image into the trained self-encoder; determining a hidden layer with the minimum dimension in a self-encoder as an effective characteristic layer; clustering the effective characteristic layers by using a spectral clustering algorithm, and dividing all the test images into a plurality of clustering clusters according to the image characteristics extracted by the effective characteristic layers; and in each cluster, acquiring the number of the manual classification label images of different categories, and defining the category of the label-free image in the current cluster by using the category of the manual classification label image with the largest number. Therefore, the image features are extracted by the self-encoder, and are classified by the spectral clustering algorithm, so that the obtained image features are more accurate, the spectral clustering algorithm has stronger adaptability to data distribution, the clustering effect is more obvious, and the accuracy of image classification is higher.

Description

Image classification method and system

Technical Field

The present application relates to the field of image processing, and in particular, to a method and system for classifying images.

Background

In image classification scenarios, supervised learning methods are often used to train neural networks. Wherein, supervised learning refers to a mode of training a neural network by using all marked images. The neural network can obtain the image characteristics corresponding to the marks after training the marked images, and can extract and classify the characteristics of the new unmarked images according to the characteristics obtained by training. However, the number of images marked in life is small, and the advance manual marking of the unmarked images consumes a lot of time and resources, and is not beneficial to the training of the neural network.

In order to reduce the number of labeled images required in training, semi-supervised learning methods are used in the prior art, and a small part of labeled images and a large part of unlabeled images are used for training the neural network. However, the adopted image feature extraction algorithm and classification algorithm have the problems of inaccurate feature extraction, large calculation amount and the like, so that the problem of low accuracy rate exists in semi-supervised image classification.

Disclosure of Invention

In order to solve the technical problems in the prior art, the application provides a method and a system for classifying images, which can extract image features from a self-encoder and classify the images by a spectral clustering algorithm, so that the accuracy of semi-supervised image classification is improved.

In a first aspect, an embodiment of the present application provides an image classification method, including:

inputting the image with the manual classification label into a self-encoder to train the self-encoder, and finishing the training of the self-encoder when the feature approximation degree of the image output by the self-encoder and the input original image meets the preset requirement; the self-encoder is a network structure, wherein the input dimension is equal to the output dimension, and the image features are extracted by reducing the dimension of a middle hidden layer;

inputting a test image into the trained self-encoder; the method comprises the following steps that a test image is an image with an artificial classification label and an image without a label to be classified;

determining a hidden layer with the least dimensionality and the extracted image characteristics having the meaning representing the image category in a self-encoder as an effective characteristic layer;

clustering the effective characteristic layers by using a spectral clustering algorithm, and dividing all the test images into a plurality of clustering clusters according to the image characteristics extracted by the effective characteristic layers;

and in each cluster, acquiring the number of the manual classification label images of different categories, and defining the category of the label-free image in the current cluster by using the category of the manual classification label image with the largest number.

Optionally, the inputting the image with the manual classification label into an auto-encoder to train the auto-encoder includes: and inputting the image with the artificial classification label added with the noise into an auto-encoder to train the auto-encoder.

Optionally, the number of the cluster clusters is: the number of the image classification types is consistent with the number of the image classification types required to be obtained.

Optionally, the image with the manual classification label includes: and the artificial classification label image at least comprises a classification label to be classified and determined by the unlabeled image.

Optionally, the training the self-encoder includes: the self-encoder is trained using the mean square error function as a loss function and an Adam optimizer.

Optionally, the method further includes: and if the number of the manual classification label images of different types in the clustering cluster is the same or similar, carrying out the training of the self-encoder and the classification of the test images again.

In a second aspect, an embodiment of the present application provides an image classification system, including:

the training unit is used for inputting the image with the manual classification label into the self-encoder to train the self-encoder, finishing the training of the self-encoder when the feature approximation degree of the image output by the self-encoder and the input original image meets the preset requirement, and inputting the test image into the trained self-encoder;

the classification unit is used for determining a hidden layer with the least dimensionality and the extracted image characteristics with the meaning of representing image categories in the self-encoder as an effective characteristic layer; clustering the effective characteristic layers by using a spectral clustering algorithm, and dividing all the test images into a plurality of clustering clusters according to the image characteristics extracted by the effective characteristic layers;

and the marking unit is used for acquiring the number of the manual classification label images of different categories in each cluster, and defining the category of the label-free image in the current cluster by using the category of the manual classification label image with the largest number.

Optionally, the training unit inputs the labeled image into a self-encoder model to train the self-encoder, including: and inputting the labeled image added with the noise into an auto-encoder model to train the auto-encoder.

Optionally, the number of the cluster clusters in the classification unit is: the number of the image classification types is consistent with the number of the image classification types required to be obtained.

Optionally, the image with the manual classification label input to the training unit includes: and the artificial classification label image at least comprises a classification label to be classified and determined by the unlabeled image.

Optionally, the training unit trains the self-encoder, including: the self-encoder is trained using the mean square error function as a loss function and an Adam optimizer.

Optionally, the system further includes: and the resetting unit is used for re-training the self-encoder and classifying the test images if the number of the manual classification label images of different types in the clustering cluster is the same or similar.

Compared with the prior art, the embodiment of the application has the following advantages:

in the embodiment of the application, the image with the manually classified labels is used for training the self-encoder, the performance of the self-encoder can meet the requirement of image classification, then the test image is input into the self-encoder, the self-encoder reduces the dimensionality of the middle hidden layer, the image characteristics extracted by the hidden layer with the minimum dimensionality are the effective characteristics with the most image type representative significance, finally, the effective characteristic layers are clustered by using a spectral clustering algorithm, and the label types of the label-free images are defined by the labels corresponding to the images with the most quantity according to the quantity of the images with the manually classified labels in the obtained clustering cluster. Therefore, the self-encoder is used for extracting the image features, the image features are used for classifying the test images through the spectral clustering algorithm, and finally the labels with the most label images are used for defining the label-free images. Compared with other semi-supervised image classification algorithms, the image features extracted by the self-encoder in the embodiment of the application are accurate, the adaptability of the spectral clustering algorithm to data distribution is stronger, the clustering effect is more obvious, and the obtained accuracy rate of image classification is higher.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art without creative efforts.

Fig. 1 is a flowchart of an image classification method according to an embodiment of the present disclosure;

fig. 2 is a flowchart of another image classification method provided in an embodiment of the present application;

fig. 3 is a schematic structural diagram of an auto-encoder in an image classification method according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of an image classification system according to an embodiment of the present application.

Detailed Description

As described above, in the image classification technology, a large number of images with artificial classification labels are required to be used in training a classification model by adopting a supervised learning method, and the features of the images are obtained through the image classification model, so that the image classification model establishes the relationship between the image features and the image labels, and thus, the trained image classification model can be used for classifying the unlabeled images. However, the supervised learning method is costly and inefficient because all the training models used are images with manual classification labels, and it takes much time and effort to label the images manually.

Therefore, semi-supervised learning is proposed on the basis of supervised learning, images of a training image classification model comprise a part of images with artificial marks and a part of unlabelled images, and the image classification model after training is obtained through the learning of the artificial marked images and the classification of the unlabelled images by the image classification model.

The inventor finds that the existing semi-supervised learning method has inaccuracy problems in image feature extraction and image classification according to the image features. Therefore, in the existing semi-supervised image classification method, the accuracy of the extraction of the image features and the classification of the image features needs to be improved. The inventor finds in research that the self-encoder extracts the features with classification significance in the image by reducing the dimension of the input through the encoding module so that the dimension of the hidden layer is smaller than the dimension of the input layer and the output layer. The features extracted by the hidden layer with the smallest dimension should be the features with the most classification significance in the image. Therefore, the image features extracted by the hidden layer with the smallest dimension in the encoder can be used as effective features for image classification. And selecting a spectral clustering algorithm with stronger adaptability to data distribution and smaller calculated amount in a classification algorithm to classify the image characteristics. Therefore, the accuracy of image classification can be improved by adopting the self-encoder to extract the image features and the spectral clustering algorithm to classify the image features.

In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

First embodiment

A first embodiment of the present application provides a method for image classification, which is described in detail below with reference to the accompanying drawings.

Referring to fig. 1, the figure is a flowchart of a method for classifying an image according to an embodiment of the present application. In this embodiment, the method may be implemented, for example, by the following steps S101-S105.

S101, inputting the image with the manual classification label into a self-encoder to train the self-encoder, and finishing the training of the self-encoder when the feature approximation degree of the image output by the self-encoder and the input original image meets the preset requirement; the self-encoder is a network structure with input dimension equal to output dimension and used for extracting image features by reducing the dimension of an intermediate hidden layer.

It should be noted that the image with the manual classification label in the embodiment of the present application refers to an image that has been manually classified and has a corresponding classification label, and the specific form of the label is not limited in the embodiment.

It should be noted that, the self-encoder in the embodiment of the present application may be a neural network structure in which the dimension of the input layer is the same as that of the output layer, and the dimension of the intermediate hidden layer is smaller than that of the input layer and the output layer through the encoding module and the decoding module.

It should be noted that, in the embodiment of the present application, a specific network model of a self-encoder is not limited, and the model may be built according to specific requirements of image classification. The hidden layer in the middle can select a convolution network layer or a full connection layer according to the complexity of the image.

In the embodiment of the present application, a specific tag type of the image with the tag classified manually is not limited, and may be selected according to a target type of the image classification.

It should be noted that, in the embodiment of the present application, the preset requirement for the similarity between the input image and the output image may be specifically set according to an actual situation, and the method for determining whether the preset requirement is met may also be set according to needs. In a possible implementation manner, the similarity of the two images may be determined through an algorithm for comparing the similarities of the images, and a lowest threshold of the similarity is set, and when the similarity of the input image and the output image obtained by the similarity algorithm is higher than the lowest threshold, the training of the self-encoder may be considered to be completed.

S102, inputting a test image into a trained self-encoder; the test images are images with manual classification labels and label-free images to be classified.

It is understood that the image with the manual classification label mentioned in the embodiment of the present application may include a label other than the kind of label to be classified for the non-label image, and in one possible implementation, the kind of label with the manual classification label image is consistent with the target classification kind of the non-label image to be classified.

It should be noted that, in the embodiment of the present application, the ratio of the image with the artificial classification label to the unlabeled image in the test image is not limited, and the adjustment may be performed according to the requirement of image classification and the accuracy of the obtained classification result.

It should be noted that the test images in the embodiments of the present application may include previous images used for training the self-encoder.

In the embodiment of the present application, the similarity of the output image obtained by inputting the test image into the encoder is not limited. In one possible case, the similarity of the input image from the encoder and the output image of the test image may be judged again, and the accuracy of the self-encoder model is determined.

And S103, determining the hidden layer with the least dimension in the self-encoder and the extracted image features having the meaning representing the image category as an effective feature layer.

In the embodiments of the present application, the number of hidden layers determined as the effective feature layers is not limited. In one possible implementation, a plurality of hidden layers, also of the smallest dimension, may be used as the active feature layers.

And S104, clustering the effective characteristic layers by using a spectral clustering algorithm, and dividing all the test images into a plurality of clustering clusters according to the image characteristics extracted by the effective characteristic layers.

It is to be understood that the number of the plurality of cluster clusters in the embodiment of the present application should be greater than or equal to the number of image classifications.

It should be noted that the number of the clustering clusters in the embodiment of the present application may be set in a spectral clustering algorithm, and the specific number of the clustering clusters is not limited in the embodiment of the present application, and may be set according to the number of the types of image classification.

And S105, acquiring the number of the manual classification label images of different classes in each cluster, and defining the class of the unlabeled image in the current cluster by using the class of the manual classification label image with the largest number.

It should be noted that, in the embodiment of the present application, a specific manner of obtaining the number of different types of manual classification label images is not limited, and in a possible implementation manner, the number of manual classification label images of each type may be obtained through a statistical algorithm.

In the embodiment, the accuracy of extracting the image features from the self-encoder is ensured by training the self-encoder. Inputting a test image into a self-encoder, taking a hidden layer with the minimum dimension in the self-encoder as an effective characteristic layer, and clustering image characteristics extracted from the effective characteristic layer by using a spectral clustering algorithm to obtain a plurality of cluster clusters. And finally, defining label-free images in the clustering clusters by using the labels with the largest number of the manually classified label images in each clustering cluster, thereby realizing image classification. Because the self-encoder is adopted to extract the image features and the spectral clustering algorithm is utilized to classify the images, the effective image features can be extracted, the classification of the image features is more accurate, and the semi-supervised image classification with high accuracy is realized.

Second embodiment

In the first embodiment, the image classification of the unlabeled images is performed by selecting the labels with the artificially classified labeled images with the largest number to define the unlabeled images according to the number of the artificially classified labeled images in each cluster. According to the principle of the image classification algorithm, the images in each cluster obtained by the spectral clustering algorithm are the images of the same category with similar image characteristics obtained by the classification algorithm. When the classification algorithm and the image feature extraction are accurate, the category of the images with the artificial classification labels, which is overwhelmingly numerous, should appear in the cluster. In some scenarios, however, the images with the largest number of artificial classification labels in the cluster are clustered, but the number of images is not overwhelming, and it can be considered that the image classification result can be further optimized.

Referring to fig. 2, the figure is a flowchart of a method for classifying an image according to an embodiment of the present application. In this embodiment, the method may be implemented by, for example, steps S201-S204 as follows.

S201, inputting the image with the artificial classification label added with the noise into a self-encoder, using a mean square error function as a loss function and an Adam optimizer to train the self-encoder, and finishing the training of the self-encoder when the feature approximation degree of the image output by the self-encoder and the input original image meets a preset requirement;

it is understood that the loss function used in the self-encoder and the type of the optimizer are not limited in the embodiments of the present application, and may be selected according to the specific situation of image classification.

It should be noted that the added noise is mentioned in the embodiment of the present application to enhance the robustness of the self-encoder, and the kind of the added noise is not limited. In one possible implementation, random white gaussian noise may be added to the image with the artificial classification label before the image is input into the self-encoder.

S202, inputting a test image into the trained self-encoder; the method comprises the following steps that a test image is an image with an artificial classification label and an image without a label to be classified; and determining a hidden layer with the least dimension in the self-encoder and the extracted image features having the meaning representing the image category as an effective feature layer.

S203, clustering the effective characteristic layers by using a spectral clustering algorithm, and dividing all the test images into a plurality of clustering clusters according to the image characteristics extracted from the effective characteristic layers; and in each cluster, acquiring the number of the manual classification label images of different categories, and defining the category of the label-free image in the current cluster by using the category of the manual classification label image with the largest number.

And S204, if the number of the manual classification label images of different types in the clustering cluster is the same or similar, the training of the self-encoder and the classification of the test images are carried out again.

In the embodiment of the present application, the range of the number of the input images is not limited, and the number may be set according to the number of the input images and the number of the cluster clusters.

It is to be understood that, in the embodiment of the present application, there is no limitation on the selection of the specific image label category and the setting of the related parameters in the training of the self-encoder and the classification of the test image. In one possible implementation, the preset requirement for similarity of the input image and the output image when the self-encoder is trained again may be changed.

In the embodiment, the self-encoder during training is optimized, noise is added to the input image, a mean square error function is used as a loss function, and the Adam optimizer trains the self-encoder, so that the training effect of the self-encoder is optimized. And monitoring the condition that the number of different types of manual classification label images in the clustering cluster is the same or similar, and re-performing self-encoder training and image classification when the manual classification label images in the clustering cluster appear.

Third embodiment

The method for classifying images provided by the embodiment of the present application is introduced above, and the method provided by the embodiment of the present application is introduced below with reference to a specific application scenario.

In this scenario, the MNIST dataset is used for classification of unlabeled handwritten digital images. The MNIST data set is a data set of handwritten digital images, has 10 label types from numbers '0' to '9', and comprises 60000 pictures with manual classification labels and 10000 pictures without labels. Before image classification, a network structure of a self-encoder is built by combining image characteristics of an MNIST data set, a structural schematic diagram of the self-encoder is shown in figure 3, the middle represents a full connection layer, and numbers represent dimensions of all layers. Firstly, a training self-encoder selects 1000 pictures with artificial classification labels in an MNIST data set as training images, wherein 100 images correspond to each handwritten number. Random noise is added to the trained images, the images are input into the self-encoder until the output images are very similar to the input images, the similarity of the images is considered to reach the preset requirement, and the training of the self-encoder is finished. The mean square error is used as a loss function in training the self-encoder, using an Adam optimizer. Secondly, all images in the MNIST data set are input into a trained self-encoder as test images, wherein all images in the MNIST data set comprise labeled images and unlabeled images, and the number of the labels of the images with the manually classified labels is the same or similar. The hidden layer with the smallest output dimension from the encoder is taken as the effective feature layer of the image, that is, the hidden layer with the dimension of 32 in fig. 3 is taken as the effective feature layer of the image. And clustering the effective characteristic layers by using a spectral clustering algorithm, and setting the number of finally generated clustering clusters in the spectral clustering algorithm to be 10 because the MNIST data set has 10 different handwritten digit types. Finally, after the cluster classification is completed, acquiring the number of different types of labeled images in each cluster, for example, in the first cluster, the number of images corresponding to labels from "0" to "9" is: 90. 4, 2, 1, 3, 8, 1, 4, 5, 2. The most numerous images are the images corresponding to the "0" label, and the category of all unlabeled images in the cluster is defined as "0" and has a "0" label. And similarly, performing the operation of label-free image classification in other clustering clusters. But if the number of the corresponding images of the data labels from "0" to "9" obtained in the cluster is: 40. 7, 6, 8, 2, 5, 4, 35, and 9, since the number of corresponding images of the label "0" and the label "8" is similar, training of the self-encoder and classification of the images are performed again.

Fourth embodiment

Referring to fig. 4, the figure is a schematic structural diagram of an image classification system according to an embodiment of the present application.

The system 400 may specifically include, for example:

the training unit 401 may be configured to input an image with a manual classification label into the self-encoder to train the self-encoder, and when a feature approximation degree between an image output from the self-encoder and an input original image meets a preset requirement, end training of the self-encoder, and input a test image into the trained self-encoder;

a classification unit 402, configured to determine, as an effective feature layer, a hidden layer with the least dimensionality in a self-encoder and extracted image features having meaning representing image categories; clustering the effective characteristic layers by using a spectral clustering algorithm, and dividing all the test images into a plurality of clustering clusters according to the image characteristics extracted by the effective characteristic layers;

the labeling unit 403 may be configured to obtain, in each cluster, the number of the manual classification tagged images of different categories, and define the category of the unlabeled image in the current cluster by using the category of the manual classification tagged image with the largest number.

In some possible embodiments, the system may further include: and the resetting unit can be used for carrying out the training of the self-encoder and the classification of the test images again if the number of the manual classification label images of different types in the clustering cluster is the same or similar.

Since the system 400 is a system corresponding to the method provided in the above method embodiment, reference may be made to the description part of the above method embodiment for the description of each unit of the system 400, and details are not repeated here.

Therefore, by adopting the image classification system provided by the embodiment of the application, the self-encoder can be trained through the training unit, the image features extracted from the self-encoder are obtained through the classification unit, the images are classified according to the features of the images through a spectral clustering algorithm, and finally the labels of the manually classified label images with the largest number in the cluster clusters are defined as the label types of the label-free images through the marking unit. Therefore, the accuracy of image feature extraction and classification according to the image features is improved by using the self-encoder and the spectral clustering algorithm.

It should be understood that in the present application, "at least one" means one or more, "a plurality" means two or more. "and/or" for describing an association relationship of associated objects, indicating that there may be three relationships, e.g., "a and/or B" may indicate: only A, only B and both A and B are present, wherein A and B may be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "at least one of the following" or similar expressions refer to any combination of these items, including any combination of single item(s) or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus embodiment, since it is substantially similar to the method embodiment, it is relatively simple to describe, and reference may be made to some descriptions of the method embodiment for relevant points. The above-described apparatus embodiments are merely illustrative, and the units and modules described as separate components may or may not be physically separate. In addition, some or all of the units and modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

The foregoing is directed to embodiments of the present application and it is noted that numerous modifications and adaptations may be made by those skilled in the art without departing from the principles of the present application and are intended to be within the scope of the present application.

Claims

1. A method of image classification, the method comprising:

inputting the image with the manual classification label into a self-encoder to train the self-encoder, and finishing the training of the self-encoder when the feature approximation degree of the image output by the self-encoder and the input original image meets the preset requirement; the self-encoder is a neural network structure, wherein the input dimension is equal to the output dimension, and the image features are extracted by reducing the dimension of an intermediate hidden layer;

2. The method of claim 1, wherein inputting the image with the artificial classification label into an auto-encoder trains the auto-encoder, comprising: and inputting the image with the artificial classification label added with the noise into an auto-encoder to train the auto-encoder.

3. The method of claim 1, wherein the number of clusters is: the number of the image classification types is consistent with the number of the image classification types required to be obtained.

4. The method of claim 1, wherein the image with the manual classification label comprises: and the artificial classification label image at least comprises a classification label to be classified and determined by the unlabeled image.

5. The method of claim 1, wherein training the self-encoder comprises: the self-encoder is trained using the mean square error function as a loss function and an Adam optimizer.

6. The method of claim 1, further comprising: and if the number of the manual classification label images of different types in the clustering cluster is the same or similar, carrying out the training of the self-encoder and the classification of the test images again.

7. An image classification system, characterized in that the system comprises:

the classification unit is used for determining a hidden layer with the least dimensionality and the extracted image characteristics having the meaning of representing the image category in the self-encoder as an effective characteristic layer, clustering the effective characteristic layer by utilizing a spectral clustering algorithm, and dividing all test images into a plurality of clustering clusters according to the image characteristics extracted by the effective characteristic layer;

8. The system of claim 7, wherein the training unit inputs the labeled image into a self-encoder model to train the self-encoder, comprising: and inputting the labeled image added with the noise into an auto-encoder model to train the auto-encoder.

9. The system of claim 7, wherein the number of clusters in the classification unit is: the number of the image classification types is consistent with the number of the image classification types required to be obtained.

10. The system of claim 7, wherein the images with artificial classification labels input to the training unit comprise: and the artificial classification label image at least comprises a classification label to be classified and determined by the unlabeled image.

11. The system of claim 7, wherein the training unit trains the self-encoder, comprising: the self-encoder is trained using the mean square error function as a loss function and an Adam optimizer.

12. The system of claim 7, further comprising: and the resetting unit is used for re-training the self-encoder and classifying the test images if the number of the manual classification label images of different types in the clustering cluster is the same or similar.