CN109978074A

CN109978074A - Image aesthetic feeling and emotion joint classification method and system based on depth multi-task learning

Info

Publication number: CN109978074A
Application number: CN201910272826.6A
Authority: CN
Inventors: 崔超然; 余俊; 杨文雅
Original assignee: Shandong University of Finance and Economics
Current assignee: Shandong University of Finance and Economics
Priority date: 2019-04-04
Filing date: 2019-04-04
Publication date: 2019-07-05

Abstract

Present disclose provides a kind of image aesthetic feeling based on depth multi-task learning and emotion joint classification method and system.Wherein, which includes: the corresponding aesthetic feeling classification of mark image and emotional category, forms training dataset；Construction includes the depth convolutional neural networks of across branch articulamentum and Liang Ge parallel network branch；Depth convolutional neural networks are trained using training dataset, until predefined loss function reaches minimum；The depth convolutional neural networks output given image obtained using training belongs to the probability of each aesthetic feeling classification and each emotional category, chooses prediction aesthetic feeling classification and emotional category of the classification respectively as given image of maximum probability in aesthetic feeling classification and emotional category.

Description

Image aesthetic feeling and emotion joint classification method and system based on depth multi-task learning

Technical field

The disclosure belongs to technical field of computer vision more particularly to a kind of image aesthetic feeling based on depth multi-task learning With emotion joint classification method and system.

Background technique

Only there is provided background technical informations relevant to the disclosure for the statement of this part, it is not necessary to so constitute first skill Art.

With the fast development of computer vision technique, people are not intended merely to computer capacity enough in semantic level to image Content is analyzed, it more desirable to which computer can simulate human vision and thought system, generate higher level sensing capability.Make Two representative tasks in research are understood for perception, and the aesthetic feeling classification of image and emotional semantic classification are respectively intended to make computer can be with The aesthetic and emotional responses that the identification mankind are generated by being stimulated by image vision.Currently, the aesthetic feeling classification of image and emotion point Class technology has been applied in the storage of image, editor, retrieval etc..For example, for user's shooting about same object Or multiple candidate photos of scene, the works of screening most aesthetic feeling are saved and are shown, reasonably reduce the storage overhead of data；? In the creation and editor of image artifacts, the aesthetic quality of analysis comparison candidate scheme promotes the visual sense of beauty of works；It is examined in image In cable system, the Sentiment orientation for returning to image is considered, provide semantic accurate and more infectious search result for user.

It is automatic to realize to the aesthetic feeling classification of image and emotion due to the diversity of picture material and the complexity of human perception Classification is challenging task.In recent years, have benefited from the large-scale image number with aesthetic feeling label and emotion label According to the appearance of collection, the method based on machine learning is widely adopted.The core procedure of method is to extract to have in classification task The Image Visual Feature of good discrimination ability.The method of early stage relies primarily on the feature of engineer, needs researcher to problem Itself there is deep understanding.As deep learning is in the rise of computer vision field, recent method mainly utilizes convolution refreshing Through network, automatically extraction feature is used for image aesthetic feeling and emotional semantic classification, and obtains preferable effect.

Inventors have found that aesthetic feeling classification and emotional semantic classification of the prior art usually by image are as two mutually independent Business.But instinctively, the aesthetic feeling impression and emotion impression of the mankind is not isolated appearance；On the contrary, in psychological cognition level, it Should be interrelated and interactional.For example, if piece image can make the pleasure of people's acquisition aesthetically, it It is likely to arouse the positive emotion of observer.The research of neuroscience field also turns out that the aesthetic experience of the mankind is a kind of The cognitive process constantly upgraded along with affective state, vice versa.

Summary of the invention

To solve the above-mentioned problems, the first aspect of the disclosure provides a kind of image beauty based on depth multi-task learning Sense and emotion joint classification method, by unified depth convolutional neural networks frame, making can be effective between two tasks Ground shared information realizes aesthetic feeling classification and emotional category the joint identification to image and identification accuracy and efficiency.

To achieve the goals above, the disclosure adopts the following technical scheme that

A kind of image aesthetic feeling and emotion joint classification method based on depth multi-task learning, comprising:

The corresponding aesthetic feeling classification of image and emotional category are marked, training dataset is formed；

Construction includes the depth convolutional neural networks of across branch articulamentum and Liang Ge parallel network branch；

Wherein, two network branches are each responsible for carrying out aesthetic feeling classification and emotional semantic classification to input picture；Across branch connection Layer is for connecting corresponding convolutional layer group in two network branches, to be associated with aesthetic feeling classification and the two tasks of emotional semantic classification；It is deep The output representing input images of degree convolutional neural networks belong to the probability of each aesthetic feeling classification and each emotional category；

Depth convolutional neural networks are trained using training dataset, until predefined loss function reaches minimum；

The depth convolutional neural networks output given image obtained using training belongs to each aesthetic feeling classification and each emotional category Probability, choose the classification of maximum probability in aesthetic feeling classification and emotional category respectively as the prediction aesthetic feeling classification of given image and Emotional category.

To solve the above-mentioned problems, the second aspect of the disclosure provides a kind of image beauty based on depth multi-task learning Sense and emotion joint classification system, by unified depth convolutional neural networks frame, making can be effective between two tasks Ground shared information realizes aesthetic feeling classification and emotional category the joint identification to image and identification accuracy and efficiency.

A kind of image aesthetic feeling and emotion joint classification system based on depth multi-task learning, comprising:

Training dataset forms module, is used to mark the corresponding aesthetic feeling classification of image and emotional category, forms training number According to collection；

Depth convolutional neural networks constructing module is used to construct comprising across branch articulamentum and Liang Ge parallel network branch Depth convolutional neural networks；

Depth convolutional neural networks training module is used to train depth convolutional neural networks using training dataset, Until predefined loss function reaches minimum；

Predict categorization module, the depth convolutional neural networks output given image for being used to obtain using training belongs to each beauty Feel the probability of classification and each emotional category, chooses the classification of maximum probability in aesthetic feeling classification and emotional category respectively as given figure The prediction aesthetic feeling classification and emotional category of picture.

To solve the above-mentioned problems, a kind of computer readable storage medium is provided in terms of the third of the disclosure, passed through Unified depth convolutional neural networks frame makes that information can be effectivelyd share between two tasks, realizes the aesthetic feeling to image Classification and the identification of emotional category joint and identification accuracy and efficiency.

A kind of computer readable storage medium, is stored thereon with computer program, realization when which is executed by processor Step in image aesthetic feeling and emotion joint classification method based on depth multi-task learning described above.

To solve the above-mentioned problems, the 4th aspect of the disclosure provides a kind of computer equipment, passes through unified depth Convolutional neural networks frame is spent, makes that information can be effectivelyd share between two tasks, realizes the aesthetic feeling classification and feelings to image Feel the identification of classification joint and identification accuracy and efficiency.

A kind of computer equipment can be run on a memory and on a processor including memory, processor and storage Computer program, the processor realize the image aesthetic feeling described above based on depth multi-task learning when executing described program With the step in emotion joint classification method.

The beneficial effect of the disclosure is:

The disclosure applies to the thought of multi-task learning in the aesthetic feeling classification and emotional semantic classification of image, takes full advantage of Associate feature between two tasks, and a unified depth convolutional neural networks frame is devised, by being connected across branch Layer makes can to make effectively share between two tasks in a manner of swap image characteristic pattern between network branches information, and Automatically learn which information is different task need in training process, realizes to combine the aesthetic feeling classification and emotional category of image and know Not, the accuracy of image aesthetic feeling classification and emotional category is improved.

Detailed description of the invention

The Figure of description for constituting a part of this disclosure is used to provide further understanding of the disclosure, and the disclosure is shown Meaning property embodiment and its explanation do not constitute the improper restriction to the disclosure for explaining the disclosure.

Fig. 1 is a kind of image aesthetic feeling and emotion joint classification based on depth multi-task learning that the embodiment of the present disclosure provides Method flow diagram.

Fig. 2 is the depth convolutional neural networks schematic diagram that the embodiment of the present disclosure provides.

Fig. 3 is across the branch articulamentum schematic diagram that the embodiment of the present disclosure provides.

Fig. 4 is a kind of image aesthetic feeling and emotion joint classification based on depth multi-task learning that the embodiment of the present disclosure provides System structure diagram.

Specific embodiment

The disclosure is described further with embodiment with reference to the accompanying drawing.

It is noted that following detailed description is all illustrative, it is intended to provide further instruction to the disclosure.Unless another It indicates, all technical and scientific terms used herein has usual with disclosure person of an ordinary skill in the technical field The identical meanings of understanding.

It should be noted that term used herein above is merely to describe specific embodiment, and be not intended to restricted root According to the illustrative embodiments of the disclosure.As used herein, unless the context clearly indicates otherwise, otherwise singular Also it is intended to include plural form, additionally, it should be understood that, when in the present specification using term "comprising" and/or " packet Include " when, indicate existing characteristics, step, operation, device, component and/or their combination.

The image aesthetic feeling and emotion joint classification based on depth multi-task learning that 1 pair of disclosure proposes with reference to the accompanying drawing Method elaborates.

As shown in Figure 1, a kind of image aesthetic feeling and emotion joint classification side based on depth multi-task learning of the present embodiment Method, comprising:

S101: the corresponding aesthetic feeling classification of mark image and emotional category form training dataset.

In specific implementation, in image aesthetic feeling classification problem, two class of high aesthetic feeling and low aesthetic feeling is divided the image into；In image In emotional semantic classification problem, pleasure is divided the image into, reveres, meet, excitement, indignation, detest, is frightened, total eight bases of sadness Emotional category.

Since the aesthetic feeling and emotion of people are all the very strong cognition attributes of subjectivity, there are apparent individual differences.Therefore, For the aesthetic feeling classification of image and the mark of emotional category, the strategy that same piece image is marked jointly using more people, it The classification for taking the highest classification of common recognition degree final as image afterwards.

It should be understood that in other examples, the classification of image aesthetic feeling and Image emotional semantic classification can also be divided into other classes Not, those skilled in the art can self-setting as the case may be, be not described in detail here.

S102: construction includes the depth convolutional neural networks of across branch articulamentum and Liang Ge parallel network branch；

Wherein, two network branches are each responsible for carrying out aesthetic feeling classification and emotional semantic classification to input picture；Across branch connection Layer is for connecting corresponding convolutional layer group in two network branches, to be associated with aesthetic feeling classification and the two tasks of emotional semantic classification；It is deep The output representing input images of degree convolutional neural networks belong to the probability of each aesthetic feeling classification and each emotional category.

Specifically, in the depth convolutional neural networks, convolutional layer group quantity in two network branches is identical to be n；Quantity across branch's articulamentum is n-1；I-th of across branch articulamentum is by i-th of corresponding convolutional layer in two network branches The characteristics of image figure of group output is stacked as input, and by these characteristics of image figures inputted along channel direction, by heap The characteristics of image of poststack is separately input into the corresponding convolutional layer group of i+1 in two network branches；1≤i≤n-1；N be greater than Or the positive integer equal to 2.

Depth convolutional neural networks in the present embodiment are as shown in Fig. 2.Network includes two parallel branch altogether, they connect By same width input picture, and it is each responsible for carrying out aesthetic feeling classification and emotional semantic classification to input picture.The knot of each network branches Structure is identical, is all based on VGG16 network structure (referring to Simonyan K, Zisserman A.Very deep convolutional networks for large-scale image recognition.arXiv preprint arXiv:1409.1556, 2014.).Each network branches are made of 5 convolutional layer groups, 3 full articulamentums and 1 Softmax layers.Wherein, single convolution Comprising multiple continuous convolutional layers and 1 maximum pond layer in layer group, the purpose is to extract effective characteristics of image figure.Full connection Layer carries out multiple nonlinear transformation to the characteristics of image figure of the last one convolutional layer group output, is mapped as a column vector. The dimension of vector is equal to the number of aesthetic feeling classification or emotional category, per the specific aesthetic feeling classification of one-dimensional correspondence one or emotion class Not.By final Softmax layer, which is converted into a probability value per one-dimensional, and representing input images belong to correspondence The probability of classification.Each layer of specific structure and parameter setting are referring to VGG16 network model in network branches.

Across branch articulamentum is introduced, convolutional layer group corresponding in two network branches is attached, across branch articulamentum Structure it is as shown in Fig. 3.The characteristics of image figure that across branch articulamentum exports two convolutional layer groups as input, and by they It is stacked along channel direction.Assuming that the channel number of heap prestack single image characteristic pattern is K (K is positive integer), then stack The channel number of characteristics of image figure is 2K afterwards.Then, the characteristics of image figure of heap poststack is inputted two convolution kernel sizes respectively is The convolutional layer of 1*1.The two convolutional layers all include K convolution kernel, and the step-length and edge filling size of convolution kernel are respectively 1 and 0. In this way, two convolutional layers will export new characteristics of image figure again, and the size of new characteristics of image figure is constant, channel Number reverts to K, and new characteristics of image figure is finally sent into the subsequent convolutional layer group of a network branches or full articulamentum respectively. Intuitively, across branch articulamentum makes to carry out shared information in a manner of swap image characteristic pattern between two network branches, and Facilitating model, automatically study determines which information is two tasks be respectively necessary in the training process；

In traditional depth multi-task learning method, different task is normally provided as sharing lower network layer, and Respective branch is maintained in higher network layer.Before carrying out multi-task learning training, need by virtue of experience artificial in advance Specify shared network layer in ground.This way lacks theoretical direction, for sharing the unreasonable selection side of may result in of network layer The serious downslide of method performance.It is different from the above method, the present embodiment all designs individually for different task on all-network layer Network branches, across branch articulamentum make between network branches can in a manner of swap image characteristic pattern come shared information, And automatically which information is study different task need in the training process, and then improves classification accuracy.

It should be noted that sequence between step 101 and step 102 can according to the concrete condition of those skilled in the art come Voluntarily adjustment sequence.

S103: training depth convolutional neural networks using training dataset, until predefined loss function reaches minimum.

In specific implementation, the process of depth convolutional neural networks is trained using training dataset, comprising:

The size dimension of all images of unified training dataset；

Initialize the weight of each layer of depth convolutional neural networks, predefined loss function；

Depth convolutional neural networks are trained using stochastic gradient descent algorithm, determination can make loss function minimum Network weight；And in each training iteration, one piece of fixed size image block is cut out from the random position of image, and with Certain probability carries out flip horizontal to image block.

During training depth convolutional neural networks using training dataset, firstly, by all training image scalings To the size of unified size, the present embodiment is by image zooming to 256*256 pixel；Then, the pixel for calculating training image is average Value, and make every piece image that the mean value be individually subtracted, which can make training image remove common ground, highlight training image Individual difference；Finally, it is big to cut out one piece of fixation from the random position for the image for subtracting mean value in each training iteration Small image block, and flip horizontal is carried out to image block with certain probability.In this way, training sample can effectively be expanded Quantity, the diversity of training for promotion sample.What the present embodiment was chosen is the image block of 224*224 pixel size, is carried out every time The probability of flip horizontal operation is 0.5.

In addition to the full articulamentum of the last layer and across branch articulamentum, the weight of each each layer of network branches is all made of The weight of the VGG16 model of pre-training initializes on ImageNet data set, to the full articulamentum of the last layer and across branch company The weight for connecing layer carries out random initializtion.Using cross entropy loss function, being defined on the classificatory loss of aesthetic feeling is La, in emotion Classificatory loss is Le, i.e.,

La=-y_a logp_a-(1-y_a)log(1-p_a)

Wherein, y_aThe true aesthstic classification for showing input picture, if it is high artistic image, value 1 that image is practical；It is no Then, value 0.y_eShow the real feelings classification of input picture, if image actually belongs to e-th of emotional category, value 1； Otherwise, value 0.p_aThe image for representing network output belongs to the probability of high aesthetic feeling classification, p_eThe image for representing network output belongs to The probability of e-th of emotional category.

Further, total loss function is L=La+ λ Le.Wherein, λ is the hyper parameter of two class of balance model loss.? In the present embodiment, consider that aesthetic feeling is classified as two classification problems, and emotional semantic classification is more classification problems, therefore the value that sets of λ is 1/4. Network is trained using stochastic gradient descent algorithm, determination can make the smallest network weight of loss function.

S104: the depth convolutional neural networks output given image obtained using training belongs to each aesthetic feeling classification and each emotion The probability of classification chooses prediction aesthetic feeling class of the classification respectively as given image of maximum probability in aesthetic feeling classification and emotional category Other and emotional category.

In the present embodiment, piece image is given, first by its scaling to 224*224 pixel, then image is inputted and is instructed The network perfected obtains the probability that it belongs to each aesthetic feeling classification and each emotional category, finally chooses the classification conduct of maximum probability The prediction aesthetic feeling classification and emotional category of image.

The present embodiment applies to the thought of multi-task learning in the aesthetic feeling classification and emotional semantic classification of image, makes full use of Associate feature between two tasks, and devise a unified depth convolutional neural networks frame, by connecting across branch Connecing layer makes to make that information can be effectivelyd share between two tasks in a manner of swap image characteristic pattern between network branches, and Automatically which information is study different task need in the training process, realizes and combines to the aesthetic feeling classification and emotional category of image Identification, improves the accuracy of image aesthetic feeling classification and emotional category.

The image aesthetic feeling and emotion joint classification based on depth multi-task learning that 4 pairs of disclosure propose with reference to the accompanying drawing System elaborates.

As shown in figure 4, a kind of image aesthetic feeling and emotion joint classification system based on depth multi-task learning of the present embodiment System, comprising: training dataset forms module 11, depth convolutional neural networks constructing module 12, the training of depth convolutional neural networks Module 13 and prediction categorization module 14.

Wherein:

Training dataset forms module 11, is used to mark the corresponding aesthetic feeling classification of image and emotional category, forms training Data set.

Depth convolutional neural networks constructing module 12 is used to construct comprising across branch articulamentum and two parallel networks point The depth convolutional neural networks of branch.

Depth convolutional neural networks training module 13 is used to train depth convolutional Neural net using training dataset Network, until predefined loss function reaches minimum.

The depth convolutional neural networks training module 13, comprising:

Size unified modules 131 are used for the size dimension of all images of unified training dataset；

Initialization module 132 is used to initialize the weight of each layer of depth convolutional neural networks, predefined loss function；

Repetitive exercise module 133 is used to be trained depth convolutional neural networks using stochastic gradient descent algorithm, Determination can make the smallest network weight of loss function；And in each training iteration, cut out from the random position of image One piece of fixed size image block, and flip horizontal is carried out to image block with certain probability.

La=-y_a logp_a-(1-y_a)log(1-p_a)

Predict categorization module 14, the depth convolutional neural networks output given image for being used to be obtained using training is belonged to respectively The probability of aesthetic feeling classification and each emotional category chooses the classification of maximum probability in aesthetic feeling classification and emotional category respectively as given The prediction aesthetic feeling classification and emotional category of image.

In another embodiment, a kind of computer readable storage medium is provided, computer program is stored thereon with, the journey The image aesthetic feeling and emotion joint classification method based on depth multi-task learning as shown in Figure 1 is realized when sequence is executed by processor In step.

In another embodiment, a kind of computer equipment is provided, including memory, processor and storage are on a memory And the computer program that can be run on a processor, the processor are realized as shown in Figure 1 based on depth when executing described program Spend the step in the image aesthetic feeling and emotion joint classification method of multi-task learning.

It should be understood by those skilled in the art that, embodiment of the disclosure can provide as method, system or computer program Product.Therefore, the shape of hardware embodiment, software implementation or embodiment combining software and hardware aspects can be used in the disclosure Formula.Moreover, the disclosure, which can be used, can use storage in the computer that one or more wherein includes computer usable program code The form for the computer program product implemented on medium (including but not limited to magnetic disk storage and optical memory etc.).

The disclosure is referring to method, the process of equipment (system) and computer program product according to the embodiment of the present disclosure Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.

These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.

These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.

Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with Relevant hardware is instructed to complete by computer program, the program can be stored in a computer-readable storage medium In, the program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, the storage medium can be magnetic Dish, CD, read-only memory (Read-Only Memory, ROM) or random access memory (Random AccessMemory, RAM) etc..

The foregoing is merely preferred embodiment of the present disclosure, are not limited to the disclosure, for the skill of this field For art personnel, the disclosure can have various modifications and variations.It is all within the spirit and principle of the disclosure, it is made any to repair Change, equivalent replacement, improvement etc., should be included within the protection scope of the disclosure.

Claims

1. a kind of image aesthetic feeling and emotion joint classification method based on depth multi-task learning characterized by comprising

Wherein, two network branches are each responsible for carrying out aesthetic feeling classification and emotional semantic classification to input picture；Across branch articulamentum is used In connecting corresponding convolutional layer group in two network branches, to be associated with aesthetic feeling classification and the two tasks of emotional semantic classification；Depth volume The output representing input images of product neural network belong to the probability of each aesthetic feeling classification and each emotional category；

The depth convolutional neural networks output given image obtained using training belongs to the general of each aesthetic feeling classification and each emotional category Rate chooses prediction aesthetic feeling classification and emotion of the classification respectively as given image of maximum probability in aesthetic feeling classification and emotional category Classification.

2. the image aesthetic feeling based on depth multi-task learning and emotion joint classification method as described in claim 1, feature It is, in the depth convolutional neural networks, identical convolutional layer group quantity in two network branches is n；Across branch company The quantity for connecing layer is n-1；The figure that i-th of across branch articulamentum exports i-th in two network branches corresponding convolutional layer group It is inputted as characteristic pattern is used as, and these characteristics of image figures inputted is stacked along channel direction, by the image of heap poststack Feature is separately input into the corresponding convolutional layer group of i+1 in two network branches；1≤i≤n-1；N is more than or equal to 2 Positive integer.

3. the image aesthetic feeling based on depth multi-task learning and emotion joint classification method as claimed in claim 2, feature It is, each convolutional layer group includes a maximum pond layer and at least two continuous convolutional layers.

4. the image aesthetic feeling based on depth multi-task learning and emotion joint classification method as described in claim 1, feature It is, the process of depth convolutional neural networks is trained using training dataset, comprising:

The size dimension of all images of unified training dataset；

Depth convolutional neural networks are trained using stochastic gradient descent algorithm, determination can make the smallest net of loss function Network weight；And in each training iteration, one piece of fixed size image block is cut out from the random position of image, and with certain Probability carries out flip horizontal to image block.

5. a kind of image aesthetic feeling and emotion joint classification system based on depth multi-task learning characterized by comprising

Training dataset forms module, is used to mark the corresponding aesthetic feeling classification of image and emotional category, forms training dataset；

Depth convolutional neural networks constructing module is used to construct the depth comprising across branch articulamentum and Liang Ge parallel network branch Spend convolutional neural networks；

Predict categorization module, the depth convolutional neural networks output given image for being used to obtain using training belongs to each aesthetic feeling class Not with the probability of each emotional category, the classification of maximum probability in aesthetic feeling classification and emotional category is chosen respectively as given image Predict aesthetic feeling classification and emotional category.

6. the image aesthetic feeling based on depth multi-task learning and emotion joint classification system as claimed in claim 5, feature It is, in the depth convolutional neural networks, identical convolutional layer group quantity in two network branches is n；Across branch company The quantity for connecing layer is n-1；The figure that i-th of across branch articulamentum exports i-th in two network branches corresponding convolutional layer group It is inputted as characteristic pattern is used as, and these characteristics of image figures inputted is stacked along channel direction, by the image of heap poststack Feature is separately input into the corresponding convolutional layer group of i+1 in two network branches；1≤i≤n-1；N is more than or equal to 2 Positive integer.

7. the image aesthetic feeling based on depth multi-task learning and emotion joint classification system as claimed in claim 6, feature It is, each convolutional layer group includes a maximum pond layer and at least two continuous convolutional layers.

8. the image aesthetic feeling based on depth multi-task learning and emotion joint classification system as claimed in claim 5, feature It is, the depth convolutional neural networks training module, comprising:

Size unified modules are used for the size dimension of all images of unified training dataset；

Initialization module is used to initialize the weight of each layer of depth convolutional neural networks, predefined loss function；

Repetitive exercise module is used to be trained depth convolutional neural networks using stochastic gradient descent algorithm, determines energy So that the smallest network weight of loss function；And in each training iteration, one piece is cut out admittedly from the random position of image Determine sized images block, and flip horizontal is carried out to image block with certain probability.

9. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is held by processor Such as the image aesthetic feeling and emotion joint classification of any of claims 1-4 based on depth multi-task learning is realized when row Step in method.

10. a kind of computer equipment including memory, processor and stores the meter that can be run on a memory and on a processor Calculation machine program, which is characterized in that the processor realizes such as base of any of claims 1-4 when executing described program Step in the image aesthetic feeling and emotion joint classification method of depth multi-task learning.