CN110944166B - Objective evaluation method for stereoscopic image visual satisfaction - Google Patents
Objective evaluation method for stereoscopic image visual satisfaction Download PDFInfo
- Publication number
- CN110944166B CN110944166B CN201911105745.3A CN201911105745A CN110944166B CN 110944166 B CN110944166 B CN 110944166B CN 201911105745 A CN201911105745 A CN 201911105745A CN 110944166 B CN110944166 B CN 110944166B
- Authority
- CN
- China
- Prior art keywords
- image
- objective evaluation
- value
- training set
- visual
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000000007 visual effect Effects 0.000 title claims abstract description 121
- 238000011156 evaluation Methods 0.000 title claims abstract description 113
- 238000012549 training Methods 0.000 claims abstract description 97
- 238000000034 method Methods 0.000 claims abstract description 34
- 230000008447 perception Effects 0.000 claims abstract description 32
- 238000012360 testing method Methods 0.000 claims abstract description 29
- 238000013210 evaluation model Methods 0.000 claims abstract description 23
- 241000282414 Homo sapiens Species 0.000 claims abstract description 17
- 230000008569 process Effects 0.000 claims description 18
- FYLGZBNNVGZPIW-UHFFFAOYSA-N 1-[4-(1,4-dioxa-8-azaspiro[4.5]decan-8-yl)phenyl]ethanone Chemical compound C1=CC(C(=O)C)=CC=C1N1CCC2(OCCO2)CC1 FYLGZBNNVGZPIW-UHFFFAOYSA-N 0.000 claims description 12
- 230000006870 function Effects 0.000 claims description 7
- 238000002474 experimental method Methods 0.000 claims description 5
- 230000004438 eyesight Effects 0.000 claims description 5
- 230000000694 effects Effects 0.000 claims description 4
- 230000002159 abnormal effect Effects 0.000 claims description 3
- 230000003993 interaction Effects 0.000 claims description 3
- 238000012216 screening Methods 0.000 claims description 3
- 230000016776 visual perception Effects 0.000 claims description 3
- 238000005516 engineering process Methods 0.000 description 6
- 238000011160 research Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 230000006872 improvement Effects 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000012805 post-processing Methods 0.000 description 2
- 206010052143 Ocular discomfort Diseases 0.000 description 1
- 208000003464 asthenopia Diseases 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000007654 immersion Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000013441 quality evaluation Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 230000035807 sensation Effects 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N17/00—Diagnosis, testing or measuring for television systems or their details
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Processing Or Creating Images (AREA)
- Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
Abstract
The invention discloses an objective evaluation method of three-dimensional image visual satisfaction, which comprises the steps of obtaining an average subjective evaluation mean value of 3D visual satisfaction of each three-dimensional image in a training set during training; calculating the objective evaluation value of visual comfort, the objective evaluation value of perceived absolute distance and the objective evaluation value of perceived relative distance of each three-dimensional image; fitting all average subjective evaluation mean values and all visual comfort objective evaluation values, perception absolute distance objective evaluation values and perception relative distance objective evaluation values by using a training model epsilon-SVR to obtain a stereoscopic image visual satisfaction objective evaluation model; during testing, obtaining an objective evaluation value of visual comfort degree, an objective evaluation value of perceived absolute distance and an objective evaluation value of perceived relative distance of the stereo image to be evaluated, inputting the objective evaluation values into an objective evaluation model of the visual satisfaction degree of the stereo image, and predicting to obtain a predicted value of the visual satisfaction degree; the method has the advantage that the objective evaluation result consistent with the subjective perception of human satisfaction degree on the stereoscopic image can be predicted.
Description
Technical Field
The invention relates to an image quality evaluation technology, in particular to an objective evaluation method for the visual satisfaction degree of a three-dimensional image.
Background
With the rapid development of stereoscopic video display technology and high-quality stereoscopic video content acquisition technology, quality of experience (QoE) of stereoscopic images/videos has become an important issue in the design of stereoscopic systems, and Visual Comfort (VC) and Perceived Depth (PD) are important factors affecting the quality of visual experience of stereoscopic images/videos. Currently, research on the quality of visual experience of stereoscopic images mainly focuses on researching the influence of content distortion on the quality of visual experience, and a lot of research results have been generated in this respect. The research on the visual comfort of the stereo image is the physiological discomfort caused on the premise that the content of the stereo image is not distorted, and the perception depth of the stereo image is closely related to the stereoscopic impression, immersion impression and presence impression brought to a viewer by the stereo image. The visual comfort degree is defined to be the 3D visual satisfaction degree in combination with the perception depth, and the research on the objective evaluation method of the 3D visual satisfaction degree has very important effects on improving the visual experience quality of a viewer and guiding the production and post-processing of 3D content.
The visual comfort of the stereoscopic image is closely related to the binocular parallax, the binocular parallax is a cause for generating stereoscopic sensation, and the overlarge binocular parallax is an important factor for generating visual discomfort. The existing objective evaluation method for the visual comfort of the stereo image mainly comprises the steps of extracting parallax features of the stereo image, such as parallax amplitude, parallax gradient, relative parallax, parallax range and the like, and then establishing an objective model by using a statistical method or a machine learning algorithm. When the objective models are used for guiding the production and post-processing of the 3D content, binocular parallax can be excessively reduced so that the visual comfort level is optimal, but the problems of insufficient stereoscopic impression and long distance between a virtual scene and human eyes are caused, namely, the perception depth is weakened, and the visual experience quality is reduced. Therefore, an objective evaluation method for the visual satisfaction of the stereoscopic image, which combines the visual comfort and the perceived depth, needs to be researched to better guide the production and processing of the 3D content.
Disclosure of Invention
The invention aims to provide an objective evaluation method for the stereoscopic image visual satisfaction, which integrates visual comfort evaluation and perception depth evaluation and can predict an objective evaluation result consistent with subjective perception of the stereoscopic image satisfaction of human beings.
The technical scheme adopted by the invention for solving the technical problems is as follows: an objective evaluation method for the visual satisfaction degree of a three-dimensional image is characterized by comprising a training stage and a testing stage;
the specific steps of the training phase process are as follows:
step 1_ 1: selecting N width as WimageAnd has a height HimageThe stereo images form a training set; wherein N is a positive integer, and N is more than or equal to 200;
step 1_ 2: obtaining an average subjective evaluation mean value of 3D visual satisfaction of each three-dimensional image in a training set; then, the set formed by the mean subjective evaluation average of the 3D visual satisfaction of all the stereo images in the training set is recorded as { VSMOSnN is more than or equal to 1 and less than or equal to N }; wherein N is a positive integer, N is more than or equal to 1 and less than or equal to N, and VSMOSnPresentation trainingThe average subjective evaluation mean value of the 3D visual satisfaction of the concentrated nth three-dimensional image comprises the comprehensive effect of visual comfort and perception depth;
step 1_ 3: calculating the objective evaluation value of the visual comfort degree of each three-dimensional image in the training set according to the objective evaluation model of the visual comfort degree; then, the set of objective evaluation values of visual comfort of all the stereo images in the training set is recorded as { VCA }nN is more than or equal to 1 and less than or equal to N }; wherein, VCAnAn objective evaluation value representing the visual comfort of the nth three-dimensional image in the training set;
step 1_ 4: acquiring the right viewpoint parallax image of each three-dimensional image in the training set, and recording the right viewpoint parallax image of the nth three-dimensional image in the training set as DRnWill DRnThe pixel value of the pixel point with the middle coordinate position (x, y) is recorded as DRn(x,y),DRn(x, y) is also the right viewpoint parallax value of the pixel point with the coordinate position (x, y) in the nth three-dimensional image in the training set; then, calculating a physical distance map of each stereo image in the training set, and recording the physical distance map of the nth stereo image in the training set as VDnWill VDnThe pixel value of the pixel point with the middle coordinate position (x, y) is recorded as VDn(x,y),VDn(x, y) is also the physical distance value, VD, of the pixel point with coordinate position (x, y) in the nth stereo image in the training setnThe units of (x, y) are millimeters; wherein x is more than or equal to 0 and less than Wimage,0≤y<HimageP denotes the interpupillary distance of the subject's human eye, L denotes the physical distance of the subject's human eye to the 3D display, DRSn(x, y) denotes DRnThe screen parallax value of the pixel point with the middle coordinate position (x, y),the 3D display has a screen width of W3DAnd the height of the 3D display screen is H3D,p、L、W3D、H3DAll in millimeters;
step 1_ 5: calculating an objective evaluation value of the perceived absolute distance of each three-dimensional image in the training set according to the physical distance map of each three-dimensional image in the training set; then, the set of objective evaluation values of perceived absolute distance of all stereo images in the training set is recorded as { APDAnN is more than or equal to 1 and less than or equal to N }; wherein, APDAnAn objective evaluation value representing the perceived absolute distance of the nth stereo image in the training set;
step 1_ 6: calculating an objective evaluation value of the perceived relative distance of each three-dimensional image in the training set according to the physical distance map of each three-dimensional image in the training set; then, a set of objective evaluation values of perceptual relative distances of all stereo images in the training set is recorded as { RPDAnN is more than or equal to 1 and less than or equal to N }; wherein, RPDAnAn objective evaluation value representing the perceived relative distance of the nth stereo image in the training set;
step 1_ 7: using a training model of ε -SVR in VSMOSn1 is not less than N is not less than N and { VCA |n|1≤n≤N}、{APDAn|1≤n≤N}、{RPDAnFitting is carried out between |1 and N which are not less than N to obtain a stereoscopic image visual satisfaction objective evaluation model, and the description is as follows: VCPDA ═ VCPD (VCA, APDA, RPDA); the kernel function of the training model epsilon-SVR uses a histogram interaction type, a width coefficient gamma is 1/256, a penalty coefficient C is 4 and an insensitive loss coefficient epsilon is 0.1, VCPD () is a function representation form of a stereoscopic image visual satisfaction objective evaluation model, VCA, APDA and RPDA are all input of the stereoscopic image visual satisfaction objective evaluation model, VCA is used for representing a visual comfort objective evaluation value of a stereoscopic image, APDA is used for representing a perception absolute distance objective evaluation value of the stereoscopic image, RPDA is used for representing a perception relative distance objective evaluation value of the stereoscopic image, and VCPDA is output of the stereoscopic image visual satisfaction objective evaluation model;
the test stage process comprises the following specific steps:
step 2_ 1: for a stereo image with visual satisfaction to be evaluated, acquiring the objective evaluation value of visual comfort and the perceived absolute distance of the stereo image to be evaluated in the same way according to the processes from step 1_3 to step 1_6The objective evaluation value from the objective evaluation value and the perceived relative distance are correspondingly recorded as VCAtest、APDAtest、RPDAtest(ii) a Then VCAtest、APDAtest、RPDAtestAnd inputting the three-dimensional image visual satisfaction degree prediction value into a three-dimensional image visual satisfaction degree objective evaluation model to obtain the visual satisfaction degree prediction value of the three-dimensional image to be evaluated.
In the step 1_2, the specific process of obtaining the average subjective evaluation mean value of the 3D visual satisfaction of each stereo image in the training set is as follows:
step 1_2 a: selecting more than 10 testees containing gender of male and female to participate in subjective experiments, wherein the visual system of each tester meets the condition that the stereoscopic vision is less than 60 arcsec and passes the color test;
step 1_2 b: making a distance between each subject and a 3D display for displaying a stereoscopic image 3 times a screen height of the 3D display, and enabling each subject to view the stereoscopic image displayed on the 3D display in head-up; informing each subject of a scoring standard, wherein the scoring standard is two subjective visual perceptions of visual comfort and perception depth of the comprehensive stereo image, and scoring on the whole;
step 1_2 c: enabling the 3D display to sequentially display each three-dimensional image in the training set, and displaying for 2-3 times in a co-circulation mode, enabling the display time of each three-dimensional image on the 3D display to be 5-10 seconds, enabling the 3D display to display the next three-dimensional image after the testee has a rest for 3-10 seconds, and enabling the 3D display to display a middle gray image during the rest of the testee; each subject looks up to view the stereoscopic images displayed on the 3D display, and when each stereoscopic image in the training set is displayed in turn in the last pass, each subject scores the 3D visual satisfaction level of the stereoscopic image just displayed on the 3D display during the rest period according to the absolute category rating in the international standards ITU-T p.910 and ITU-T p.911, 5 points represent "very satisfactory", 4 points represent "satisfactory", 3 points represent "normal", 2 points represent "unsatisfactory", 1 point represents "very unsatisfactory";
step 1_2 d: after each subject scores the 3D visual satisfaction level of each stereo image in the training set, removing abnormal scores and keeping normal scores according to a screening method disclosed in the international standard ITU-R BT.500-11;
step 1_2 e: and taking the average value of all normal scores of the stereo images as the average subjective evaluation average value of the 3D visual satisfaction of the stereo images aiming at each stereo image in the training set.
The APDA in the step 1-5nThe acquisition process comprises the following steps:
step 1_5 a: for VDnThe pixel values of all the pixel points in the image are sequenced from large to small or from small to large;
step 1_5 b: removing the smallest first 1% pixel values from all the sorted pixel values; then, the minimum value is found from all the left pixel values and is marked as vdmin;
The RPDA in the step 1_6nThe acquisition process comprises the following steps:
step 1_6 a: for VDnThe pixel values of all the pixel points in the image are sequenced from large to small or from small to large;
step 1_6 b: removing the smallest first 1% pixel value and the largest first 1% pixel value from all the sorted pixel values; then, the minimum value and the maximum value are found out from all the left pixel values, and the correspondence is recorded as vdminAnd vdmax;
Compared with the prior art, the invention has the advantages that:
the method disclosed by the invention integrates visual comfort evaluation and perception depth evaluation, wherein the perception depth evaluation comprises perception absolute distance evaluation and perception relative distance evaluation, and then a mapping relation between three objective evaluation values of visual comfort, perception absolute distance and perception relative distance and an average subjective evaluation mean value of 3D visual satisfaction of a three-dimensional image is established by using a machine learning method, so that an objective evaluation model of the visual satisfaction of the three-dimensional image is constructed, the preference degree of a human visual system to the three-dimensional image can be more accurately predicted, and an objective evaluation result consistent with the subjective perception of the satisfaction degree of the human three-dimensional image can be predicted.
Drawings
FIG. 1 is a block diagram of an overall implementation of the method of the present invention;
FIG. 2 is a diagram illustrating a physical distance value VD of a pixel point with a coordinate position (x, y) in the nth stereo image in the training setn(x, y) and screen disparity value DRS of pixel point with coordinate position (x, y) in right viewpoint disparity image of nth stereo image in training setn(x, y), the physical distance L from the human eye of the subject to the 3D display, and the interpupillary distance p of the human eye of the subject.
Detailed Description
The invention is described in further detail below with reference to the accompanying examples.
The overall implementation block diagram of the objective evaluation method for the visual satisfaction degree of the stereo image provided by the invention is shown in fig. 1, and the method comprises a training stage and a testing stage.
The specific steps of the training phase process are as follows:
step 1_ 1: selecting N width as WimageAnd has a height HimageThe stereo images form a training set; wherein N is a positive integer, and N is more than or equal to 200.
Step 1_ 2: obtaining an average subjective evaluation mean value of 3D visual satisfaction of each three-dimensional image in a training set; then, the set formed by the mean subjective evaluation average of the 3D visual satisfaction of all the stereo images in the training set is recorded as { VSMOSnN is more than or equal to 1 and less than or equal to N }; wherein N is a positive integer, N is more than or equal to 1 and less than or equal to N, and VSMOSnAnd the average subjective evaluation mean value of the 3D visual satisfaction of the nth stereo image in the training set is represented, and the average subjective evaluation mean value of the 3D visual satisfaction comprises the comprehensive effect of visual comfort and perception depth.
In this embodiment, in step 1_2, the specific process of obtaining the average subjective evaluation mean of the 3D visual satisfaction of each stereo image in the training set includes:
step 1_2 a: more than 10 subjects including gender were selected to participate in the subjective experiment, each subject's visual system satisfied stereoscopic vision less than 60 arcsec and passed the color test.
Step 1_2 b: making a distance between each subject and a 3D display for displaying a stereoscopic image 3 times a screen height of the 3D display, and enabling each subject to view the stereoscopic image displayed on the 3D display in head-up; and informing each subject of a scoring standard, wherein the scoring standard is two subjective visual perceptions of visual comfort and perception depth of the comprehensive stereo image, and scoring on the whole.
Step 1_2 c: enabling the 3D display to sequentially display each three-dimensional image in the training set, and displaying for 2-3 times in a co-circulation mode, enabling the display time of each three-dimensional image on the 3D display to be 5-10 seconds, enabling the 3D display to display the next three-dimensional image after the testee has a rest for 3-10 seconds, and enabling the 3D display to display a middle gray image during the rest of the testee; each subject looks up to view the stereoscopic images displayed on the 3D display, and when each stereoscopic image in the training set is displayed in turn on the last pass, each subject scores the 3D visual satisfaction level of the stereoscopic image just displayed on the 3D display during the rest period according to the absolute category rating in the international standards ITU-T p.910 and ITU-T p.911, 5 scores represent "very satisfactory", 4 scores represent "satisfactory", 3 scores represent "normal", 2 scores represent "unsatisfactory", and 1 score represents "very unsatisfactory".
The N stereoscopic images are circularly displayed for 2-3 times, and if the number of times of circular display is too large, the visual fatigue of a subject is easy to generate, so that the problem of inaccuracy exists in the last time of printing; the display time of each stereoscopic image on the 3D display may be determined according to the size of the stereoscopic image, and the display time of the large-sized stereoscopic image may be generally set to be longer, for example, to be 10 seconds.
Step 1_2 d: after each subject scored the 3D visual satisfaction rating of each stereoscopic image in the training set, the abnormal scores were removed and the normal scores were retained according to the screening method disclosed in international standard ITU-R bt.500-11.
Step 1_2 e: and taking the average value of all normal scores of the stereo images as the average subjective evaluation average value of the 3D visual satisfaction of the stereo images aiming at each stereo image in the training set.
Step 1_ 3: calculating the objective evaluation value of the visual comfort degree of each three-dimensional image in the training set according to the objective evaluation model of the visual comfort degree; then, the set of objective evaluation values of visual comfort of all the stereo images in the training set is recorded as { VCA }nN is more than or equal to 1 and less than or equal to N }; wherein, VCAnThe objective evaluation value of the visual comfort of the nth stereo image in the training set is shown.
Here, the objective evaluation model of visual comfort was derived from the thesis: jung, h.sohn, s.lee, h.w.park, and y.m.ro. "compressing Visual science disconformation of Stereoscopic Images using Human Attention Model", IEEE Transactions on Circuits & Systems for Video Technology, vol.23, No.12, pp.2077-2082,2013 (published in journal of circuit and system of Video Technology, volume 23, phase 12, page 2077 and 2082,2013, by the institute of electrical and electronics engineers, using the Attention Model to predict the Visual comfort of Stereoscopic Images).
Step 1_ 4: acquiring the right viewpoint parallax image of each three-dimensional image in the training set, and recording the right viewpoint parallax image of the nth three-dimensional image in the training set as DRnWill DRnThe pixel value of the pixel point with the middle coordinate position (x, y) is recorded as DRn(x,y),DRn(x, y) is also the right viewpoint parallax value of the pixel point with the coordinate position (x, y) in the nth three-dimensional image in the training set; then, calculating a physical distance map of each stereo image in the training set, and recording the physical distance map of the nth stereo image in the training set as VDnWill VDnThe pixel value of the pixel point with the middle coordinate position (x, y) is recorded as VDn(x,y),VDn(x, y) is also the physical distance value of the pixel point with the coordinate position (x, y) in the nth three-dimensional image in the training set; wherein x is more than or equal to 0 and less than Wimage,0≤y<HimageP denotes the interpupillary distance of the subject's human eye, L denotes the physical distance of the subject's human eye to the 3D display, DRSn(x, y) denotes DRnThe screen parallax value of the pixel point with the middle coordinate position (x, y),the 3D display has a screen width of W3DAnd the height of the 3D display screen is H3D,p、L、W3D、H3DAll in millimeters.
Here, DRnObtaining by the prior art; FIG. 2 shows a physical distance value VD of a pixel point with coordinate position (x, y) in the nth stereo image in the training setn(x, y) and screen disparity value DRS of pixel point with coordinate position (x, y) in right viewpoint disparity image of nth stereo image in training setn(x, y), the physical distance L from the human eye of the subject to the 3D display, and the interpupillary distance p of the human eye of the subject.
Step 1_ 5: calculating an objective evaluation value of the perceived absolute distance of each three-dimensional image in the training set according to the physical distance map of each three-dimensional image in the training set; then, the set of objective evaluation values of perceived absolute distance of all stereo images in the training set is recorded as { APDAnN is more than or equal to 1 and less than or equal to N }; wherein, APDAnAnd the objective evaluation value of the perceived absolute distance of the nth stereo image in the training set is shown.
In this embodiment, APDA in step 1_5nThe acquisition process comprises the following steps:
step 1_5 a: for VDnThe pixel values of all the pixel points in the image are sorted from large to small or from small to large.
Step 1_5 b: removing the smallest first 1% pixel values from all the sorted pixel values; then, the minimum value is found from all the left pixel values and is marked as vdmin。
Step 1_ 6: calculating an objective evaluation value of the perceived relative distance of each three-dimensional image in the training set according to the physical distance map of each three-dimensional image in the training set; then, a set of objective evaluation values of perceptual relative distances of all stereo images in the training set is recorded as { RPDAnN is more than or equal to 1 and less than or equal to N }; wherein, RPDAnThe objective evaluation value of the perceived relative distance of the nth stereo image in the training set is shown.
In this embodiment, RPDA in step 1_6nThe acquisition process comprises the following steps:
step 1_6 a: for VDnThe pixel values of all the pixel points in the image are sorted from large to small or from small to large.
Step 1_6 b: removing the smallest first 1% pixel value and the largest first 1% pixel value from all the sorted pixel values; then, the minimum value and the maximum value are found out from all the left pixel values, and the correspondence is recorded as vdminAnd vdmax。
Step 1_ 7: adopts a classical training model of epsilon-SVR in VSMOSn1 is not less than N is not less than N and { VCA |n|1≤n≤N}、{APDAn|1≤n≤N}、{RPDAnFitting is carried out between |1 and N which are not less than N to obtain a stereoscopic image visual satisfaction objective evaluation model, and the description is as follows: VCPDA ═ VCPD (VCA, APDA, RPDA); the kernel function of the training model epsilon-SVR uses a histogram interaction type, a width coefficient gamma is 1/256, a penalty coefficient C is 4 and an insensitive loss coefficient epsilon is 0.1, VCPD () is a function representation form of a stereoscopic image visual satisfaction objective evaluation model, VCA, APDA and RPDA are all input of the stereoscopic image visual satisfaction objective evaluation model, VCA is used for representing a visual comfort objective evaluation value of a stereoscopic image, APDA is used for representing a perception absolute distance objective evaluation value of the stereoscopic image, and RPDA is used for representing a perception absolute distance objective evaluation value of the stereoscopic imageThe VCPDA is the output of the objective evaluation model of the visual satisfaction degree of the stereo image.
The test stage process comprises the following specific steps:
step 2_ 1: for a stereoscopic image with visual satisfaction to be evaluated, acquiring the visual comfort objective evaluation value, the perception absolute distance objective evaluation value and the perception relative distance objective evaluation value of the stereoscopic image to be evaluated in the same way according to the processes from the step 1_3 to the step 1_6, and correspondingly recording the evaluation values as VCAtest、APDAtest、RPDAtest(ii) a Then VCAtest、APDAtest、RPDAtestAnd inputting the three-dimensional image visual satisfaction degree prediction value into a three-dimensional image visual satisfaction degree objective evaluation model to obtain the visual satisfaction degree prediction value of the three-dimensional image to be evaluated.
To further illustrate the feasibility and effectiveness of the method of the present invention, experiments were conducted on the method of the present invention.
Here, taking a stereo image database provided by korea institute of science and technology images and video systems laboratory (IVY LAB) as an example, the stereo image database includes 120 stereo images and right viewpoint parallax images of each stereo image, includes indoor and outdoor stereo images of various scene depths, and gives an average subjective evaluation mean of 3D visual satisfaction of each stereo image.
In this experiment, 22 stereo images in which the mean subjective evaluation of 3D visual satisfaction in the stereo image database was lower than 3 points were selected, and the numbers were: 2. 28, 29, 30, 32, 33, 35, 39, 46, 47, 49, 50, 51, 52, 53, 55, 70, 73, 74, 101, 102, 103. The 22 original stereo images are respectively subjected to parallax offset and reduction for 12 times, and then, 12 secondary stereo images with improved visual comfort are redrawn for each stereo image, and the 22 original stereo images and 264 secondary stereo images form an image set. Here, the parallax shifting and zooming out and redrawing methods come from the papers: jung, h.sohn, s.lee, and y.m.ro, "Visual comfort improvement in stereoscopic 3D display using perceptual comfort assessment method of Visual comfort", IEEE Transactions on Consumer Electronics, vol.60, No.1, pp.1-9,2014 (perceptually reliable Visual comfort assessment index applied to Visual comfort improvement for stereoscopic 3D displays, published 2014 in the "Consumer Electronics" journal of the institute of electrical and Electronics engineers, volume 60, phase 1, pages 1-9).
In order to verify the performance of the stereoscopic image visual satisfaction objective evaluation model in the method, 18 original stereoscopic images and 234 corresponding secondary stereoscopic images are randomly selected from an image set to form a training set, the remaining 4 original images in the image set and 52 corresponding secondary stereoscopic images form a test set, and 7315 times of cross verification is needed. 4 common objective parameters of the evaluation method for evaluating the image quality are used as evaluation indexes, namely Pearson correlation coefficient (PLCC), Spearman correlation coefficient (SROCC), Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) under the condition of nonlinear regression. PLCC reflects the correlation between the objective evaluation predicted value and the subjective evaluation value; SROCC reflects monotonicity and consistency of an objective evaluation predicted value and a subjective evaluation value; MAE and RMSE reflect the accuracy of the objectively evaluated predictors. The method is utilized to obtain the vision satisfaction degree predicted value of each three-dimensional image in the image set, five-parameter Logistic function nonlinear fitting is carried out on the vision satisfaction degree predicted value of the 286 three-dimensional images, and the higher the PLCC and SROCC values are, the smaller the MAE and RMSE values are, the better the correlation between the objective evaluation result and the average subjective evaluation mean value is. Table 1 gives the evaluation index values for the method of the invention.
TABLE 1 evaluation of performance index values
Performance evaluation index | PLCC | SROCC | MAE | RMSE |
Index value | 0.9439 | 0.9349 | 0.2643 | 0.3308 |
As can be seen from Table 1, the objective evaluation result obtained by the method of the present invention is consistent with the result of subjective perception of human eyes, and the effectiveness of the method of the present invention is well illustrated.
Claims (1)
1. An objective evaluation method for the visual satisfaction degree of a three-dimensional image is characterized by comprising a training stage and a testing stage;
the specific steps of the training phase process are as follows:
step 1_ 1: selecting N width as WimageAnd has a height HimageThe stereo images form a training set; wherein N is a positive integer, and N is more than or equal to 200;
step 1_ 2: obtaining an average subjective evaluation mean value of 3D visual satisfaction of each three-dimensional image in a training set; then, the set formed by the mean subjective evaluation average of the 3D visual satisfaction of all the stereo images in the training set is recorded as { VSMOSnN is more than or equal to 1 and less than or equal to N }; wherein N is a positive integer, N is more than or equal to 1 and less than or equal to N, and VSMOSnThe average subjective evaluation mean value of the 3D visual satisfaction of the nth three-dimensional image in the training set is represented, and the average subjective evaluation mean value of the 3D visual satisfaction comprises the comprehensive effect of visual comfort and perception depth;
in the step 1_2, the specific process of obtaining the average subjective evaluation mean value of the 3D visual satisfaction of each stereo image in the training set is as follows:
step 1_2 a: selecting more than 10 testees containing gender of male and female to participate in subjective experiments, wherein the visual system of each tester meets the condition that the stereoscopic vision is less than 60 arcsec and passes the color test;
step 1_2 b: making a distance between each subject and a 3D display for displaying a stereoscopic image 3 times a screen height of the 3D display, and enabling each subject to view the stereoscopic image displayed on the 3D display in head-up; informing each subject of a scoring standard, wherein the scoring standard is two subjective visual perceptions of visual comfort and perception depth of the comprehensive stereo image, and scoring on the whole;
step 1_2 c: enabling the 3D display to sequentially display each three-dimensional image in the training set, and displaying for 2-3 times in a co-circulation mode, enabling the display time of each three-dimensional image on the 3D display to be 5-10 seconds, enabling the 3D display to display the next three-dimensional image after the testee has a rest for 3-10 seconds, and enabling the 3D display to display a middle gray image during the rest of the testee; each subject looks up to view the stereoscopic images displayed on the 3D display, and when each stereoscopic image in the training set is displayed in turn in the last pass, each subject scores the 3D visual satisfaction level of the stereoscopic image just displayed on the 3D display during the rest period according to the absolute category rating in the international standards ITU-T p.910 and ITU-T p.911, 5 points represent "very satisfactory", 4 points represent "satisfactory", 3 points represent "normal", 2 points represent "unsatisfactory", 1 point represents "very unsatisfactory";
step 1_2 d: after each subject scores the 3D visual satisfaction level of each stereo image in the training set, removing abnormal scores and keeping normal scores according to a screening method disclosed in the international standard ITU-R BT.500-11;
step 1_2 e: taking the average value of all normal scores of the three-dimensional images as the average subjective evaluation average value of the 3D visual satisfaction of the three-dimensional images aiming at each three-dimensional image in the training set;
step 1_ 3: according to the visual comfort objective evaluation model, calculating the visual comfort objective evaluation of each three-dimensional image in the training setA value; then, the set of objective evaluation values of visual comfort of all the stereo images in the training set is recorded as { VCA }nN is more than or equal to 1 and less than or equal to N }; wherein, VCAnAn objective evaluation value representing the visual comfort of the nth three-dimensional image in the training set;
step 1_ 4: acquiring the right viewpoint parallax image of each three-dimensional image in the training set, and recording the right viewpoint parallax image of the nth three-dimensional image in the training set as DRnWill DRnThe pixel value of the pixel point with the middle coordinate position (x, y) is recorded as DRn(x,y),DRn(x, y) is also the right viewpoint parallax value of the pixel point with the coordinate position (x, y) in the nth three-dimensional image in the training set; then, calculating a physical distance map of each stereo image in the training set, and recording the physical distance map of the nth stereo image in the training set as VDnWill VDnThe pixel value of the pixel point with the middle coordinate position (x, y) is recorded as VDn(x,y),VDn(x, y) is also the physical distance value, VD, of the pixel point with coordinate position (x, y) in the nth stereo image in the training setnThe units of (x, y) are millimeters; wherein x is more than or equal to 0 and less than Wimage,0≤y<HimageP denotes the interpupillary distance of the subject's human eye, L denotes the physical distance of the subject's human eye to the 3D display, DRSn(x, y) denotes DRnThe screen parallax value of the pixel point with the middle coordinate position (x, y),the 3D display has a screen width of W3DAnd the height of the 3D display screen is H3D,p、L、W3D、H3DAll in millimeters;
step 1_ 5: calculating an objective evaluation value of the perceived absolute distance of each three-dimensional image in the training set according to the physical distance map of each three-dimensional image in the training set; then, the set of objective evaluation values of perceived absolute distance of all stereo images in the training set is recorded as { APDAn|1≤n≤N};Wherein, APDAnAn objective evaluation value representing the perceived absolute distance of the nth stereo image in the training set;
the APDA in the step 1-5nThe acquisition process comprises the following steps:
step 1_5 a: for VDnThe pixel values of all the pixel points in the image are sequenced from large to small or from small to large;
step 1_5 b: removing the smallest first 1% pixel values from all the sorted pixel values; then, the minimum value is found from all the left pixel values and is marked as vdmin;
Step 1_ 6: calculating an objective evaluation value of the perceived relative distance of each three-dimensional image in the training set according to the physical distance map of each three-dimensional image in the training set; then, a set of objective evaluation values of perceptual relative distances of all stereo images in the training set is recorded as { RPDAnN is more than or equal to 1 and less than or equal to N }; wherein, RPDAnAn objective evaluation value representing the perceived relative distance of the nth stereo image in the training set;
the RPDA in the step 1_6nThe acquisition process comprises the following steps:
step 1_6 a: for VDnThe pixel values of all the pixel points in the image are sequenced from large to small or from small to large;
step 1_6 b: removing the smallest first 1% pixel value and the largest first 1% pixel value from all the sorted pixel values; then, the minimum value and the maximum value are found out from all the left pixel values, and the correspondence is recorded as vdminAnd vdmax;
Step 1_ 7: using a training model of ε -SVR in VSMOSn1 is not less than N is not less than N and { VCA |n|1≤n≤N}、{APDAn|1≤n≤N}、{RPDAnFitting is carried out between |1 and N which are not less than N to obtain a stereoscopic image visual satisfaction objective evaluation model, and the description is as follows: VCPDA ═ VCPD (VCA, APDA, RPDA); the kernel function of the training model epsilon-SVR uses a histogram interaction type, a width coefficient gamma is 1/256, a penalty coefficient C is 4 and an insensitive loss coefficient epsilon is 0.1, VCPD () is a function representation form of a stereoscopic image visual satisfaction objective evaluation model, VCA, APDA and RPDA are all input of the stereoscopic image visual satisfaction objective evaluation model, VCA is used for representing a visual comfort objective evaluation value of a stereoscopic image, APDA is used for representing a perception absolute distance objective evaluation value of the stereoscopic image, RPDA is used for representing a perception relative distance objective evaluation value of the stereoscopic image, and VCPDA is output of the stereoscopic image visual satisfaction objective evaluation model;
the test stage process comprises the following specific steps:
step 2_ 1: for a stereoscopic image with visual satisfaction to be evaluated, acquiring the visual comfort objective evaluation value, the perception absolute distance objective evaluation value and the perception relative distance objective evaluation value of the stereoscopic image to be evaluated in the same way according to the processes from the step 1_3 to the step 1_6, and correspondingly recording the evaluation values as VCAtest、APDAtest、RPDAtest(ii) a Then VCAtest、APDAtest、RPDAtestAnd inputting the three-dimensional image visual satisfaction degree prediction value into a three-dimensional image visual satisfaction degree objective evaluation model to obtain the visual satisfaction degree prediction value of the three-dimensional image to be evaluated.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911105745.3A CN110944166B (en) | 2019-11-13 | 2019-11-13 | Objective evaluation method for stereoscopic image visual satisfaction |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911105745.3A CN110944166B (en) | 2019-11-13 | 2019-11-13 | Objective evaluation method for stereoscopic image visual satisfaction |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110944166A CN110944166A (en) | 2020-03-31 |
CN110944166B true CN110944166B (en) | 2021-04-16 |
Family
ID=69907659
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911105745.3A Active CN110944166B (en) | 2019-11-13 | 2019-11-13 | Objective evaluation method for stereoscopic image visual satisfaction |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110944166B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105407349A (en) * | 2015-11-30 | 2016-03-16 | 宁波大学 | No-reference objective three-dimensional image quality evaluation method based on binocular visual perception |
CN105898279A (en) * | 2016-06-01 | 2016-08-24 | 宁波大学 | Stereoscopic image quality objective evaluation method |
KR20180117433A (en) * | 2017-04-19 | 2018-10-29 | 주식회사 넥슨코리아 | Method and system for testing stereo-scopic image |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5851332B2 (en) * | 2012-05-02 | 2016-02-03 | 日本電信電話株式会社 | 3D video quality evaluation apparatus, method and program |
CN109788275A (en) * | 2018-12-28 | 2019-05-21 | 天津大学 | Naturality, structure and binocular asymmetry are without reference stereo image quality evaluation method |
-
2019
- 2019-11-13 CN CN201911105745.3A patent/CN110944166B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105407349A (en) * | 2015-11-30 | 2016-03-16 | 宁波大学 | No-reference objective three-dimensional image quality evaluation method based on binocular visual perception |
CN105898279A (en) * | 2016-06-01 | 2016-08-24 | 宁波大学 | Stereoscopic image quality objective evaluation method |
KR20180117433A (en) * | 2017-04-19 | 2018-10-29 | 주식회사 넥슨코리아 | Method and system for testing stereo-scopic image |
Also Published As
Publication number | Publication date |
---|---|
CN110944166A (en) | 2020-03-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103152600B (en) | Three-dimensional video quality evaluation method | |
CN103763552B (en) | Stereoscopic image non-reference quality evaluation method based on visual perception characteristics | |
CN103986925B (en) | based on the stereoscopic video visual comfort evaluation method of luminance compensation | |
CN101610425B (en) | Method for evaluating stereo image quality and device | |
CN103780895B (en) | A kind of three-dimensional video quality evaluation method | |
CN102750695A (en) | Machine learning-based stereoscopic image quality objective assessment method | |
CN104581141B (en) | A kind of stereo image vision comfort level evaluation methodology | |
CN105407349A (en) | No-reference objective three-dimensional image quality evaluation method based on binocular visual perception | |
CN108520510B (en) | No-reference stereo image quality evaluation method based on overall and local analysis | |
CN109429051B (en) | Non-reference stereo video quality objective evaluation method based on multi-view feature learning | |
CN104754322A (en) | Stereoscopic video comfort evaluation method and device | |
CN109788275A (en) | Naturality, structure and binocular asymmetry are without reference stereo image quality evaluation method | |
CN114648482A (en) | Quality evaluation method and system for three-dimensional panoramic image | |
CN105654142A (en) | Natural scene statistics-based non-reference stereo image quality evaluation method | |
CN102722888A (en) | Stereoscopic image objective quality evaluation method based on physiological and psychological stereoscopic vision | |
CN104361583A (en) | Objective quality evaluation method of asymmetrically distorted stereo images | |
CN111641822B (en) | Method for evaluating quality of repositioning stereo image | |
CN110944165B (en) | Stereoscopic image visual comfort level improving method combining perceived depth quality | |
CN105488792B (en) | Based on dictionary learning and machine learning without referring to stereo image quality evaluation method | |
CN108848365B (en) | A kind of reorientation stereo image quality evaluation method | |
Kim et al. | Quality assessment of perceptual crosstalk on two-view auto-stereoscopic displays | |
Kim et al. | Visual comfort aware-reinforcement learning for depth adjustment of stereoscopic 3d images | |
CN110944166B (en) | Objective evaluation method for stereoscopic image visual satisfaction | |
Jiang et al. | Visual comfort assessment for stereoscopic images based on sparse coding with multi-scale dictionaries | |
Li et al. | Stereoscopic 3D Image Retargeting Quality Assessment. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20240119 Address after: 313200 Room 337, Building 3, No. 266, Zhenxing Road, Yuyue Town, Deqing County, Huzhou City, Zhejiang Province Patentee after: Huzhou Chuangguan Technology Co.,Ltd. Address before: 315211, Fenghua Road, Jiangbei District, Zhejiang, Ningbo 818 Patentee before: Ningbo University |