WO2022222689A1 - 数据生成方法、装置及电子设备 - Google Patents
数据生成方法、装置及电子设备 Download PDFInfo
- Publication number
- WO2022222689A1 WO2022222689A1 PCT/CN2022/083110 CN2022083110W WO2022222689A1 WO 2022222689 A1 WO2022222689 A1 WO 2022222689A1 CN 2022083110 W CN2022083110 W CN 2022083110W WO 2022222689 A1 WO2022222689 A1 WO 2022222689A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image data
- information
- target
- data
- target object
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 57
- 238000003709 image segmentation Methods 0.000 claims description 27
- 238000012549 training Methods 0.000 claims description 11
- 238000004422 calculation algorithm Methods 0.000 claims description 8
- 239000013598 vector Substances 0.000 claims description 5
- 238000012545 processing Methods 0.000 description 14
- 238000010586 diagram Methods 0.000 description 12
- 230000008569 process Effects 0.000 description 9
- 230000006870 function Effects 0.000 description 8
- 238000004590 computer program Methods 0.000 description 5
- 230000005540 biological transmission Effects 0.000 description 4
- 230000007613 environmental effect Effects 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 238000000926 separation method Methods 0.000 description 3
- 238000003491 array Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 230000001902 propagating effect Effects 0.000 description 2
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 1
- 241000282326 Felis catus Species 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 229910052802 copper Inorganic materials 0.000 description 1
- 239000010949 copper Substances 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T19/00—Manipulating 3D models or images for computer graphics
- G06T19/006—Mixed reality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/96—Management of image or video recognition tasks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/20—Scenes; Scene-specific elements in augmented reality scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/64—Three-dimensional objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/70—Labelling scene content, e.g. deriving syntactic or semantic representations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Definitions
- the present application relates to the technical field of mixed reality, and more particularly, to a data generation method, an apparatus, and an electronic device.
- MR Mixed Reality
- MR Mixed Reality
- interacting with virtual objects enables users to better understand the fun of some key data in the real environment.
- the mixed reality data generated by current electronic devices are often rough, for example, only recognize large surfaces in the real environment, such as the surfaces of objects such as floors, ceilings, walls, etc., and superimpose based on the recognized information.
- the surfaces of objects such as floors, ceilings, walls, etc.
- One objective of the embodiments of the present application is to provide a new technical solution for generating mixed reality data, so as to improve the fun of the user when using the electronic device.
- a data generation method comprising:
- first image data is data representing the real environment where the user is located
- Obtaining category information and plane information of a target object wherein the target object is an object in the first image data, and the plane information includes information on the outer surface of the target object;
- the second image data is data including virtual objects
- the first image data and the second image data are mixed to generate target image data, wherein the target image data is an image containing the target object and the virtual object data.
- the generating target image data by mixing the first image data and the second image data according to the plane information and the category information includes: determining the relative positional relationship between the virtual object in the second image data and the target object in the first image data; according to the plane information and the relative positional relationship, the virtual object is rendered to the target object At the preset position of the target object, the target image data is obtained.
- the obtaining the category information and plane information of the target object includes: inputting the first image data into a target image segmentation model to obtain mask information of the target object; according to the mask information to obtain the category information and the plane information.
- obtaining the category information according to the mask information includes: inputting the mask information into a target category recognition model to obtain the category information.
- the obtaining the plane information according to the mask information includes: obtaining, according to the mask information, a target image block corresponding to the target object in the first image data; For the target image block, obtain the target position information of the key points of the target object in the world coordinate system, wherein the key points include the corner points of the target object; according to the target position information and the preset plane A combination algorithm is used to obtain the plane information, wherein the plane information includes the coordinates of the center point and the plane normal vector corresponding to each plane of the target object.
- the method is applied to an electronic device, and the acquiring, according to the target image block, the target position information of the key points of the target object in the world coordinate system includes: according to the target image block, Detect the first position information of the key point in the first image data; obtain the pose information of the electronic device at the first moment, and the third image data obtained by the key point at the second moment The second position information in , wherein the first moment includes the current moment, and the second moment is earlier than the first moment; information to obtain the target location information.
- the target image segmentation model and the target category recognition model are obtained by training through the following steps: acquiring sample data, wherein the sample data is data including sample objects in a preset scene; according to the The sample data is used to jointly train the initial image segmentation model and the initial category recognition model to obtain the target image segmentation model and the target category recognition model.
- the method further comprises: displaying the target image data.
- the present application also provides a data generating device, comprising:
- a first image data acquisition module configured to acquire first image data, wherein the first image data is data representing the real environment where the user is located;
- an information acquisition module configured to acquire category information and plane information of a target object, wherein the target object is an object in the first image data, and the plane information includes information on the outer surface of the target object;
- a second image data acquisition module configured to acquire second image data, wherein the second image data is data including virtual objects;
- a target image data generation module configured to mix the first image data and the second image data according to the category information and the plane information to generate target image data, wherein the target image data is composed of the The target object and the data of the virtual object.
- an electronic device which includes the device according to the second aspect of the present application.
- the electronic device includes: a memory for storing executable instructions; and a processor for running the electronic device to execute the method described in the first aspect of the present application according to the control of the instructions.
- the beneficial effect of the present application is that, according to the embodiment of the present application, the electronic device obtains the plane information and category information of the target object in the first image data by obtaining the first image data representing the real environment where the user is located; after that, By acquiring the second image data including the virtual object, the first image data and the second image data can be mixed according to the plane information and the category information to obtain target image data including both the target object and the virtual object.
- the method provided by this embodiment recognizes the information on the outer surface and the category information of the target object, so that when the electronic device constructs the mixed reality data, based on the category information and plane information of the target object, the electronic device can accurately carry out the process with the virtual object aggregated by the virtual environment. Combined to improve the fineness of the target image data obtained by construction, thereby improving the user experience and increasing the fun when the user uses the electronic device.
- FIG. 1 is a schematic flowchart of a data generation method provided by an embodiment of the present application.
- FIG. 2 is a principle block diagram of a data generating apparatus provided by an embodiment of the present application.
- FIG. 3 is a schematic diagram of a hardware structure of an electronic device provided by an embodiment of the present application.
- FIG. 1 is a schematic flowchart of the data generation method provided by the embodiment of the present application.
- the method can be applied to an electronic device, so that the device can generate mixed reality data with high precision, and display the data for the user to view, so as to improve the user experience.
- the electronic device implementing the method may include a display device, for example, a display screen and at least two image capturing devices for capturing real environment information.
- the image acquisition device may be a monochrome camera with an acquisition range of about 153 ⁇ *120 ⁇ *167 ⁇ (H*V*D), a resolution of not less than 640*480, and a frame rate of not less than 30Hz. , and other cameras can also be configured as needed.
- the larger the acquisition range the greater the optical distortion of the camera, which may affect the accuracy of the final data.
- the electronic device may be, for example, a VR device, an AR device, or an MR device.
- the method of this embodiment may include steps S1100-S1400, which will be described in detail below.
- Step S1100 acquiring first image data, wherein the first image data is data representing the real environment where the user is located.
- the first image data may be data reflecting the real environment where the user is located, that is, the real physical environment.
- the image data may include various physical objects in the real environment. For example, according to different scenes where the user is located, The image data may include objects such as sofas, dining tables, trees, buildings, cars, and roads.
- the first image data can be generated by collecting data in the real environment where the user is located by at least two image acquisition devices provided on the electronic device; of course, during specific implementation, according to actual needs, the first image data
- the data can also be generated by collecting data in the real environment where the user is located by other devices other than the electronic device.
- the first image data can be obtained by collecting and obtaining the first image data through an image acquisition device separately set in the environment where the user is located.
- the electronic device establishes a connection, and provides the first image data to the electronic device. This embodiment does not specifically limit the acquisition method of the first image data.
- Step S1200 Obtain category information and plane information of a target object, wherein the target object is an object in the first image data, and the plane information includes information on the outer surface of the target object.
- the target object may be one or more objects in the first image data that correspond to physical objects in the real environment, for example, may be objects corresponding to tables, chairs, sofas, etc. in the real environment Object.
- the plane information of the target object may be the information of the outer surface of the target object, and specifically may be information used to represent attributes such as the position and size of the outer surface of the target object.
- the information may be the information of a certain outer surface of the target object.
- the center coordinate data and the normal vector of the outer surface are used to represent the position and size of the outer surface at the same time.
- the category information of the target object may be information indicating the object type described by the target object.
- the category information may be "furniture” or directly “sofa”; in specific implementation,
- the category information of the target object can be set as required, for example, it can be the information of the large category to which the object belongs, or the information of the sub-category to which it belongs; in addition, the category information can also be represented by the identifier of the type of the object, such as , you can use "0" to indicate furniture, "1" to indicate sofa, etc., which will not be repeated here.
- the obtaining the category information and plane information of the target object includes: inputting the first image data into a target image segmentation model to obtain mask information of the target object; according to the mask information to obtain the category information and the plane information.
- obtaining the category information according to the mask information includes: inputting the mask information into a target category recognition model to obtain the category information.
- mask information can be used to block the image to be processed (all or part) to control the area or process of image processing.
- the mask can be a two-dimensional matrix array or a multi-valued image, which is used to extract the user's interest in the image to be processed, that is, to obtain the area of interest to the user, for example, by multiplying the mask by the image to be processed, so that the image to be processed is The image value of other areas of , and the image value of the area of interest to the user remain unchanged.
- the mask information of the target object is specifically obtained through the target image segmentation model obtained by pre-training; then, according to the mask information, the target category recognition model obtained by pre-training is used to identify the category information of the target object , and, according to the mask information, the plane information of the target object is obtained by calculation.
- the following first describes how to train and obtain the target image segmentation model and the target category recognition model.
- the target image separation model is a model used to separate the object from the carrier, for example, to separate the target object from its carrier image, so as to use the target object for subsequent virtual-real combination processing;
- the target image segmentation model may be a convolutional neural network model, for example, a model based on the Mask R-CNN network structure, which is not particularly limited here.
- the target category recognition model is a model used to identify the category of the object corresponding to the mask information based on the input mask information. For example, when the target object is a sofa, the mask information of the target object can be input by inputting In the target category recognition model, it can be obtained that its category is "furniture", and further, it can be identified as "sofa"; in specific implementation, the target category recognition model can also be a convolutional neural network model, and its model structure It will not be repeated here.
- the target image segmentation model and the target category recognition model can be obtained by training through the following steps: acquiring sample data, wherein the sample data is data including sample objects in a preset scene; data, and jointly train the initial image segmentation model and the initial category recognition model to obtain the target image segmentation model and the target category recognition model.
- environmental image data in different scenarios can be pre-obtained as sample data.
- environmental image data in 128 preset scenarios can be acquired, and the objects in each environmental image data can be manually marked to obtain The sample data used to train the target image segmentation model and the target category recognition model; then, based on the sample data, the initial image segmentation model and the initial category recognition model corresponding to the target image segmentation model and the target category recognition model can be combined respectively. training to obtain the target image segmentation model and the target category recognition model.
- jointly training an initial image segmentation model and an initial category recognition model according to the sample data to obtain the target image segmentation model and the target category recognition model includes: inputting the sample data into a In the initial image segmentation model, the sample mask information of the sample object is obtained; the sample mask information is input into the initial category recognition model to obtain the sample category information of the sample object; and, during training In the process of , by adjusting the parameters of the initial image segmentation model and the initial category recognition model, the target image segmentation model and the target category recognition model that satisfy preset convergence conditions are obtained.
- the sample mask information of the sample object is obtained by inputting the sample data into the initial image segmentation model; then the sample mask information is processed by the initial category recognition model to obtain the sample category information of the sample object.
- the process of joint training by designing the loss function corresponding to the two models, and by continuously adjusting the parameters of the two models respectively, to obtain the target image segmentation model and target category recognition model that meet the preset convergence conditions , where the preset convergence condition may be, for example, that the error of the recognition results of the two models does not exceed a preset threshold. Since the detailed processing of model training has been described in detail in the prior art, it will not be repeated here.
- the mask information of the target object in the first image data is identified and obtained based on the target image separation model, and the mask information is obtained according to the mask.
- the plane information of the target object may also be acquired according to the mask information. The following describes in detail how to acquire the plane information.
- the obtaining the plane information according to the mask information includes: obtaining, according to the mask information, a target image block corresponding to the target object in the first image data; For the target image block, obtain the target position information of the key points of the target object in the world coordinate system, wherein the key points include the corner points of the target object; according to the target position information and the preset plane A combination algorithm is used to obtain the plane information, wherein the plane information includes the coordinates of the center point and the plane normal vector corresponding to each plane of the target object.
- the target image block is an image block formed by the pixels in the first image data that are used to form the target object.
- each key point that constitutes the target object for example, the target position information of the corner point, that is, the three-dimensional position coordinates of each key point in the real world coordinate system; after that, the preset plane fitting algorithm can be used to fit the target information of each outer surface of the object to obtain the plane information.
- the preset plane fitting algorithm may be, for example, a least squares plane fitting algorithm or other algorithms, which are not particularly limited here.
- the electronic device when acquiring the target position information of the key points of the target object in the world coordinate system according to the target image block, the electronic device may be used to: detect the key point according to the target image block the first position information of the point in the first image data; obtain the pose information of the electronic device at the first moment, and the second point in the third image data obtained by the key point at the second moment position information, wherein the first moment includes the current moment, and the second moment is earlier than the first moment; according to the first position information, the pose information and the second position information, the obtained target location information
- the first position information can be the two-dimensional coordinate data of the key points of the target object in the first image data; the pose information of the electronic device can be calculated and obtained according to the system parameters of the image acquisition device carried by the electronic device, which is not repeated here. repeat;
- the second position information may be image data collected by the key points of the target object at a historical moment before the current moment, that is, two-dimensional coordinate data in a historical image frame.
- the position trajectory of the key point at the first moment can be predicted based on the second position information of the key point at the second moment, so as to correct the first position information according to the position trajectory;
- the target position information of the key point in the world coordinate system that is, the three-dimensional coordinate data, is obtained from the position information and the pose information of the electronic device.
- step S1300 is executed to acquire second image data, wherein the second image data is data including virtual objects.
- the virtual object may be an object that does not exist in the real environment where the user is located, that is, virtual content, for example, may be animals, plants, buildings, etc. in the virtual world, which is not particularly limited this time.
- the first image data including the target object and the second image data including the virtual object may be two-dimensional data or three-dimensional data, which is not particularly limited in this embodiment.
- Step S1400 mix the first image data and the second image data to generate target image data, wherein the target image data includes the target object and the Data for virtual objects.
- the The plane information and the category information are used to segment the target object in the first image data and mix them with the virtual object in the second image data to obtain both the target object in the real environment and the virtual object in the virtual environment target image data.
- the generating target image data by mixing the first image data and the second image data according to the plane information and the category information includes: determining the relative positional relationship between the virtual object in the second image data and the target object in the first image data; according to the plane information and the relative positional relationship, the virtual object is rendered to the target object At the preset position of the target object, the target image data is obtained.
- the method further includes displaying the target image data.
- the electronic device can display the target image data on its display screen; further, It is also possible to further obtain the interactive content that the user interacts with the virtual object based on the displayed target image data. For example, if the virtual object is a cat, the user can interact with the virtual cat and save the corresponding content. interactive video.
- the electronic device may further include a network module, and after the network module is connected to the Internet, the electronic device may also save the interaction data of the user interacting with the virtual object in the target image data , such as image data and/or video data, and provide the interaction data to other users, such as friends of the user, for viewing.
- the detailed processing process will not be repeated here.
- the above is only an example of applying the method provided in this embodiment.
- the method can also be applied to scenes such as wall stickers, social networking, virtual remote office, personal games, and advertisements. It is not repeated here.
- the electronic device obtains the first image data representing the real environment where the user is located, and obtains the plane information and category information of the target object in the first image data; then, By acquiring the second image data including the virtual object, the first image data and the second image data can be mixed according to the plane information and the category information to obtain target image data including both the target object and the virtual object.
- the method provided by this embodiment recognizes the information on the outer surface and the category information of the target object, so that when the electronic device constructs the mixed reality data, based on the category information and plane information of the target object, the electronic device can accurately carry out the process with the virtual object aggregated by the virtual environment. Combined to improve the fineness of the target image data obtained by construction, thereby improving the user experience.
- this embodiment also provides a data generating apparatus.
- the apparatus 2000 may be applied to electronic equipment, and may specifically include a first image data acquisition module 2100, an information acquisition module 2200, The second image data acquisition module 2300 and the target image data generation module 2400 .
- the first image data acquisition module 2100 is configured to acquire first image data, wherein the first image data is data representing the real environment where the user is located.
- the information acquisition module 2200 is configured to acquire category information and plane information of a target object, wherein the target object is an object in the first image data, and the plane information includes information on the outer surface of the target object.
- the information acquisition module 2200 may be configured to: input the first image data into the target image segmentation model, and obtain the target object's mask information; obtain the category information and the plane information according to the mask information.
- the information acquisition module 2200 may be configured to: input the mask information into a target category recognition model to obtain the category information.
- the information obtaining module 2200 may be configured to: obtain the target object in the first image data according to the mask information corresponding target image block; according to the target image block, obtain the target position information of the key point of the target object in the world coordinate system, wherein the key point includes the corner point of the target object; according to the target
- the position information and a preset plane fitting algorithm are used to obtain the plane information, wherein the plane information includes the coordinates of the center point and the plane normal vector corresponding to each plane of the target object.
- the information acquisition module 2200 acquires the target position information of the key points of the target object in the world coordinate system according to the target image block, it can be used for: According to the target image block, the first position information of the key point in the first image data is detected; the pose information of the electronic device at the first moment is acquired, and the key point is at the second moment the acquired second position information in the third image data, wherein the first moment includes the current moment, and the second moment is earlier than the first moment; according to the first position information, the pose information and the second position information to obtain the target position information.
- the second image data acquisition module 2300 is configured to acquire second image data, wherein the second image data is data including virtual objects.
- the target image data generation module 2400 is configured to mix the first image data and the second image data according to the category information and the plane information to generate target image data, wherein the target image data includes Data of the target object and the virtual object.
- the target image data generation module 2400 when generating target image data by mixing the first image data and the second image data according to the plane information and the category information, can be used to: Determine the relative positional relationship between the virtual object in the second image data and the target object in the first image data according to the category information; according to the plane information and the relative positional relationship , rendering the virtual object to a preset position of the target object to obtain the target image data.
- the apparatus 2000 further includes a display module, configured to display the target image data after obtaining the target image data.
- an electronic device which may include the data generation apparatus 2000 according to any embodiment of the present application, for implementing the data generation method of any embodiment of the present application.
- the electronic device 3000 may further include a processor 3200 and a memory 3100, where the memory 3100 is used to store executable instructions; the processor 3200 is used to run the electronic device according to the control of the instructions to execute any The data generation method of the embodiment.
- Each module of the above apparatus 2000 may be implemented by the processor 3200 running the instruction to execute the method according to any embodiment of the present application.
- the electronic device 3000 may include a display device, for example, a display screen and at least two image capturing devices for capturing real environment information.
- the image acquisition device may be a monochrome camera with an acquisition range of about 153 ⁇ *120 ⁇ *167 ⁇ (H*V*D), a resolution of not less than 640*480, and a frame rate of not less than 30Hz. , and other cameras can also be configured as needed. However, the larger the acquisition range, the greater the optical distortion of the camera, which may affect the accuracy of the final data.
- the electronic device may be, for example, a VR device, an AR device, or an MR device.
- the present application may be a system, method and/or computer program product.
- the computer program product may include a computer-readable storage medium having computer-readable program instructions loaded thereon for causing a processor to implement various aspects of the present application.
- a computer-readable storage medium may be a tangible device that can hold and store instructions for use by the instruction execution device.
- the computer-readable storage medium may be, for example, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
- Non-exhaustive list of computer readable storage media include: portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM) or flash memory), static random access memory (SRAM), portable compact disk read only memory (CD-ROM), digital versatile disk (DVD), memory sticks, floppy disks, mechanically coded devices, such as printers with instructions stored thereon Hole cards or raised structures in grooves, and any suitable combination of the above.
- RAM random access memory
- ROM read only memory
- EPROM erasable programmable read only memory
- flash memory static random access memory
- SRAM static random access memory
- CD-ROM compact disk read only memory
- DVD digital versatile disk
- memory sticks floppy disks
- mechanically coded devices such as printers with instructions stored thereon Hole cards or raised structures in grooves, and any suitable combination of the above.
- Computer-readable storage media are not to be construed as transient signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (eg, light pulses through fiber optic cables), or through electrical wires transmitted electrical signals.
- the computer readable program instructions described herein may be downloaded to various computing/processing devices from a computer readable storage medium, or to an external computer or external storage device over a network, such as the Internet, a local area network, a wide area network, and/or a wireless network.
- the network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers.
- a network adapter card or network interface in each computing/processing device receives computer-readable program instructions from a network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in each computing/processing device .
- Computer program instructions for carrying out the operations of the present application may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, state setting data, or instructions in one or more programming languages.
- Source or object code written in any combination, including object-oriented programming languages, such as Smalltalk, C++, etc., and conventional procedural programming languages, such as the "C" language or similar programming languages.
- the computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server implement.
- the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider through the Internet connect).
- LAN local area network
- WAN wide area network
- custom electronic circuits such as programmable logic circuits, field programmable gate arrays (FPGAs), or programmable logic arrays (PLAs) can be personalized by utilizing state information of computer readable program instructions.
- Computer readable program instructions are executed to implement various aspects of the present application.
- These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer or other programmable data processing apparatus to produce a machine that causes the instructions when executed by the processor of the computer or other programmable data processing apparatus , resulting in means for implementing the functions/acts specified in one or more blocks of the flowchart and/or block diagrams.
- These computer readable program instructions can also be stored in a computer readable storage medium, these instructions cause a computer, programmable data processing apparatus and/or other equipment to operate in a specific manner, so that the computer readable medium on which the instructions are stored includes An article of manufacture comprising instructions for implementing various aspects of the functions/acts specified in one or more blocks of the flowchart and/or block diagrams.
- Computer readable program instructions can also be loaded onto a computer, other programmable data processing apparatus, or other equipment to cause a series of operational steps to be performed on the computer, other programmable data processing apparatus, or other equipment to produce a computer-implemented process , thereby causing instructions executing on a computer, other programmable data processing apparatus, or other device to implement the functions/acts specified in one or more blocks of the flowcharts and/or block diagrams.
- each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more functions for implementing the specified logical function(s) executable instructions.
- the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
- each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented in dedicated hardware-based systems that perform the specified functions or actions , or can be implemented in a combination of dedicated hardware and computer instructions. It is well known to those skilled in the art that implementation in hardware, implementation in software, and implementation in a combination of software and hardware are all equivalent.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computer Hardware Design (AREA)
- Computer Graphics (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Processing Or Creating Images (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (10)
- 一种数据生成方法,其包括:获取第一图像数据,其中,所述第一图像数据为表示用户所处真实环境的数据;获取目标对象的类别信息和平面信息,其中,所述目标对象为所述第一图像数据中的对象,所述平面信息包括所述目标对象的外表面的信息;获取第二图像数据,其中,所述第二图像数据为包含虚拟对象的数据;根据所述类别信息和所述平面信息,混合所述第一图像数据和所述第二图像数据,生成目标图像数据,其中,所述目标图像数据为包含所述目标对象和所述虚拟对象的数据。
- 根据权利要求1所述的方法,其中,所述根据所述平面信息和所述类别信息,混合所述第一图像数据和所述第二图像数据,生成目标图像数据,包括:根据所述类别信息,确定所述第二图像数据中的所述虚拟对象与所述第一图像数据中的所述目标对象之间的相对位置关系;根据所述平面信息和所述相对位置关系,将所述虚拟对象渲染至所述目标对象的预设位置处,获得所述目标图像数据。
- 根据权利要求1所述的方法,其中,所述获取所述目标对象的类别信息和平面信息,包括:将所述第一图像数据输入到目标图像分割模型中,获得所述目标对象的掩膜信息;根据所述掩膜信息,获得所述类别信息和所述平面信息。
- 根据权利要求3所述的方法,其中,所述根据所述掩膜信息,获得所述类别信息,包括:将所述掩膜信息输入到目标类别识别模型中,获得所述类别信息。
- 根据权利要求3所述的方法,其中,所述根据所述掩膜信息,获得所述平面信息,包括:根据所述掩膜信息,获得所述目标对象在所述第一图像数据中对应的目标图像块;根据所述目标图像块,获取所述目标对象的关键点在世界坐标系下的目标位置信息,其中,所述关键点包括所述目标对象的角点;根据所述目标位置信息和预设平面拟合算法,获得所述平面信息,其中,所述平面信息包括与所述目标对象的每一平面对应的中心点坐标和平面法向量。
- 根据权利要求5所述的方法,其中,所述方法应用于电子设备,所述根据所述目标图像块,获取所述目标对象的关键点在世界坐标系下的目标位置信息,包括:根据所述目标图像块,检测所述关键点在所述第一图像数据中的第一位置信息;获取所述电子设备在第一时刻的位姿信息,以及,所述关键点在第二时刻获取到的第三图像数据中的第二位置信息,其中,所述第一时刻包括当前时刻,所述第二时刻早于所述第一时刻;根据所述第一位置信息、所述位姿信息和所述第二位置信息,获得所述目标位置信息。
- 根据权利要求4所述的方法,其中,所述目标图像分割模型和所述目标类别识别模型通过以下步骤训练获得:获取样本数据,其中,所述样本数据为包含预设场景中的样本对象的数据;根据所述样本数据,联合训练初始图像分割模型和初始类别识别模型,获得所述目标图像分割模型和所述目标类别识别模型。
- 根据权利要求1所述的方法,其中,在获得所述目标图像数据之后,所述方法还包括:展示所述目标图像数据。
- 一种数据生成装置,其包括:第一图像数据获取模块,用于获取第一图像数据,其中,所述第一图像数据为表示用户所处真实环境的数据;信息获取模块,用于获取目标对象的类别信息和平面信息,其中,所述目标对象为所述第一图像数据中的对象,所述平面信息包括所述目标对象的外表面的信息;第二图像数据获取模块,用于获取第二图像数据,其中,所述第二图像数据为包含虚拟对象的数据;目标图像数据生成模块,用于根据所述类别信息和所述平面信息,混合所述第一图像数据和所述第二图像数据,生成目标图像数据,其中,所述目标图像数据为包含所述目标对象和所述虚拟对象的数据。
- 一种电子设备,其中,包括权利要求9所述的装置;或者,所述电子设备包括:存储器,用于存储可执行的指令;处理器,用于根据所述指令的控制运行所述电子设备执行如权利要求1-8任意一项所述的方法。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2023556723A JP2024512447A (ja) | 2021-04-21 | 2022-03-25 | データ生成方法、装置及び電子機器 |
EP22790798.7A EP4290452A4 (en) | 2021-04-21 | 2022-03-25 | DATA GENERATION METHOD AND APPARATUS, AND ELECTRONIC DEVICE |
KR1020237030173A KR20230142769A (ko) | 2021-04-21 | 2022-03-25 | 데이터 생성 방법, 장치 및 전자기기 |
US18/460,095 US11995741B2 (en) | 2021-04-21 | 2023-09-01 | Data generation method and apparatus, and electronic device |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110431972.6A CN113269782B (zh) | 2021-04-21 | 2021-04-21 | 数据生成方法、装置及电子设备 |
CN202110431972.6 | 2021-04-21 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US18/460,095 Continuation US11995741B2 (en) | 2021-04-21 | 2023-09-01 | Data generation method and apparatus, and electronic device |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022222689A1 true WO2022222689A1 (zh) | 2022-10-27 |
Family
ID=77229241
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/083110 WO2022222689A1 (zh) | 2021-04-21 | 2022-03-25 | 数据生成方法、装置及电子设备 |
Country Status (6)
Country | Link |
---|---|
US (1) | US11995741B2 (zh) |
EP (1) | EP4290452A4 (zh) |
JP (1) | JP2024512447A (zh) |
KR (1) | KR20230142769A (zh) |
CN (1) | CN113269782B (zh) |
WO (1) | WO2022222689A1 (zh) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113269782B (zh) * | 2021-04-21 | 2023-01-03 | 青岛小鸟看看科技有限公司 | 数据生成方法、装置及电子设备 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190221041A1 (en) * | 2018-01-12 | 2019-07-18 | Beijing Xiaomi Mobile Software Co., Ltd. | Method and apparatus for synthesizing virtual and real objects |
CN112017300A (zh) * | 2020-07-22 | 2020-12-01 | 青岛小鸟看看科技有限公司 | 混合现实图像的处理方法、装置及设备 |
CN112037314A (zh) * | 2020-08-31 | 2020-12-04 | 北京市商汤科技开发有限公司 | 图像显示方法、装置、显示设备及计算机可读存储介质 |
CN113269782A (zh) * | 2021-04-21 | 2021-08-17 | 青岛小鸟看看科技有限公司 | 数据生成方法、装置及电子设备 |
Family Cites Families (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10019962B2 (en) * | 2011-08-17 | 2018-07-10 | Microsoft Technology Licensing, Llc | Context adaptive user interface for augmented reality display |
CN106249883B (zh) * | 2016-07-26 | 2019-07-30 | 努比亚技术有限公司 | 一种数据处理方法及电子设备 |
US10235771B2 (en) * | 2016-11-11 | 2019-03-19 | Qualcomm Incorporated | Methods and systems of performing object pose estimation |
US10635927B2 (en) * | 2017-03-06 | 2020-04-28 | Honda Motor Co., Ltd. | Systems for performing semantic segmentation and methods thereof |
CN111133365B (zh) * | 2017-05-01 | 2023-03-31 | 奇跃公司 | 内容到空间3d环境的匹配 |
US10475250B1 (en) * | 2018-08-30 | 2019-11-12 | Houzz, Inc. | Virtual item simulation using detected surfaces |
CN110032278B (zh) * | 2019-03-29 | 2020-07-14 | 华中科技大学 | 一种人眼感兴趣物体的位姿识别方法、装置及系统 |
CN111862333B (zh) * | 2019-04-28 | 2024-05-28 | 广东虚拟现实科技有限公司 | 基于增强现实的内容处理方法、装置、终端设备及存储介质 |
CN110610488A (zh) * | 2019-08-29 | 2019-12-24 | 上海杏脉信息科技有限公司 | 分类训练和检测的方法与装置 |
CN111399654B (zh) * | 2020-03-25 | 2022-08-12 | Oppo广东移动通信有限公司 | 信息处理方法、装置、电子设备及存储介质 |
CN111510701A (zh) * | 2020-04-22 | 2020-08-07 | Oppo广东移动通信有限公司 | 虚拟内容的显示方法、装置、电子设备及计算机可读介质 |
CN111652317B (zh) * | 2020-06-04 | 2023-08-25 | 郑州科技学院 | 基于贝叶斯深度学习的超参数图像分割方法 |
CN111666919B (zh) * | 2020-06-24 | 2023-04-07 | 腾讯科技(深圳)有限公司 | 一种对象识别方法、装置、计算机设备和存储介质 |
CN111815786A (zh) * | 2020-06-30 | 2020-10-23 | 北京市商汤科技开发有限公司 | 信息显示方法、装置、设备和存储介质 |
CN115390256A (zh) * | 2020-07-24 | 2022-11-25 | 闪耀现实(无锡)科技有限公司 | 一种增强现实显示装置及其设备 |
CN111931664B (zh) * | 2020-08-12 | 2024-01-12 | 腾讯科技(深圳)有限公司 | 混贴票据图像的处理方法、装置、计算机设备及存储介质 |
CN112348969B (zh) * | 2020-11-06 | 2023-04-25 | 北京市商汤科技开发有限公司 | 增强现实场景下的展示方法、装置、电子设备及存储介质 |
-
2021
- 2021-04-21 CN CN202110431972.6A patent/CN113269782B/zh active Active
-
2022
- 2022-03-25 EP EP22790798.7A patent/EP4290452A4/en active Pending
- 2022-03-25 KR KR1020237030173A patent/KR20230142769A/ko unknown
- 2022-03-25 JP JP2023556723A patent/JP2024512447A/ja active Pending
- 2022-03-25 WO PCT/CN2022/083110 patent/WO2022222689A1/zh active Application Filing
-
2023
- 2023-09-01 US US18/460,095 patent/US11995741B2/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190221041A1 (en) * | 2018-01-12 | 2019-07-18 | Beijing Xiaomi Mobile Software Co., Ltd. | Method and apparatus for synthesizing virtual and real objects |
CN112017300A (zh) * | 2020-07-22 | 2020-12-01 | 青岛小鸟看看科技有限公司 | 混合现实图像的处理方法、装置及设备 |
CN112037314A (zh) * | 2020-08-31 | 2020-12-04 | 北京市商汤科技开发有限公司 | 图像显示方法、装置、显示设备及计算机可读存储介质 |
CN113269782A (zh) * | 2021-04-21 | 2021-08-17 | 青岛小鸟看看科技有限公司 | 数据生成方法、装置及电子设备 |
Non-Patent Citations (1)
Title |
---|
See also references of EP4290452A4 * |
Also Published As
Publication number | Publication date |
---|---|
JP2024512447A (ja) | 2024-03-19 |
KR20230142769A (ko) | 2023-10-11 |
CN113269782A (zh) | 2021-08-17 |
EP4290452A1 (en) | 2023-12-13 |
US20230410386A1 (en) | 2023-12-21 |
US11995741B2 (en) | 2024-05-28 |
EP4290452A4 (en) | 2024-06-19 |
CN113269782B (zh) | 2023-01-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109242978B (zh) | 三维模型的视角调整方法和装置 | |
Tian et al. | Handling occlusions in augmented reality based on 3D reconstruction method | |
Barranco et al. | A dataset for visual navigation with neuromorphic methods | |
WO2020056903A1 (zh) | 用于生成信息的方法和装置 | |
CN114025219B (zh) | 增强现实特效的渲染方法、装置、介质及设备 | |
CN112017300B (zh) | 混合现实图像的处理方法、装置及设备 | |
WO2023042160A1 (en) | Browser optimized interactive electronic model based determination of attributes of a structure | |
US20210056337A1 (en) | Recognition processing device, recognition processing method, and program | |
US20180115700A1 (en) | Simulating depth of field | |
US20210407125A1 (en) | Object recognition neural network for amodal center prediction | |
CN111192308B (zh) | 图像处理方法及装置、电子设备和计算机存储介质 | |
US10401947B2 (en) | Method for simulating and controlling virtual sphere in a mobile device | |
CN113269781A (zh) | 数据生成方法、装置及电子设备 | |
WO2022222689A1 (zh) | 数据生成方法、装置及电子设备 | |
US11206433B2 (en) | Generating augmented videos | |
US20230290132A1 (en) | Object recognition neural network training using multiple data sources | |
Xuerui | Three-dimensional image art design based on dynamic image detection and genetic algorithm | |
Jin et al. | Volumivive: An authoring system for adding interactivity to volumetric video | |
Yang et al. | View suggestion for interactive segmentation of indoor scenes | |
US10755459B2 (en) | Object painting through use of perspectives or transfers in a digital medium environment | |
Fradet et al. | [poster] mr TV mozaik: A new mixed reality interactive TV experience | |
CN108805951B (zh) | 一种投影图像处理方法、装置、终端和存储介质 | |
CN113191462A (zh) | 信息获取方法、图像处理方法、装置及电子设备 | |
CN115937480B (zh) | 一种基于人工势场的虚拟现实去中心化重定向系统 | |
CN114721562B (zh) | 用于数字对象的处理方法、装置、设备、介质及产品 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22790798 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 20237030173 Country of ref document: KR Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1020237030173 Country of ref document: KR Ref document number: 2022790798 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2023556723 Country of ref document: JP |
|
ENP | Entry into the national phase |
Ref document number: 2022790798 Country of ref document: EP Effective date: 20230905 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |