WO2023246189A1 - Procédé et dispositif d'affichage d'informations d'image - Google Patents

Procédé et dispositif d'affichage d'informations d'image Download PDF

Info

Publication number
WO2023246189A1
WO2023246189A1 PCT/CN2023/081391 CN2023081391W WO2023246189A1 WO 2023246189 A1 WO2023246189 A1 WO 2023246189A1 CN 2023081391 W CN2023081391 W CN 2023081391W WO 2023246189 A1 WO2023246189 A1 WO 2023246189A1
Authority
WO
WIPO (PCT)
Prior art keywords
dimensional
image
display
display area
information
Prior art date
Application number
PCT/CN2023/081391
Other languages
English (en)
Chinese (zh)
Inventor
谢独放
李阳
李浩正
王怡丁
焦弟琴
黄晓艺
Original Assignee
如你所视(北京)科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 如你所视(北京)科技有限公司 filed Critical 如你所视(北京)科技有限公司
Publication of WO2023246189A1 publication Critical patent/WO2023246189A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/006Mixed reality
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/816Monomedia components thereof involving special video data, e.g 3D video

Definitions

  • the present disclosure relates to the field of computer technology, and in particular, to an image information display method and device.
  • VR panoramic technology is an emerging technology. Because VR panoramic technology can present three-dimensional space scenes to users at 720 degrees without blind spots, it brings users an immersive visual experience. Users can choose rooms through VR viewing. The simulated decoration plan can simulate house decoration effects in VR scenes. During the VR house viewing process, users can determine the position information and perspective information of the virtual observation point in the VR model, determine the displayed observation screen based on the location information and perspective information, and be able to see the items in the observation screen, including windows, Walls, mirrors, tabletops, TVs and other flat surfaces.
  • Embodiments of the present disclosure provide an image information display method and device.
  • an image information display method including: obtaining classification information of pixels in a two-dimensional image, and generating a semantic map corresponding to the two-dimensional image based on the classification information;
  • the semantic map determines a candidate display area corresponding to the target object in the two-dimensional image; according to the depth image corresponding to the two-dimensional image, obtains a three-dimensional display plane corresponding to the candidate display area; obtains A three-dimensional model corresponding to the two-dimensional image, determining an image observation position corresponding to the three-dimensional model, and selecting a three-dimensional target display area in a three-dimensional display plane corresponding to at least one target object based on the image observation position; obtaining display position information and image information corresponding to the three-dimensional target display area, and determine a two-dimensional screen display area corresponding to the three-dimensional target display area according to the display position information and the image observation position; in the The image information is displayed in a two-dimensional screen display area.
  • the classification information of the pixels in the two-dimensional image is obtained, and based on the classification information, a pixel corresponding to the two-dimensional image is generated.
  • the semantic map corresponding to the image includes: using a trained neural network model to classify at least one pixel in the two-dimensional image, and obtaining a category label of at least one pixel in the two-dimensional image; based on the two-dimensional image The position information of at least one pixel in the image and the corresponding category label are used to generate the semantic map.
  • determining the candidate display area corresponding to the target object in the two-dimensional image based on the semantic map includes: based on the category label in the semantic map, determining in the semantic map the candidate display area corresponding to the target object in the two-dimensional image. At least one target area corresponding to the target object; using a preset image connection algorithm to perform image connection processing on multiple target areas corresponding to the target object to generate at least one pixel aggregation cluster; determining the pixel aggregation cluster based on the at least one pixel aggregation cluster.
  • the candidate display area includes: based on the category label in the semantic map, determining in the semantic map the candidate display area corresponding to the target object in the two-dimensional image. At least one target area corresponding to the target object; using a preset image connection algorithm to perform image connection processing on multiple target areas corresponding to the target object to generate at least one pixel aggregation cluster; determining the pixel aggregation cluster based on the at least one pixel aggregation cluster.
  • determining the candidate display area according to the at least one pixel aggregation cluster includes: determining whether the number of the pixel aggregation clusters is greater than 1, and if not, setting this pixel aggregation cluster as a candidate cluster; if so , then perform scoring processing on at least one pixel aggregation cluster according to the preset aggregation cluster scoring factor, and determine candidate clusters among multiple pixel aggregation clusters based on the score of the at least one pixel aggregation cluster; wherein, the aggregation cluster scoring factor It includes: position distribution and size of pixel aggregation clusters; setting the candidate clusters as the foreground and setting the remaining pixels in the semantic map as the background to generate a binary image; obtaining the information related to the foreground in the binary image The corresponding first rectangle serves as the candidate display area.
  • obtaining the three-dimensional display plane corresponding to the candidate display area based on the depth image corresponding to the two-dimensional image includes: converting the two-dimensional pixels in the candidate display area based on the depth image. The coordinates are converted into corresponding three-dimensional pixel coordinates; a three-dimensional point cloud corresponding to the candidate display area is generated according to the three-dimensional pixel coordinates; a plane detection is performed on the three-dimensional point cloud according to a plane detection algorithm; if the detection is passed, the The three-dimensional display plane corresponding to the three-dimensional point cloud.
  • selecting a three-dimensional target display area from a three-dimensional display plane corresponding to at least one target object based on the image observation position includes: acquiring a three-dimensional display plane corresponding to at least one target object based on the image observation position.
  • the display position information includes: the three-dimensional coordinate information of the vertex of the three-dimensional target display area; and the two-dimensional coordinate information corresponding to the three-dimensional target display area is determined based on the display position information and the image observation position.
  • the three-dimensional screen display area includes: determining the two-dimensional coordinate information of the vertex of the three-dimensional target display area based on the three-dimensional coordinate information of the vertex and the image observation position; based on the two-dimensional coordinate information, between the three-dimensional model and the The two-dimensional screen display area is determined in the two-dimensional screen display image corresponding to the image observation position.
  • the display processing of the image information in the two-dimensional screen display area includes: obtaining background information of the two-dimensional screen display area, and displaying elements of the image information based on the background information. Make adjustments; wherein the display elements of the image information include at least one of the following: pictures, text corresponding to the pictures, and symbols.
  • the display processing of the image information in the two-dimensional screen display area includes: obtaining the observation distance between the three-dimensional target display area and the image observation position, and determining the two-dimensional The area size of the screen display area; based on the area size and the observation distance, the display mode and size of the image information are determined.
  • the two-dimensional image includes: a color two-dimensional image corresponding to the interior of the house;
  • the target object includes: at least one of a window, a wall, a mirror, a desktop, and a television.
  • an image information display device including: an image analysis module, configured to obtain classification information of pixels in a two-dimensional image, and generate a phase corresponding to the two-dimensional image based on the classification information.
  • the corresponding semantic map a candidate area determination module, used to determine a candidate display area corresponding to the target object in the two-dimensional image based on the semantic map; a three-dimensional plane acquisition module, used to determine the candidate display area corresponding to the two-dimensional image based on the two-dimensional image
  • the corresponding depth image is used to obtain the three-dimensional display plane corresponding to the candidate display area;
  • the target area determination module is used to obtain the three-dimensional model corresponding to the two-dimensional image and determine the image observation corresponding to the three-dimensional model.
  • Position select a three-dimensional target display area in a three-dimensional display plane corresponding to at least one target object based on the image observation position; a display area determination module used to obtain display position information and image information corresponding to the three-dimensional target display area , according to the display position information and the image observation position, determine the two-dimensional screen display area corresponding to the three-dimensional target display area; a display processing module for processing the image in the two-dimensional screen display area Information is displayed and processed.
  • the image analysis module is specifically configured to use a trained neural network model to classify at least one pixel in the two-dimensional image and obtain a category label of at least one pixel in the two-dimensional image; based on The position information of at least one pixel in the two-dimensional image and the corresponding category label are used to generate the semantic map.
  • the candidate area determination module includes: a target area determination unit, configured to determine at least one target area corresponding to the target object in the semantic map based on the category label in the semantic map; area The connection processing unit is used to perform image connection processing on multiple target areas corresponding to the target object using a preset image connection algorithm to generate at least one pixel aggregation cluster; the candidate area selection unit is used to perform image connection processing based on the at least one pixel Aggregated clusters determine the candidate display areas.
  • a target area determination unit configured to determine at least one target area corresponding to the target object in the semantic map based on the category label in the semantic map
  • area The connection processing unit is used to perform image connection processing on multiple target areas corresponding to the target object using a preset image connection algorithm to generate at least one pixel aggregation cluster
  • the candidate area selection unit is used to perform image connection processing based on the at least one pixel Aggregated clusters determine the candidate display areas.
  • the candidate area selection unit is used to determine whether the number of pixel aggregation clusters is greater than 1. If not, set this pixel aggregation cluster as a candidate cluster; if so, score according to a preset aggregation cluster.
  • Factors perform scoring processing on at least one pixel aggregation cluster, and determine candidate clusters among multiple pixel aggregation clusters based on the score of the at least one pixel aggregation cluster; wherein the aggregation cluster scoring factors include: the location distribution of the pixel aggregation cluster and size; will be described
  • the selected cluster is set as the foreground and the remaining pixels in the semantic map are set as the background to generate a binary image; the first rectangle corresponding to the foreground is obtained in the binary image as the candidate display area.
  • the three-dimensional plane acquisition module includes: a coordinate conversion unit for converting two-dimensional pixel coordinates in the candidate display area into corresponding three-dimensional pixel coordinates based on the depth image; a point cloud generation unit for Generating a three-dimensional point cloud corresponding to the candidate display area according to the three-dimensional pixel coordinates; a plane detection unit for performing plane detection on the three-dimensional point cloud according to a plane detection algorithm; a plane determination unit for detecting if the , then the three-dimensional display plane corresponding to the three-dimensional point cloud is obtained.
  • the target area determination module is specifically configured to obtain display factors corresponding to a three-dimensional display plane corresponding to at least one target object according to the image observation position; wherein the display factors include: a three-dimensional display plane The orientation, the distance between the three-dimensional display plane and the image observation position; determining the display score of the three-dimensional display plane corresponding to at least one target object based on the display factors; selecting the three-dimensional target display area according to the display score .
  • the display position information includes: vertex three-dimensional coordinate information of the three-dimensional target display area; the display area determination module is specifically configured to determine the vertex three-dimensional coordinate information and the image observation position based on the Two-dimensional coordinate information of the vertex of the three-dimensional target display area; according to the two-dimensional coordinate information, the two-dimensional screen display area is determined in the two-dimensional screen display image corresponding to the three-dimensional model and the image observation position.
  • the display processing module is used to obtain the background information of the two-dimensional screen display area, and adjust the display elements of the image information based on the background information; wherein the display elements of the image information include At least one of the following: pictures, text corresponding to pictures, symbols.
  • the display processing module is used to obtain the observation distance between the three-dimensional target display area and the image observation position, and determine the area size of the two-dimensional screen display area; based on the area size and The observation distance determines the display mode and size of the image information.
  • the two-dimensional image includes: a color two-dimensional image corresponding to the interior of the house;
  • the target object includes: at least one of a window, a wall, a mirror, a desktop, and a television.
  • an electronic device includes: a processor; a memory for storing instructions executable by the processor; and the processor for executing the above method.
  • a computer program product including computer program instructions, wherein the computer program instructions implement the above-mentioned method when executed by a processor.
  • the image information for interaction with the user can be displayed on the real space plane and the virtual space plane when the user browses, and mixed reality technology information display can be provided.
  • Display capabilities and scene-based information provide users with spatial scene interaction experience, improve users’ spatial browsing experience, and effectively improve users’ sensitivity.
  • Figure 1 is a flow chart of an embodiment of the image information display method of the present disclosure
  • Figure 2 is a flow chart for determining candidate display areas in one embodiment of the image information display method of the present disclosure
  • Figure 3 is a flow chart for determining a three-dimensional display plane in one embodiment of the image information display method of the present disclosure
  • Figure 4 is a flow chart of selecting a three-dimensional target display area in one embodiment of the image information display method of the present disclosure
  • Figure 5 is a flow chart for determining a two-dimensional screen display area in one embodiment of the image information display method of the present disclosure
  • Figure 6A is a schematic diagram of a two-dimensional screen display area
  • Figure 6B is a schematic diagram of displaying image information in the two-dimensional screen display area
  • Figure 7 is a schematic structural diagram of an embodiment of the image information display device of the present disclosure.
  • Figure 8 is a schematic structural diagram of a candidate area determination module in one embodiment of the image information display device of the present disclosure.
  • Figure 9 is a schematic structural diagram of a three-dimensional plane acquisition module in one embodiment of the image information display device of the present disclosure.
  • FIG. 10 is a structural diagram of an embodiment of the electronic device of the present disclosure.
  • Embodiments of the present invention may be applied to computer systems/servers that may operate with numerous other general purpose or special purpose computing system environments or configurations.
  • Examples of well-known computing systems, environments and/or configurations suitable for use with computer systems/servers include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients, handheld or laptop-based devices, Microprocessor systems, set-top boxes, programmable consumer electronics, network personal computers, small computer systems, large computer systems and distributed cloud computing technology environments including any of the above systems, etc.
  • Computer systems/servers may be described in the general context of computer system executable instructions, such as program modules, being executed by the computer system.
  • program modules may include routines, programs, object programs, components, logic, data structures, etc., that perform specific tasks or implement specific abstract data types.
  • the computer system/server may be implemented in a distributed cloud computing environment where tasks are performed by remote processing devices linked through a communications network.
  • program modules may be located on local or remote computing system storage media including storage devices.
  • step numbers in this disclosure such as “Step 1", “Step 2", “S101", “S102”, etc., are only for distinguishing different steps and do not represent the order between steps. Steps with different numbers are executed when The order can be adjusted.
  • Figure 1 is a flow chart of an embodiment of the image information display method of the present disclosure. The method shown in Figure 1 includes the following steps:
  • S101 Obtain classification information of pixels in the two-dimensional image, and generate a semantic map corresponding to the two-dimensional image based on the classification information.
  • the two-dimensional image may be a color two-dimensional image corresponding to the interior of a house.
  • the two-dimensional image may be a color image of an object, a bedroom, a gymnasium interior, etc.
  • the target objects in the two-dimensional image can be windows, walls, mirrors, desktops, TVs, etc.
  • the target objects have a planar structure and can play videos, set pictures, etc. on them.
  • the classification information of pixels in the two-dimensional image includes classification information such as windows, walls, mirrors, desktops, TVs, floors, walls, etc.
  • classification information such as windows, walls, mirrors, desktops, TVs, floors, walls, etc.
  • Several methods can be used to generate semantic maps corresponding to two-dimensional images based on classification information. For example, a trained neural network model is used to classify at least one pixel in the two-dimensional image, and a category label of at least one pixel in the two-dimensional image is obtained.
  • Neural network models can be convolutional neural networks, adversarial neural network models, etc., and can be trained using a variety of existing training methods.
  • the neural network model Through the neural network model, it can be determined that the pixels in the two-dimensional image belong to windows, walls, mirrors, desktops, TVs, floors, walls, etc., and set at least one pixel in the two-dimensional image.
  • Category tags can include but are not limited to: windows, walls, mirrors, desktops, TVs, floors, walls and other tags.
  • a semantic map is generated based on the position information of at least one pixel in the two-dimensional image and the corresponding category label.
  • a category label corresponding to the pixel can be set at the location of at least one pixel in the two-dimensional image. For example, a category label indicating that the pixel belongs to a window, wall, mirror, desktop, TV, floor, wall, etc., to generate a semantic map. .
  • S102 Determine a candidate display area corresponding to the target object in the two-dimensional image based on the semantic map.
  • Candidate display areas for target objects such as windows, walls, mirrors, desktops, TVs, floors, and walls in the two-dimensional image can be determined based on the generated semantic map.
  • S103 Obtain a three-dimensional display plane corresponding to the candidate display area based on the depth image corresponding to the two-dimensional image.
  • S104 Obtain a three-dimensional model corresponding to the two-dimensional image, determine the image observation position corresponding to the three-dimensional model, and select a three-dimensional target display area in the three-dimensional display plane corresponding to at least one target object based on the image observation position.
  • the depth image may be a depth image of a living room, bedroom, gymnasium, etc. taken using a depth camera, etc., and the pixels in the depth image have three-dimensional coordinate information.
  • a three-dimensional model of the living room, bedroom, gymnasium, etc. can be established.
  • This three-dimensional model can support the VR scene display function, which is a VR model.
  • the three-dimensional space scene can be presented to the user through this three-dimensional model. .
  • the three-dimensional model can determine the two-dimensional image that needs to be displayed on the two-dimensional screen based on the image observation position.
  • S105 Obtain the display position information and image information corresponding to the three-dimensional target display area, and determine the two-dimensional screen display area corresponding to the three-dimensional target display area based on the display position information and image observation position.
  • the image information may be mixed reality technology (MR) information, and the image information may be of various types, such as community introduction, environment description, house advantage description and other image information, which can provide users with scene-based information.
  • Image information can be displayed on planes in real space and planes in virtual space, such as windows, walls, mirrors, desktops, TVs, floors, walls and other target objects in the two-dimensional screen display area, providing MR display capabilities to It provides users with an interactive experience of spatial scenes and improves their spatial browsing experience.
  • MR mixed reality technology
  • Figure 2 is a flow chart for determining candidate display areas in one embodiment of the image information display method of the present disclosure. The method shown in Figure 2 includes the following steps:
  • S202 use a preset image connection algorithm to perform image connection processing on multiple target areas corresponding to the target object, and generate at least one pixel aggregation cluster.
  • the image connection algorithm may include a variety of algorithms, such as existing image region growing algorithms, etc.
  • S203 Determine a candidate display area based on at least one pixel aggregation cluster.
  • Aggregation cluster scoring factors include the location distribution and size of pixel aggregation clusters.
  • the target object is a window, etc.
  • multiple target areas corresponding to the windows are determined in the semantic map. Use the existing image region growing algorithm to perform image connection processing on multiple target areas of the window to generate multiple pixel aggregation clusters.
  • Aggregation cluster scoring factors include the location distribution and size of pixel aggregation clusters, etc. You can set scoring standards corresponding to the aggregation cluster scoring factors. For example, the larger the pixel aggregation cluster, the higher the aggregation cluster score; the greater the location distance of the pixel aggregation cluster. The farther away the center position of the pixel aggregation cluster is, the lower the aggregation cluster score will be, etc. Score multiple pixel aggregation clusters respectively according to the scoring criteria, and determine the pixel aggregation clusters with scores greater than the threshold as candidate clusters. The number of candidate clusters can be one or more.
  • the candidate cluster as the foreground and the remaining pixels in the semantic map as the background to generate a binary map, and obtain the first rectangle corresponding to the foreground in the binary map as the candidate display area.
  • the first rectangle is a larger rectangle. For example, the largest rectangle corresponding to the foreground is obtained in the binary image as a candidate display area.
  • a trained feedforward neural network model is used to process a two-dimensional color image, extract features in the color image, and classify each pixel in the two-dimensional color image based on the features, and classify each pixel Determine a semantic category label, which can include category labels for windows, walls, mirrors, desktops, TVs, etc.
  • a semantic map is formed based on the category labels corresponding to the pixels of the original color image.
  • target objects such as windows, walls, mirrors, desktops, and TVs
  • you can initially screen target objects such as windows, walls, mirrors, desktops, and TVs, obtain the position of the target object in the two-dimensional color image, and generate the target area.
  • the target area obtained at this time is two-dimensional and the target area is distributed. Scattered, pixel by pixel, does not constitute an enclosed area.
  • the image region growing algorithm is used to generate connected parts, and adjacent pixels of the same category are aggregated into pixel aggregation clusters.
  • the results are sorted according to scoring criteria such as location distribution and cluster size, and candidate clusters that meet the criteria are selected.
  • Pixel candidate clusters may be irregular in shape and need to be processed into regular shapes.
  • Set the current pixel candidate cluster as the foreground, and set the positions of the remaining image pixels as the background to obtain a binary image.
  • Figure 3 is a flow chart for determining a three-dimensional display plane in one embodiment of the image information display method of the present disclosure. The method shown in Figure 3 includes the following steps:
  • S302 Generate a three-dimensional point cloud corresponding to the candidate display area according to the three-dimensional pixel coordinates.
  • S303 Perform plane detection on the three-dimensional point cloud according to the plane detection algorithm.
  • plane detection algorithms such as the existing random sampling consensus algorithm, etc.
  • the three-dimensional display plane can be constructed through existing three-dimensional reconstruction technology.
  • the three-dimensional display plane can be multiple areas in space suitable for displaying information, and each area is composed of three-dimensional coordinates of multiple boundary corner points.
  • the candidate display area lacks three-dimensional position information and needs to be processed based on the depth map to obtain the three-dimensional coordinates.
  • the three-dimensional plane corresponding to the object is used as the three-dimensional display plane.
  • Figure 4 is a flow chart of selecting a three-dimensional target display area in one embodiment of the image information display method of the present disclosure. The method shown in Figure 4 includes the following steps:
  • the display factors include factors such as the orientation of the three-dimensional display plane and the distance between the three-dimensional display plane and the image observation position.
  • the displayed position information may be the three-dimensional coordinate information of the vertices of the three-dimensional target display area, etc.
  • S402. Determine the display score of the three-dimensional display plane corresponding to at least one target object based on the display factors.
  • S403 Select the three-dimensional target display area according to the display score.
  • a display of the three-dimensional display plane corresponding to target objects such as windows, walls, mirrors, desktops, TVs, floors, and walls is obtained.
  • Factors, display factors include the orientation of the three-dimensional display plane, the distance between the three-dimensional display plane and the image observation position, etc.
  • the scoring standards and based on factors such as the orientation of the three-dimensional display plane, the distance between the three-dimensional display plane and the image observation position, the display score of the three-dimensional display plane corresponding to at least one target object is determined, and the three-dimensional display plane with a score greater than the threshold is determined It is determined as a three-dimensional target display area, and the number of three-dimensional target display areas can be one or more.
  • Figure 5 is a flow chart for determining a two-dimensional screen display area in one embodiment of the image information display method of the present disclosure. The method shown in Figure 5 includes the following steps:
  • the display position information includes vertex three-dimensional coordinate information of the three-dimensional target display area.
  • the three-dimensional target display area may be a display area located on a window, as shown in Figure 6A.
  • the three-dimensional target display area is a rectangular area, and the display position information includes three-dimensional coordinate information of four vertices of this rectangular area.
  • the area surrounded by the four vertices is the rectangular area that needs to display the user interface (User Interface, UI).
  • the image information can be pasted to the corresponding rectangular area of the three-dimensional target display area.
  • the observation distance between the three-dimensional target display area and the image observation position is obtained, and the area size of the two-dimensional screen display area is determined. Based on the area size and the observation distance, the display mode and size of the image information are determined.
  • the image information display method based on the present disclosure can display the image information that interacts with the user on the real space plane and the virtual space plane when the user browses, provides MR information display capabilities and scene-based information, and provides the user with a space scene interaction experience. , improving the user’s space browsing experience.
  • the present disclosure provides an image information display device, including an image analysis module 71, a candidate area determination module 72, a three-dimensional plane acquisition module 73, a target area determination module 74, and a display area determination module. 75 and display processing module 76.
  • the image analysis module 71 obtains classification information of pixels in the two-dimensional image, and generates a semantic map corresponding to the two-dimensional image based on the classification information.
  • the image analysis module 71 uses a trained neural network model to classify at least one pixel in the two-dimensional image, and obtains the category label of at least one pixel in the two-dimensional image; the image analysis module 71 is based on the classification of at least one pixel in the two-dimensional image. Location information and corresponding category labels are used to generate semantic maps.
  • the candidate area determination module 72 determines a candidate display area corresponding to the target object in the two-dimensional image based on the semantic map.
  • the three-dimensional plane acquisition module 73 acquires a three-dimensional display plane corresponding to the candidate display area according to the depth image corresponding to the two-dimensional image.
  • the target area determination module 74 obtains a three-dimensional model corresponding to the two-dimensional image, determines an image observation position corresponding to the three-dimensional model, and selects a three-dimensional target display area in a three-dimensional display plane corresponding to at least one target object based on the image observation position.
  • the display area determination module 75 obtains display position information and image information corresponding to the three-dimensional target display area, and determines a two-dimensional screen display area corresponding to the three-dimensional target display area based on the display position information and image observation position.
  • the display processing module 76 performs display processing on the image information in the two-dimensional screen display area.
  • the candidate area selection unit 723 determines whether the number of pixel aggregation clusters is greater than 1. If not, the candidate area selection unit 723 sets this pixel aggregation cluster as a candidate cluster; if so, the candidate area selection unit 723 scores according to the preset aggregation cluster.
  • the factors perform scoring processing on at least one pixel aggregation cluster, and determine candidate clusters among multiple pixel aggregation clusters based on the score of at least one pixel aggregation cluster; wherein the aggregation cluster scoring factors include: location distribution and size of the pixel aggregation cluster.
  • the candidate area selection unit 723 sets the candidate cluster as the foreground and the remaining pixels in the semantic map as the background, generating a binary map.
  • the candidate area selection unit 723 obtains the first rectangle corresponding to the foreground in the binary image as a candidate display area.
  • the three-dimensional plane acquisition module 73 includes a coordinate conversion unit 731 , a point cloud generation unit 732 , a plane detection unit 733 and a plane determination unit 734 .
  • the coordinate conversion unit 731 converts two-dimensional pixel coordinates in the candidate display area into corresponding three-dimensional pixel coordinates based on the depth image.
  • the point cloud generation unit 732 generates a three-dimensional point cloud corresponding to the candidate display area according to the three-dimensional pixel coordinates.
  • the plane detection unit 733 performs plane detection on the three-dimensional point cloud according to a plane detection algorithm; wherein the plane detection algorithm includes a random sampling consistency algorithm. If the detection is passed, the plane determination unit 734 acquires the three-dimensional display plane corresponding to the three-dimensional point cloud.
  • the display position information includes the three-dimensional coordinate information of the vertex of the three-dimensional target display area; the display area determination module 75 determines the two-dimensional coordinate information of the vertex of the three-dimensional target display area based on the three-dimensional coordinate information of the vertex and the image observation position. The display area determination module 75 determines the two-dimensional screen display area in the two-dimensional screen display image corresponding to the three-dimensional model and the image observation position according to the two-dimensional coordinate information.
  • the display processing module 76 obtains the background information of the two-dimensional screen display area, and adjusts the display elements of the image information based on the background information; wherein the display elements of the image information include at least one of the following pictures, text corresponding to the picture, symbols, etc.
  • the display processing module 76 obtains the observation distance between the three-dimensional target display area and the image observation position, determines the area size of the two-dimensional screen display area, and determines the display mode and size of the image information based on the area size and the observation distance.
  • FIG. 10 is a structural diagram of an embodiment of an electronic device of the present disclosure. As shown in FIG. 10 , the electronic device 101 includes one or more processors 1011 and a memory 1012 .
  • the processor 1011 may be a central processing unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device 101 to perform desired functions.
  • CPU central processing unit
  • the processor 1011 may be a central processing unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device 101 to perform desired functions.
  • Memory 1012 may store one or more computer program products and may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory.
  • the volatile memory may include, for example, random access memory (RAM) and/or cache memory (cache).
  • the non-volatile memory may include, for example, read-only memory (ROM), hard disk, flash memory, etc.
  • One or more computer program products may be stored on the computer-readable storage medium and may be executed by a processor. To implement the image information display method and/or other desired functions of various embodiments of the present disclosure described above.
  • the electronic device 101 may also include: an input device 1013 and an output device 1014, etc. These components are interconnected through a bus system and/or other forms of connection mechanisms (not shown).
  • the input device 1013 may also include, for example, a keyboard, a mouse, and the like.
  • the output device 1014 can output various information to the outside.
  • the output device 1014 may include, for example, a display, a speaker, a printer, a communication network and remote output devices connected thereto, and the like.
  • the electronic device 101 may also include any other appropriate components depending on the specific application.
  • embodiments of the present disclosure may also be a computer program product, which includes computer program instructions that, when executed by a processor, cause the processor to execute the present invention.
  • the steps in the image information display method according to various embodiments of the present disclosure are described in the above part of the specification.
  • the computer program product may have program code for performing operations of embodiments of the present disclosure written in any combination of one or more programming languages, including object-oriented programming languages such as Java, C++, etc., and Includes conventional procedural programming languages, such as the "C" language or similar programming languages.
  • the program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on.
  • embodiments of the present disclosure may also be a computer-readable storage medium having computer program instructions stored thereon.
  • the computer program instructions when executed by a processor, cause the processor to perform the "exemplary method" described above in this specification.
  • the steps in the image information display method according to various embodiments of the present disclosure are described in Sec.
  • the computer-readable storage medium may be any combination of one or more readable media.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • the readable storage medium may include, for example, but is not limited to, electrical, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices or devices, or any combination thereof. More specific examples (non-exhaustive list) of readable storage media may include: an electrical connection with one or more conductors, a portable disk, a hard disk, random access memory (RAM), read only memory (ROM), Erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • the image information display methods, devices, storage media, electronic devices, and program products in the above embodiments can display image information that interacts with the user on the real space plane and the virtual space plane when the user browses, providing MR information display capabilities.
  • image information that interacts with the user on the real space plane and the virtual space plane when the user browses, providing MR information display capabilities.
  • scene-based information it provides users with a spatial scene interactive experience, improves the user's spatial browsing experience, and effectively improves the user's sensitivity.
  • the methods and apparatus of the present invention may be implemented in many ways.
  • the method and device of the present invention can be implemented through software, hardware, firmware, or any combination of software, hardware, and firmware.
  • the above order for the steps of the method is for illustration only, and the steps of the method of the present invention are not limited to the order specifically described above unless otherwise specifically stated.
  • the present invention can also be implemented as programs recorded in recording media, and these programs include machine-readable instructions for implementing the methods according to the present invention.
  • the present invention also covers recording media storing a program for executing the method according to the present invention.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Graphics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Geometry (AREA)
  • Computer Hardware Design (AREA)
  • User Interface Of Digital Computer (AREA)
  • Image Analysis (AREA)

Abstract

La présente invention concerne un procédé et un dispositif d'affichage d'informations d'image. Le procédé comprend : l'obtention d'informations de classification de pixels dans une image bidimensionnelle et la génération d'un graphe sémantique correspondant à l'image bidimensionnelle : sur la base du graphe sémantique, la détermination d'une région d'affichage candidate correspondant à un objet cible dans l'image bidimensionnelle : l'obtention d'un plan d'affichage tridimensionnel correspondant à la région d'affichage candidate, et la sélection d'une région d'affichage cible tridimensionnelle à partir du plan d'affichage tridimensionnel correspondant à au moins un objet cible ; et selon des informations d'emplacement d'affichage et un emplacement d'observation d'image, la détermination d'une région d'affichage d'écran bidimensionnelle correspondant à la région d'affichage cible tridimensionnelle, et l'affichage d'informations d'image.
PCT/CN2023/081391 2022-06-24 2023-03-14 Procédé et dispositif d'affichage d'informations d'image WO2023246189A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210720304.XA CN114827711B (zh) 2022-06-24 2022-06-24 图像信息显示方法和装置
CN202210720304.X 2022-06-24

Publications (1)

Publication Number Publication Date
WO2023246189A1 true WO2023246189A1 (fr) 2023-12-28

Family

ID=82522122

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/081391 WO2023246189A1 (fr) 2022-06-24 2023-03-14 Procédé et dispositif d'affichage d'informations d'image

Country Status (2)

Country Link
CN (1) CN114827711B (fr)
WO (1) WO2023246189A1 (fr)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114827711B (zh) * 2022-06-24 2022-09-20 如你所视(北京)科技有限公司 图像信息显示方法和装置
CN115631291B (zh) * 2022-11-18 2023-03-10 如你所视(北京)科技有限公司 用于增强现实的实时重光照方法和装置、设备和介质

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105989594A (zh) * 2015-02-12 2016-10-05 阿里巴巴集团控股有限公司 一种图像区域检测方法及装置
US20170061631A1 (en) * 2015-08-27 2017-03-02 Fujitsu Limited Image processing device and image processing method
EP3299763A1 (fr) * 2015-05-20 2018-03-28 Mitsubishi Electric Corporation Dispositif de production d'image de nuage de points et système d'affichage
CN110400337A (zh) * 2019-07-10 2019-11-01 北京达佳互联信息技术有限公司 图像处理方法、装置、电子设备及存储介质
CN111178191A (zh) * 2019-11-11 2020-05-19 贝壳技术有限公司 信息播放方法、装置、计算机可读存储介质及电子设备
CN112581629A (zh) * 2020-12-09 2021-03-30 中国科学院深圳先进技术研究院 增强现实显示方法、装置、电子设备及存储介质
CN113793255A (zh) * 2021-09-09 2021-12-14 百度在线网络技术(北京)有限公司 用于图像处理的方法、装置、设备、存储介质和程序产品
CN113934297A (zh) * 2021-10-13 2022-01-14 西交利物浦大学 一种基于增强现实的交互方法、装置、电子设备及介质
CN114827711A (zh) * 2022-06-24 2022-07-29 如你所视(北京)科技有限公司 图像信息显示方法和装置

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030095707A1 (en) * 2001-11-19 2003-05-22 Koninklijke Philips Electronics N.V. Computer vision method and system for blob-based analysis using a probabilistic pramework
CN110060230B (zh) * 2019-01-18 2021-11-26 商汤集团有限公司 三维场景分析方法、装置、介质及设备
CN113129362B (zh) * 2021-04-23 2024-05-10 北京地平线机器人技术研发有限公司 一种三维坐标数据的获取方法及装置
CN113902856B (zh) * 2021-11-09 2023-08-25 浙江商汤科技开发有限公司 一种语义标注的方法、装置、电子设备及存储介质

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105989594A (zh) * 2015-02-12 2016-10-05 阿里巴巴集团控股有限公司 一种图像区域检测方法及装置
EP3299763A1 (fr) * 2015-05-20 2018-03-28 Mitsubishi Electric Corporation Dispositif de production d'image de nuage de points et système d'affichage
US20170061631A1 (en) * 2015-08-27 2017-03-02 Fujitsu Limited Image processing device and image processing method
CN110400337A (zh) * 2019-07-10 2019-11-01 北京达佳互联信息技术有限公司 图像处理方法、装置、电子设备及存储介质
CN111178191A (zh) * 2019-11-11 2020-05-19 贝壳技术有限公司 信息播放方法、装置、计算机可读存储介质及电子设备
CN112581629A (zh) * 2020-12-09 2021-03-30 中国科学院深圳先进技术研究院 增强现实显示方法、装置、电子设备及存储介质
CN113793255A (zh) * 2021-09-09 2021-12-14 百度在线网络技术(北京)有限公司 用于图像处理的方法、装置、设备、存储介质和程序产品
CN113934297A (zh) * 2021-10-13 2022-01-14 西交利物浦大学 一种基于增强现实的交互方法、装置、电子设备及介质
CN114827711A (zh) * 2022-06-24 2022-07-29 如你所视(北京)科技有限公司 图像信息显示方法和装置

Also Published As

Publication number Publication date
CN114827711B (zh) 2022-09-20
CN114827711A (zh) 2022-07-29

Similar Documents

Publication Publication Date Title
US10755485B2 (en) Augmented reality product preview
Alexiou et al. On the performance of metrics to predict quality in point cloud representations
WO2023246189A1 (fr) Procédé et dispositif d'affichage d'informations d'image
CN111080799A (zh) 基于三维建模的场景漫游方法、系统、装置和存储介质
US10140000B2 (en) Multiscale three-dimensional orientation
US8836728B2 (en) Techniques to magnify images
WO2021093416A1 (fr) Procédé et dispositif de lecture d'informations, support de stockage lisible par ordinateur et dispositif électronique
WO2021018214A1 (fr) Procédé et appareil de traitement d'objet virtuel, et support de stockage et dispositif électronique
CN112017300B (zh) 混合现实图像的处理方法、装置及设备
US11763479B2 (en) Automatic measurements based on object classification
WO2023202349A1 (fr) Procédé et appareil de présentation interactive pour une étiquette tridimensionnelle, ainsi que dispositif, support et produit de programme
TW201417041A (zh) 推擠一模型通過二維場景的系統、方法和電腦程式商品
WO2023103980A1 (fr) Procédé et appareil de présentation de chemins tridimensionnels, support de stockage lisible et dispositif électronique
CN109743566A (zh) 一种用于识别vr视频格式的方法与设备
KR20240074815A (ko) 3d 모델 렌더링 방법 및 장치, 전자 디바이스, 그리고 저장 매체
CN107871338B (zh) 基于场景装饰的实时交互渲染方法
CN113920282B (zh) 图像处理方法和装置、计算机可读存储介质、电子设备
Xuerui Three-dimensional image art design based on dynamic image detection and genetic algorithm
CN107481306B (zh) 一种三维交互的方法
Kim et al. Multimodal visual data registration for web-based visualization in media production
Zhang et al. Sceneviewer: Automating residential photography in virtual environments
WO2022222689A1 (fr) Procédé et appareil de génération de données, et dispositif électronique
WO2023173126A1 (fr) Système et procédé de détection d'objet et de modèles 3d interactifs
US11170043B2 (en) Method for providing visualization of progress during media search
CN114913277A (zh) 一种物体立体交互展示方法、装置、设备及介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23825847

Country of ref document: EP

Kind code of ref document: A1