WO2023246189A1 - 图像信息显示方法和装置 - Google Patents

图像信息显示方法和装置 Download PDF

Info

Publication number
WO2023246189A1
WO2023246189A1 PCT/CN2023/081391 CN2023081391W WO2023246189A1 WO 2023246189 A1 WO2023246189 A1 WO 2023246189A1 CN 2023081391 W CN2023081391 W CN 2023081391W WO 2023246189 A1 WO2023246189 A1 WO 2023246189A1
Authority
WO
WIPO (PCT)
Prior art keywords
dimensional
image
display
display area
information
Prior art date
Application number
PCT/CN2023/081391
Other languages
English (en)
French (fr)
Inventor
谢独放
李阳
李浩正
王怡丁
焦弟琴
黄晓艺
Original Assignee
如你所视(北京)科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 如你所视(北京)科技有限公司 filed Critical 如你所视(北京)科技有限公司
Publication of WO2023246189A1 publication Critical patent/WO2023246189A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/006Mixed reality
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/816Monomedia components thereof involving special video data, e.g 3D video

Definitions

  • the present disclosure relates to the field of computer technology, and in particular, to an image information display method and device.
  • VR panoramic technology is an emerging technology. Because VR panoramic technology can present three-dimensional space scenes to users at 720 degrees without blind spots, it brings users an immersive visual experience. Users can choose rooms through VR viewing. The simulated decoration plan can simulate house decoration effects in VR scenes. During the VR house viewing process, users can determine the position information and perspective information of the virtual observation point in the VR model, determine the displayed observation screen based on the location information and perspective information, and be able to see the items in the observation screen, including windows, Walls, mirrors, tabletops, TVs and other flat surfaces.
  • Embodiments of the present disclosure provide an image information display method and device.
  • an image information display method including: obtaining classification information of pixels in a two-dimensional image, and generating a semantic map corresponding to the two-dimensional image based on the classification information;
  • the semantic map determines a candidate display area corresponding to the target object in the two-dimensional image; according to the depth image corresponding to the two-dimensional image, obtains a three-dimensional display plane corresponding to the candidate display area; obtains A three-dimensional model corresponding to the two-dimensional image, determining an image observation position corresponding to the three-dimensional model, and selecting a three-dimensional target display area in a three-dimensional display plane corresponding to at least one target object based on the image observation position; obtaining display position information and image information corresponding to the three-dimensional target display area, and determine a two-dimensional screen display area corresponding to the three-dimensional target display area according to the display position information and the image observation position; in the The image information is displayed in a two-dimensional screen display area.
  • the classification information of the pixels in the two-dimensional image is obtained, and based on the classification information, a pixel corresponding to the two-dimensional image is generated.
  • the semantic map corresponding to the image includes: using a trained neural network model to classify at least one pixel in the two-dimensional image, and obtaining a category label of at least one pixel in the two-dimensional image; based on the two-dimensional image The position information of at least one pixel in the image and the corresponding category label are used to generate the semantic map.
  • determining the candidate display area corresponding to the target object in the two-dimensional image based on the semantic map includes: based on the category label in the semantic map, determining in the semantic map the candidate display area corresponding to the target object in the two-dimensional image. At least one target area corresponding to the target object; using a preset image connection algorithm to perform image connection processing on multiple target areas corresponding to the target object to generate at least one pixel aggregation cluster; determining the pixel aggregation cluster based on the at least one pixel aggregation cluster.
  • the candidate display area includes: based on the category label in the semantic map, determining in the semantic map the candidate display area corresponding to the target object in the two-dimensional image. At least one target area corresponding to the target object; using a preset image connection algorithm to perform image connection processing on multiple target areas corresponding to the target object to generate at least one pixel aggregation cluster; determining the pixel aggregation cluster based on the at least one pixel aggregation cluster.
  • determining the candidate display area according to the at least one pixel aggregation cluster includes: determining whether the number of the pixel aggregation clusters is greater than 1, and if not, setting this pixel aggregation cluster as a candidate cluster; if so , then perform scoring processing on at least one pixel aggregation cluster according to the preset aggregation cluster scoring factor, and determine candidate clusters among multiple pixel aggregation clusters based on the score of the at least one pixel aggregation cluster; wherein, the aggregation cluster scoring factor It includes: position distribution and size of pixel aggregation clusters; setting the candidate clusters as the foreground and setting the remaining pixels in the semantic map as the background to generate a binary image; obtaining the information related to the foreground in the binary image The corresponding first rectangle serves as the candidate display area.
  • obtaining the three-dimensional display plane corresponding to the candidate display area based on the depth image corresponding to the two-dimensional image includes: converting the two-dimensional pixels in the candidate display area based on the depth image. The coordinates are converted into corresponding three-dimensional pixel coordinates; a three-dimensional point cloud corresponding to the candidate display area is generated according to the three-dimensional pixel coordinates; a plane detection is performed on the three-dimensional point cloud according to a plane detection algorithm; if the detection is passed, the The three-dimensional display plane corresponding to the three-dimensional point cloud.
  • selecting a three-dimensional target display area from a three-dimensional display plane corresponding to at least one target object based on the image observation position includes: acquiring a three-dimensional display plane corresponding to at least one target object based on the image observation position.
  • the display position information includes: the three-dimensional coordinate information of the vertex of the three-dimensional target display area; and the two-dimensional coordinate information corresponding to the three-dimensional target display area is determined based on the display position information and the image observation position.
  • the three-dimensional screen display area includes: determining the two-dimensional coordinate information of the vertex of the three-dimensional target display area based on the three-dimensional coordinate information of the vertex and the image observation position; based on the two-dimensional coordinate information, between the three-dimensional model and the The two-dimensional screen display area is determined in the two-dimensional screen display image corresponding to the image observation position.
  • the display processing of the image information in the two-dimensional screen display area includes: obtaining background information of the two-dimensional screen display area, and displaying elements of the image information based on the background information. Make adjustments; wherein the display elements of the image information include at least one of the following: pictures, text corresponding to the pictures, and symbols.
  • the display processing of the image information in the two-dimensional screen display area includes: obtaining the observation distance between the three-dimensional target display area and the image observation position, and determining the two-dimensional The area size of the screen display area; based on the area size and the observation distance, the display mode and size of the image information are determined.
  • the two-dimensional image includes: a color two-dimensional image corresponding to the interior of the house;
  • the target object includes: at least one of a window, a wall, a mirror, a desktop, and a television.
  • an image information display device including: an image analysis module, configured to obtain classification information of pixels in a two-dimensional image, and generate a phase corresponding to the two-dimensional image based on the classification information.
  • the corresponding semantic map a candidate area determination module, used to determine a candidate display area corresponding to the target object in the two-dimensional image based on the semantic map; a three-dimensional plane acquisition module, used to determine the candidate display area corresponding to the two-dimensional image based on the two-dimensional image
  • the corresponding depth image is used to obtain the three-dimensional display plane corresponding to the candidate display area;
  • the target area determination module is used to obtain the three-dimensional model corresponding to the two-dimensional image and determine the image observation corresponding to the three-dimensional model.
  • Position select a three-dimensional target display area in a three-dimensional display plane corresponding to at least one target object based on the image observation position; a display area determination module used to obtain display position information and image information corresponding to the three-dimensional target display area , according to the display position information and the image observation position, determine the two-dimensional screen display area corresponding to the three-dimensional target display area; a display processing module for processing the image in the two-dimensional screen display area Information is displayed and processed.
  • the image analysis module is specifically configured to use a trained neural network model to classify at least one pixel in the two-dimensional image and obtain a category label of at least one pixel in the two-dimensional image; based on The position information of at least one pixel in the two-dimensional image and the corresponding category label are used to generate the semantic map.
  • the candidate area determination module includes: a target area determination unit, configured to determine at least one target area corresponding to the target object in the semantic map based on the category label in the semantic map; area The connection processing unit is used to perform image connection processing on multiple target areas corresponding to the target object using a preset image connection algorithm to generate at least one pixel aggregation cluster; the candidate area selection unit is used to perform image connection processing based on the at least one pixel Aggregated clusters determine the candidate display areas.
  • a target area determination unit configured to determine at least one target area corresponding to the target object in the semantic map based on the category label in the semantic map
  • area The connection processing unit is used to perform image connection processing on multiple target areas corresponding to the target object using a preset image connection algorithm to generate at least one pixel aggregation cluster
  • the candidate area selection unit is used to perform image connection processing based on the at least one pixel Aggregated clusters determine the candidate display areas.
  • the candidate area selection unit is used to determine whether the number of pixel aggregation clusters is greater than 1. If not, set this pixel aggregation cluster as a candidate cluster; if so, score according to a preset aggregation cluster.
  • Factors perform scoring processing on at least one pixel aggregation cluster, and determine candidate clusters among multiple pixel aggregation clusters based on the score of the at least one pixel aggregation cluster; wherein the aggregation cluster scoring factors include: the location distribution of the pixel aggregation cluster and size; will be described
  • the selected cluster is set as the foreground and the remaining pixels in the semantic map are set as the background to generate a binary image; the first rectangle corresponding to the foreground is obtained in the binary image as the candidate display area.
  • the three-dimensional plane acquisition module includes: a coordinate conversion unit for converting two-dimensional pixel coordinates in the candidate display area into corresponding three-dimensional pixel coordinates based on the depth image; a point cloud generation unit for Generating a three-dimensional point cloud corresponding to the candidate display area according to the three-dimensional pixel coordinates; a plane detection unit for performing plane detection on the three-dimensional point cloud according to a plane detection algorithm; a plane determination unit for detecting if the , then the three-dimensional display plane corresponding to the three-dimensional point cloud is obtained.
  • the target area determination module is specifically configured to obtain display factors corresponding to a three-dimensional display plane corresponding to at least one target object according to the image observation position; wherein the display factors include: a three-dimensional display plane The orientation, the distance between the three-dimensional display plane and the image observation position; determining the display score of the three-dimensional display plane corresponding to at least one target object based on the display factors; selecting the three-dimensional target display area according to the display score .
  • the display position information includes: vertex three-dimensional coordinate information of the three-dimensional target display area; the display area determination module is specifically configured to determine the vertex three-dimensional coordinate information and the image observation position based on the Two-dimensional coordinate information of the vertex of the three-dimensional target display area; according to the two-dimensional coordinate information, the two-dimensional screen display area is determined in the two-dimensional screen display image corresponding to the three-dimensional model and the image observation position.
  • the display processing module is used to obtain the background information of the two-dimensional screen display area, and adjust the display elements of the image information based on the background information; wherein the display elements of the image information include At least one of the following: pictures, text corresponding to pictures, symbols.
  • the display processing module is used to obtain the observation distance between the three-dimensional target display area and the image observation position, and determine the area size of the two-dimensional screen display area; based on the area size and The observation distance determines the display mode and size of the image information.
  • the two-dimensional image includes: a color two-dimensional image corresponding to the interior of the house;
  • the target object includes: at least one of a window, a wall, a mirror, a desktop, and a television.
  • an electronic device includes: a processor; a memory for storing instructions executable by the processor; and the processor for executing the above method.
  • a computer program product including computer program instructions, wherein the computer program instructions implement the above-mentioned method when executed by a processor.
  • the image information for interaction with the user can be displayed on the real space plane and the virtual space plane when the user browses, and mixed reality technology information display can be provided.
  • Display capabilities and scene-based information provide users with spatial scene interaction experience, improve users’ spatial browsing experience, and effectively improve users’ sensitivity.
  • Figure 1 is a flow chart of an embodiment of the image information display method of the present disclosure
  • Figure 2 is a flow chart for determining candidate display areas in one embodiment of the image information display method of the present disclosure
  • Figure 3 is a flow chart for determining a three-dimensional display plane in one embodiment of the image information display method of the present disclosure
  • Figure 4 is a flow chart of selecting a three-dimensional target display area in one embodiment of the image information display method of the present disclosure
  • Figure 5 is a flow chart for determining a two-dimensional screen display area in one embodiment of the image information display method of the present disclosure
  • Figure 6A is a schematic diagram of a two-dimensional screen display area
  • Figure 6B is a schematic diagram of displaying image information in the two-dimensional screen display area
  • Figure 7 is a schematic structural diagram of an embodiment of the image information display device of the present disclosure.
  • Figure 8 is a schematic structural diagram of a candidate area determination module in one embodiment of the image information display device of the present disclosure.
  • Figure 9 is a schematic structural diagram of a three-dimensional plane acquisition module in one embodiment of the image information display device of the present disclosure.
  • FIG. 10 is a structural diagram of an embodiment of the electronic device of the present disclosure.
  • Embodiments of the present invention may be applied to computer systems/servers that may operate with numerous other general purpose or special purpose computing system environments or configurations.
  • Examples of well-known computing systems, environments and/or configurations suitable for use with computer systems/servers include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients, handheld or laptop-based devices, Microprocessor systems, set-top boxes, programmable consumer electronics, network personal computers, small computer systems, large computer systems and distributed cloud computing technology environments including any of the above systems, etc.
  • Computer systems/servers may be described in the general context of computer system executable instructions, such as program modules, being executed by the computer system.
  • program modules may include routines, programs, object programs, components, logic, data structures, etc., that perform specific tasks or implement specific abstract data types.
  • the computer system/server may be implemented in a distributed cloud computing environment where tasks are performed by remote processing devices linked through a communications network.
  • program modules may be located on local or remote computing system storage media including storage devices.
  • step numbers in this disclosure such as “Step 1", “Step 2", “S101", “S102”, etc., are only for distinguishing different steps and do not represent the order between steps. Steps with different numbers are executed when The order can be adjusted.
  • Figure 1 is a flow chart of an embodiment of the image information display method of the present disclosure. The method shown in Figure 1 includes the following steps:
  • S101 Obtain classification information of pixels in the two-dimensional image, and generate a semantic map corresponding to the two-dimensional image based on the classification information.
  • the two-dimensional image may be a color two-dimensional image corresponding to the interior of a house.
  • the two-dimensional image may be a color image of an object, a bedroom, a gymnasium interior, etc.
  • the target objects in the two-dimensional image can be windows, walls, mirrors, desktops, TVs, etc.
  • the target objects have a planar structure and can play videos, set pictures, etc. on them.
  • the classification information of pixels in the two-dimensional image includes classification information such as windows, walls, mirrors, desktops, TVs, floors, walls, etc.
  • classification information such as windows, walls, mirrors, desktops, TVs, floors, walls, etc.
  • Several methods can be used to generate semantic maps corresponding to two-dimensional images based on classification information. For example, a trained neural network model is used to classify at least one pixel in the two-dimensional image, and a category label of at least one pixel in the two-dimensional image is obtained.
  • Neural network models can be convolutional neural networks, adversarial neural network models, etc., and can be trained using a variety of existing training methods.
  • the neural network model Through the neural network model, it can be determined that the pixels in the two-dimensional image belong to windows, walls, mirrors, desktops, TVs, floors, walls, etc., and set at least one pixel in the two-dimensional image.
  • Category tags can include but are not limited to: windows, walls, mirrors, desktops, TVs, floors, walls and other tags.
  • a semantic map is generated based on the position information of at least one pixel in the two-dimensional image and the corresponding category label.
  • a category label corresponding to the pixel can be set at the location of at least one pixel in the two-dimensional image. For example, a category label indicating that the pixel belongs to a window, wall, mirror, desktop, TV, floor, wall, etc., to generate a semantic map. .
  • S102 Determine a candidate display area corresponding to the target object in the two-dimensional image based on the semantic map.
  • Candidate display areas for target objects such as windows, walls, mirrors, desktops, TVs, floors, and walls in the two-dimensional image can be determined based on the generated semantic map.
  • S103 Obtain a three-dimensional display plane corresponding to the candidate display area based on the depth image corresponding to the two-dimensional image.
  • S104 Obtain a three-dimensional model corresponding to the two-dimensional image, determine the image observation position corresponding to the three-dimensional model, and select a three-dimensional target display area in the three-dimensional display plane corresponding to at least one target object based on the image observation position.
  • the depth image may be a depth image of a living room, bedroom, gymnasium, etc. taken using a depth camera, etc., and the pixels in the depth image have three-dimensional coordinate information.
  • a three-dimensional model of the living room, bedroom, gymnasium, etc. can be established.
  • This three-dimensional model can support the VR scene display function, which is a VR model.
  • the three-dimensional space scene can be presented to the user through this three-dimensional model. .
  • the three-dimensional model can determine the two-dimensional image that needs to be displayed on the two-dimensional screen based on the image observation position.
  • S105 Obtain the display position information and image information corresponding to the three-dimensional target display area, and determine the two-dimensional screen display area corresponding to the three-dimensional target display area based on the display position information and image observation position.
  • the image information may be mixed reality technology (MR) information, and the image information may be of various types, such as community introduction, environment description, house advantage description and other image information, which can provide users with scene-based information.
  • Image information can be displayed on planes in real space and planes in virtual space, such as windows, walls, mirrors, desktops, TVs, floors, walls and other target objects in the two-dimensional screen display area, providing MR display capabilities to It provides users with an interactive experience of spatial scenes and improves their spatial browsing experience.
  • MR mixed reality technology
  • Figure 2 is a flow chart for determining candidate display areas in one embodiment of the image information display method of the present disclosure. The method shown in Figure 2 includes the following steps:
  • S202 use a preset image connection algorithm to perform image connection processing on multiple target areas corresponding to the target object, and generate at least one pixel aggregation cluster.
  • the image connection algorithm may include a variety of algorithms, such as existing image region growing algorithms, etc.
  • S203 Determine a candidate display area based on at least one pixel aggregation cluster.
  • Aggregation cluster scoring factors include the location distribution and size of pixel aggregation clusters.
  • the target object is a window, etc.
  • multiple target areas corresponding to the windows are determined in the semantic map. Use the existing image region growing algorithm to perform image connection processing on multiple target areas of the window to generate multiple pixel aggregation clusters.
  • Aggregation cluster scoring factors include the location distribution and size of pixel aggregation clusters, etc. You can set scoring standards corresponding to the aggregation cluster scoring factors. For example, the larger the pixel aggregation cluster, the higher the aggregation cluster score; the greater the location distance of the pixel aggregation cluster. The farther away the center position of the pixel aggregation cluster is, the lower the aggregation cluster score will be, etc. Score multiple pixel aggregation clusters respectively according to the scoring criteria, and determine the pixel aggregation clusters with scores greater than the threshold as candidate clusters. The number of candidate clusters can be one or more.
  • the candidate cluster as the foreground and the remaining pixels in the semantic map as the background to generate a binary map, and obtain the first rectangle corresponding to the foreground in the binary map as the candidate display area.
  • the first rectangle is a larger rectangle. For example, the largest rectangle corresponding to the foreground is obtained in the binary image as a candidate display area.
  • a trained feedforward neural network model is used to process a two-dimensional color image, extract features in the color image, and classify each pixel in the two-dimensional color image based on the features, and classify each pixel Determine a semantic category label, which can include category labels for windows, walls, mirrors, desktops, TVs, etc.
  • a semantic map is formed based on the category labels corresponding to the pixels of the original color image.
  • target objects such as windows, walls, mirrors, desktops, and TVs
  • you can initially screen target objects such as windows, walls, mirrors, desktops, and TVs, obtain the position of the target object in the two-dimensional color image, and generate the target area.
  • the target area obtained at this time is two-dimensional and the target area is distributed. Scattered, pixel by pixel, does not constitute an enclosed area.
  • the image region growing algorithm is used to generate connected parts, and adjacent pixels of the same category are aggregated into pixel aggregation clusters.
  • the results are sorted according to scoring criteria such as location distribution and cluster size, and candidate clusters that meet the criteria are selected.
  • Pixel candidate clusters may be irregular in shape and need to be processed into regular shapes.
  • Set the current pixel candidate cluster as the foreground, and set the positions of the remaining image pixels as the background to obtain a binary image.
  • Figure 3 is a flow chart for determining a three-dimensional display plane in one embodiment of the image information display method of the present disclosure. The method shown in Figure 3 includes the following steps:
  • S302 Generate a three-dimensional point cloud corresponding to the candidate display area according to the three-dimensional pixel coordinates.
  • S303 Perform plane detection on the three-dimensional point cloud according to the plane detection algorithm.
  • plane detection algorithms such as the existing random sampling consensus algorithm, etc.
  • the three-dimensional display plane can be constructed through existing three-dimensional reconstruction technology.
  • the three-dimensional display plane can be multiple areas in space suitable for displaying information, and each area is composed of three-dimensional coordinates of multiple boundary corner points.
  • the candidate display area lacks three-dimensional position information and needs to be processed based on the depth map to obtain the three-dimensional coordinates.
  • the three-dimensional plane corresponding to the object is used as the three-dimensional display plane.
  • Figure 4 is a flow chart of selecting a three-dimensional target display area in one embodiment of the image information display method of the present disclosure. The method shown in Figure 4 includes the following steps:
  • the display factors include factors such as the orientation of the three-dimensional display plane and the distance between the three-dimensional display plane and the image observation position.
  • the displayed position information may be the three-dimensional coordinate information of the vertices of the three-dimensional target display area, etc.
  • S402. Determine the display score of the three-dimensional display plane corresponding to at least one target object based on the display factors.
  • S403 Select the three-dimensional target display area according to the display score.
  • a display of the three-dimensional display plane corresponding to target objects such as windows, walls, mirrors, desktops, TVs, floors, and walls is obtained.
  • Factors, display factors include the orientation of the three-dimensional display plane, the distance between the three-dimensional display plane and the image observation position, etc.
  • the scoring standards and based on factors such as the orientation of the three-dimensional display plane, the distance between the three-dimensional display plane and the image observation position, the display score of the three-dimensional display plane corresponding to at least one target object is determined, and the three-dimensional display plane with a score greater than the threshold is determined It is determined as a three-dimensional target display area, and the number of three-dimensional target display areas can be one or more.
  • Figure 5 is a flow chart for determining a two-dimensional screen display area in one embodiment of the image information display method of the present disclosure. The method shown in Figure 5 includes the following steps:
  • the display position information includes vertex three-dimensional coordinate information of the three-dimensional target display area.
  • the three-dimensional target display area may be a display area located on a window, as shown in Figure 6A.
  • the three-dimensional target display area is a rectangular area, and the display position information includes three-dimensional coordinate information of four vertices of this rectangular area.
  • the area surrounded by the four vertices is the rectangular area that needs to display the user interface (User Interface, UI).
  • the image information can be pasted to the corresponding rectangular area of the three-dimensional target display area.
  • the observation distance between the three-dimensional target display area and the image observation position is obtained, and the area size of the two-dimensional screen display area is determined. Based on the area size and the observation distance, the display mode and size of the image information are determined.
  • the image information display method based on the present disclosure can display the image information that interacts with the user on the real space plane and the virtual space plane when the user browses, provides MR information display capabilities and scene-based information, and provides the user with a space scene interaction experience. , improving the user’s space browsing experience.
  • the present disclosure provides an image information display device, including an image analysis module 71, a candidate area determination module 72, a three-dimensional plane acquisition module 73, a target area determination module 74, and a display area determination module. 75 and display processing module 76.
  • the image analysis module 71 obtains classification information of pixels in the two-dimensional image, and generates a semantic map corresponding to the two-dimensional image based on the classification information.
  • the image analysis module 71 uses a trained neural network model to classify at least one pixel in the two-dimensional image, and obtains the category label of at least one pixel in the two-dimensional image; the image analysis module 71 is based on the classification of at least one pixel in the two-dimensional image. Location information and corresponding category labels are used to generate semantic maps.
  • the candidate area determination module 72 determines a candidate display area corresponding to the target object in the two-dimensional image based on the semantic map.
  • the three-dimensional plane acquisition module 73 acquires a three-dimensional display plane corresponding to the candidate display area according to the depth image corresponding to the two-dimensional image.
  • the target area determination module 74 obtains a three-dimensional model corresponding to the two-dimensional image, determines an image observation position corresponding to the three-dimensional model, and selects a three-dimensional target display area in a three-dimensional display plane corresponding to at least one target object based on the image observation position.
  • the display area determination module 75 obtains display position information and image information corresponding to the three-dimensional target display area, and determines a two-dimensional screen display area corresponding to the three-dimensional target display area based on the display position information and image observation position.
  • the display processing module 76 performs display processing on the image information in the two-dimensional screen display area.
  • the candidate area selection unit 723 determines whether the number of pixel aggregation clusters is greater than 1. If not, the candidate area selection unit 723 sets this pixel aggregation cluster as a candidate cluster; if so, the candidate area selection unit 723 scores according to the preset aggregation cluster.
  • the factors perform scoring processing on at least one pixel aggregation cluster, and determine candidate clusters among multiple pixel aggregation clusters based on the score of at least one pixel aggregation cluster; wherein the aggregation cluster scoring factors include: location distribution and size of the pixel aggregation cluster.
  • the candidate area selection unit 723 sets the candidate cluster as the foreground and the remaining pixels in the semantic map as the background, generating a binary map.
  • the candidate area selection unit 723 obtains the first rectangle corresponding to the foreground in the binary image as a candidate display area.
  • the three-dimensional plane acquisition module 73 includes a coordinate conversion unit 731 , a point cloud generation unit 732 , a plane detection unit 733 and a plane determination unit 734 .
  • the coordinate conversion unit 731 converts two-dimensional pixel coordinates in the candidate display area into corresponding three-dimensional pixel coordinates based on the depth image.
  • the point cloud generation unit 732 generates a three-dimensional point cloud corresponding to the candidate display area according to the three-dimensional pixel coordinates.
  • the plane detection unit 733 performs plane detection on the three-dimensional point cloud according to a plane detection algorithm; wherein the plane detection algorithm includes a random sampling consistency algorithm. If the detection is passed, the plane determination unit 734 acquires the three-dimensional display plane corresponding to the three-dimensional point cloud.
  • the display position information includes the three-dimensional coordinate information of the vertex of the three-dimensional target display area; the display area determination module 75 determines the two-dimensional coordinate information of the vertex of the three-dimensional target display area based on the three-dimensional coordinate information of the vertex and the image observation position. The display area determination module 75 determines the two-dimensional screen display area in the two-dimensional screen display image corresponding to the three-dimensional model and the image observation position according to the two-dimensional coordinate information.
  • the display processing module 76 obtains the background information of the two-dimensional screen display area, and adjusts the display elements of the image information based on the background information; wherein the display elements of the image information include at least one of the following pictures, text corresponding to the picture, symbols, etc.
  • the display processing module 76 obtains the observation distance between the three-dimensional target display area and the image observation position, determines the area size of the two-dimensional screen display area, and determines the display mode and size of the image information based on the area size and the observation distance.
  • FIG. 10 is a structural diagram of an embodiment of an electronic device of the present disclosure. As shown in FIG. 10 , the electronic device 101 includes one or more processors 1011 and a memory 1012 .
  • the processor 1011 may be a central processing unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device 101 to perform desired functions.
  • CPU central processing unit
  • the processor 1011 may be a central processing unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device 101 to perform desired functions.
  • Memory 1012 may store one or more computer program products and may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory.
  • the volatile memory may include, for example, random access memory (RAM) and/or cache memory (cache).
  • the non-volatile memory may include, for example, read-only memory (ROM), hard disk, flash memory, etc.
  • One or more computer program products may be stored on the computer-readable storage medium and may be executed by a processor. To implement the image information display method and/or other desired functions of various embodiments of the present disclosure described above.
  • the electronic device 101 may also include: an input device 1013 and an output device 1014, etc. These components are interconnected through a bus system and/or other forms of connection mechanisms (not shown).
  • the input device 1013 may also include, for example, a keyboard, a mouse, and the like.
  • the output device 1014 can output various information to the outside.
  • the output device 1014 may include, for example, a display, a speaker, a printer, a communication network and remote output devices connected thereto, and the like.
  • the electronic device 101 may also include any other appropriate components depending on the specific application.
  • embodiments of the present disclosure may also be a computer program product, which includes computer program instructions that, when executed by a processor, cause the processor to execute the present invention.
  • the steps in the image information display method according to various embodiments of the present disclosure are described in the above part of the specification.
  • the computer program product may have program code for performing operations of embodiments of the present disclosure written in any combination of one or more programming languages, including object-oriented programming languages such as Java, C++, etc., and Includes conventional procedural programming languages, such as the "C" language or similar programming languages.
  • the program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on.
  • embodiments of the present disclosure may also be a computer-readable storage medium having computer program instructions stored thereon.
  • the computer program instructions when executed by a processor, cause the processor to perform the "exemplary method" described above in this specification.
  • the steps in the image information display method according to various embodiments of the present disclosure are described in Sec.
  • the computer-readable storage medium may be any combination of one or more readable media.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • the readable storage medium may include, for example, but is not limited to, electrical, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices or devices, or any combination thereof. More specific examples (non-exhaustive list) of readable storage media may include: an electrical connection with one or more conductors, a portable disk, a hard disk, random access memory (RAM), read only memory (ROM), Erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • the image information display methods, devices, storage media, electronic devices, and program products in the above embodiments can display image information that interacts with the user on the real space plane and the virtual space plane when the user browses, providing MR information display capabilities.
  • image information that interacts with the user on the real space plane and the virtual space plane when the user browses, providing MR information display capabilities.
  • scene-based information it provides users with a spatial scene interactive experience, improves the user's spatial browsing experience, and effectively improves the user's sensitivity.
  • the methods and apparatus of the present invention may be implemented in many ways.
  • the method and device of the present invention can be implemented through software, hardware, firmware, or any combination of software, hardware, and firmware.
  • the above order for the steps of the method is for illustration only, and the steps of the method of the present invention are not limited to the order specifically described above unless otherwise specifically stated.
  • the present invention can also be implemented as programs recorded in recording media, and these programs include machine-readable instructions for implementing the methods according to the present invention.
  • the present invention also covers recording media storing a program for executing the method according to the present invention.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computer Graphics (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Geometry (AREA)
  • Computer Hardware Design (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • User Interface Of Digital Computer (AREA)
  • Image Analysis (AREA)

Abstract

提供了一种图像信息显示方法和装置,其中的方法包括:获取二维图像中的像素的分类信息,生成与二维图像相对应的语义图;基于语义图确定与二维图像中的目标物体相对应的候选显示区域;获取与候选显示区域相对应的三维显示平面,在与至少一个目标物体对应的三维显示平面中选取三维目标显示区域;根据显示位置信息以及图像观测位置,确定与三维目标显示区域相对应的二维屏幕显示区域,对图像信息进行显示处理。

Description

图像信息显示方法和装置
本公开要求在2022年6月24日提交中国专利局、公开号为CN202210720304.X、发明名称为“图像信息显示方法和装置”的中国专利公开的优先权,其全部内容通过引用结合在本公开中。
技术领域
本公开涉及计算机技术领域,尤其涉及一种图像信息显示方法和装置。
背景技术
虚拟现实(Virtual Reality,VR)全景技术是一种新兴技术,由于VR全景技术可以720度无死角的为用户呈现三维空间场景,给用户带来浸入式视觉体验,用户可以通过VR看房挑选房间的模拟装修方案,可以在VR场景中实现房屋装修效果模拟等。用户在进行VR看房过程中,可以通过确定虚拟观察点在VR模型内的位置信息和视角信息,基于位置信息和视角信息确定显示的观察画面,能够看到观察画面中的物品,包括窗户、墙壁、镜面、桌面、电视等具有平面的物品。
发明内容
本公开的实施例提供了一种图像信息显示方法和装置。
根据本公开实施例的第一方面,提供一种图像信息显示方法,包括:获取二维图像中的像素的分类信息,基于所述分类信息生成与所述二维图像相对应的语义图;基于所述语义图确定与所述二维图像中的目标物体相对应的候选显示区域;根据与所述二维图像相对应的深度图像,获取与所述候选显示区域相对应的三维显示平面;获取与所述二维图像相对应的三维模型,确定与所述三维模型相对应的图像观测位置,基于所述图像观测位置在与至少一个目标物体对应的三维显示平面中选取三维目标显示区域;获取与所述三维目标显示区域相对应的显示位置信息以及图像信息,根据所述显示位置信息以及所述图像观测位置,确定与所述三维目标显示区域相对应的二维屏幕显示区域;在所述二维屏幕显示区域中对所述图像信息进行显示处理。
可选地,所述获取二维图像中的像素的分类信息,基于所述分类信息生成与所述二维 图像相对应的语义图包括:使用训练好的神经网络模型对所述二维图像中的至少一个像素进行分类处理,获取所述二维图像中至少一个像素的类别标签;基于所述二维图像中的至少一个像素的位置信息以及对应的类别标签,生成所述语义图。
可选地,所述基于所述语义图确定与所述二维图像中的目标物体相对应的候选显示区域包括:基于所述语义图中的类别标签,在所述语义图中确定与所述目标物体对应的至少一个目标区域;使用预设的图像连通算法将与所述目标物体对应的多个目标区域进行图像连通处理,生成至少一个像素聚合簇;根据所述至少一个像素聚合簇确定所述候选显示区域。
可选地,所述根据所述至少一个像素聚合簇确定所述候选显示区域包括:判断所述像素聚合簇的数量是否大于1,如果否,则将此像素聚合簇设置为候选簇;如果是,则根据预设的聚合簇评分因素对至少一个像素聚合簇进行评分处理,并基于所述至少一个像素聚合簇的评分在多个像素聚合簇中确定候选簇;其中,所述聚合簇评分因素包括:像素聚合簇的位置分布以及大小;将所述候选簇设置为前景并将所述语义图中的其余像素设置为背景,生成二值图;在所述二值图中获取与所述前景相对应的第一矩形,作为所述候选显示区域。
可选地,所述根据与所述二维图像相对应的深度图像,获取与所述候选显示区域相对应的三维显示平面包括:基于所述深度图像将所述候选显示区域中的二维像素坐标转换为对应的三维像素坐标;根据所述三维像素坐标生成与所述候选显示区域相对应的三维点云;根据平面检测算法对所述三维点云进行平面检测;如果通过检测,则获取与所述三维点云相对应的三维显示平面。
可选地,所述基于所述图像观测位置在与至少一个目标物体对应的三维显示平面中选取三维目标显示区域包括:根据所述图像观测位置,获取与至少一个目标物体相对应的三维显示平面相对应的展示因素;其中,所述展示因素包括:三维显示平面的朝向、三维显示平面与所述图像观测位置之间的距离;基于所述展示因素确定与至少一个目标物体相对应的三维显示平面的展示评分;根据所述展示评分选取所述三维目标显示区域。
可选地,所述显示位置信息包括:所述三维目标显示区域的顶点三维坐标信息;所述根据所述显示位置信息以及所述图像观测位置,确定与所述三维目标显示区域相对应的二维屏幕显示区域包括:基于所述顶点三维坐标信息以及所述图像观测位置,确定所述三维目标显示区域的顶点二维坐标信息;根据所述二维坐标信息,在所述三维模型与所述图像观测位置相对应的二维屏幕显示图像中确定所述二维屏幕显示区域。
可选地,所述在所述二维屏幕显示区域中对所述图像信息进行显示处理包括:获取所述二维屏幕显示区域的背景信息,基于所述背景信息对所述图像信息的显示元素进行调整;其中,所述图像信息的显示元素包括以下至少之一:图片、图片对应的文字、符号。
可选地,所述在所述二维屏幕显示区域中对所述图像信息进行显示处理包括:获取所述三维目标显示区域与所述图像观测位置之间的观测距离,并确定所述二维屏幕显示区域的区域大小;基于所述区域大小和所述观测距离,确定所述图像信息的显示方式和大小。
可选地,所述二维图像包括:与房屋室内相对应的彩色二维图像;所述目标物体包括:窗户、墙壁、镜面、桌面、电视中的至少一个。
根据本公开实施例的第二方面,提供一种图像信息显示装置,包括:图像分析模块,用于获取二维图像中的像素的分类信息,基于所述分类信息生成与所述二维图像相对应的语义图;候选区域确定模块,用于基于所述语义图确定与所述二维图像中的目标物体相对应的候选显示区域;三维平面获取模块,用于根据与所述二维图像相对应的深度图像,获取与所述候选显示区域相对应的三维显示平面;目标区域确定模块,用于获取与所述二维图像相对应的三维模型,确定与所述三维模型相对应的图像观测位置,基于所述图像观测位置在与至少一个目标物体对应的三维显示平面中选取三维目标显示区域;显示区域确定模块,用于获取与所述三维目标显示区域相对应的显示位置信息以及图像信息,根据所述显示位置信息以及所述图像观测位置,确定与所述三维目标显示区域相对应的二维屏幕显示区域;显示处理模块,用于在所述二维屏幕显示区域中对所述图像信息进行显示处理。
可选地,所述图像分析模块,具体用于使用训练好的神经网络模型对所述二维图像中的至少一个像素进行分类处理,获取所述二维图像中至少一个像素的类别标签;基于所述二维图像中的至少一个像素的位置信息以及对应的类别标签,生成所述语义图。
可选地,所述候选区域确定模块,包括:目标区域确定单元,用于基于所述语义图中的类别标签,在所述语义图中确定与所述目标物体对应的至少一个目标区域;区域连通处理单元,用于使用预设的图像连通算法将与所述目标物体对应的多个目标区域进行图像连通处理,生成至少一个像素聚合簇;候选区域选取单元,用于根据所述至少一个像素聚合簇确定所述候选显示区域。
可选地,所述候选区域选取单元,用于判断所述像素聚合簇的数量是否大于1,如果否,则将此像素聚合簇设置为候选簇;如果是,则根据预设的聚合簇评分因素对至少一个像素聚合簇进行评分处理,并基于所述至少一个像素聚合簇的评分在多个像素聚合簇中确定候选簇;其中,所述聚合簇评分因素包括:像素聚合簇的位置分布以及大小;将所述候 选簇设置为前景并将所述语义图中的其余像素设置为背景,生成二值图;在所述二值图中获取与所述前景相对应的第一矩形,作为所述候选显示区域。
可选地,所述三维平面获取模块,包括:坐标转换单元,用于基于所述深度图像将所述候选显示区域中的二维像素坐标转换为对应的三维像素坐标;点云生成单元,用于根据所述三维像素坐标生成与所述候选显示区域相对应的三维点云;平面检测单元,用于根据平面检测算法对所述三维点云进行平面检测;平面确定单元,用于如果通过检测,则获取与所述三维点云相对应的三维显示平面。
可选地,所述目标区域确定模块,具体用于根据所述图像观测位置,获取与至少一个目标物体相对应的三维显示平面相对应的展示因素;其中,所述展示因素包括:三维显示平面的朝向、三维显示平面与所述图像观测位置之间的距离;基于所述展示因素确定与至少一个目标物体相对应的三维显示平面的展示评分;根据所述展示评分选取所述三维目标显示区域。
可选地,所述显示位置信息包括:所述三维目标显示区域的顶点三维坐标信息;所述显示区域确定模块,具体用于基于所述顶点三维坐标信息以及所述图像观测位置,确定所述三维目标显示区域的顶点二维坐标信息;根据所述二维坐标信息,在所述三维模型与所述图像观测位置相对应的二维屏幕显示图像中确定所述二维屏幕显示区域。
可选地,所述显示处理模块,用于获取所述二维屏幕显示区域的背景信息,基于所述背景信息对所述图像信息的显示元素进行调整;其中,所述图像信息的显示元素包括以下至少之一:图片、图片对应的文字、符号。
可选地,所述显示处理模块,用于获取所述三维目标显示区域与所述图像观测位置之间的观测距离,并确定所述二维屏幕显示区域的区域大小;基于所述区域大小和所述观测距离,确定所述图像信息的显示方式和大小。
可选地,所述二维图像包括:与房屋室内相对应的彩色二维图像;所述目标物体包括:窗户、墙壁、镜面、桌面、电视中的至少一个。
根据本公开实施例的第三方面,提供一种电子设备,所述电子设备包括:处理器;用于存储所述处理器可执行指令的存储器;所述处理器,用于执行上述的方法。
根据本公开实施例的第四方面,提供一种计算机程序产品,包括计算机程序指令,其特征在于,该计算机程序指令被处理器执行时实现上述所述的方法。
基于本公开上述实施例提供的图像信息显示方法和装置,能够在用户浏览时,在真实空间平面以及虚拟空间平面上展示与用户进行交互的图像信息,提供混合现实技术信息展 示能力以及场景化信息,向用户提供空间场景交互体验,提高了用户的空间浏览体验,有效改善了用户的感受度。
下面通过附图和实施例,对本发明的技术方案做进一步的详细描述。
附图说明
构成说明书的一部分的附图描述了本发明的实施例,并且连同描述一起用于解释本发明的原理。
参照附图,根据下面的详细描述,可以更加清楚地理解本发明,其中:
图1为本公开的图像信息显示方法的一个实施例的流程图;
图2为本公开的图像信息显示方法的一个实施例中的确定候选显示区域的流程图;
图3为本公开的图像信息显示方法的一个实施例中的确定三维显示平面的流程图;
图4为本公开的图像信息显示方法的一个实施例中的选取三维目标显示区域的流程图;
图5为本公开的图像信息显示方法的一个实施例中的确定二维屏幕显示区域的流程图;
图6A为二维屏幕显示区域的示意图,图6B为在二维屏幕显示区域内显示图像信息的示意图;
图7为本公开的图像信息显示装置的一个实施例的结构示意图;
图8为本公开的图像信息显示装置的一个实施例中的候选区域确定模块的结构示意图;
图9为本公开的图像信息显示装置的一个实施例中的三维平面获取模块的结构示意图;
图10是本公开的电子设备的一个实施例的结构图。
具体实施方式
现在将参照附图来详细描述本发明的各种示例性实施例。应注意到:除非另外具体说明,否则在这些实施例中阐述的部件和步骤的相对布置、数字表达式和数值不限制本发明的范围。
同时,应当明白,为了便于描述,附图中所示出的各个部分的尺寸并不是按照实际的比例关系绘制的。
以下对至少一个示例性实施例的描述实际上仅仅是说明性的,决不作为对本发明及其应用或使用的任何限制。
对于相关领域普通技术人员已知的技术、方法和设备可能不作详细讨论,但在适当情况下,所述技术、方法和设备应当被视为说明书的一部分。
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步讨论。
本发明实施例可以应用于计算机系统/服务器,其可与众多其它通用或专用计算系统环境或配置一起操作。适于与计算机系统/服务器一起使用的众所周知的计算系统、环境和/或配置的例子包括但不限于:个人计算机系统、服务器计算机系统、瘦客户机、厚客户机、手持或膝上设备、基于微处理器的系统、机顶盒、可编程消费电子产品、网络个人电脑、小型计算机系统﹑大型计算机系统和包括上述任何系统的分布式云计算技术环境,等等。
计算机系统/服务器可以在由计算机系统执行的计算机系统可执行指令(诸如程序模块)的一般语境下描述。通常,程序模块可以包括例程、程序、目标程序、组件、逻辑、数据结构等等,它们执行特定的任务或者实现特定的抽象数据类型。计算机系统/服务器可以在分布式云计算环境中实施,分布式云计算环境中,任务是由通过通信网络链接的远程处理设备执行的。在分布式云计算环境中,程序模块可以位于包括存储设备的本地或远程计算系统存储介质上。
本公开中的步骤标号,例如“步骤一”、“步骤二”、“S101”、“S102”等,仅为了区分不同步骤,不代表步骤之间的先后顺序,具有不同标号的步骤在执行时可以调整顺序。
图1为本公开的图像信息显示方法的一个实施例的流程图,如图1所示的方法包括以下步骤:
S101,获取二维图像中的像素的分类信息,基于分类信息生成与二维图像相对应的语义图。
在一个实施例中,二维图像可以为与房屋室内相对应的彩色二维图像,例如,二维图像为客体、卧室、体育馆室内的彩色图像等。二维图像中的目标物体可以为窗户、墙壁、镜面、桌面、电视等,目标物体具有平面结构,能够在其上播放视频、设置图片等。
二维图像中的像素的分类信息包括窗户、墙壁、镜面、桌面、电视、地面、墙壁等分类信息。基于分类信息生成与二维图像相对应的语义图可以使用多种方法。例如,使用训练好的神经网络模型对二维图像中的至少一个像素进行分类处理,获取二维图像中至少一个像素的类别标签。神经网络模型可以卷积神经网络、对抗式神经网络模型等,可以采用现有的多种训练方法进行训练。
将二维图像输入训练好的神经网络模型,通过神经网络模型能够确定二维图像中的像素属于窗户、墙壁、镜面、桌面、电视、地面、墙壁等,并对二维图像中至少一个像素设 置对应的类别标签,类别标签可以包括但不限于:窗户、墙壁、镜面、桌面、电视、地面、墙壁等标签。
基于二维图像中的至少一个像素的位置信息以及对应的类别标签,生成语义图。可以在二维图像中的至少一个像素的所在位置处设置与此像素相对应的类别标签,例如,表征像素属于窗户、墙壁、镜面、桌面、电视、地面、墙壁等的类别标签,生成语义图。
S102,基于语义图确定与二维图像中的目标物体相对应的候选显示区域。
可以根据生成的语义图,确定二维图像中的窗户、墙壁、镜面、桌面、电视、地面、墙壁等目标物体的候选显示区域。
S103,根据与二维图像相对应的深度图像,获取与候选显示区域相对应的三维显示平面。
S104,获取与二维图像相对应的三维模型,确定与三维模型相对应的图像观测位置,基于图像观测位置在与至少一个目标物体对应的三维显示平面中选取三维目标显示区域。
在一个实施例中,深度图像可以为使用深度相机等拍摄的客厅、卧室、体育馆室内等的深度图像,深度图像中的像素具有三维坐标信息。基于深度图像并使用现有的多种方法可以建立客厅、卧室、体育馆室内等的三维模型,此三维模型能够支持VR场景展示功能,即为VR模型,可以通过此三维模型为用户呈现三维空间场景。
用户在观看VR场景展示时,可以设置与三维模型相对应的图像观测位置(虚拟观测点),三维模型能够基于图像观测位置确定需要在二维屏幕上进行显示的二维图像。
S105,获取与三维目标显示区域相对应的显示位置信息以及图像信息,根据显示位置信息以及图像观测位置,确定与三维目标显示区域相对应的二维屏幕显示区域。
S106,在二维屏幕显示区域中对图像信息进行显示处理。
在一个实施例中,图像信息可以为混合现实技术(MR)信息,图像信息可以为多种,例如,小区介绍、环境说明、房屋优势说明等图像信息,能够为用户提供场景化信息。可以在真实空间的平面以及虚拟空间内的平面上,例如,窗户、墙壁、镜面、桌面、电视、地面、墙壁等目标物体上的二维屏幕显示区域中展示图像信息,提供MR展示能力,向用户提供空间场景交互体验,提高了用户的空间浏览体验。
图2为本公开的图像信息显示方法的一个实施例中的确定候选显示区域的流程图,如图2所示的方法包括以下步骤:
S201,基于语义图中的类别标签,在语义图中确定与目标物体对应的至少一个目标区域。
S202,使用预设的图像连通算法将与目标物体对应的多个目标区域进行图像连通处理,生成至少一个像素聚合簇。
可选地,图像连通算法可以包括多种,例如,现有的图像区域生长算法等。
S203,根据至少一个像素聚合簇确定候选显示区域。
判断像素聚合簇的数量是否大于1,如果否,则将此像素聚合簇设置为候选簇;如果是,则根据预设的聚合簇评分因素对至少一个像素聚合簇进行评分处理,并基于至少一个像素聚合簇的评分在多个像素聚合簇中确定候选簇。
聚合簇评分因素包括像素聚合簇的位置分布以及大小。例如,目标物体为窗户等,基于语义图中的类别标签,在语义图中确定与窗户对应的多个目标区域。使用现有的图像区域生长算法将窗户的多个目标区域进行图像连通处理,生成多个像素聚合簇。
聚合簇评分因素包括像素聚合簇的位置分布以及大小等,可以设置与聚合簇评分因素相对应的评分标准,例如,像素聚合簇越大,则聚合簇评分越高;像素聚合簇的位置距离多个像素聚合簇的中心位置的距离越远,则聚合簇评分越低等。根据评分标准分别对多个像素聚合簇进行评分,将分值大于阈值的像素聚合簇确定为候选簇,候选簇的数量可以为一个或多个。将候选簇设置为前景并将语义图中的其余像素设置为背景,生成二值图,在二值图中获取与前景相对应的第一矩形,作为候选显示区域。其中,第一矩形为较大矩形,例如,在二值图中获取与前景相对应的最大矩形,作为候选显示区域。
在一个实施例中,使用训练好的前馈神经网络模型对二维彩色图像进行处理,提取彩色图像中的特征,并基于特征对二维彩色图像中的每个像素进行分类,对每个像素确定一个语义类别标签,类别标签可以包括窗户、墙壁、镜面、桌面、电视等类别标签。基于与原彩色图像素一一对应的类别标签构成一张语义图。
使用语义图可以初步筛选窗户、墙壁、镜面、桌面、电视等目标物体,获取目标物体在二维彩色图像中的位置,生成目标区域,此时获取的目标区域是二维的,且目标区域分布零散、逐像素的,不构成一个合围的区域。
通过把零散的目标区域聚合成连通的部分,并结合深度图拟合三维物体的表面。例如,采用图像区域生长算法生成连通部分,把相邻的同类别的像素聚合成像素聚合簇。在得到的多个像素聚合簇中,根据位置分布、簇大小等评分标准,对结果进行排序,选取符合标准的候选簇。
像素候选簇可能是不规则形状,需要处理成规则形状。把当前像素候选簇设置为前景,其余的图像像素的位置设置为背景,得到一张二值图。对二值图运行使用现有的最大矩形 查找算法,得到规则的矩形,构成一个合围区域,作为候选显示区域。
图3为本公开的图像信息显示方法的一个实施例中的确定三维显示平面的流程图,如图3所示的方法包括以下步骤:
S301,基于深度图像将候选显示区域中的二维像素坐标转换为对应的三维像素坐标。
S302,根据三维像素坐标生成与候选显示区域相对应的三维点云。
S303,根据平面检测算法对三维点云进行平面检测。平面检测算法可以为多种,例如为现有的随机采样一致性算法等。
S304,如果通过检测,则获取与三维点云相对应的三维显示平面。
在一个实施例中,三维显示平面可以通过现有的三维重建技术构造,三维显示平面可以为空间中适合展示信息的多个区域,每个区域由多个边界角点的三维坐标合围构成。候选显示区域缺少三维位置信息,需要基于深度图的处理得到三维坐标。
使用深度图把候选显示区域的至少一个像素对应的二维坐标变换到三维空间,得到对应的三维点云,对三维点云使用随机采样一致性算法进行随机采样一致性检测,能够拟合与目标物体相对应的三维平面,作为三维显示平面。
选取三维目标显示区域可以采用多种方法。图4为本公开的图像信息显示方法的一个实施例中的选取三维目标显示区域的流程图,如图4所示的方法包括以下步骤:
S401,根据图像观测位置,获取与至少一个目标物体相对应的三维显示平面相对应的展示因素。
在一个实施例中,展示因素包括三维显示平面的朝向、三维显示平面与图像观测位置之间的距离等因素。显示位置信息可以为三维目标显示区域的顶点三维坐标信息等。
S402,基于展示因素确定与至少一个目标物体相对应的三维显示平面的展示评分。
S403,根据展示评分选取三维目标显示区域。
在一个实施例中,根据每个与三维模型相对应的图像观测位置(虚拟观测点),获取与窗户、墙壁、镜面、桌面、电视、地面、墙壁等目标物体相对应的三维显示平面的展示因素,展示因素包括三维显示平面的朝向、三维显示平面与图像观测位置之间的距离等。
可以设置展示评分标准,例如,三维显示平面的朝向越偏离屏幕,则展示评分越低;三维显示平面与图像观测位置之间的距离近,则展示评分越高等。根据评分标准并基于三维显示平面的朝向、三维显示平面与图像观测位置之间的距离等因素,确定与至少一个目标物体相对应的三维显示平面的展示评分,将分值大于阈值的三维显示平面确定为三维目标显示区域,三维目标显示区域的数量可以为一个或多个。
图5为本公开的图像信息显示方法的一个实施例中的确定二维屏幕显示区域的流程图,如图5所示的方法包括以下步骤:
S501,基于顶点三维坐标信息以及图像观测位置,确定三维目标显示区域的顶点二维坐标信息。
在一个实施例中,显示位置信息包括三维目标显示区域的顶点三维坐标信息。例如,三维目标显示区域可以为位于窗户上的显示区域,如图6A所示,三维目标显示区域为矩形区域,显示位置信息包括此矩形区域的四个顶点的三维坐标信息。
确定三维目标显示区域的四个顶点,四个顶点围成的区域为需要展示用户界面(User Interface,UI)的矩形区域,可以将图像信息贴到对应的三维目标显示区域的矩形区域。当用户发生游走、旋转等手势时,图像观测位置发生了变化,则确定新的显示位置信息,并根据新的显示位置信息以及图像观测位置,确定三维目标显示区域的顶点二维坐标信息。
S502,根据二维坐标信息,在三维模型与图像观测位置相对应的二维屏幕显示图像中确定二维屏幕显示区域。
获取二维屏幕显示区域的背景信息,基于背景信息对图像信息的显示元素进行调整,图像信息的显示元素包括图片、图片对应的文字、符号等。通过实现MR信息展示,可以将信息与环境联系更加紧密。例如,在三维空间中的窗户、墙壁、镜面、桌面、电视等的二维屏幕显示区域展示图像信息。
使用线性图片配合文案的形式展示信息,能够直观的展示数据信息。为使图像展示更加清晰,信息展示有两种样式:1、信息文字符号等选用白色为主色,搭配外发光,具有黑色半透蒙层或背景模糊效果;2、通过现有的算法识别背景色色值,针对不同背景色色值,信息符号等元素选用不同颜色进行展示。
在一个实施例中,获取三维目标显示区域与图像观测位置之间的观测距离,并确定二维屏幕显示区域的区域大小,基于区域大小和观测距离,确定图像信息的显示方式和大小。
当用户处于VR空间时,随着图像观测位置的变化,与MR信息显示的距离会随之变化,如果MR信息显示的大小为固定的,可能在用户观察时会出现MR信息显示过大或过小的情况,影响整体的展示效果。
MR信息显示在空间的展示效果,受到两个参数的影响,即MR展示区域大小(由于UI效果长宽比一定,以二维屏幕显示区域的区域宽度D进行显示限制)和观察距离L,在用户观测时,通过如下表1进行MR信息显示控制,能够保证MR信息的展示效果,MR信息的展示控制效果如图6B所示。
表1-MR信息的展示控制表
基于本公开的图像信息显示方法,能够在用户浏览时,在真实空间平面以及虚拟空间平面上展示与用户进行交互的图像信息,提供MR信息展示能力以及场景化信息,向用户提供空间场景交互体验,提高了用户的空间浏览体验。
本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于一计算机可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的存储介质包括:ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。
在一个实施例中,如图7所示,本公开提供一种图像信息显示装置,包括图像分析模块71、候选区域确定模块72、三维平面获取模块73、目标区域确定模块74、显示区域确定模块75和显示处理模块76。
图像分析模块71获取二维图像中的像素的分类信息,基于分类信息生成与二维图像相对应的语义图。图像分析模块71使用训练好的神经网络模型对二维图像中的至少一个像素进行分类处理,获取二维图像中至少一个像素的类别标签;图像分析模块71基于二维图像中的至少一个像素的位置信息以及对应的类别标签,生成语义图。
候选区域确定模块72基于语义图确定与二维图像中的目标物体相对应的候选显示区域。三维平面获取模块73根据与二维图像相对应的深度图像,获取与候选显示区域相对应的三维显示平面。
目标区域确定模块74获取与二维图像相对应的三维模型,确定与三维模型相对应的图像观测位置,基于图像观测位置在与至少一个目标物体对应的三维显示平面中选取三维目标显示区域。显示区域确定模块75获取与三维目标显示区域相对应的显示位置信息以及图像信息,根据显示位置信息以及图像观测位置,确定与三维目标显示区域相对应的二维屏幕显示区域。显示处理模块76在二维屏幕显示区域中对图像信息进行显示处理。
在一个实施例中,如图8所示,候选区域确定模块72包括目标区域确定单元721、区域连通处理单元722和候选区域选取单元723。目标区域确定单元721基于语义图中的类别标签,在语义图中确定与目标物体对应的至少一个目标区域。区域连通处理单元722使 用预设的图像连通算法将与目标物体对应的多个目标区域进行图像连通处理,生成至少一个像素聚合簇;其中,图像连通算法包括图像区域生长算法。候选区域选取单元723根据至少一个像素聚合簇确定候选显示区域。
候选区域选取单元723判断像素聚合簇的数量是否大于1,如果否,则候选区域选取单元723将此像素聚合簇设置为候选簇;如果是,则候选区域选取单元723根据预设的聚合簇评分因素对至少一个像素聚合簇进行评分处理,并基于至少一个像素聚合簇的评分在多个像素聚合簇中确定候选簇;其中,聚合簇评分因素包括:像素聚合簇的位置分布以及大小。候选区域选取单元723将候选簇设置为前景并将语义图中的其余像素设置为背景,生成二值图。候选区域选取单元723在二值图中获取与前景相对应的第一矩形,作为候选显示区域。
在一个实施例中,如图9所示,三维平面获取模块73包括坐标转换单元731、点云生成单元732、平面检测单元733和平面确定单元734。坐标转换单元731基于深度图像将候选显示区域中的二维像素坐标转换为对应的三维像素坐标。点云生成单元732根据三维像素坐标生成与候选显示区域相对应的三维点云。平面检测单元733根据平面检测算法对三维点云进行平面检测;其中,平面检测算法包括随机采样一致性算法。如果通过检测,则平面确定单元734获取与三维点云相对应的三维显示平面。
在一个实施例中,目标区域确定模块74根据图像观测位置,获取与至少一个目标物体相对应的三维显示平面相对应的展示因素;其中,展示因素包括三维显示平面的朝向、三维显示平面与图像观测位置之间的距离等。目标区域确定模块74基于展示因素确定与至少一个目标物体相对应的三维显示平面的展示评分,根据展示评分选取三维目标显示区域。
显示位置信息包括三维目标显示区域的顶点三维坐标信息;显示区域确定模块75基于顶点三维坐标信息以及图像观测位置,确定三维目标显示区域的顶点二维坐标信息。显示区域确定模块75根据二维坐标信息,在三维模型与图像观测位置相对应的二维屏幕显示图像中确定二维屏幕显示区域。
显示处理模块76获取二维屏幕显示区域的背景信息,基于背景信息对图像信息的显示元素进行调整;其中,图像信息的显示元素包括以下至少之一图片、图片对应的文字、符号等。显示处理模块76获取三维目标显示区域与图像观测位置之间的观测距离,并确定二维屏幕显示区域的区域大小,基于区域大小和观测距离,确定图像信息的显示方式和大小。
图10是本公开的电子设备的一个实施例的结构图,如图10所示,电子设备101包括一个或多个处理器1011和存储器1012。
处理器1011可以是中央处理单元(CPU)或者具有数据处理能力和/或指令执行能力的其他形式的处理单元,并且可以控制电子设备101中的其他组件以执行期望的功能。
存储器1012可以存储器可以存储一个或多个计算机程序产品,所述存储器可以包括各种形式的计算机可读存储介质,例如易失性存储器和/或非易失性存储器。所述易失性存储器例如可以包括随机存取存储器(RAM)和/或高速缓冲存储器(cache)等。所述非易失性存储器例如可以包括只读存储器(ROM)、硬盘、闪存等。在所述计算机可读存储介质上可以存储一个或多个计算机程序产品,处理器可以运行所述计算机程序产品。以实现上文所述的本公开的各个实施例的图像信息显示方法以及/或者其他期望的功能。
在一个示例中,电子设备101还可以包括:输入装置1013以及输出装置1014等,这些组件通过总线系统和/或其他形式的连接机构(未示出)互连。此外,该输入设备1013还可以包括例如键盘、鼠标等等。该输出装置1014可以向外部输出各种信息。该输出设备1014可以包括例如显示器、扬声器、打印机、以及通信网络及其所连接的远程输出设备等等。
当然,为了简化,图10中仅示出了该电子设备101中与本公开有关的组件中的一些,省略了诸如总线、输入/输出接口等等的组件。除此之外,根据具体应用情况,电子设备101还可以包括任何其他适当的组件。
除了上述方法和设备以外,除了上述方法和设备以外,本公开的实施例还可以是计算机程序产品,其包括计算机程序指令,所述计算机程序指令在被处理器运行时使得所述处理器执行本说明书上述部分中描述的根据本公开各种实施例的图像信息显示方法中的步骤。
计算机程序产品可以以一种或多种程序设计语言的任意组合来编写用于执行本公开实施例操作的程序代码,所述程序设计语言包括面向对象的程序设计语言,诸如Java、C++等,还包括常规的过程式程序设计语言,诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户计算设备上部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。
此外,本公开的实施例还可以是计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令在被处理器运行时使得所述处理器执行本说明书上述“示例性方法” 部分中描述的根据本公开各种实施例的图像信息显示方法中的步骤。
所述计算机可读存储介质可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以包括但不限于电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列举)可以包括:具有一个或者多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。
以上结合具体实施例描述了本公开的基本原理,但是,需要指出的是,在本公开中提及的优点、优势、效果等仅是示例而非限制,不能认为这些优点、优势以及效果等是本公开的各个实施例必须具备的。另外,上述公开的具体细节仅是为了示例的作用和便于理解的作用,而非限制,上述细节并不限制本公开为必须采用上述具体的细节来实现。
上述实施例中的图像信息显示方法、装置以及存储介质、电子设备、程序产品,能够在用户浏览时,在真实空间平面以及虚拟空间平面上展示与用户进行交互的图像信息,提供MR信息展示能力以及场景化信息,向用户提供空间场景交互体验,提高了用户的空间浏览体验,有效改善了用户的感受度。
本说明书中各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其它实施例的不同之处,各个实施例之间相同或相似的部分相互参见即可。对于系统实施例而言,由于其与方法实施例基本对应,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
可能以许多方式来实现本发明的方法和装置。例如,可通过软件、硬件、固件或者软件、硬件、固件的任何组合来实现本发明的方法和装置。用于所述方法的步骤的上述顺序仅是为了进行说明,本发明的方法的步骤不限于以上具体描述的顺序,除非以其它方式特别说明。此外,在一些实施例中,还可将本发明实施为记录在记录介质中的程序,这些程序包括用于实现根据本发明的方法的机器可读指令。因而,本发明还覆盖存储用于执行根据本发明的方法的程序的记录介质。
本发明的描述是为了示例和描述起见而给出的,而并不是无遗漏的或者将本发明限于所公开的形式。很多修改和变化对于本领域的普通技术人员而言是显然的。选择和描述实施例是为了更好说明本发明的原理和实际应用,并且使本领域的普通技术人员能够理解本发明从而设计适于特定用途的带有各种修改的各种实施例。

Claims (20)

  1. 一种图像信息显示方法,其特征在于,包括:
    获取二维图像中的像素的分类信息,基于所述分类信息生成与所述二维图像相对应的语义图;
    基于所述语义图确定与所述二维图像中的目标物体相对应的候选显示区域;
    根据与所述二维图像相对应的深度图像,获取与所述候选显示区域相对应的三维显示平面;
    获取与所述二维图像相对应的三维模型,确定与所述三维模型相对应的图像观测位置,基于所述图像观测位置在与至少一个目标物体对应的三维显示平面中选取三维目标显示区域;
    获取与所述三维目标显示区域相对应的显示位置信息以及图像信息,根据所述显示位置信息以及所述图像观测位置,确定与所述三维目标显示区域相对应的二维屏幕显示区域;
    在所述二维屏幕显示区域中对所述图像信息进行显示处理。
  2. 如权利要求1所述的方法,其特征在于,所述获取二维图像中的像素的分类信息,基于所述分类信息生成与所述二维图像相对应的语义图,包括:
    使用训练好的神经网络模型对所述二维图像中的至少一个像素进行分类处理,获取所述二维图像中至少一个像素的类别标签;
    基于所述二维图像中的至少一个像素的位置信息以及对应的类别标签,生成所述语义图。
  3. 如权利要求2所述的方法,其特征在于,所述基于所述语义图确定与所述二维图像中的目标物体相对应的候选显示区域,包括:
    基于所述语义图中的类别标签,在所述语义图中确定与所述目标物体对应的至少一个目标区域;
    使用预设的图像连通算法将与所述目标物体对应的多个目标区域进行图像连通处理,生成至少一个像素聚合簇;根据所述至少一个像素聚合簇确定所述候选显示区域。
  4. 如权利要求3所述的方法,其特征在于,所述根据所述至少一个像素聚合簇确定所述候选显示区域,包括:
    判断所述像素聚合簇的数量是否大于1,如果否,则将此像素聚合簇设置为候选簇;如果是,则根据预设的聚合簇评分因素对至少一个像素聚合簇进行评分处理,并基于所述至少一个像素聚合簇的评分在多个像素聚合簇中确定候选簇;其中,所述聚合簇评分因素包括:像素聚合簇的位置分布以及大小;
    将所述候选簇设置为前景并将所述语义图中的其余像素设置为背景,生成二值图;
    在所述二值图中获取与所述前景相对应的第一矩形,作为所述候选显示区域。
  5. 如权利要求4所述的方法,其特征在于,所述根据与所述二维图像相对应的深度图像,获取与所述候选显示区域相对应的三维显示平面,包括:
    基于所述深度图像将所述候选显示区域中的二维像素坐标转换为对应的三维像素坐标;
    根据所述三维像素坐标生成与所述候选显示区域相对应的三维点云;
    根据平面检测算法对所述三维点云进行平面检测;如果通过检测,则获取与所述三维点云相对应的三维显示平面。
  6. 如权利要求5所述的方法,其特征在于,所述基于所述图像观测位置在与至少一个目标物体对应的三维显示平面中选取三维目标显示区域包括:
    根据所述图像观测位置,获取与至少一个目标物体相对应的三维显示平面相对应的展 示因素;其中,所述展示因素包括:三维显示平面的朝向、三维显示平面与所述图像观测位置之间的距离;
    基于所述展示因素确定与至少一个目标物体相对应的三维显示平面的展示评分;
    根据所述展示评分选取所述三维目标显示区域。
  7. 如权利要求6所述的方法,其特征在于,所述显示位置信息包括:所述三维目标显示区域的顶点三维坐标信息;所述根据所述显示位置信息以及所述图像观测位置,确定与所述三维目标显示区域相对应的二维屏幕显示区域,包括:
    基于所述顶点三维坐标信息以及所述图像观测位置,确定所述三维目标显示区域的顶点二维坐标信息;
    根据所述二维坐标信息,在所述三维模型与所述图像观测位置相对应的二维屏幕显示图像中确定所述二维屏幕显示区域。
  8. 如权利要求7所述的方法,其特征在于,所述在所述二维屏幕显示区域中对所述图像信息进行显示处理,包括:
    获取所述二维屏幕显示区域的背景信息,基于所述背景信息对所述图像信息的显示元素进行调整;
    其中,所述图像信息的显示元素包括以下至少之一:图片、图片对应的文字、符号。
  9. 如权利要求7所述的方法,其特征在于,所述在所述二维屏幕显示区域中对所述图像信息进行显示处理,包括:
    获取所述三维目标显示区域与所述图像观测位置之间的观测距离,并确定所述二维屏幕显示区域的区域大小;
    基于所述区域大小和所述观测距离,确定所述图像信息的显示方式和大小。
  10. 如权利要求1-9任一所述的方法,其特征在于,所述二维图像包括:与房屋室内相对应的彩色二维图像;所述目标物体包括:窗户、墙壁、镜面、桌面、电视中的至少一个。
  11. 一种图像信息显示装置,其特征在于,包括:
    图像分析模块,用于获取二维图像中的像素的分类信息,基于所述分类信息生成与所述二维图像相对应的语义图;
    候选区域确定模块,用于基于所述语义图确定与所述二维图像中的目标物体相对应的候选显示区域;
    三维平面获取模块,用于根据与所述二维图像相对应的深度图像,获取与所述候选显示区域相对应的三维显示平面;
    目标区域确定模块,用于获取与所述二维图像相对应的三维模型,确定与所述三维模型相对应的图像观测位置,基于所述图像观测位置在与至少一个目标物体对应的三维显示平面中选取三维目标显示区域;
    显示区域确定模块,用于获取与所述三维目标显示区域相对应的显示位置信息以及图像信息,根据所述显示位置信息以及所述图像观测位置,确定与所述三维目标显示区域相对应的二维屏幕显示区域;
    显示处理模块,用于在所述二维屏幕显示区域中对所述图像信息进行显示处理。
  12. 如权利要求11所述的装置,其特征在于,所述图像分析模块,具体用于使用训练好的神经网络模型对所述二维图像中的至少一个像素进行分类处理,获取所述二维图像中至少一个像素的类别标签;基于所述二维图像中的至少一个像素的位置信息以及对应的类别标签,生成所述语义图。
  13. 如权利要求12所述的装置,其特征在于,所述候选区域确定模块,包括:目标区域确定单元,用于基于所述语义图中的类别标签,在所述语义图中确定与所述目标物体对应的至少一个目标区域;区域连通处理单元,用于使用预设的图像连通算法将与所述目 标物体对应的多个目标区域进行图像连通处理,生成至少一个像素聚合簇;候选区域选取单元,用于根据所述至少一个像素聚合簇确定所述候选显示区域。
  14. 如权利要求13所述的装置,其特征在于,所述候选区域选取单元,用于判断所述像素聚合簇的数量是否大于1,如果否,则将此像素聚合簇设置为候选簇;如果是,则根据预设的聚合簇评分因素对至少一个像素聚合簇进行评分处理,并基于所述至少一个像素聚合簇的评分在多个像素聚合簇中确定候选簇;其中,所述聚合簇评分因素包括:像素聚合簇的位置分布以及大小;将所述候选簇设置为前景并将所述语义图中的其余像素设置为背景,生成二值图;在所述二值图中获取与所述前景相对应的第一矩形,作为所述候选显示区域。
  15. 如权利要求14所述的装置,其特征在于,所述三维平面获取模块,包括:坐标转换单元,用于基于所述深度图像将所述候选显示区域中的二维像素坐标转换为对应的三维像素坐标;点云生成单元,用于根据所述三维像素坐标生成与所述候选显示区域相对应的三维点云;平面检测单元,用于根据平面检测算法对所述三维点云进行平面检测;平面确定单元,用于如果通过检测,则获取与所述三维点云相对应的三维显示平面。
  16. 如权利要求15所述的装置,其特征在于,所述目标区域确定模块,具体用于根据所述图像观测位置,获取与至少一个目标物体相对应的三维显示平面相对应的展示因素;其中,所述展示因素包括:三维显示平面的朝向、三维显示平面与所述图像观测位置之间的距离;基于所述展示因素确定与至少一个目标物体相对应的三维显示平面的展示评分;根据所述展示评分选取所述三维目标显示区域。
  17. 如权利要求16所述的装置,其特征在于,所述显示位置信息包括:所述三维目标显示区域的顶点三维坐标信息;所述显示区域确定模块,具体用于基于所述顶点三维坐标信息以及所述图像观测位置,确定所述三维目标显示区域的顶点二维坐标信息;根据所述二维坐标信息,在所述三维模型与所述图像观测位置相对应的二维屏幕显示图像中确定所述二维屏幕显示区域。
  18. 如权利要求17所述的装置,其特征在于,所述显示处理模块,用于获取所述二维屏幕显示区域的背景信息,基于所述背景信息对所述图像信息的显示元素进行调整;其中,所述图像信息的显示元素包括以下至少之一:图片、图片对应的文字、符号。
  19. 如权利要求17所述的装置,其特征在于,所述显示处理模块,用于获取所述三维目标显示区域与所述图像观测位置之间的观测距离,并确定所述二维屏幕显示区域的区域大小;基于所述区域大小和所述观测距离,确定所述图像信息的显示方式和大小。
  20. 如权利要求11-19任一所述的装置,其特征在于,所述二维图像包括:与房屋室内相对应的彩色二维图像;所述目标物体包括:窗户、墙壁、镜面、桌面、电视中的至少一个。
PCT/CN2023/081391 2022-06-24 2023-03-14 图像信息显示方法和装置 WO2023246189A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210720304.XA CN114827711B (zh) 2022-06-24 2022-06-24 图像信息显示方法和装置
CN202210720304.X 2022-06-24

Publications (1)

Publication Number Publication Date
WO2023246189A1 true WO2023246189A1 (zh) 2023-12-28

Family

ID=82522122

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/081391 WO2023246189A1 (zh) 2022-06-24 2023-03-14 图像信息显示方法和装置

Country Status (2)

Country Link
CN (1) CN114827711B (zh)
WO (1) WO2023246189A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114827711B (zh) * 2022-06-24 2022-09-20 如你所视(北京)科技有限公司 图像信息显示方法和装置
CN115631291B (zh) * 2022-11-18 2023-03-10 如你所视(北京)科技有限公司 用于增强现实的实时重光照方法和装置、设备和介质

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105989594A (zh) * 2015-02-12 2016-10-05 阿里巴巴集团控股有限公司 一种图像区域检测方法及装置
US20170061631A1 (en) * 2015-08-27 2017-03-02 Fujitsu Limited Image processing device and image processing method
EP3299763A1 (en) * 2015-05-20 2018-03-28 Mitsubishi Electric Corporation Point-cloud-image generation device and display system
CN110400337A (zh) * 2019-07-10 2019-11-01 北京达佳互联信息技术有限公司 图像处理方法、装置、电子设备及存储介质
CN111178191A (zh) * 2019-11-11 2020-05-19 贝壳技术有限公司 信息播放方法、装置、计算机可读存储介质及电子设备
CN112581629A (zh) * 2020-12-09 2021-03-30 中国科学院深圳先进技术研究院 增强现实显示方法、装置、电子设备及存储介质
CN113793255A (zh) * 2021-09-09 2021-12-14 百度在线网络技术(北京)有限公司 用于图像处理的方法、装置、设备、存储介质和程序产品
CN113934297A (zh) * 2021-10-13 2022-01-14 西交利物浦大学 一种基于增强现实的交互方法、装置、电子设备及介质
CN114827711A (zh) * 2022-06-24 2022-07-29 如你所视(北京)科技有限公司 图像信息显示方法和装置

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030095707A1 (en) * 2001-11-19 2003-05-22 Koninklijke Philips Electronics N.V. Computer vision method and system for blob-based analysis using a probabilistic pramework
CN110060230B (zh) * 2019-01-18 2021-11-26 商汤集团有限公司 三维场景分析方法、装置、介质及设备
CN113129362B (zh) * 2021-04-23 2024-05-10 北京地平线机器人技术研发有限公司 一种三维坐标数据的获取方法及装置
CN113902856B (zh) * 2021-11-09 2023-08-25 浙江商汤科技开发有限公司 一种语义标注的方法、装置、电子设备及存储介质

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105989594A (zh) * 2015-02-12 2016-10-05 阿里巴巴集团控股有限公司 一种图像区域检测方法及装置
EP3299763A1 (en) * 2015-05-20 2018-03-28 Mitsubishi Electric Corporation Point-cloud-image generation device and display system
US20170061631A1 (en) * 2015-08-27 2017-03-02 Fujitsu Limited Image processing device and image processing method
CN110400337A (zh) * 2019-07-10 2019-11-01 北京达佳互联信息技术有限公司 图像处理方法、装置、电子设备及存储介质
CN111178191A (zh) * 2019-11-11 2020-05-19 贝壳技术有限公司 信息播放方法、装置、计算机可读存储介质及电子设备
CN112581629A (zh) * 2020-12-09 2021-03-30 中国科学院深圳先进技术研究院 增强现实显示方法、装置、电子设备及存储介质
CN113793255A (zh) * 2021-09-09 2021-12-14 百度在线网络技术(北京)有限公司 用于图像处理的方法、装置、设备、存储介质和程序产品
CN113934297A (zh) * 2021-10-13 2022-01-14 西交利物浦大学 一种基于增强现实的交互方法、装置、电子设备及介质
CN114827711A (zh) * 2022-06-24 2022-07-29 如你所视(北京)科技有限公司 图像信息显示方法和装置

Also Published As

Publication number Publication date
CN114827711B (zh) 2022-09-20
CN114827711A (zh) 2022-07-29

Similar Documents

Publication Publication Date Title
US10755485B2 (en) Augmented reality product preview
Alexiou et al. On the performance of metrics to predict quality in point cloud representations
WO2023246189A1 (zh) 图像信息显示方法和装置
CN111080799A (zh) 基于三维建模的场景漫游方法、系统、装置和存储介质
US10140000B2 (en) Multiscale three-dimensional orientation
US8836728B2 (en) Techniques to magnify images
WO2021093416A1 (zh) 信息播放方法、装置、计算机可读存储介质及电子设备
WO2021018214A1 (zh) 虚拟对象处理方法及装置、存储介质和电子设备
CN112017300B (zh) 混合现实图像的处理方法、装置及设备
US11763479B2 (en) Automatic measurements based on object classification
WO2023202349A1 (zh) 三维标签的交互呈现方法、装置、设备、介质和程序产品
TW201417041A (zh) 推擠一模型通過二維場景的系統、方法和電腦程式商品
WO2023103980A1 (zh) 三维路径展示方法、装置、可读存储介质及电子设备
CN109743566A (zh) 一种用于识别vr视频格式的方法与设备
KR20240074815A (ko) 3d 모델 렌더링 방법 및 장치, 전자 디바이스, 그리고 저장 매체
CN107871338B (zh) 基于场景装饰的实时交互渲染方法
CN113920282B (zh) 图像处理方法和装置、计算机可读存储介质、电子设备
Xuerui Three-dimensional image art design based on dynamic image detection and genetic algorithm
CN107481306B (zh) 一种三维交互的方法
Kim et al. Multimodal visual data registration for web-based visualization in media production
Zhang et al. Sceneviewer: Automating residential photography in virtual environments
WO2022222689A1 (zh) 数据生成方法、装置及电子设备
WO2023173126A1 (en) System and method of object detection and interactive 3d models
US11170043B2 (en) Method for providing visualization of progress during media search
CN114913277A (zh) 一种物体立体交互展示方法、装置、设备及介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23825847

Country of ref document: EP

Kind code of ref document: A1