WO2023222130A1

WO2023222130A1 - Display method and electronic device

Info

Publication number: WO2023222130A1
Application number: PCT/CN2023/095396
Authority: WO
Inventors: 邸皓轩; 李丹洪; 王春晖
Original assignee: 荣耀终端有限公司
Priority date: 2022-05-20
Filing date: 2023-05-19
Publication date: 2023-11-23
Also published as: WO2023222130A9; CN116048243B; CN116048243A; CN118312035A

Abstract

The present application provides a display method. The method may be applied to a terminal device such as a mobile phone or a tablet computer. By implementing the method, the terminal device may display one or more shortcut windows in an unlocked main interface and/or an interface to be unlocked. Each shortcut window is associated with a commonly used interface set by a user. When it is detected that the user gazes at a certain shortcut window, the terminal device may display the commonly used interface associated with the shortcut window, such as a commonly used payment interface, a health code interface, etc. Therefore, the user may quickly acquire related information in the commonly used interface, and a touch operation is not required.

Description

A display method and electronic device

This application claims the priority of the Chinese patent application submitted to the China Patent Office on May 20, 2022, with the application number 202210549347.6 and the application name "A display method and electronic device", and the Chinese patent submitted on June 30, 2022 Office, with application number 202210761048.9 and the priority of a Chinese patent application titled "A display method and electronic device", the entire content of which is incorporated into this application by reference.

Technical field

The present application relates to the field of terminals, and in particular, to a display method and electronic device.

Background technique

With the rise of mobile terminals and the maturity of communication technology, people have begun to explore new human-computer interaction methods that are independent of the mouse and keyboard, such as voice control, gesture recognition control, etc., thereby realizing new human-computer interaction methods and providing users with more diverse needs. A more convenient interactive experience and improved user experience.

Contents of the invention

The embodiment of the present application provides a display method. By implementing this method, the terminal device can detect the area where the user is looking at the screen, and then display an interface corresponding to the area. In this way, users can quickly obtain information in the above interface without touch operations.

In a first aspect, this application provides a display method, which is applied to an electronic device. The electronic device includes a screen, and the screen of the electronic device includes a first preset area. The method includes: displaying a first interface; , the electronic device collects the first image; determines the user's first eyeball gaze area based on the first image, and the first eyeball gaze area is used to indicate the screen area that the user looks at when the user looks at the screen; when the first eyeball gaze area is in the first When within the preset area, the second interface is displayed.

By implementing the method provided in the first aspect, the electronic device can collect images used to determine the user's eye gaze area when displaying an interface. When it is determined through the above image that the user is looking at a preset area, the electronic device can display an interface associated with the area. In this way, the user can quickly control the electronic device to display a certain interface through gaze operations, thereby quickly obtaining services or information provided by the interface.

In conjunction with the method provided in the first aspect, in some embodiments, the screen of the electronic device includes a second preset area, and the second preset area is different from the first preset area. The method further includes: determining the user's location based on the first image. The second eyeball fixation area, the position of the second eyeball fixation area on the screen is different from the position of the first eyeball fixation area on the screen; when the second eyeball fixation area is within the second preset area, the third interface is displayed, and the third interface is displayed. The third interface is different from the second interface.

By implementing the method provided by the above embodiments, the electronic device can divide the screen into multiple preset areas. One area can correspond to one interface. When the electronic device detects which area the user is looking at, it can display an interface corresponding to the area. In this way, the user can quickly control the electronic device to display different interfaces by looking at different screen areas.

Combined with the method provided in the first aspect, in some embodiments, the second interface and the third interface are interfaces provided by the same application, or the second interface and the third interface are interfaces provided by different applications.

In conjunction with the method provided in the first aspect, in some embodiments, the method further includes: displaying a fourth interface; when displaying the fourth interface, the electronic device collects a second image; and determining the user's third eye gaze area based on the second image; third eye When the ball gaze area is within the first preset area, the fifth interface is displayed, and the fifth interface is different from the second interface.

By implementing the method provided by the above embodiment, when the electronic device displays different main interfaces, the interfaces associated with a screen area of the electronic device may also be different. For example, on the first desktop, the interface associated with the upper right corner area of the electronic device may be the payment interface, and on the second desktop, the interface associated with the upper right corner area of the electronic device may be the ride code interface. In this way, the user can set more interfaces associated with the screen area to meet the user's need to open the interface through gaze operations.

Combined with the method provided in the first aspect, in some embodiments, when the first eyeball gaze area is within the first preset area, displaying the second interface includes: when the first eyeball gaze area is within the first preset area, And when the duration of looking at the first preset area is the first duration, the second interface is displayed.

By implementing the method provided by the above embodiments, the electronic device can also monitor the user's gaze duration when detecting the user's eyeball gaze area. When the gaze duration meets the preset duration, the electronic device can display the corresponding interface.

In conjunction with the method provided in the first aspect, in some embodiments, the method further includes: when the first eyeball fixation area is within the first preset area and the duration of fixation on the first preset area is the second duration, displaying the second Six interfaces.

By implementing the method provided by the above embodiments, the electronic device can also associate a screen area with multiple interfaces, and determine which interface is specifically displayed based on the user's gaze duration.

Combined with the method provided in the first aspect, in some embodiments, the first eye gaze area is a cursor point formed by one display unit on the screen, or the first eye gaze area is a cursor point formed by multiple display units on the screen, or Cursor area.

Combined with the method provided in the first aspect, in some embodiments, the second interface is a non-private interface. The method further includes: displaying the interface to be unlocked; when displaying the interface to be unlocked, the electronic device collects a third image; based on the third image Determine the user's fourth eyeball gaze area; when the fourth eyeball gaze position is within the first preset area, display the second interface.

Combined with the method provided in the first aspect, in some embodiments, the third interface is a privacy interface, and the method further includes: not displaying the third interface when the fourth eye gaze position is within the second preset area.

By implementing the method provided by the above embodiments, the electronic device can also set the privacy type of the associated interface. When the associated interface is a non-private interface, in the locked screen state, after recognizing that the user is looking at the screen area corresponding to the non-private interface, the electronic device can directly display the above-mentioned non-private interface without unlocking. In this way, users can obtain the above-mentioned non-private interface more quickly. When the associated interface is a privacy interface, in the lock screen state, after recognizing that the user is looking at the screen area corresponding to the privacy interface, the electronic device may not display the above-mentioned privacy interface. In this way, electronic devices can avoid privacy leaks and improve user experience when providing users with fast services.

Combined with the method provided in the first aspect, in some embodiments, both the second interface and the third interface are privacy interfaces; the electronic device does not enable the camera to acquire images when displaying the interface to be unlocked.

By implementing the method provided by the above embodiment, when all associated interfaces are private interfaces, the electronic device can not turn on the camera when the screen is locked, thereby saving power consumption.

In combination with the method provided in the first aspect, in some embodiments, a first control is displayed in the second preset area of the first interface, and the first control is used to indicate that the second preset area is associated with the third interface.

By implementing the method provided by the above embodiments, the electronic device can display prompt controls in the preset area of the associated interface during the process of detecting the user's eye gaze area. This prompt control can be used to indicate to the user the interface associated with this area, as well as the services or information that the interface can provide. In this way, users can intuitively understand whether each area has associated boundaries. interface, and the services or information that each interface can provide. On this basis, the user can decide which preset area to focus on and which associated interface to open.

Combined with the method provided in the first aspect, in some embodiments, the first control is not displayed in the second preset area of the interface to be unlocked.

By implementing the method provided in the above embodiment, when the interface associated with a certain preset area is a privacy interface, the electronic device will not display a prompt control indicating the privacy interface in the lock screen state, preventing the user from ineffectively looking at the preset area. area.

Combined with the method provided in the first aspect, in some embodiments, the first control is any one of the following: a thumbnail of the first interface, an icon of an application corresponding to the first interface, an icon indicating that the first interface provides Function icons for services.

Combined with the method provided in the first aspect, in some embodiments, the duration for which the electronic device collects images is the first preset duration; the electronic device collects the first image, specifically: the electronic device collects the first image within the first preset time .

By implementing the method provided by the above embodiments, the terminal device will not detect the user's eyeball gaze area all the time, but will detect it within a preset period of time to save power consumption and avoid camera abuse from affecting user information security.

Combined with the method provided in the first aspect, in some embodiments, the first preset duration is the first 3 seconds of displaying the first interface.

By implementing the method provided in the above embodiment, the terminal device can detect the user's eyeball gaze area 3 seconds before displaying the first interface, and determine whether the user is gazing at the preset area of the screen. It not only meets user needs in most scenarios, but also reduces power consumption as much as possible.

Combined with the method provided in the first aspect, in some embodiments, the electronic device collects the first image through a camera module. The camera module includes: at least one 2D camera and at least one 3D camera. The 2D camera is used to acquire two-dimensional images. Image, the 3D camera is used to acquire an image including depth information; the first image includes a two-dimensional image and an image including depth information.

To implement the method provided by the above embodiments, the camera module of the terminal device may include multiple cameras, and the multiple cameras include at least one 2D camera and at least one 3D camera. In this way, the terminal device can obtain two-dimensional images and three-dimensional images indicating the gaze position of the user's eyeballs. The combination of two-dimensional images and three-dimensional images can help improve the accuracy and accuracy of the terminal device in identifying the user's eye gaze area.

Combined with the method provided in the first aspect, in some embodiments, the first image acquired by the camera module is stored in the secure data buffer. Before determining the user's first eye gaze area based on the first image, the method further includes: Obtain the first image from the secure data buffer in a trusted execution environment.

By implementing the method provided by the above embodiment, before the terminal device processes the images collected by the camera module, the terminal device can store the images collected by the camera module in the secure data buffer. The image data stored in the secure data buffer can only be transmitted to the eye gaze recognition algorithm through the secure transmission channel provided by the security service, thereby improving the security of the image data.

Combined with the method provided in the first aspect, in some embodiments, the secure data buffer is set at the hardware layer of the electronic device.

In conjunction with the method provided in the first aspect, in some embodiments, determining the user's first eye gaze area based on the first image specifically includes: determining feature data of the first image, where the feature data includes a left eye image, a right eye image, a person One or more of the face image and face mesh data; the eye gaze recognition model is used to determine the first eye gaze area indicated by the feature data, and the eye gaze recognition model is established based on a convolutional neural network.

By implementing the methods provided by the above embodiments, the terminal device can obtain the two-dimensional images and three-dimensional images collected by the camera module. Obtain left eye images, right eye images, face images and face mesh data respectively to extract more features and improve recognition precision and accuracy.

Combined with the method provided in the first aspect, in some embodiments, determining the characteristic data of the first image specifically includes: performing face correction on the first image to obtain a first image with a corrected face; Image, determine the characteristic data of the first image.

By implementing the method provided in the above embodiment, before acquiring the left eye image, right eye image, and face image, the terminal device can perform face correction on the images collected by the camera module to improve the left eye image, right eye image, and face image. Image accuracy.

Combined with the method provided in the first aspect, in some embodiments, the first interface is any one of the first desktop, the second desktop, or the negative screen; the fourth interface is the first desktop, the second desktop, or the negative screen. Any one of them, and it is different from the first interface.

By implementing the method provided in the above embodiment, the main interfaces of the terminal device displaying the first desktop, the second desktop, the negative screen, etc. can respectively set the screen preset areas and their associated interfaces. Different main interfaces can reuse a screen preset area.

Combined with the method provided in the first aspect, in some embodiments, the association between the first preset area and the second interface and the fifth interface is set by the user.

By implementing the method provided by the above embodiments, the user can set the associated interfaces of different screen preset areas corresponding to each protagonist through the interface provided by the electronic device to meet their own personalized needs.

In a second aspect, the present application provides an electronic device, which includes one or more processors and one or more memories; wherein one or more memories are coupled to one or more processors, and one or more The memory is used to store computer program code. The computer program code includes computer instructions. When one or more processors execute the computer instructions, the electronic device performs the method described in the first aspect and any possible implementation manner of the first aspect.

In a third aspect, embodiments of the present application provide a chip system, which is applied to an electronic device. The chip system includes one or more processors, and the processor is used to call computer instructions to cause the electronic device to execute the first step. aspect and the method described in any possible implementation manner in the first aspect.

In a fourth aspect, the present application provides a computer-readable storage medium, including instructions. When the above instructions are run on an electronic device, the above electronic device causes the above-mentioned electronic device to execute as described in the first aspect and any possible implementation manner of the first aspect. method.

In a fifth aspect, the present application provides a computer program product containing instructions. When the computer program product is run on an electronic device, the electronic device causes the electronic device to execute as described in the first aspect and any possible implementation manner of the first aspect. method.

It can be understood that the electronic device provided by the second aspect, the chip system provided by the third aspect, the computer storage medium provided by the fourth aspect, and the computer program product provided by the fifth aspect are all used to execute the method provided by this application. Therefore, the beneficial effects it can achieve can be referred to the beneficial effects in the corresponding methods, and will not be described again here.

Description of the drawings

Figure 1 is a schematic diagram of an eyeball gaze position provided by an embodiment of the present application;

Figures 2A-2I are a set of user interfaces provided by embodiments of the present application;

Figures 3A-3E are a set of user interfaces provided by embodiments of the present application;

Figures 4A-4D are a set of user interfaces provided by embodiments of the present application;

Figures 5A-5M are a set of user interfaces provided by embodiments of the present application;

Figures 6A-6I are a set of user interfaces provided by embodiments of the present application;

Figures 7A-7C are a set of user interfaces provided by embodiments of the present application;

Figure 8 is a flow chart of a display method provided by an embodiment of the present application;

Figure 9 is a schematic structural diagram of an eye gaze recognition model provided by an embodiment of the present application;

Figure 10 is a flow chart of a face correction method provided by an embodiment of the present application;

Figures 11A-11C are schematic diagrams of a set of face correction methods provided by embodiments of the present application;

Figure 12 is a structural diagram of a convolutional network of an eye gaze recognition model provided by an embodiment of the present application;

Figure 13 is a schematic diagram of a separable convolution technology provided by an embodiment of the present application;

Figure 14 is a schematic system structure diagram of the terminal 100 provided by the embodiment of the present application;

Figure 15 is a schematic diagram of the hardware structure of the terminal 100 provided by the embodiment of the present application.

Detailed ways

The terms used in the following embodiments of the present application are only for the purpose of describing specific embodiments and are not intended to limit the present application.

The embodiment of the present application provides a display method. This method can be applied to terminal devices such as mobile phones and tablet computers. Terminal devices such as mobile phones and tablet computers that implement the above method can be recorded as terminal 100. In subsequent embodiments, the terminal 100 will be used to refer to the above-mentioned terminal devices such as mobile phones and tablet computers.

Not limited to mobile phones and tablet computers, the terminal 100 can also be a desktop computer, a laptop computer, a handheld computer, a notebook computer, an ultra-mobile personal computer (UMPC), a netbook, a cellular phone, or a personal digital assistant. (personal digital assistant, PDA), augmented reality (AR) devices, virtual reality (VR) devices, artificial intelligence (artificial intelligence, AI) devices, wearable devices, vehicle-mounted devices, smart home devices and /or smart city equipment. The embodiments of this application do not place special restrictions on the specific types of the above terminals.

In a display method provided by an embodiment of the present application, the terminal 100 can display a shortcut window in the main interface after unlocking. Applications frequently used by users can be displayed in the shortcut window, such as the icon, main interface, or common interface of the application. The above-mentioned common interfaces refer to the pages that users frequently open. After detecting successful unlocking, the terminal 100 may detect the user's eyeball gaze position. When it is detected that the user's eyeball gaze position is within the above-mentioned shortcut window area, the terminal 100 may display the main interface or common interface of the application program displayed in the shortcut window.

The layer where the above shortcut window is located is above the layer of the main interface. Therefore, the content displayed in the shortcut window will not be obscured. The above-mentioned user's eyeball gaze position refers to the position where the user's line of sight focuses on the screen of the terminal 100 when the user gazes at the terminal 100 . As shown in Figure 1, a cursor point S may be displayed on the screen of the terminal 100. When the user looks at the cursor point S, the position where the user's sight focuses on the screen shown in Figure 1 is the cursor point S, that is, the position where the user's eyeballs focus is the cursor point S. The cursor point S can be anywhere on the screen. Figure 1 also shows the shortcut window W. When the user's eyeball gaze position is the cursor point S', the terminal 100 can determine that the user's eyeball gaze position is within the shortcut window area W, that is, the user is looking at the shortcut window.

In some embodiments, shortcut windows can also be divided into privacy categories and non-privacy categories. Shortcut windows marked as private can only be displayed on the main interface after the unlock is successful. Non-privacy shortcut windows can also be displayed on the interface to be unlocked before the unlock is successful. In the interface to be unlocked, when it is detected that the user is looking at a non-privacy application displayed in the shortcut window , the terminal 100 may display the main interface or common interface of the application program. Whether a shortcut window is private depends on the privacy requirements of the information displayed in the window.

By implementing the above method, users can quickly open commonly used applications and common interfaces in commonly used applications, thereby saving user operations and improving user convenience. At the same time, the user can control whether the terminal 100 opens the above-mentioned common applications or common interfaces through the eye gaze position, further saving user operations. In particular, when the user's hands are occupied, the user controls the terminal 100 to perform a certain action through the eye gaze position, which provides the user with a new interactive control method and improves the user's experience.

The following is a detailed introduction to user scenarios in which the terminal 100 implements the above interaction method based on eye gaze recognition.

FIG. 2A exemplarily shows a user diagram of the terminal 100 in a screen-off state.

When the user is not using the terminal 100, the terminal 100 may be in a screen-off state. As shown in FIG. 2A , when the terminal 100 is in the screen-off state, the display of the terminal 100 sleeps and becomes a black screen, but other devices and programs work normally. In other embodiments, when the user is not using the terminal 100, the terminal 100 may also be in the AOD (Always on Display) state. The screen-off AOD state refers to the state of controlling part of the screen to light up without lighting up the entire mobile phone screen, that is, the state of controlling part of the screen to light up based on the screen-off state.

When detecting a user operation to wake up the mobile phone, the terminal 100 can light up the entire screen and display the interface to be unlocked as shown in Figure 2B. The time and date can be displayed on the interface to be unlocked for the user to view. The terminal 100 detects the user operation of waking up the mobile phone, including but not limited to: the user's operation of picking up the mobile phone, the user's operation of waking up the mobile phone through the voice assistant, etc. This embodiment of the present application does not limit this.

After displaying the interface to be unlocked, the terminal 100 can enable the camera to collect and generate image frames. The image frame may include an image of the user's face. Then, the terminal 100 can use the above image frame to perform facial recognition and determine whether the facial image in the above image frame is the facial image of the owner, that is, determine whether the user performing the unlocking operation is the owner himself.

Referring to FIG. 2B , the terminal 100 may be provided with a camera module 210 . The camera module 210 of the terminal 100 includes at least a 2D camera and a 3D camera. 2D cameras refer to cameras that generate two-dimensional images, such as cameras commonly used on mobile phones that generate RGB images. The above-mentioned 3D camera refers to a camera that can generate a three-dimensional image or a camera that can generate an image including depth information, such as a TOF camera. Compared with 2D cameras, the images generated by 3D cameras also include depth information, that is, the distance information between the object being photographed and the 3D camera. Optionally, the camera module 210 may also include multiple 2D cameras and multiple 3D cameras, which is not limited in the embodiments of the present application.

In this embodiment of the present application, when performing face unlocking verification, the camera used by the terminal 100 may be one of the cameras in the above-mentioned camera module 210 . Typically, this camera is the 3D camera in the camera module 210.

When the facial unlocking is successful, that is, when the collected facial image matches the facial image of the owner, the terminal 100 may display the user interface shown in Figures 2C-2D.

First, the terminal 100 may display the unlocking success interface shown in FIG. 2C. The interface may display an icon 211. The icon 211 can be used to prompt the user that the face unlock is successful. Subsequently, the terminal 100 may display the user interface shown in FIG. 2D. This interface may be called the main interface of the terminal 100 .

It can be understood that the unlocking success interface shown in Figure 2C is optional. After confirming that the unlocking is successful, the terminal 100 may also directly display the main interface shown in Figure 2D.

Not limited to the facial unlocking introduced in FIG. 2C above, the terminal 100 can also adopt password unlocking (graphic password, digital password), fingerprint unlocking and other unlocking methods. After the unlocking is successful, the terminal 100 can also display the main interface shown in Figure 2D.

The main interface may include a notification bar 221, a page indicator 222, a frequently used application icon tray 223, and other application icon trays 224.

Wherein: the notification bar may include one or more signal strength indicators (such as signal strength indicator 221A, signal strength indicator 221B) of mobile communication signals (also known as cellular signals), wireless fidelity (wireless fidelity, Wi-Fi) Fi) signal strength indicator 221C, battery status indicator 221D, time indicator 221E.

The page indicator 222 may be used to indicate the positional relationship of the currently displayed page to other pages. Generally, the main interface of the terminal 100 may include multiple pages. The interface shown in Figure 2D may be one of the above-mentioned multiple pages. The main interface of the terminal 100 also includes other pages. This other page is not shown in Figure 2D. When detecting the user's left and right sliding operations, the terminal 100 may display the above other pages, that is, switch pages. At this time, the page indicator 222 will also change to different forms to indicate different pages. Subsequent embodiments will be introduced in detail.

The frequently used application icon tray 223 may include multiple common application icons (such as a camera application icon, an address book application icon, a phone application icon, and an information application icon), and the frequently used application icons remain displayed when the page is switched. The above common application icons are optional and are not limited in this embodiment of the present application.

The other application icon tray 224 may include a plurality of general application icons, such as a settings application icon, an application market application icon, a gallery application icon, a browser application icon, etc. General application icons may be distributed in other application icon trays 224 on multiple pages of the main interface. The general application icons displayed in the other application icon tray 224 will be changed accordingly when the page is switched. The icon of an application can be a general application icon or a commonly used application icon. When the above icon is placed in the common application icon tray 223, the above icon is a common application icon; when the above icon is placed in the other application icon tray 224, the above icon is a general application icon.

It can be understood that FIG. 2D only illustrates a main interface or a page of a main interface of the terminal 100, and should not be construed as limiting the embodiments of the present application.

Referring to Figure 2E, while displaying the main interface shown in Figure 2D, that is, after the unlocking is successful, the terminal 100 can also display a shortcut window 225 and a shortcut window 226 on top of the layer of the above-mentioned main interface.

The shortcut window 225 can display a thumbnail of the payment interface. The shortcut window 226 can display a thumbnail of the health code interface. It can be understood that in order to more vividly display the terminal's 100-layer display main interface and shortcut window, this display process is shown in Figure 2D and Figure 2E in the drawings of this application respectively. However, from the user's perspective, after the unlocking is successful, the interface the user sees is actually the interface shown in Figure 2E (the layers of the main interface and the layers of the shortcut window are displayed at the same time). It can be understood that the main interface may display more or fewer shortcut windows, such as 3, 4, 1, etc., and the embodiment of the present application does not limit this.

Specifically, the first application program may be installed on the terminal 100 . The first application can provide payment services to users. After opening the first application program, the terminal 100 may display a payment interface. The payment interface can include payment codes, such as payment QR codes, payment barcodes, etc. Users can complete the payment task by showing the above payment interface. The first application to open is Refers to setting the first application as the foreground application. As shown in FIG. 2E , before opening the first application, the shortcut window 225 may display a thumbnail of the payment interface to prompt the user of the applications and commonly used interfaces associated with the shortcut window 225 .

Similarly, a second application program can also be installed on the terminal 100 . After starting the second application, the terminal 100 may display the health code interface. The health code reflecting the user's health status can be displayed on the health code interface. Users can complete the health check by showing the above health code interface. Similarly, before starting the second application, the shortcut window 226 may display a thumbnail of the above health code interface.

The above payment interface can be called a common interface of the first application. The above health code interface can be called a common interface of the second application.

While displaying the main interface and shortcut window, the terminal 100 can collect the user's facial image through the camera module 210 .

At this time, the number of cameras used by the terminal 100 is two, including a 2D camera and a 3D camera. Of course, it is not limited to one 2D camera and one 3D camera. The terminal 100 can also use more cameras to obtain more facial features of the user, especially eye features, so as to determine the user's eyeball gaze position more quickly and accurately in the future. .

In the scenario of using face unlocking, the 3D camera of the terminal 100 is turned on. Therefore, at this time, the terminal 100 only needs to turn on the 2D camera of the camera module 210. In the scenario of using password unlocking and fingerprint unlocking, the camera of the terminal 100 is turned off. At this time, the terminal 100 needs to turn on the 2D camera and the 3D camera in the camera module 210.

The time when the terminal 100 collects the user's facial image through the camera module 210 (2D camera, 3D camera) can be recorded as the gaze recognition time. Preferably, the gaze recognition time is the first 3 seconds after the main interface is displayed after successful unlocking. After 3 seconds, the terminal 100 can turn off the camera module 210 to save power consumption. If the gaze recognition time is set too short, for example, 1 second, the image frames collected by the terminal 100 including the user's facial image may be insufficient, which may lead to inaccurate eye gaze recognition results. On the other hand, it is difficult for users to immediately focus on a shortcut window within 1 second after displaying the main interface. Setting the gaze recognition time too long, such as 7 seconds or 10 seconds, will result in excessive power consumption. Of course, it is not limited to 3 seconds. The gaze recognition time can also be set to other values, such as 2.5 seconds, 3.5 seconds, 4 seconds, etc., which are not limited in the embodiments of the present application. Subsequent introductions will take 3 seconds as an example.

Correspondingly, the terminal 100 may also display the shortcut window only within the gaze recognition time. When the camera module 210 is turned off, that is, when the terminal 100 no longer detects the user's eyeball gaze position, the terminal 100 no longer displays the shortcut window to avoid blocking the main interface for a long time and affecting the user experience.

During the above gaze recognition time, the camera module 210 may continuously collect and generate image frames including the user's facial image. The above image frames include two-dimensional images collected by the 2D camera and three-dimensional images collected by the 3D camera.

Based on the image frames collected during the above gaze recognition time, the terminal 100 can identify the user's eyeball gaze position and determine whether the user is gazing at the shortcut window 225 or the shortcut window 226 .

As shown in FIG. 2F , the terminal 100 may determine that the user is looking at the shortcut window 225 based on the collected image frames. In response to detecting the user action of the user looking at the shortcut window 225, the terminal 100 may open the first application program and display the payment interface corresponding to the shortcut window 225, see FIG. 2G. As shown in Figure 2G, the payment interface displays a payment QR code 231 and related information for providing payment services to users.

As shown in FIG. 2H , the terminal 100 may also determine that the user is looking at the shortcut window 226 based on the collected image frames. In response to detecting the user action of gazing at the shortcut window 226, the terminal 100 may open the second application and display the health code interface corresponding to the shortcut window 226, see FIG. 2I. As shown in Figure 2I, the health code interface displays the health code 232 required for the health check and its related information, so that the user can quickly complete the health check.

In other embodiments, the terminal 100 can also display different interfaces by detecting the user's gaze on a certain area for different lengths of time. For example, referring to FIG. 2D , after entering the main interface, the terminal 100 may detect whether the user is looking at the upper right corner area of the screen. After detecting that the user is gazing at the upper right corner area for a first period of time, such as 2 seconds, the terminal 100 may display the shortcut window 225 . If it is detected that the user is still looking at the upper right corner area, and after reaching the second duration, for example, 3 seconds, the terminal 100 may switch the shortcut window 225 displayed in the upper right corner area to the shortcut window 226.

In the scenario where the shortcut window 225 or the shortcut window 226 is displayed, the terminal 100 can detect the user's touch operation, blink control operation, or head turn control operation on the above-mentioned window to determine whether to display the corresponding shortcut window 225 or the shortcut window 226. interface.

By implementing the above method, the user can immediately obtain the common interface of commonly used applications after opening the terminal 100, thereby quickly obtaining the services and information provided by the common interface, such as the payment service provided by the above-mentioned payment interface, and the health code provided by the health code interface. and related information.

On the other hand, the user can control the terminal 100 to display the interface corresponding to the shortcut window by looking at the shortcut window without performing touch operations such as clicking, double-clicking, and long-pressing, thus avoiding the inability to control the terminal when the user's hands are occupied. 100 questions, providing convenience to users.

FIG. 3A exemplarily shows a main interface including multiple pages. Among them, each page can be called the main interface.

As shown in Figure 3A, the main interface may include page 30, page 31, and page 32. Page 30 can be called negative one screen. Page 31 may be called the first desktop. Page 32 may be called the second desktop. The page layout of the second desktop is the same as that of the first desktop, which will not be described again here. The number of desktops in the main interface can be increased or reduced according to the user's settings. Only the first desktop, the second desktop, etc. are shown in FIG. 3A.

In FIG. 2D , the main interface displayed by the terminal 100 is actually the first desktop in the main interface shown in FIG. 3A . In some embodiments, after successful unlocking, the terminal 100 first displays the first desktop. In other embodiments, after successful unlocking, the terminal 100 may display the negative screen, the first desktop or the second desktop. Optionally, which one of the negative screen, first desktop, or second desktop the terminal 100 displays depends on the page you stayed on when you last exited.

Therefore, after displaying the successful unlocking interface shown in FIG. 2C , the terminal 100 may first display the second desktop or the negative screen, and display the shortcut window 225 and the shortcut window 225 on the layer where the second desktop or the negative screen is located. Window 226, see Figures 3B and 3C.

Within the first 3 seconds of displaying the main interface shown in Figure 3B (second desktop) or Figure 3C (negative screen), the terminal 100 can also collect the user's facial image through the camera module 210 and identify whether the user is looking at the shortcut window 225 Or shortcut window 226. When it is recognized that the user is looking at the shortcut window 225, the terminal 100 may also display the payment interface shown in FIG. 2G for the user to obtain the payment service provided by the first application. When it is recognized that the user is looking at the shortcut window 226, the terminal 100 may also display the health code interface shown in FIG. 2I for the user to obtain the health code 232 and related information provided by the second application. information so that users can quickly complete the health check.

In this way, no matter which main interface is displayed on the terminal 100 after unlocking, the user can obtain the common interface of commonly used applications, and thereby quickly obtain the services and information provided by the common interface to meet their own needs.

In some embodiments, the terminal 100 may also use smaller icons to replace the above shortcut window.

Referring to FIG. 3D, the terminal 100 may display icons 311 and 312. The icon 311 may correspond to the aforementioned shortcut window 225, and the icon 312 may correspond to the aforementioned shortcut window 226. When detecting that the user is gazing at the icon 311 or the icon 312, the terminal 100 may display a payment interface that provides payment services or a health code interface that displays a health code for the user to use.

The above-mentioned icons 311 and 312 not only serve as prompts, but also reduce the obstruction of the main interface, thereby improving the user experience.

Of course, the terminal 100 may also display icons of application programs installed on the terminal 100, such as the application icon 321 and the application icon 322 shown in FIG. 3E. Generally, the above-mentioned applications are applications frequently used by users. After detecting the user's gaze action, the terminal 100 can open the application program, thereby providing the user with a service to quickly open the above-mentioned application program without requiring the user to perform a touch operation.

Users can choose to enable or disable eye gaze recognition. In a scenario where eye gaze recognition is enabled, after the unlocking is completed, the terminal 100 can collect the user's facial image, identify whether the user is looking at the shortcut window, and then determine whether to display the common interface corresponding to the shortcut window so that the user can quickly and conveniently obtain Information in commonly used interfaces. On the contrary, in a scenario where eye gaze recognition is turned off, the terminal 100 will not recognize whether the user is gazing at the shortcut window, and thus will not display the common interface corresponding to the shortcut window.

4A-4D exemplarily illustrate a set of user interface settings for enabling or disabling the eye gaze recognition function.

FIG. 4A exemplarily shows the setting interface on the terminal 100. Multiple setting options may be displayed on the setting interface, such as account setting options 411, WLAN options 412, Bluetooth options 413, mobile network options 414, etc. In this embodiment of the present application, the setting interface also includes auxiliary function options 415. Accessibility option 415 can be used to set some shortcut operations.

Terminal 100 may detect user operations on accessibility options 415 . In response to the above operation, the terminal 100 may display the user interface shown in FIG. 4B, which is referred to as the auxiliary function setting interface. The interface may display multiple accessibility options, such as accessibility options 421, one-handed mode options 422, and so on. In this embodiment of the present application, the auxiliary function setting interface also includes quick start and gesture options 423. Quick start and gesture options 423 can be used to set some gesture actions and eye gaze actions to control interaction.

The terminal 100 can detect user operations on the quick launch and gesture options 423 . In response to the above operation, the terminal 100 may display the user interface shown in FIG. 4C , which is denoted as the quick startup and gesture setting interface. The interface can display multiple quick launch and gesture setting options, such as smart voice option 431, screenshot option 432, screen recording option 433, and quick call option 434. In this embodiment of the present application, the quick start and gesture setting interface also includes an eye gaze option 435. The eye gaze option 435 can be used to set the area for eye gaze recognition and corresponding shortcut operations.

Terminal 100 may detect user operations on eye gaze option 435 . In response to the above operation, the terminal 100 may display the user interface shown in FIG. 4D , which is referred to as the eye gaze recognition setting interface. As shown in Figure 4D, the interface can display multiple function options based on eye gaze recognition, such as payment code option 442 and health code option 443.

The payment code option 442 can be used to turn on or off the function of eye gaze controlling the display of payment codes. For example, in the scenario where the payment code option 442 is turned on (“ON”), when the unlocking is successful and the main interface is displayed, the terminal 100 can display the shortcut window 225 associated with the payment interface. The image frame of the facial image confirms whether the user is looking at the shortcut window 225. When it is detected that the user looks at the shortcut window 225 on the screen, the terminal 100 can display the payment interface corresponding to the shortcut window 225 and obtain the payment code. In this way, users can quickly and easily obtain the payment code and complete the payment behavior, thereby avoiding a large number of tedious user operations and obtaining a better user experience.

The health code option 443 can be used to turn on or off the function of eye gaze control to display the health code. For example, in the scenario where the health code option 443 is turned on (“ON”), when the unlocking is successful and the main interface is displayed, the terminal 100 can display the shortcut window 226 associated with the health code interface. At the same time, the terminal 100 can also display the collected information containing The image frame of the user's facial image confirms whether the user is looking at the shortcut window 226. When detecting the user's action of gazing at the shortcut window 226, the terminal 100 may display a health code interface including the health code and related information. In this way, users can quickly and easily obtain health codes and complete health checks, thereby avoiding a large number of tedious user operations.

The eye gaze recognition setting interface shown in FIG. 4D may also include other shortcut function options based on eye gaze, such as notification bar option 444. When the unlocking is successful and the main interface is displayed, the terminal 100 may detect whether the user is looking at the notification bar area at the top of the screen. When detecting that the user looks at the notification bar, the terminal 100 may display a notification interface for the user to check notification messages.

In some embodiments, users can customize the display area of the shortcut window according to their own usage habits and the layout of the main interface, so as to minimize the impact of the shortcut window on the main interface of the terminal 100.

In some embodiments, the eye gaze recognition setting interface may also be shown in Figure 5A. The terminal 100 may detect a user operation on the payment code option 442, and in response to the above operation, the terminal 100 may display the user interface (payment code setting interface) shown in FIG. 5B.

As shown in FIG. 5B , the interface may include buttons 511 and area selection controls 512 . Button 511 can be used to turn on ("ON") or turn off ("OFF") the function of eye gaze control to display the payment code. The area selection control 512 can be used to set the display area of the payment code shortcut window 225 on the screen.

The area selection control 512 may include a control 5121, a control 5122, a control 5123, and a control 5124. By default, when the function of controlling the display of payment codes by eye gaze is turned on, the payment code shortcut window 225 is displayed in the upper right corner area of the screen, corresponding to the display area shown by control 5122. At this time, the icon 5125 (selected icon) can be displayed in the control 5122, indicating the display area (upper right corner area) of the current payment code shortcut window 225 on the screen.

If a display area has been used to display a shortcut window, the icon 5126 (occupied icon) may be displayed in the control corresponding to the display area. For example, the display area shown in the control 5123 may correspond to the health code shortcut window 226. Therefore, the occupied icon may be displayed in the control 5123, indicating that the area in the lower left corner of the screen corresponding to the control 5123 is occupied and can no longer be used to set the payment code shortcut window 225.

Referring to FIG. 5C, the terminal 100 may detect a user operation on the control 5121. In response to the above operation, the terminal 100 may display a selection icon in the control 5121 to indicate the display area (upper left corner area) of the currently selected shortcut window 225 associated with the payment code on the screen. At this time, referring to Figure 5D, when successful unlocking is detected and the main interface is displayed, the shortcut window 225 corresponding to the payment code may be displayed in the upper left corner area above the layer of the main interface.

Referring to the setting methods shown in Figures 5A-5C, the terminal 100 can also set the health code shortcut window according to user operations. 226 display area, I won’t go into details here.

As shown in Figure 5E, the eye gaze recognition setting interface may also include a control 445. Control 445 can be used to add more shortcut windows, thereby providing users with more services for quickly opening commonly used applications and/or commonly used interfaces.

As shown in Figure 5E, terminal 100 may detect user operations on control 445. In response to the above operation, the terminal 100 may display the user interface (add shortcut window interface) shown in FIG. 5F. The interface may include multiple shortcut window options, such as option 521, option 522, and so on. Option 521 may be used to set a shortcut window 227 associated with the health check record. After recognizing the user's action of looking at the shortcut window 227, the terminal 100 may display an interface (third interface) including the user's health detection record. Specifically, the health detection record shortcut window can be referred to Figure 5G. Option 522 can be used to set a shortcut window associated with the electronic ID card. This shortcut window can be associated with the interface that displays the user's electronic ID card, thereby providing a service for quickly opening the interface, which will not be described again here.

Terminal 100 may detect user operations on option 521. In response to the above operation, the terminal 100 may display the user interface (health detection record setting interface) shown in FIG. 5H. As shown in Figure 5H, button 531 can be used to open the shortcut window 227 associated with the health test record. Page controls (control 5321, control 5322, control 5323) can be used to set the display page of the shortcut window 227.

Page control 5321 can be used to indicate the negative screen of the main interface. Page control 5322 may be used to indicate the first desktop of the main interface. Page control 5323 may be used to indicate a second desktop of the main interface. By default, after the shortcut window 227 is opened, the shortcut window 227 can be set on the first desktop (when all four display areas of the first desktop are not occupied). At this time, the page control 5322 may also display a check mark to indicate that the shortcut window 227 is currently set in the first desktop. Further, by default, the shortcut window 227 can be set in the lower right corner area of the first desktop, corresponding to the area selection control 5334.

After the setting is completed, the terminal 100 may detect a user operation on the return control 534, and in response to the above operation, the terminal 100 may display an eye gaze recognition setting interface, see FIG. 5I. At this time, the interface also includes a health monitoring record option 446, which corresponds to the function of controlling eye gaze to display health monitoring records.

In this way, after completing the unlocking and displaying the main interface, within the preset gaze recognition time, the terminal 100 can also display the shortcut window 227 associated with the health monitoring record on top of the layer of the main interface. Refer to the shortcut window 227 in Figure 5J . Based on the image frames collected during the gaze recognition time including the user's facial image, the terminal 100 can identify the user's eyeball gaze position and determine whether the user is gazing at the shortcut window 227 . In response to detecting the user's action of gazing at the shortcut window 227, the terminal 100 may display a third interface corresponding to the shortcut window 227 that displays the health detection record.

It is understandable that the terminal 100 can also change the display page and display area of the above-mentioned shortcut window 227 according to user operations to meet the user's personalized display needs, better fit the user's usage habits, and improve the user's usage experience.

For example, referring to FIG. 5K, the terminal 100 may detect a user operation acting on the page control 5323. At this time, the "display area" corresponds to four display areas including the upper left corner and the upper right corner of the second desktop. The terminal 100 may detect a user operation on the area selection control 5333. At this time, the terminal 100 may determine to display the shortcut window 227 in the lower left corner area of the second desktop.

Referring to Figure 5L, after completing the unlocking and displaying the first desktop, within the preset gaze recognition time, the terminal 100 can display the payment code shortcut window 225 and the health code shortcut window 226 on the layer of the first desktop; refer to Figure 5M. After completing the unlocking and displaying the second desktop, the terminal 100 can also be on top of the layer of the second desktop within the preset gaze recognition time. Display shortcut window 227.

It can be understood that when the enabled shortcut windows are set on different pages of the main interface, the terminal 100 can display the corresponding shortcut windows belonging to the page according to the page displayed after unlocking.

In some embodiments, the terminal 100 can also set the privacy type (private and non-private) of various shortcut windows. For non-private shortcut windows, the terminal 100 can also display them on the interface to be unlocked.

The terminal 100 can detect the gaze position of the user's eyeballs on the interface to be unlocked, and determine whether the user is gazing at the non-private shortcut window. When detecting that the user looks at the non-private shortcut window, the terminal 100 may display a commonly used interface corresponding to the shortcut window. In this way, the user does not need to complete the unlocking operation, thereby further saving user operations and allowing the user to obtain commonly used applications and/or commonly used interfaces more quickly.

Referring to FIG. 6A, the payment code setting interface may also include a button 611. Button 611 can be used to set the privacy type of the shortcut window 225 associated with the payment code. Button 611 is turned on ("ON") to indicate that payment code shortcut window 225 is private. Conversely, button 611 being turned off ("OFF") may indicate that shortcut window 225 is non-private. As shown in Figure 6A, shortcut window 225 can be set to be private.

Referring to the above process, the shortcut window 226 associated with the health code can also be set to be private or non-private. As shown in Figure 6B, button 612 is closed, which means that shortcut window 226 can be set to be non-private. Referring to Figure 6C, in the eye gaze recognition setting interface, the option corresponding to the private shortcut window may be accompanied by a security display label 613 to remind the user that the shortcut window is private and will not be displayed on the screen before unlocking.

As shown in FIG. 6D and FIG. 6E , when displaying the interface to be unlocked, the terminal 100 may display the non-private health code shortcut window 226 on top of the layer of the interface to be unlocked. When displaying the health code shortcut window 226, the terminal 100 may collect the user's facial image. Referring to FIG. 6F , the terminal 100 may recognize that the user is looking at the health code shortcut window 226 based on the collected image frame including the user's facial image. In response to the user's action of gazing at the health code shortcut window 226, the terminal 100 may display a health code interface corresponding to the health code shortcut window 226 that displays the health code, see FIG. 6G.

For private shortcut windows, such as the payment code shortcut window 226, the terminal 100 will not display the shortcut window on the interface to be unlocked to avoid leakage of the payment code.

When the terminal 100 is provided with both a private shortcut window and a non-private shortcut window, the terminal 100 can turn on the camera to identify the user's eyeball gaze position on the interface to be unlocked. When the user's eyeball gaze position is within the non-private shortcut window, the terminal 100 may display the corresponding commonly used interface. When the user's eyeball gaze position is within the privacy shortcut window, the terminal 100 may not display the corresponding common interface.

When only a private shortcut window is provided in the terminal 100, in the interface to be unlocked, the terminal 100 does not need to turn on the camera to collect the user's facial image and identify the user's eyeball gaze position.

Referring to Figure 6H and Figure 6I, if the terminal 100 completes the unlocking operation first, at this time, the terminal 100 can display the main interface. After displaying the main interface, the terminal 100 can display either the non-private health code shortcut window 226 or the private payment code shortcut window 225. That is to say, the terminal 100 can display a private shortcut window after being unlocked. The terminal 100 can display a non-private shortcut window before unlocking or display a non-private shortcut window after unlocking, providing users with a more convenient service of controlling and displaying commonly used applications and/or commonly used interfaces.

In some embodiments, the terminal 100 can also set the display times of various shortcut windows. After the above display times are exceeded, the terminal 100 may not display the shortcut window, but the terminal 100 may still recognize the user's eyeball gaze position and provide services for quickly displaying applications and/or commonly used interfaces.

Specifically, referring to Figure 7A, the health code setting interface may also include a control 711. The control 711 can be used to set the number of times the shortcut window is displayed. For example, the "100 times" displayed in the control 711 can mean: when the eye gaze control display health code function is enabled for the first 100 times, the terminal 100 can display the shortcut window 226 corresponding to the health code. , to prompt the user.

As shown in Figure 7B, after 100 times, the terminal 100 may not display the shortcut window 226 corresponding to the health code (the dotted line box in Figure 7B represents the area where the user's eyeball gaze is located, and the above dotted line box is not displayed on the screen).

During the gaze recognition time, although the terminal 100 no longer displays the shortcut window 226 corresponding to the health code, the terminal 100 can still collect the user's facial image. If it is detected that the user looks at the lower left corner area of the first desktop, the terminal 100 can still display the corresponding health code interface displaying the health code, see FIG. 7C , so that the user can use the above health code interface to complete the health code verification.

In this way, after the user uses the eye gaze function for a long time, the user can skillfully know which area of the main interface corresponds to which common interface, without the need for the terminal 100 to display the shortcut window corresponding to the common interface in the above area. At this time, the terminal 100 may not display the above shortcut window, thereby reducing the shortcut window's obstruction of the main interface and improving the user experience.

Figure 8 exemplarily shows a flow chart of a display method provided by an embodiment of the present application. The following is a detailed introduction to the process of the terminal 100 implementing the above display method with reference to FIG. 8 and the user interface introduced previously.

S101. The terminal 100 detects that the trigger condition for turning on eye gaze recognition is met.

Turning on the camera for a long time to collect the user's facial image and identify the user's eyeball gaze position takes up a lot of resources of the terminal 100 (camera equipment resources and computing resources), and also greatly increases the power consumption of the terminal 100. At the same time, considering privacy and security, the camera of the terminal 100 will not be turned on all the time.

Therefore, the terminal 100 may be preset with some scenarios for enabling eye gaze recognition. When it is detected that the terminal 100 is in the above scene, the terminal 100 will turn on the camera to collect the user's facial image. When the above scene ends, the terminal 100 can turn off the camera and stop collecting the user's facial image to avoid occupying camera resources, save power consumption, and protect user privacy.

R&D developers can determine the above-mentioned scenarios where eye gaze recognition needs to be turned on through advance analysis of user habits. Generally, the above scene is usually the scene where the user picks up the mobile phone or just unlocks the mobile phone and enters the mobile phone. At this time, the terminal 100 can provide the user with a service of quickly launching a certain application program (commonly used application program), so as to save the user operations and improve the user experience. Furthermore, the terminal 100 can provide the user with a control method for controlling eye gaze to activate the above-mentioned applications, thereby avoiding the problem of inconvenience in performing touch operations in scenarios where the user's hands are occupied, and further improving the user experience.

Therefore, the above scenarios include but are not limited to: the scenario of lighting up the mobile phone screen and displaying the interface to be unlocked, and the scenario of displaying the main interface (including the first desktop, second desktop, negative screen, etc.) after unlocking.

Corresponding to the above scenario of turning on eye gaze recognition, the triggering conditions for turning on eye gaze recognition include: detecting a user operation to wake up the phone, detecting a user operation to complete unlocking and display the main interface. Among them, the user operation of waking up the mobile phone includes but is not limited to the operation of the user picking up the mobile phone, the operation of the user waking up the mobile phone through the voice assistant, etc.

Referring to the user interfaces shown in Figures 2C-2E, after detecting the completion of the unlocking operation, the terminal 100 can display the main interface shown in Figure 2D. At the same time, the terminal 100 can also display on top of the layer of the main interface. Frequently used applications or Use the shortcut windows 225 and 226 associated with common interfaces in the application. The above operation of instructing the terminal 100 to display the user interface shown in FIGS. 2C to 2E may be referred to as a user operation of detecting completion of unlocking and displaying the main interface. At this time, the terminal 100 can turn on the camera to collect the user's facial image, identify the user's eyeball gaze position, and then determine whether the user is looking at the above-mentioned shortcut window.

Referring to the user interfaces shown in Figures 6D-6E, when it is detected that the terminal 100 is awakened but the unlocking is not completed, the terminal 100 can also display common applications or common interfaces in common applications on top of the layer of the interface to be unlocked. associated shortcut window. At this time, the terminal 100 can also turn on the camera to collect the user's facial image, identify the user's eyeball gaze position, and then determine whether the user is looking at the above-mentioned shortcut window.

S102. The terminal 100 turns on the camera module 210 to collect the user's facial image.

After detecting a user operation to wake up the mobile phone, or detecting a user operation to complete unlocking and display the main interface, the terminal 100 may determine to turn on the eye gaze recognition function.

On the one hand, the terminal 100 may display a shortcut window to prompt the user to open commonly used applications and common interfaces associated with the shortcut window by looking at the shortcut window. On the other hand, the terminal 100 can turn on the camera module 210 and collect the user's facial image to identify whether and which shortcut window the user is looking at.

Referring to the introduction in FIG. 2B , the camera module 210 of the terminal 100 includes at least a 2D camera and a 3D camera. 2D cameras can be used to capture and generate two-dimensional images. 3D cameras can be used to capture and generate three-dimensional images containing depth information. In this way, the terminal 100 can obtain the two-dimensional image and the three-dimensional image of the user's face at the same time. Combining the above two-dimensional images and three-dimensional images, the terminal 100 can obtain richer facial features, especially eye features, so as to more accurately identify the user's eyeball gaze position, and more accurately determine whether the user is looking at the quick window and which shortcut he is looking at. window.

Referring to the introduction in S101, the terminal 100 will not always turn on the camera. Therefore, after turning on the camera module 210, the terminal 100 needs to set a time to turn off the camera module 210.

The terminal 100 can set the gaze recognition time. The gaze recognition time opening time is the time when the terminal 100 detects the trigger condition described in S101. The moment at which the gaze recognition time ends depends on the duration of the gaze recognition time. The above duration is preset, such as 2.5 seconds, 3 seconds, 3.5 seconds, 4 seconds, etc. introduced in Figure 2F. Among them, 3 seconds is the preferred duration of gaze recognition time. When the gaze recognition time expires, the terminal 100 can turn off the camera module 210, that is, it will no longer recognize the user's eyeball gaze position.

Correspondingly, after the gaze recognition time expires, the terminal 100 may no longer display the shortcut window to avoid blocking the main interface for a long time and affecting the user experience.

S103. The terminal 100 determines the user's eyeball gaze position based on the collected image frames including the user's facial image.

The image frames collected and generated by the camera module 210 during the gaze recognition time may be called target input images. The terminal 100 can identify the user's eyeball gaze position using the above-mentioned target input image. Referring to the introduction of FIG. 1 , when the user looks at the terminal 100 , the position where the user's line of sight focuses on the screen of the terminal 100 may be called the eyeball gaze position.

Specifically, after acquiring the target input image, the terminal 100 may input the above image into the eye gaze recognition model. The eye gaze recognition model is a model preset in the terminal 100 . The eye gaze recognition model can determine the user's eye gaze position using an image frame containing the user's facial image, with reference to the cursor point S shown in Figure 1. The eye gaze recognition model can output the position coordinates of the eye gaze position on the screen. The subsequent Figure 9 will specifically introduce the eye gaze used in this application. The structure of the identification model will not be expanded upon here.

After obtaining the position coordinates of the eye gaze position, the terminal 100 can determine whether and which shortcut window the user is watching based on the above position coordinates, and then determine whether to open commonly used applications and common interfaces associated with the above shortcut window.

Optionally, the eye gaze recognition model can also output the user's eye gaze area. An eye-gaze area can be contracted into an eye-gaze position, and an eye-gaze position can also be expanded into an eye-gaze area. In some examples, a cursor point formed by one display unit on the screen can be called an eye gaze position, and correspondingly, a cursor point or a cursor area formed by multiple display units on the screen can be called an eye gaze area.

After outputting an eye gaze area, the terminal 100 can determine whether and which shortcut window the user is looking at by judging the position of the eye gaze area on the screen, and then determine whether to open commonly used applications and frequently used applications associated with the above shortcut window. interface.

S104. The terminal 100 determines whether the user is gazing at the shortcut window based on the position coordinates of the eye gaze position and the current interface, and further determines whether to display commonly used applications and common interfaces associated with the above shortcut window.

After determining the position coordinates of the user's eyeball gaze position, combined with the current interface of the terminal 100, the terminal 100 can determine whether the user is gazing at the shortcut window on the current interface.

Referring to FIG. 2F, when the terminal 100 displays the interface shown in FIG. 2F, the interface may be called the current interface of the terminal 100. At this time, based on the collected image frames including the user's facial image, the terminal 100 can determine the position coordinates of the user's eyeball gaze position. Therefore, the terminal 100 can determine the area or control corresponding to the eye gaze position according to the position coordinates.

When the eyeball gaze position is within the shortcut window 225, the terminal 100 can determine that the user is looking at the shortcut window 225; when the eyeball gaze position is within the shortcut window 226, the terminal 100 can determine that the user is looking at the shortcut window 226. In some embodiments, the terminal 100 may also determine that the user's eyeball gaze position corresponds to a certain application icon in the common application icon tray 223 or other application icon trays 224, such as the "Gallery" application and so on. The above-mentioned eye gaze position can also be in a blank area on the screen, which does not correspond to the icons or controls in the main interface, nor to the shortcut window described in this application.

Referring to FIGS. 2F to 2G , when it is determined that the user is watching the shortcut window 225 , the terminal 100 may display the payment interface corresponding to the shortcut window 225 . The payment interface is a common interface determined by users. Referring to FIGS. 2H to 2I , when it is determined that the user is looking at the shortcut window 226 , the terminal 100 may display a health code interface corresponding to the shortcut window 226 for displaying the health code. The health code interface is also a commonly used interface for users to determine.

In some embodiments, when it is determined that the user is looking at an application icon in the frequently used application icon tray 223 or other application icon trays 224, the terminal 100 may open the application corresponding to the application icon. For example, referring to FIG. 2F, when it is determined that the user is looking at the "Gallery" application icon, the terminal 100 may display the homepage of the "Gallery".

Referring to Figure 3E, the terminal 100 can also display icons of commonly used applications (application icons 321 and 322). When it is determined that the user is looking at the application icon 321 or the application icon 322, the terminal 100 can open the corresponding application icon 321 or the application icon 322. frequently used applications, such as displaying the home page of the above frequently used applications.

When the eye gaze position is a blank area on the screen, the terminal 100 may not perform any action until the eye gaze recognition time is over and turn off the eye gaze recognition function.

In some embodiments, referring to FIGS. 7A-7C , the terminal 100 may not display the shortcut window or icon when identifying the user's eyeball gaze position. However, the terminal 100 can still determine the specific area to which the eyeball gaze position belongs based on the position coordinates of the user's eyeball gaze position. The above-mentioned specific areas are preset, such as the upper left corner area, the upper right corner area, the lower left corner area, the lower right corner area, etc. shown in FIG. 7A. Furthermore, based on the common applications and common interfaces associated with the specific area, the terminal 100 can determine which application to open and which interface to display.

For example, referring to FIG. 7B , the terminal 100 can recognize that the user's eyeball gaze position is in the lower left corner area of the screen. Therefore, the terminal 100 can display a health code interface associated with the lower left corner area for displaying the health code. Refer to FIG. 7C .

Figure 9 exemplarily shows the structure of the eye gaze recognition model. The eye gaze recognition model used in the embodiment of the present application will be introduced in detail below with reference to Figure 9 . In the embodiment of this application, the eye gaze recognition model is established based on convolutional neural networks (Convolutional Neural Networks, CNN).

As shown in Figure 9, the eye gaze recognition model may include: a face correction module, a dimensionality reduction module, and a convolutional network module.

(1), Face correction module.

The image frames collected by the camera module 210 and including the user's facial image may first be input into the face correction module. The face correction module can be used to identify whether the facial image in the input image frame is straight. For image frames in which the facial image is not straight (such as head tilt), the face correction module can correct the image frame to make it straight, thereby avoiding subsequent impact on the eye gaze recognition effect.

Figure 10 shows the processing flow of the face correction module performing face correction on the image frames collected by the camera module 210.

S201: Use the facial key point recognition algorithm to determine the facial key points in the image frame T1.

In the embodiment of this application, the key points of the human face include the left eye, the right eye, the nose, the left lip corner, and the right lip corner. The face key point recognition algorithm is existing, such as the Kinect-based face key point recognition algorithm, etc., which will not be described again here.

Referring to FIG. 11A , FIG. 11A exemplarily shows an image frame including a user's facial image, which is denoted as image frame T1. The face correction module can use the face key point recognition algorithm to determine the key points of the face in the image frame T1: left eye a, right eye b, nose c, left lip corner d, right lip corner e, and determine the key points of each key point. For the coordinate position, refer to image frame T1 in Figure 11B.

S202: Use the face key points to determine the calibrated line of the image frame T1, and then determine the face deflection angle θ of the image frame T1.

In an upright facial image, the left and right eyes are on the same horizontal line, so the straight line connecting the key points of the left eye and the key points of the right eye (the calibrated line) is parallel to the horizontal line, that is, the face deflection angle (the composition of the calibrated line and the horizontal line) angle)θ is 0.

As shown in FIG. 11B , the face correction module can use the recognized coordinate positions of the left eye a and the right eye b to determine the calibrated line L1. Then, based on L1 and the horizontal line, the face correction module can determine the face deflection angle θ of the facial image in the image frame T1.

S203: If θ=0°, it is determined that the facial image in the image frame T1 is straight and no correction is needed.

S204: If θ≠0°, it is determined that the facial image in the image frame T1 is not straight. Further, rotation correction is performed on the image frame T1 to obtain an image frame with a straight facial image.

In FIG. 11B , θ≠0, that is, the facial image in the image frame T1 is not straight. At this time, the face correction module can correct the image frame T1 to make the face in the image frame straight.

Specifically, the face correction module can first use the coordinate positions of the left eye a and the right eye b to determine the rotation center point y, and then, Taking the y point as the rotation center, rotate the image frame T1 by θ° to obtain an image frame with a straight facial image, which is recorded as image frame T2. As shown in Figure 11B, point A can represent the position of the rotated left eye a, point B can represent the position of the right eye b after the rotation, point C can represent the position of the nose c after the rotation, and point D can represent the position of the rotated nose c. The position of the left lip corner d and point E can represent the position of the rotated right lip corner e.

It can be understood that when rotating the image frame T1, every pixel in the image frame will be rotated. The above A, B, C, D, and E are only examples of the rotation process of key points in the image, and do not only rotate the key points of the human face.

S205: Process the corrected image frame of the corrected facial image to obtain a left eye image, a right eye image, a facial image and face grid data. Among them, the face grid data can be used to reflect the position of the face image in the entire image.

Specifically, the face correction module can center on the key points of the face and crop the corrected image frame according to the preset size, thereby obtaining the left eye image, right eye image, and face image corresponding to the image. In determining the face image, the face correction module may determine face mesh data.

Referring to Figure 11C, the face correction module can determine a rectangle of fixed size with the left eye A as the center. The image covered by this rectangle is the left eye image. In the same way, the face correction module can determine the right eye image with the right eye B as the center, and the face image with the nose C as the center. Among them, the size of the left eye image and the right eye image are the same, and the size of the face image and the left eye image are different. After determining the face image, the face correction module can correspondingly obtain the face grid data, that is, the position of the face image in the entire image.

After completing the face correction, the terminal 100 can obtain the corrected image frame of the facial image, and obtain the corresponding left eye image, right eye image, facial image and face mesh data from the above image frame.

(2) Dimensionality reduction module.

The face correction module can input the left eye image, right eye image, facial image and face mesh data output by itself into the dimensionality reduction module. The dimensionality reduction module can be used to reduce the dimensionality of the input left eye image, right eye image, facial image and face grid data to reduce the computational complexity of the convolutional network module and improve the speed of eye gaze recognition. The dimensionality reduction methods used by the dimensionality reduction module include but are not limited to principal component analysis (PCA), downsampling, 1*1 convolution kernel, etc.

(3), Convolutional network module.

Each dimensionally reduced image (left eye image, right eye image, face image and face mesh data) can be input to the convolutional network module. The convolutional network module can output the eye gaze position based on the above input image. In this embodiment of the present application, the structure of the convolutional network in the convolutional network module can be referred to Figure 12.

As shown in Figure 12, the convolution network may include convolution group 1 (CONV1), convolution group 2 (CONV2), and convolution group 3 (CONV3). A convolution group includes: convolution kernel (Convolution), activation function PRelu, pooling kernel (Pooling) and local response normalization layer (Local Response Normalization, LRN). Among them, the convolution kernel of CONV1 is a 7*7 matrix, and the pooling kernel is a 3*3 matrix; the convolution kernel of CONV2 is a 5*5 matrix, and the pooling kernel is a 3*3 matrix; the convolution of CONV3 The kernel is a 3*3 matrix, and the pooling kernel is a 2*2 matrix.

Among them, separable convolution technology can reduce the storage requirements of convolution kernels (Convolution) and pooling kernel (Pooling), thereby reducing the overall model's demand for storage space, allowing the model to be deployed on terminal devices.

Specifically, separable convolution technology refers to decomposing an n*n matrix into an n*1 column matrix and a 1*n row matrix for storage, thereby reducing the demand for storage space. Therefore, the eye gaze module used in this application has the advantages of small size and easy deployment, so as to be adapted to be deployed on terminal electronic devices such as mobile phones.

Specifically, referring to Figure 13, matrix A can represent a 3*3 convolution kernel. Assuming that matrix A is stored directly, matrix A needs to occupy 9 storage units. Matrix A can be split into column matrix A1 and row matrix A2 (column matrix A1 × row matrix A2 = matrix A). The column matrix A1 and the row matrix A2 only require 6 storage units.

After being processed by CONV1, CONV2, and CONV3, different images can be input to different connection layers for full connection. As shown in Figure 12, the convolutional network may include connection layer 1 (FC1), connection layer 2 (FC2), and connection layer 3 (FC3).

The left eye image and the right eye image can be input into FC1 after passing through CONV1, CONV2, and CONV3. FC1 may include a combination module (concat), a convolution kernel 1201, PRelu, and a fully connected module 1202. Among them, concat can be used to combine left eye images and right eye images. The face image can be input into FC2 after passing through CONV1, CONV2, and CONV3. FC2 may include a convolution kernel 1203, PRelu, a fully connected module 1204, and a fully connected module 1205. FC2 can perform two full connections on face images. The face mesh data can be input into FC3 after passing through CONV1, CONV2, and CONV3. FC3 includes a fully connected module.

Connection layers with different structures are constructed for different types of images (such as left eye, right eye, face images), which can better obtain the characteristics of various types of images, thereby improving the accuracy of the model, so that the terminal 100 can be more accurate to identify the user’s eye gaze position.

Then, the full connection module 1206 can perform another full connection on the left eye image, the right eye image, the face image, and the face grid data, and finally output the position coordinates of the eyeball gaze position. The eyeball gaze position indicates the abscissa and ordinate of the focus of the user's sight on the screen, refer to the cursor point S shown in Figure 1. Furthermore, when the eyeball gaze position is within the control area (control such as icon, window, etc.), the terminal 100 can determine that the user is gazing at the control.

In addition, the convolutional neural network set by the eye gaze model used in this application has fewer parameters. Therefore, the time required to calculate and predict the user's eye gaze position using the eye gaze model is relatively small, that is, the terminal 100 can quickly determine the user's eye gaze position, and then quickly determine whether the user will open commonly used applications and programs through eye gaze control. Common interface.

In this application example:

The first preset area and the second preset area may be any two different areas of the upper left corner area, the upper right corner area, the lower left corner area, and the lower right corner area of the screen;

Referring to Figure 3A, the first interface and the fourth interface can be any two different interfaces among the main interfaces such as the first desktop (page 31), the second desktop (page 32), the negative screen (page 30), etc.;

The second interface, the third interface, the fifth interface, and the sixth interface can be any one of the following interfaces: the payment interface shown in Figure 2G, the health code interface shown in Figure 2I, and the health detection record interface shown in Figure 5G. As well as various common user interfaces such as the ride code interface shown in the attached picture;

Taking the payment interface set to privacy as an example, before displaying the payment interface, referring to Figure 2D, the shortcut window 225 displayed by the electronic device on the first desktop can be called the first control; referring to Figure 3D, the icon 331 can also be called the first control. A control; alternatively, a shortcut to an application that provides a payment interface may also be called a first control.

Figure 14 is a schematic system structure diagram of the terminal 100 according to the embodiment of the present application.

The layered architecture divides the system into several layers, and each layer has clear roles and division of labor. The layers communicate through software interfaces. In some embodiments, the system is divided into five layers, from top to bottom: application layer (application layer), application framework layer (framework layer), hardware abstraction layer, driver layer and hardware layer.

The application layer can include multiple applications, such as dial-up applications, gallery applications, and so on. In this embodiment of the present application, the application layer also includes an eye gaze SDK (software development kit). The system of the terminal 100 and the third application installed on the terminal 100 can identify the user's eyeball gaze position by calling the eyeball gaze SDK.

The framework layer provides application programming interface (API) and programming framework for applications in the application layer. The framework layer includes some predefined functions. In this embodiment of the present application, the framework layer may include a camera service interface and an eyeball gaze service interface. The camera service interface is used to provide an application programming interface and programming framework for using the camera. The eye gaze service interface provides an application programming interface and programming framework that uses the eye gaze recognition model.

The hardware abstraction layer is the interface layer between the framework layer and the driver layer, providing a virtual hardware platform for the operating system. In this embodiment of the present application, the hardware abstraction layer may include a camera hardware abstraction layer and an eye gaze process. The camera hardware abstraction layer can provide virtual hardware for camera device 1 (RGB camera), camera device 2 (TOF camera), or more camera devices. The calculation process of identifying the user's eye gaze position through the eye gaze recognition module is performed during the eye gaze process.

The driver layer is the layer between hardware and software. The driver layer includes drivers for various hardware. The driver layer may include camera device drivers. The camera device driver is used to drive the sensor of the camera to collect images and drive the image signal processor to preprocess the images.

The hardware layer includes sensors and secure data buffers. Among them, the sensors include RGB camera (ie 2D camera) and TOF camera (ie 3D camera). The camera included in the sensor corresponds to the virtual camera device included in the camera hardware abstraction layer one-to-one. RGB cameras capture and generate 2D images. TOF camera is a depth-sensing camera that can collect and generate 3D images with depth information.

Data collected by the camera is stored in a secure data buffer. When any upper-layer process or reference obtains the image data collected by the camera, it needs to obtain it from the secure data buffer and cannot obtain it through other means. Therefore, the secure data buffer can also avoid the problem of abuse of the image data collected by the camera, so it is called For safe data buffer.

The software layers introduced above and the modules or interfaces included in each layer run in a runnable environment (Runnable executive environment, REE). The terminal 100 also includes a trusted execution environment (Trust executive environment, TEE). Data communication in TEE is more secure than REE.

TEE can include eye gaze recognition algorithm module, trust application (Trust Application, TA) module and security service module. The eye gaze recognition algorithm module stores the executable code of the eye gaze recognition model. TA can be used to safely send the recognition results output by the above model to the eye gaze process. The security service module can be used to securely input the image data stored in the secure data buffer to the eye gaze recognition algorithm module.

The following is a detailed description of the interaction method based on eye gaze recognition in the embodiment of the present application in combination with the above hardware structure and system structure:

The terminal 100 detects that the trigger condition for turning on eye gaze recognition is met. Accordingly, the terminal 100 may determine to perform the eye gaze recognition operation.

First, the terminal 100 can call the eye gaze service through the eye gaze SDK.

On the one hand, the eye gaze service can call the camera service of the frame layer to collect and obtain the user's facial image through the camera service. The camera service can send instructions to start the RGB camera and TOF camera by calling camera device 1 (RGB camera) and camera device 2 (TOF camera) in the camera hardware abstraction layer. The camera hardware abstraction layer sends this instruction to the camera device driver of the driver layer. The camera device driver can start the camera according to the above instructions. The instructions sent by camera device 1 to the camera device driver can be used to start the RGB camera. The instructions sent by the camera device 2 to the camera device driver can be used to start the TOF camera. After the RGB camera and TOF camera are turned on, they collect light signals and use the image signal processor to generate two-dimensional and three-dimensional images of electrical signals.

On the other hand, the eye gaze service creates an eye gaze process and initializes the eye recognition model.

Images (two-dimensional images and three-dimensional images) generated by the image signal processor can be stored in a secure data buffer. After the eye gaze process is created and initialized, the image data stored in the secure data buffer can be transmitted to the eye gaze recognition algorithm through the secure transmission channel (TEE) provided by the security service. After receiving the image data, the eye gaze recognition algorithm can input the above image data into the eye gaze recognition model established based on CNN to determine the user's eye gaze position. Then, TA safely returns the above-mentioned eye gaze position to the eye gaze process, and then returns it to the application layer eye gaze SDK through the camera service and eye gaze service.

Finally, the eye gaze SDK can determine the area or icon, window and other controls that the user is looking at based on the received eye gaze position, and then determine the display action associated with the above area or control.

Figure 15 shows a schematic diagram of the hardware structure of the terminal 100.

The terminal 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, Mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, headphone interface 170D, sensor module 180, button 190, motor 191, indicator 192, camera 193, display screen 194, and user Identification module (subscriber identification module, SIM) card interface 195, etc. The sensor module 180 may include a pressure sensor 180A, a gyro sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, and ambient light. Sensor 180L, bone conduction sensor 180M, etc.

It can be understood that the structure illustrated in the embodiment of the present invention does not constitute a specific limitation on the terminal 100. In other embodiments of the present application, the terminal 100 may include more or fewer components than shown in the figures, or some components may be combined, or some components may be separated, or may be arranged differently. The components illustrated may be implemented in hardware, software, or a combination of software and hardware.

The processor 110 may include one or more processing units. For example, the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processing unit (GPU), and an image signal processor. (image signal processor, ISP), controller, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural network processing unit (NPU), etc. Among them, different processing units can be independent devices or integrated in one or more processors.

The controller can generate operation control signals based on the instruction operation code and timing signals to complete the control of fetching and executing instructions.

The processor 110 may also be provided with a memory for storing instructions and data. In some embodiments, the memory in processor 110 is cache memory. This memory may hold instructions or data that have been recently used or recycled by processor 110 . If the processor 110 needs to use the instructions or data again, it can be called directly from the memory. Repeated access is avoided and the waiting time of the processor 110 is reduced, thus improving the efficiency of the system.

In some embodiments, processor 110 may include one or more interfaces. Interfaces may include integrated circuit (inter-integrated circuit, I2C) interface, integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, pulse code modulation (pulse code modulation, PCM) interface, universal asynchronous receiver and transmitter (universal asynchronous receiver/transmitter (UART) interface, mobile industry processor interface (MIPI), general-purpose input/output (GPIO) interface, subscriber identity module (SIM) interface, and /or universal serial bus (USB) interface, etc.

It can be understood that the interface connection relationships between the modules illustrated in the embodiment of the present invention are only schematic illustrations and do not constitute a structural limitation on the terminal 100 . In other embodiments of the present application, the terminal 100 may also adopt different interface connection methods in the above embodiments, or a combination of multiple interface connection methods.

The charging management module 140 is used to receive charging input from the charger. The power management module 141 is used to connect the battery 142, the charging management module 140 and the processor 110.

The wireless communication function of the terminal 100 can be implemented through the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modem processor and the baseband processor.

Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals. The mobile communication module 150 can provide wireless communication solutions including 2G/3G/4G/5G applied to the terminal 100. The mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (LNA), etc. The mobile communication module 150 can receive electromagnetic waves through the antenna 1, perform filtering, amplification and other processing on the received electromagnetic waves, and transmit them to the modem processor for demodulation. The mobile communication module 150 can also amplify the signal modulated by the modem processor and convert it into electromagnetic waves through the antenna 1 for radiation.

A modem processor may include a modulator and a demodulator. Among them, the modulator is used to modulate the low-frequency baseband signal to be sent into a medium-high frequency signal. The demodulator is used to demodulate the received electromagnetic wave signal into a low-frequency baseband signal.

The wireless communication module 160 can provide applications on the terminal 100 including wireless local area networks (WLAN) (such as wireless fidelity (Wi-Fi) network), Bluetooth (bluetooth, BT), and global navigation satellite system. (global navigation satellite system, GNSS), frequency modulation (FM), near field communication technology (near field communication, NFC), infrared technology (infrared, IR) and other wireless communication solutions. The wireless communication module 160 receives electromagnetic waves via the antenna 2 , frequency modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110 . The wireless communication module 160 can also receive the signal to be sent from the processor 110, frequency modulate it, amplify it, and convert it into electromagnetic waves through the antenna 2 for radiation.

In some embodiments, the antenna 1 of the terminal 100 is coupled to the mobile communication module 150, and the antenna 2 is coupled to the wireless communication module 160, so that the terminal 100 can communicate with the network and other devices through wireless communication technology. Place The wireless communication technology may include global system for mobile communications (GSM), general packet radio service (GPRS), code division multiple access (CDMA), broadband code Wideband code division multiple access (WCDMA), time-division code division multiple access (TD-SCDMA), long term evolution (LTE), BT, GNSS, WLAN, NFC, FM, and/or IR technology, etc. The GNSS may include global positioning system (GPS), global navigation satellite system (GLONASS), Beidou navigation satellite system (BDS), quasi-zenith satellite system (quasi -zenith satellite system (QZSS) and/or satellite based augmentation systems (SBAS).

The terminal 100 implements the display function through the GPU, the display screen 194, and the application processor. The GPU is an image processing microprocessor and is connected to the display screen 194 and the application processor. GPUs are used to perform mathematical and geometric calculations for graphics rendering. Processor 110 may include one or more GPUs that execute program instructions to generate or alter display information.

The display screen 194 is used to display images, videos, etc. Display 194 includes a display panel. Display 194 includes a display panel. The display panel can use a liquid crystal display (LCD). The display panel can also use organic light-emitting diode (OLED), active matrix organic light-emitting diode or active matrix organic light-emitting diode (active-matrix organic light emitting diode, AMOLED), flexible light-emitting diode ( Manufacturing of flex light-emitting diodes (FLED), miniled, microled, micro-oled, quantum dot light emitting diodes (QLED), etc. In some embodiments, the terminal 100 may include 1 or N display screens 194, where N is a positive integer greater than 1.

In the embodiment of the present application, the terminal 100 uses the display functions provided by the GPU, the display screen 194, and the application processor to display Figures 2A-2I, 3A-3E, 4A-4D, and 5A-5M. , the user interface shown in Figures 6A-6I and 7A-7C.

The terminal 100 can implement the shooting function through the ISP, camera 193, video codec, GPU, display screen 194, application processor, etc. In the embodiment of the present application, the camera 193 includes an RGB camera (2D camera) that generates two-dimensional images and a TOF camera (3D camera) that generates three-dimensional images.

The ISP is used to process the data fed back by the camera 193. For example, when taking a photo, the shutter is opened, the light is transmitted to the camera sensor through the lens, the optical signal is converted into an electrical signal, and the camera sensor passes the electrical signal to the ISP for processing, and converts it into an image visible to the naked eye. ISP can also perform algorithm optimization on image noise and brightness. ISP can also optimize the exposure, color temperature and other parameters of the shooting scene. In some embodiments, the ISP may be provided in the camera 193.

Camera 193 is used to capture still images or video. The object passes through the lens to produce an optical image that is projected onto the photosensitive element. The photosensitive element may be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor. The photosensitive element converts the optical signal into an electrical signal, and then passes the electrical signal to the ISP to convert it into a digital image signal. ISP outputs digital image signals to DSP for processing. DSP converts digital image signals into standard RGB, YUV and other format image signals. In some embodiments, the terminal 100 may include 1 or N cameras 193, where N is a positive integer greater than 1.

Digital signal processors are used to process digital signals. In addition to digital image signals, they can also process other digital signals. Video codecs are used to compress or decompress digital video. Terminal 100 may support one or more video codecs.

NPU is a neural network (NN) computing processor. By drawing on the structure of biological neural networks, such as the transmission mode between neurons in the human brain, it can quickly process input information and can continuously learn by itself. The NPU can realize intelligent cognitive applications of the terminal 100, such as image recognition, face recognition, speech recognition, text understanding, etc.

In this embodiment of the present application, the terminal 100 collects the user's facial image through the shooting capability provided by the ISP and the camera 193 . The terminal 100 can execute the eye gaze recognition algorithm through the NPU, and then identify the user's eye gaze position through the collected user facial image.

The internal memory 121 may include one or more random access memories (RAM) and one or more non-volatile memories (NVM).

Random access memory can include static random-access memory (SRAM), dynamic random-access memory (DRAM), synchronous dynamic random-access memory (SDRAM), double data rate synchronous Dynamic random access memory (double data rate synchronous dynamic random access memory, DDR SDRAM, such as the fifth generation DDR SDRAM is generally called DDR5SDRAM), etc. Non-volatile memory can include disk storage devices and flash memory.

The random access memory can be directly read and written by the processor 110, can be used to store executable programs (such as machine instructions) of the operating system or other running programs, and can also be used to store user and application data, etc. The non-volatile memory can also store executable programs and user and application program data, etc., and can be loaded into the random access memory in advance for direct reading and writing by the processor 110.

In the embodiment of the present application, the application code of the eye gaze SDK can be stored in a non-volatile memory. When running the Eye Gaze SDK to call the Eye Gaze service, the application code of the Eye Gaze SDK can be loaded into random access memory. Data generated when running the above code can also be stored in random access memory.

The external memory interface 120 can be used to connect an external non-volatile memory to expand the storage capability of the terminal 100 . The external non-volatile memory communicates with the processor 110 through the external memory interface 120 to implement the data storage function. For example, save music, video and other files in external non-volatile memory.

The terminal 100 can implement audio functions through the audio module 170, the speaker 170A, the receiver 170B, the microphone 170C, the headphone interface 170D, and the application processor. Such as music playback, recording, etc.

The audio module 170 is used to convert digital audio information into analog audio signal output, and is also used to convert analog audio input into digital audio signals. Speaker 170A, also called "speaker", is used to convert audio electrical signals into sound signals. The terminal 100 can listen to music through the speaker 170A, or listen to a hands-free call. Receiver 170B, also called "earpiece", is used to convert audio electrical signals into sound signals. When the terminal 100 answers a call or a voice message, the voice can be heard by bringing the receiver 170B close to the human ear. Microphone 170C, also called "microphone" or "microphone", is used to convert sound signals into electrical signals. In this embodiment of the present application, when the terminal 100 is in the screen-off or screen-off AOD state, the terminal 100 can obtain the audio signal in the environment through the microphone 170C, and then determine whether the user's language wake-up word is detected. When making a call or sending a voice message, the user can speak through the human mouth close to the microphone 170C to transfer the sound The signal is input to microphone 170C. The headphone interface 170D is used to connect wired headphones.

The pressure sensor 180A is used to sense pressure signals and can convert the pressure signals into electrical signals. The gyro sensor 180B may be used to determine the angular velocity of the terminal 100 around three axes (ie, x, y, and z axes), and thereby determine the motion posture of the terminal 100. The acceleration sensor 180E can detect the acceleration of the terminal 100 in various directions (generally three axes). Therefore, the acceleration sensor 180E can be used to recognize the posture of the terminal 100 . In this embodiment of the present application, when the screen is off or the screen is off AOD, the terminal 100 can detect whether the user picks up the mobile phone through the acceleration sensor 180E and the gyroscope sensor 180B, and then determine whether to light up the screen.

Air pressure sensor 180C is used to measure air pressure. Magnetic sensor 180D includes a Hall sensor. The terminal 100 may use the magnetic sensor 180D to detect the opening and closing of the flip cover. Therefore, in some embodiments, when the terminal 100 is a flip machine, the terminal 100 can detect the opening and closing of the flip cover based on the magnetic sensor 180D, and then determine whether to light up the screen.

Distance sensor 180F is used to measure distance. Proximity light sensor 180G may include, for example, a light emitting diode (LED) and a light detector. The terminal 100 can use the proximity light sensor 180G to detect a scene in which the user holds the terminal 100 close to the user, such as a handset conversation. The ambient light sensor 180L is used to sense ambient light brightness. The terminal 100 can adaptively adjust the brightness of the display screen 194 according to the perceived ambient light brightness.

Fingerprint sensor 180H is used to collect fingerprints. The terminal 100 can use the collected fingerprint characteristics to implement fingerprint unlocking, access application lock and other functions. Temperature sensor 180J is used to detect temperature. Bone conduction sensor 180M can acquire vibration signals.

Touch sensor 180K, also known as "touch device". The touch sensor 180K can be disposed on the display screen 194. The touch sensor 180K and the display screen 194 form a touch screen, which is also called a "touch screen". The touch sensor 180K is used to detect a touch operation on or near the touch sensor 180K. The touch sensor can pass the detected touch operation to the application processor to determine the touch event type. Visual output related to the touch operation may be provided through display screen 194 . In other embodiments, the touch sensor 180K may also be disposed on the surface of the terminal 100 in a position different from that of the display screen 194 .

In this embodiment of the present application, the terminal 100 uses the touch sensor 180K to detect whether there is a user operation on the screen, such as clicking, sliding and other operations. Based on the user operation on the screen detected by the touch sensor 180K, the terminal 100 can determine the actions to be performed subsequently, such as running a certain application program, displaying the interface of the application program, and so on.

The buttons 190 include a power button, a volume button, etc. Key 190 may be a mechanical key. It can also be a touch button. The motor 191 can generate vibration prompts. The motor 191 can be used for vibration prompts for incoming calls and can also be used for touch vibration feedback. The indicator 192 may be an indicator light, which may be used to indicate charging status, power changes, messages, missed calls, notifications, etc.

The SIM card interface 195 is used to connect a SIM card. The terminal 100 can support 1 or N SIM card interfaces.

The term "user interface (UI)" in the description, claims and drawings of this application is a media interface for interaction and information exchange between an application program or an operating system and a user. It implements the internal form of information. Conversion to and from a user-acceptable form. The user interface of an application is source code written in specific computer languages such as Java and extensible markup language (XML). The interface source code is parsed and rendered on the terminal device, and finally presented as content that the user can recognize. Such as pictures, text Controls such as words and buttons. Control, also called widget, is the basic element of user interface. Typical controls include toolbar, menu bar, text box, button, and scroll bar. (scrollbar), images and text. The properties and contents of controls in the interface are defined through tags or nodes. For example, XML specifies the controls contained in the interface through nodes such as <Textview>, <ImgView>, and <VideoView>. A node corresponds to a control or property in the interface. After parsing and rendering, the node is rendered into user-visible content. In addition, many applications, such as hybrid applications, often include web pages in their interfaces. A web page, also known as a page, can be understood as a special control embedded in an application interface. A web page is source code written in a specific computer language, such as hypertext markup language (GTML), cascading styles Tables (cascading style sheets, CSS), java scripts (JavaScript, JS), etc., web page source code can be loaded and displayed as user-recognizable content by a browser or a web page display component with functions similar to the browser. The specific content contained in the web page is also defined through tags or nodes in the web page source code. For example, GTML defines the elements and attributes of the web page through <p>, <img>, <video>, and <canvas>.

The commonly used form of user interface is graphical user interface (GUI), which refers to a user interface related to computer operations that is displayed graphically. It can be an icon, window, control and other interface elements displayed on the display screen of the terminal device. The control can include icons, buttons, menus, tabs, text boxes, dialog boxes, status bars, navigation bars, widgets, etc. Visual interface elements.

As used in the specification and appended claims of this application, the singular expressions "a," "an," "the," "above," "the" and "the" are intended to also include Plural expressions unless the context clearly indicates otherwise. It will also be understood that the term "and/or" as used in this application refers to and includes any and all possible combinations of one or more of the listed items. As used in the above embodiments, the term "when" may be interpreted to mean "if..." or "after" or "in response to determining..." or "in response to detecting..." depending on the context. Similarly, depending on the context, the phrase "when determining..." or "if (stated condition or event) is detected" may be interpreted to mean "if it is determined..." or "in response to determining..." or "on detecting (stated condition or event)” or “in response to detecting (stated condition or event)”.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented using software, it may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the processes or functions described in the embodiments of the present application are generated in whole or in part. The computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, e.g., the computer instructions may be transferred from a website, computer, server, or data center Transmission to another website, computer, server or data center through wired (such as coaxial cable, optical fiber, digital subscriber line) or wireless (such as infrared, wireless, microwave, etc.) means. The computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more available media integrated. The usable media may be magnetic media (eg, floppy disk, hard disk, tape), optical media (eg, DVD), or semiconductor media (eg, solid state drive), etc.

Those of ordinary skill in the art can understand all or part of the processes for implementing the methods of the above embodiments. Relevant hardware can be instructed to complete by a computer program. The program can be stored in a computer-readable storage medium. When executed, the program can include the processes of the above method embodiments. The aforementioned storage media include: ROM, random access memory (RAM), magnetic disks, optical disks and other media that can store program codes.

Claims

A display method, applied to an electronic device, the electronic device including a screen, characterized in that the screen of the electronic device includes a first preset area, the method includes:

Display the first interface;

When displaying the first interface, the electronic device collects a first image;

Determine a first eye gaze area of the user based on the first image, the first eye gaze area being used to indicate an area of the screen that the user is looking at when the user is looking at the screen;

When the first eyeball gaze area is within the first preset area, the second interface is displayed.
The method of claim 1, wherein the screen of the electronic device includes a second preset area, and the second preset area is different from the first preset area, and the method further includes:

Determine a second eye gaze area of the user based on the first image, the second eye gaze area having a different position on the screen than the first eye gaze area on the screen;

When the second eyeball gaze area is within the second preset area, a third interface is displayed, and the third interface is different from the second interface.
The method of claim 2, wherein the second interface and the third interface are interfaces provided by the same application, or the second interface and the third interface are provided by different applications. interface.
The method according to any one of claims 1-3, characterized in that the method further includes:

Display the fourth interface;

When the fourth interface is displayed, the electronic device collects a second image;

Determining a third eye gaze area of the user based on the second image;

When the third eye gaze area is within the first preset area, a fifth interface is displayed, and the fifth interface is different from the second interface.
The method of claim 1, wherein displaying the second interface when the first eyeball gaze area is within the first preset area includes: when the first eyeball gaze area is within Within the first preset area, and when the duration of gazing at the first preset area is the first duration, the second interface is displayed.
The method of claim 5, further comprising: when the first eyeball gaze area is within the first preset area, and the duration of gazing at the first preset area is a third After two seconds, the sixth interface is displayed.
The method according to any one of claims 1 to 6, characterized in that the first eyeball gaze area is a cursor point formed by a display unit on the screen, or the first eyeball gaze area is a cursor point on the screen. A cursor point or cursor area composed of multiple display units.
The method of claim 2, wherein the second interface is a non-private interface, and the method further includes: displaying an interface to be unlocked; and when displaying the interface to be unlocked, the electronic device collects a third image;

Determining a fourth eye gaze area of the user based on the third image;

When the fourth eyeball gaze position is within the first preset area, the second interface is displayed.
The method according to claim 8, wherein the third interface is a privacy interface, and the method further includes:

When the fourth eyeball gaze position is within the second preset area, the third interface is not displayed.
The method of claim 2, wherein both the second interface and the third interface are privacy interfaces; and the electronic device does not enable a camera to acquire images when displaying an interface to be unlocked.
The method according to any one of claims 8-10, characterized in that the second preset area of the first interface displays a first control, and the first control is used to indicate the second The preset area is associated with the third interface.
The method according to claim 11, characterized in that the first control is not displayed in the second preset area of the interface to be unlocked.
The method according to claim 11 or 12, characterized in that the first control is any one of the following items: a thumbnail of the first interface, an icon of an application corresponding to the first interface , function icons indicating services provided by the first interface.
The method of claim 1, wherein the electronic device collects images for a first preset duration; the electronic device collects the first image, specifically: the electronic device collects the first image during the first preset time period. The first image is collected within a set time.
The method of claim 14, wherein the first preset duration is 3 seconds before the first interface is displayed.
The method of claim 1, wherein the electronic device includes a camera module, and the electronic device collects the first image through the camera module, and the camera module includes: at least A 2D camera and at least one 3D camera, the 2D camera is used to obtain a two-dimensional image, and the 3D camera is used to obtain an image including depth information; the first image includes the two-dimensional image and the depth information. Image.
The method according to claim 16, characterized in that the first image acquired by the camera module is stored in a secure data buffer,

Before determining the user's first eye gaze area based on the first image, the method further includes:

The first image is obtained from the secure data buffer under a trusted execution environment.
The method of claim 17, wherein the secure data buffer is provided at a hardware layer of the electronic device.
The method of claim 1, wherein determining the user's first eye gaze area based on the first image specifically includes:

Determine the characteristic data of the first image, the characteristic data including one or more of a left eye image, a right eye image, a face image, and face grid data;

An eye gaze recognition model is used to determine the first eye gaze area indicated by the feature data, and the eye gaze recognition model is established based on a convolutional neural network.
The method according to claim 19, wherein determining the characteristic data of the first image specifically includes:

Perform face correction on the first image to obtain a first image with a straight face;

Characteristic data of the first image is determined based on the first image of the correct human face.
The method of claim 4, wherein the first interface is any one of a first desktop, a second desktop, or a negative screen; and the fourth interface is a first desktop, a second desktop, or a negative screen. Any one of the screens, and is different from the first interface.
The method according to claim 4, characterized in that the association between the first preset area, the second interface and the fifth interface is set by the user.
An electronic device, characterized by comprising one or more processors and one or more memories; wherein the one or more memories are coupled to the one or more processors, and the one or more memories For storing computer program code, the computer program code includes computer instructions that, when executed by the one or more processors, cause the method of any one of claims 1-22 to be performed.
A computer-readable storage medium comprising instructions, characterized in that when the instructions are run on an electronic device, the method according to any one of claims 1-22 is executed.